* fix: prevent empty reasoning messages in streaming interfaces
Prevents empty "Thinking..." indicators from appearing in clients by
filtering out reasoning messages with no content at the source.
Changes:
- Gemini: Don't emit ReasoningMessage when only thought_signature exists
- Gemini: Only emit reasoning content if text is non-empty
- Anthropic: Don't emit ReasoningMessage for BetaSignatureDelta
- Anthropic: Only emit reasoning content if thinking text is non-empty
This fixes the issue where providers send signature metadata before
actual thinking content, causing empty reasoning blocks to appear
in the UI after responses complete.
Affects: Gemini reasoning, Anthropic extended thinking
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* feat: enable Datadog LLM Observability for memgpt-server
Enables DD_LLMOBS to track LLM calls, prompts, completions, and costs
in production for memgpt-server.
Changes:
- Add DD_LLMOBS_ENABLED=1 and DD_LLMOBS_ML_APP=memgpt-server in:
- .github/workflows/deploy-core.yml (GitHub Actions deployment)
- justfile (Helm deployment secrets)
- apps/core/letta/server/rest_api/app.py (runtime config)
This provides visibility into:
- LLM API calls and latency
- Prompt/completion content and tokens
- Model costs and usage
- Error rates per model/provider
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* dd llmobs
* Revert "fix: prevent empty reasoning messages in streaming interfaces"
This reverts commit a900228b3611de49eb5f740f68dc76a657fc9b14.
---------
Co-authored-by: Letta <noreply@letta.com>
Swap the order of @trace_method and @raise_on_invalid_id decorators
across all service managers so that @trace_method is always the first
wrapper applied to the function (positioned directly above the method).
This ensures the ID validation happens before tracing begins, which is
the intended execution order.
Files modified:
- agent_manager.py (23 occurrences)
- archive_manager.py (11 occurrences)
- block_manager.py (7 occurrences)
- file_manager.py (6 occurrences)
- group_manager.py (9 occurrences)
- identity_manager.py (10 occurrences)
- job_manager.py (7 occurrences)
- message_manager.py (2 occurrences)
- provider_manager.py (3 occurrences)
- sandbox_config_manager.py (7 occurrences)
- source_manager.py (5 occurrences)
- step_manager.py (13 occurrences)
* refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns
Complete migration to encrypted-only storage for sensitive fields:
- Remove dual-write to plaintext columns (token, custom_headers,
authorization_code, access_token, refresh_token, client_secret)
- Read only from _enc columns, not from plaintext fallback
- Remove helper methods (get_token_secret, set_token_secret, etc.)
- Remove Secret.from_db() and Secret.to_dict() methods
- Update tests to verify encrypted-only behavior
After this change, plaintext columns can be set to NULL manually
since they are no longer read from or written to.
* fix test
* rename
* update
* union
* fix test
* fix: run PBKDF2 in thread pool to prevent event loop freeze
Problem: Event loop freezes for 100-500ms during secret decryption, blocking
all HTTP requests and async operations. The diagnostic monitor detected the
main thread stuck in PBKDF2 HMAC SHA256 computation at:
apps/core/letta/helpers/crypto_utils.py:51 (_derive_key)
apps/core/letta/schemas/secret.py:161 (get_plaintext)
Root cause: PBKDF2 with 100k iterations is intentionally CPU-intensive for
security, but running it synchronously on the main thread blocks the event loop.
Stack trace showed:
Thread 1 (Main): PBKDF2HMAC -> SHA256_Final -> sha256_block_data_order_avx2
Event loop watchdog: Detected freeze at 01:11:44 (request started 01:12:03)
Solution:
1. Run PBKDF2 in ThreadPoolExecutor to avoid blocking event loop
2. Add async versions of encrypt/decrypt methods
3. Add LRU cache for derived keys (deterministic results)
4. Add async get_plaintext_async() method to Secret class
Changes:
- apps/core/letta/helpers/crypto_utils.py:
- Added ThreadPoolExecutor for crypto operations
- Added @lru_cache(maxsize=256) to _derive_key_cached()
- Added _derive_key_async() using loop.run_in_executor()
- Added encrypt_async() and decrypt_async() methods
- Added warnings to sync methods about blocking behavior
- apps/core/letta/schemas/secret.py:
- Added get_plaintext_async() method
- Added warnings to get_plaintext() about blocking behavior
Benefits:
- Event loop no longer freezes during secret decryption
- HTTP requests continue processing while crypto runs in background
- Derived keys are cached, reducing CPU usage for repeated operations
- Backward compatible - sync methods still work for non-async code
Performance impact:
- Before: 100-500ms event loop block per decryption
- After: 100-500ms in thread pool (non-blocking) + LRU cache hits ~0.1ms
Next steps (follow-up PRs):
- Migrate all async callsites to use get_plaintext_async()
- Add metrics to track sync vs async usage
- Consider reducing PBKDF2 iterations if security allows
* update
* test
---------
Co-authored-by: Letta Bot <jinjpeng@gmail.com>
* feat: project_id uniqueness for tools
* prevent double upsert of global tools
* use default project if no header for sdk
* reorder unique constraint for performance
* use separate session for check conflict
* feature flag adding project id header in cloud api
* add my migration after one on main
* remove comment
* stage and publish api
* web set project id just for tools
* includes instead of startswith