letta-server

Author	SHA1	Message	Date
cthomas	2d82faad70	fix: skip agent relationship joins for source upload (#7335 )	2025-12-17 17:31:03 -08:00
jnjpng	350f3a751c	fix: update more plaintext non async callsites (#7223 ) * bae * update * fix * clean up * last	2025-12-17 17:31:02 -08:00
jnjpng	591420876a	fix: correct decorator order for trace_method and raise_on_invalid_id (#7226 ) Swap the order of @trace_method and @raise_on_invalid_id decorators across all service managers so that @trace_method is always the first wrapper applied to the function (positioned directly above the method). This ensures the ID validation happens before tracing begins, which is the intended execution order. Files modified: - agent_manager.py (23 occurrences) - archive_manager.py (11 occurrences) - block_manager.py (7 occurrences) - file_manager.py (6 occurrences) - group_manager.py (9 occurrences) - identity_manager.py (10 occurrences) - job_manager.py (7 occurrences) - message_manager.py (2 occurrences) - provider_manager.py (3 occurrences) - sandbox_config_manager.py (7 occurrences) - source_manager.py (5 occurrences) - step_manager.py (13 occurrences)	2025-12-17 17:31:02 -08:00
jnjpng	25d75d6528	fix: update aysnc get plaintext callsites (#7069 ) * base * resolve * fix * fix	2025-12-17 17:31:02 -08:00
Sarah Wooders	f1bd246e9b	feat: use token streaming for anthropic summarization (#7105 )	2025-12-17 17:31:02 -08:00
jnjpng	00ba2d09f3	refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns (#6751 ) * refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns Complete migration to encrypted-only storage for sensitive fields: - Remove dual-write to plaintext columns (token, custom_headers, authorization_code, access_token, refresh_token, client_secret) - Read only from _enc columns, not from plaintext fallback - Remove helper methods (get_token_secret, set_token_secret, etc.) - Remove Secret.from_db() and Secret.to_dict() methods - Update tests to verify encrypted-only behavior After this change, plaintext columns can be set to NULL manually since they are no longer read from or written to. * fix test * rename * update * union * fix test	2025-12-17 17:31:02 -08:00
Ari Webb	4878b49fa1	chore: bounds check for assistant message index (#7070 )	2025-12-17 17:31:02 -08:00
Sarah Wooders	bd9f3aca9b	fix: fix `prompt_acknowledgement` usage and update summarization prompts (#7012 )	2025-12-15 12:03:09 -08:00
Sarah Wooders	812bfd16dd	Revert "feat: project_id uniqueness for tools" (#7007 ) Revert "feat: project_id uniqueness for tools (#6604)" This reverts commit 2c4b6397041e2c965493525fc52e056f10d1bdb6.	2025-12-15 12:03:09 -08:00
Sarah Wooders	a731e01e88	fix: use `model` instead of `model_settings` (#6834 )	2025-12-15 12:03:09 -08:00
Kian Jones	bce1749408	fix: run PBKDF2 in thread pool to prevent event loop freeze (#6763 ) * fix: run PBKDF2 in thread pool to prevent event loop freeze Problem: Event loop freezes for 100-500ms during secret decryption, blocking all HTTP requests and async operations. The diagnostic monitor detected the main thread stuck in PBKDF2 HMAC SHA256 computation at: apps/core/letta/helpers/crypto_utils.py:51 (_derive_key) apps/core/letta/schemas/secret.py:161 (get_plaintext) Root cause: PBKDF2 with 100k iterations is intentionally CPU-intensive for security, but running it synchronously on the main thread blocks the event loop. Stack trace showed: Thread 1 (Main): PBKDF2HMAC -> SHA256_Final -> sha256_block_data_order_avx2 Event loop watchdog: Detected freeze at 01:11:44 (request started 01:12:03) Solution: 1. Run PBKDF2 in ThreadPoolExecutor to avoid blocking event loop 2. Add async versions of encrypt/decrypt methods 3. Add LRU cache for derived keys (deterministic results) 4. Add async get_plaintext_async() method to Secret class Changes: - apps/core/letta/helpers/crypto_utils.py: - Added ThreadPoolExecutor for crypto operations - Added @lru_cache(maxsize=256) to _derive_key_cached() - Added _derive_key_async() using loop.run_in_executor() - Added encrypt_async() and decrypt_async() methods - Added warnings to sync methods about blocking behavior - apps/core/letta/schemas/secret.py: - Added get_plaintext_async() method - Added warnings to get_plaintext() about blocking behavior Benefits: - Event loop no longer freezes during secret decryption - HTTP requests continue processing while crypto runs in background - Derived keys are cached, reducing CPU usage for repeated operations - Backward compatible - sync methods still work for non-async code Performance impact: - Before: 100-500ms event loop block per decryption - After: 100-500ms in thread pool (non-blocking) + LRU cache hits ~0.1ms Next steps (follow-up PRs): - Migrate all async callsites to use get_plaintext_async() - Add metrics to track sync vs async usage - Consider reducing PBKDF2 iterations if security allows * update * test --------- Co-authored-by: Letta Bot <jinjpeng@gmail.com>	2025-12-15 12:03:09 -08:00
Ari Webb	c1aa01db6f	feat: project_id uniqueness for tools (#6604 ) * feat: project_id uniqueness for tools * prevent double upsert of global tools * use default project if no header for sdk * reorder unique constraint for performance * use separate session for check conflict * feature flag adding project id header in cloud api * add my migration after one on main * remove comment * stage and publish api * web set project id just for tools * includes instead of startswith	2025-12-15 12:03:09 -08:00
Sarah Wooders	0634aa13a1	fix: avoid holding sessions open (#6769 )	2025-12-15 12:03:09 -08:00
Sarah Wooders	c9ad2fd7c4	chore: move things to debug logging (#6610 )	2025-12-15 12:03:09 -08:00
jnjpng	4be813b956	fix: migrate sandbox and agent environment variables to encrypted only (#6623 ) * base * remove unnnecessary db migration * update * fix * update * update * comments * fix * revert * anotha --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:03:08 -08:00
jnjpng	714c537dc5	chore: change e2b sandbox error logs from debug to warning (#6726 ) Update log level for tool execution errors in e2b sandbox from debug to warning for better visibility when troubleshooting issues. Co-authored-by: Jin Peng <jinjpeng@users.noreply.github.com>	2025-12-15 12:02:34 -08:00
Sarah Wooders	7ea297231a	feat: add `compaction_settings` to agents (#6625 ) * initial commit * Add database migration for compaction_settings field This migration adds the compaction_settings column to the agents table to support customized summarization configuration for each agent. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix * rename * update apis * fix tests * update web test --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Kian Jones <kian@letta.com>	2025-12-15 12:02:34 -08:00
Kian Jones	15cede7281	fix: prevent db connection pool exhaustion in multi-agent tool executor (#6619 ) Problem: When executing a tool that sends messages to many agents matching tags, the code used asyncio.gather to process all agents concurrently. Each agent processing creates database operations (run creation, message storage), leading to N concurrent database connections. Example: If 100 agents match the tags, 100 simultaneous database connections are created, exhausting the connection pool and causing errors. Root cause: asyncio.gather(*[_process_agent(...) for agent in agents]) creates all coroutines and runs them concurrently, each opening a DB session. Solution: Process agents sequentially instead of concurrently. While this is slower, it prevents database connection pool exhaustion. The operation is still async, so it won't block the event loop. Changes: - apps/core/letta/services/tool_executor/multi_agent_tool_executor.py: - Replaced asyncio.gather with sequential for loop - Added explanatory comment about why sequential processing is needed Impact: With 100 matching agents: - Before: 100 concurrent DB connections (pool exhaustion) - After: 1 DB connection at a time (no pool exhaustion) Note: This follows the same pattern as PR #6617 which fixed a similar issue in file attachment operations.	2025-12-15 12:02:34 -08:00
Kian Jones	fbd89c9360	fix: replace all 'PRODUCTION' references with 'prod' for consistency (#6627 ) * fix: replace all 'PRODUCTION' references with 'prod' for consistency Problem: Codebase had 11 references to 'PRODUCTION' (uppercase) that should use 'prod' (lowercase) for consistency with the deployment workflows and environment normalization. Changes across 8 files: 1. Source files (using settings.environment): - letta/functions/function_sets/multi_agent.py - letta/services/tool_manager.py - letta/services/tool_executor/multi_agent_tool_executor.py - letta/services/helpers/agent_manager_helper.py All checks changed from: settings.environment == "PRODUCTION" To: settings.environment == "prod" 2. OTEL resource configuration: - letta/otel/resource.py - Updated _normalize_environment_tag() to handle 'prod' directly - Removed 'PRODUCTION' -> 'prod' mapping (no longer needed) - Updated device.id check from _env != "PRODUCTION" to _env != "prod" 3. Test files: - tests/managers/conftest.py - Fixture parameter changed from "PRODUCTION" to "prod" - tests/managers/test_agent_manager.py (3 occurrences) - tests/managers/test_tool_manager.py (2 occurrences) All test checks changed to use "prod" Result: Complete consistency across the codebase: - All environment checks use "prod" instead of "PRODUCTION" - Normalization function simplified (no special case for PRODUCTION) - Tests use correct "prod" value - Matches deployment workflow configuration from PR #6626 This completes the environment naming standardization effort. * fix: update settings.py environment description to use 'prod' instead of 'PRODUCTION' The field description still referenced PRODUCTION as an example value. Updated to use lowercase 'prod' for consistency with actual usage. Before: "Application environment (PRODUCTION, DEV, CANARY, etc. - normalized to lowercase for OTEL tags)" After: "Application environment (prod, dev, canary, etc. - lowercase values used for OTEL tags)"	2025-12-15 12:02:34 -08:00
Kian Jones	08ccc8b399	fix: prevent db connection pool exhaustion in file status checks (#6620 ) Problem: When listing files with status checking enabled, the code used asyncio.gather to check and update status for all files concurrently. Each status check may update the file in the database (e.g., for timeouts or embedding completion), leading to N concurrent database connections. Example: Listing 100 files with status checking creates 100 simultaneous database update operations, exhausting the connection pool. Root cause: asyncio.gather(*[check_and_update_file_status(f) for f in files]) processes all files concurrently, each potentially creating DB updates. Solution: Check and update file status sequentially instead of concurrently. While this is slower, it prevents database connection pool exhaustion when listing many files. Changes: - apps/core/letta/services/file_manager.py: - Replaced asyncio.gather with sequential for loop - Added explanatory comment about db pool exhaustion prevention Impact: With 100 files: - Before: Up to 100 concurrent DB connections (pool exhaustion) - After: 1 DB connection at a time (no pool exhaustion) Note: This follows the same pattern as PR #6617 and #6619 which fixed similar issues in file attachment and multi-agent tool execution.	2025-12-15 12:02:34 -08:00
Kian Jones	1a2e0aa8b7	fix: prevent db connection pool exhaustion in MCP server manager (#6622 ) Problem: When creating an MCP server with many tools, the code used two asyncio.gather calls - one for tool creation and one for mapping creation. Each operation involves database INSERT/UPDATE, leading to 2N concurrent database connections. Example: An MCP server with 50 tools creates 50 + 50 = 100 simultaneous database connections (tools + mappings), severely exhausting the pool. Root cause: 1. asyncio.gather([create_mcp_tool_async(...) for tool in tools]) 2. asyncio.gather([create_mcp_tool_mapping(...) for tool in results]) Both process operations concurrently, each opening a DB session. Solution: Process tool creation and mapping sequentially in a single loop. Create each tool, then immediately create its mapping if successful. This: - Reduces connection count from 2N to 1 - Maintains proper error handling per tool - Prevents database connection pool exhaustion Changes: - apps/core/letta/services/mcp_server_manager.py: - Replaced two asyncio.gather calls with single sequential loop - Create mapping immediately after each successful tool creation - Maintained return_exceptions=True behavior with try/except - Added explanatory comment about db pool exhaustion prevention Impact: With 50 MCP tools: - Before: 100 concurrent DB connections (50 tools + 50 mappings, pool exhaustion) - After: 1 DB connection at a time (no pool exhaustion) Note: This follows the same pattern as PR #6617, #6619, #6620, and #6621 which fixed similar issues throughout the codebase.	2025-12-15 12:02:34 -08:00
Kian Jones	43aa97b7d2	fix: prevent db connection pool exhaustion in MCP tool creation (#6621 ) Problem: When creating an MCP server with many tools, the code used asyncio.gather to create all tools concurrently. Each tool creation involves database operations (INSERT with upsert logic), leading to N concurrent database connections. Example: An MCP server with 50 tools creates 50 simultaneous database connections during server creation, exhausting the connection pool. Root cause: asyncio.gather(*[create_mcp_tool_async(...) for tool in tools]) processes all tool creations concurrently, each opening a DB session. Solution: Create tools sequentially instead of concurrently. While this takes longer for server creation, it prevents database connection pool exhaustion and maintains error handling by catching exceptions per tool. Changes: - apps/core/letta/services/mcp_manager.py: - Replaced asyncio.gather with sequential for loop - Maintained return_exceptions=True behavior with try/except - Added explanatory comment about db pool exhaustion prevention Impact: With 50 MCP tools: - Before: 50 concurrent DB connections (pool exhaustion) - After: 1 DB connection at a time (no pool exhaustion) Note: This follows the same pattern as PR #6617, #6619, and #6620 which fixed similar issues in file operations, multi-agent execution, and file status checks.	2025-12-15 12:02:34 -08:00
cthomas	0d77b373e6	fix: remove concurrent db writes for file upload (#6617 )	2025-12-15 12:02:34 -08:00
jnjpng	3221ed8a14	fix: update base provider to only handle _enc fields (#6591 ) * base * update * another pass * fix * generate * fix test * don't set on create * last fixes --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:02:34 -08:00
Sarah Wooders	c8fa77a01f	feat: cleanup cancellation code and add more logging (#6588 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	70c57c5072	fix: various patches to summarizer (#6597 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	8440e319e2	Revert "feat: enable provider models persistence" (#6590 ) Revert "feat: enable provider models persistence (#6193)" This reverts commit 9682aff32640a6ee8cf71a6f18c9fa7cda25c40e.	2025-12-15 12:02:34 -08:00
Sarah Wooders	bbd52e291c	feat: refactor summarization and message persistence code [LET-6464] (#6561 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	821549817d	chore: add error logging if run updates are invalid (#6582 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	fca5774795	feat: store run errors on streaming (#6573 )	2025-12-15 12:02:34 -08:00
Ari Webb	848a73125c	feat: enable provider models persistence (#6193 ) * Revert "fix test" This reverts commit 5126815f23cefb4edad3e3bf9e7083209dcc7bf1. * fix server and better test * test fix, get api key for base and byok? * set letta default endpoint * try to fix timeout for test * fix for letta api key * Delete apps/core/tests/sdk_v1/conftest.py * Update utils.py * clean up a few issues * fix filterning on list_llm_models * soft delete models with provider * add one more test * fix ci * add timeout * band aid for letta embedding provider * info instead of error logs when creating models	2025-12-15 12:02:34 -08:00
jnjpng	2536942be2	fix: combined tool manager improvements - tracing and redundant fetches (#6570 ) * fix: combined tool manager improvements - tracing and redundant fetches This PR combines improvements from #6530 and #6535: - Add tracer import to enable proper tracing spans - Improve update check logic to verify actual field changes before updating - Return current_tool directly when no update is needed (avoids redundant fetch) - Add structured tracing spans to update_tool_by_id_async for better observability - Fix decorator order for better error handling (raise_on_invalid_id before trace_method) - Remove unnecessary tracing spans in create_or_update_tool_async 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * revert: remove tracing spans from update_tool_by_id_async Remove the tracer span additions from update_tool_by_id_async while keeping all other improvements (decorator order fix, redundant fetch removal, and improved update check logic). 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:02:34 -08:00
cthomas	a7c0bad857	fix: unbound var in summarization [LET-6484] (#6568 ) * fix: unbound var in summarization * fix indentation	2025-12-15 12:02:34 -08:00
Cameron	a56c6571d2	fix: update fetch_webpage docstring to reflect actual implementation (#6503 ) The docstring incorrectly stated that fetch_webpage uses Jina AI reader. Updated to accurately describe the actual implementation which uses: 1. Exa API (if EXA_API_KEY is available) 2. Trafilatura (fallback) 3. Readability + html2text (final fallback) 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2025-12-15 12:02:34 -08:00
Ari Webb	4092820f3a	feat: add project id scoping for tools backend changes (#6529 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	4f1fbe45aa	feat: add index and concurrency control for tools (fixed alembic) (#6552 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	c8c06168e2	Revert "feat: add index and concurrency control for tools " (#6551 ) Revert "feat: add index and concurrency control for tools (#6547)" This reverts commit f4abf8e061bc2f5e08853b5ce5775a7f8626463a.	2025-12-15 12:02:34 -08:00
Sarah Wooders	a2d3011d84	feat: add index and concurrency control for tools (#6547 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	a18caf69f7	fix: undo tool manager changes (#6546 )	2025-12-15 12:02:34 -08:00
jnjpng	0c21d20914	fix: redundant tool fetch on no op update and fix tool manager tracing (#6535 ) base Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:02:33 -08:00
cthomas	6fd73d4dcf	chore: add more summarize tracing (#6532 )	2025-12-15 12:02:33 -08:00
Kian Jones	edeac2c679	fix: fix gemini otel bug and add tracing for tool upsert (#6523 ) add tracing for tool upsert, and fix gemini otel bug	2025-12-15 12:02:33 -08:00
jnjpng	7058d63bd8	fix: update tool check and add tracing to update tool logic (#6530 ) base Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:02:33 -08:00
cthomas	109989ac1a	fix: tracing for summarizer errors (#6524 )	2025-12-15 12:02:33 -08:00
jnjpng	e5bda413c0	fix: agent environment variables not using encrypted values (#6520 ) * base * clean up --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:02:33 -08:00
Kian Jones	0bac88d88a	feat: run id correlation for core logs (#6518 ) run-id correlation	2025-12-15 12:02:33 -08:00
Ari Webb	eb547bb96e	fix: clear message history no longer deletes messages (#6515 ) * fix: clear message history no longer deletes messages * toast and make it stay for 8 secs * fix test --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-12-15 12:02:33 -08:00
jnjpng	89ca8fe8b0	feat: read from encrypted values with error on fallback [LET-4250] (#6484 ) * base * warning --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:02:33 -08:00
Kian Jones	964fc05cb9	chore: add more tracing to tool upsert endpoint (#6471 ) * add tracing * fix hallucination	2025-12-15 12:02:19 -08:00
jnjpng	c90eacdcbb	fix: step metrics not found (#6472 ) * base * comment --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:02:19 -08:00

1 2 3 4 5 ...

1643 Commits