Commit Graph

1643 Commits

Author SHA1 Message Date
cthomas
2d82faad70 fix: skip agent relationship joins for source upload (#7335) 2025-12-17 17:31:03 -08:00
jnjpng
350f3a751c fix: update more plaintext non async callsites (#7223)
* bae

* update

* fix

* clean up

* last
2025-12-17 17:31:02 -08:00
jnjpng
591420876a fix: correct decorator order for trace_method and raise_on_invalid_id (#7226)
Swap the order of @trace_method and @raise_on_invalid_id decorators
across all service managers so that @trace_method is always the first
wrapper applied to the function (positioned directly above the method).

This ensures the ID validation happens before tracing begins, which is
the intended execution order.

Files modified:
- agent_manager.py (23 occurrences)
- archive_manager.py (11 occurrences)
- block_manager.py (7 occurrences)
- file_manager.py (6 occurrences)
- group_manager.py (9 occurrences)
- identity_manager.py (10 occurrences)
- job_manager.py (7 occurrences)
- message_manager.py (2 occurrences)
- provider_manager.py (3 occurrences)
- sandbox_config_manager.py (7 occurrences)
- source_manager.py (5 occurrences)
- step_manager.py (13 occurrences)
2025-12-17 17:31:02 -08:00
jnjpng
25d75d6528 fix: update aysnc get plaintext callsites (#7069)
* base

* resolve

* fix

* fix
2025-12-17 17:31:02 -08:00
Sarah Wooders
f1bd246e9b feat: use token streaming for anthropic summarization (#7105) 2025-12-17 17:31:02 -08:00
jnjpng
00ba2d09f3 refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns (#6751)
* refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns

Complete migration to encrypted-only storage for sensitive fields:

- Remove dual-write to plaintext columns (token, custom_headers,
  authorization_code, access_token, refresh_token, client_secret)
- Read only from _enc columns, not from plaintext fallback
- Remove helper methods (get_token_secret, set_token_secret, etc.)
- Remove Secret.from_db() and Secret.to_dict() methods
- Update tests to verify encrypted-only behavior

After this change, plaintext columns can be set to NULL manually
since they are no longer read from or written to.

* fix test

* rename

* update

* union

* fix test
2025-12-17 17:31:02 -08:00
Ari Webb
4878b49fa1 chore: bounds check for assistant message index (#7070) 2025-12-17 17:31:02 -08:00
Sarah Wooders
bd9f3aca9b fix: fix prompt_acknowledgement usage and update summarization prompts (#7012) 2025-12-15 12:03:09 -08:00
Sarah Wooders
812bfd16dd Revert "feat: project_id uniqueness for tools" (#7007)
Revert "feat: project_id uniqueness for tools (#6604)"

This reverts commit 2c4b6397041e2c965493525fc52e056f10d1bdb6.
2025-12-15 12:03:09 -08:00
Sarah Wooders
a731e01e88 fix: use model instead of model_settings (#6834) 2025-12-15 12:03:09 -08:00
Kian Jones
bce1749408 fix: run PBKDF2 in thread pool to prevent event loop freeze (#6763)
* fix: run PBKDF2 in thread pool to prevent event loop freeze

Problem: Event loop freezes for 100-500ms during secret decryption, blocking
all HTTP requests and async operations. The diagnostic monitor detected the
main thread stuck in PBKDF2 HMAC SHA256 computation at:
  apps/core/letta/helpers/crypto_utils.py:51 (_derive_key)
  apps/core/letta/schemas/secret.py:161 (get_plaintext)

Root cause: PBKDF2 with 100k iterations is intentionally CPU-intensive for
security, but running it synchronously on the main thread blocks the event loop.

Stack trace showed:
  Thread 1 (Main): PBKDF2HMAC -> SHA256_Final -> sha256_block_data_order_avx2
  Event loop watchdog: Detected freeze at 01:11:44 (request started 01:12:03)

Solution:
1. Run PBKDF2 in ThreadPoolExecutor to avoid blocking event loop
2. Add async versions of encrypt/decrypt methods
3. Add LRU cache for derived keys (deterministic results)
4. Add async get_plaintext_async() method to Secret class

Changes:
- apps/core/letta/helpers/crypto_utils.py:
  - Added ThreadPoolExecutor for crypto operations
  - Added @lru_cache(maxsize=256) to _derive_key_cached()
  - Added _derive_key_async() using loop.run_in_executor()
  - Added encrypt_async() and decrypt_async() methods
  - Added warnings to sync methods about blocking behavior

- apps/core/letta/schemas/secret.py:
  - Added get_plaintext_async() method
  - Added warnings to get_plaintext() about blocking behavior

Benefits:
- Event loop no longer freezes during secret decryption
- HTTP requests continue processing while crypto runs in background
- Derived keys are cached, reducing CPU usage for repeated operations
- Backward compatible - sync methods still work for non-async code

Performance impact:
- Before: 100-500ms event loop block per decryption
- After: 100-500ms in thread pool (non-blocking) + LRU cache hits ~0.1ms

Next steps (follow-up PRs):
- Migrate all async callsites to use get_plaintext_async()
- Add metrics to track sync vs async usage
- Consider reducing PBKDF2 iterations if security allows

* update

* test

---------

Co-authored-by: Letta Bot <jinjpeng@gmail.com>
2025-12-15 12:03:09 -08:00
Ari Webb
c1aa01db6f feat: project_id uniqueness for tools (#6604)
* feat: project_id uniqueness for tools

* prevent double upsert of global tools

* use default project if no header for sdk

* reorder unique constraint for performance

* use separate session for check conflict

* feature flag adding project id header in cloud api

* add my migration after one on main

* remove comment

* stage and publish api

* web set project id just for tools

* includes instead of startswith
2025-12-15 12:03:09 -08:00
Sarah Wooders
0634aa13a1 fix: avoid holding sessions open (#6769) 2025-12-15 12:03:09 -08:00
Sarah Wooders
c9ad2fd7c4 chore: move things to debug logging (#6610) 2025-12-15 12:03:09 -08:00
jnjpng
4be813b956 fix: migrate sandbox and agent environment variables to encrypted only (#6623)
* base

* remove unnnecessary db migration

* update

* fix

* update

* update

* comments

* fix

* revert

* anotha

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:03:08 -08:00
jnjpng
714c537dc5 chore: change e2b sandbox error logs from debug to warning (#6726)
Update log level for tool execution errors in e2b sandbox from debug
to warning for better visibility when troubleshooting issues.

Co-authored-by: Jin Peng <jinjpeng@users.noreply.github.com>
2025-12-15 12:02:34 -08:00
Sarah Wooders
7ea297231a feat: add compaction_settings to agents (#6625)
* initial commit

* Add database migration for compaction_settings field

This migration adds the compaction_settings column to the agents table
to support customized summarization configuration for each agent.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix

* rename

* update apis

* fix tests

* update web test

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Kian Jones <kian@letta.com>
2025-12-15 12:02:34 -08:00
Kian Jones
15cede7281 fix: prevent db connection pool exhaustion in multi-agent tool executor (#6619)
Problem: When executing a tool that sends messages to many agents matching
tags, the code used asyncio.gather to process all agents concurrently. Each
agent processing creates database operations (run creation, message storage),
leading to N concurrent database connections.

Example: If 100 agents match the tags, 100 simultaneous database connections
are created, exhausting the connection pool and causing errors.

Root cause: asyncio.gather(*[_process_agent(...) for agent in agents])
creates all coroutines and runs them concurrently, each opening a DB session.

Solution: Process agents sequentially instead of concurrently. While this is
slower, it prevents database connection pool exhaustion. The operation is
still async, so it won't block the event loop.

Changes:
- apps/core/letta/services/tool_executor/multi_agent_tool_executor.py:
  - Replaced asyncio.gather with sequential for loop
  - Added explanatory comment about why sequential processing is needed

Impact: With 100 matching agents:
- Before: 100 concurrent DB connections (pool exhaustion)
- After: 1 DB connection at a time (no pool exhaustion)

Note: This follows the same pattern as PR #6617 which fixed a similar issue
in file attachment operations.
2025-12-15 12:02:34 -08:00
Kian Jones
fbd89c9360 fix: replace all 'PRODUCTION' references with 'prod' for consistency (#6627)
* fix: replace all 'PRODUCTION' references with 'prod' for consistency

Problem: Codebase had 11 references to 'PRODUCTION' (uppercase) that should
use 'prod' (lowercase) for consistency with the deployment workflows and
environment normalization.

Changes across 8 files:

1. Source files (using settings.environment):
   - letta/functions/function_sets/multi_agent.py
   - letta/services/tool_manager.py
   - letta/services/tool_executor/multi_agent_tool_executor.py
   - letta/services/helpers/agent_manager_helper.py
   All checks changed from: settings.environment == "PRODUCTION"
   To: settings.environment == "prod"

2. OTEL resource configuration:
   - letta/otel/resource.py
     - Updated _normalize_environment_tag() to handle 'prod' directly
     - Removed 'PRODUCTION' -> 'prod' mapping (no longer needed)
     - Updated device.id check from _env != "PRODUCTION" to _env != "prod"

3. Test files:
   - tests/managers/conftest.py
     - Fixture parameter changed from "PRODUCTION" to "prod"
   - tests/managers/test_agent_manager.py (3 occurrences)
   - tests/managers/test_tool_manager.py (2 occurrences)
   All test checks changed to use "prod"

Result: Complete consistency across the codebase:
- All environment checks use "prod" instead of "PRODUCTION"
- Normalization function simplified (no special case for PRODUCTION)
- Tests use correct "prod" value
- Matches deployment workflow configuration from PR #6626

This completes the environment naming standardization effort.

* fix: update settings.py environment description to use 'prod' instead of 'PRODUCTION'

The field description still referenced PRODUCTION as an example value.
Updated to use lowercase 'prod' for consistency with actual usage.

Before: "Application environment (PRODUCTION, DEV, CANARY, etc. - normalized to lowercase for OTEL tags)"
After: "Application environment (prod, dev, canary, etc. - lowercase values used for OTEL tags)"
2025-12-15 12:02:34 -08:00
Kian Jones
08ccc8b399 fix: prevent db connection pool exhaustion in file status checks (#6620)
Problem: When listing files with status checking enabled, the code used
asyncio.gather to check and update status for all files concurrently. Each
status check may update the file in the database (e.g., for timeouts or
embedding completion), leading to N concurrent database connections.

Example: Listing 100 files with status checking creates 100 simultaneous
database update operations, exhausting the connection pool.

Root cause: asyncio.gather(*[check_and_update_file_status(f) for f in files])
processes all files concurrently, each potentially creating DB updates.

Solution: Check and update file status sequentially instead of concurrently.
While this is slower, it prevents database connection pool exhaustion when
listing many files.

Changes:
- apps/core/letta/services/file_manager.py:
  - Replaced asyncio.gather with sequential for loop
  - Added explanatory comment about db pool exhaustion prevention

Impact: With 100 files:
- Before: Up to 100 concurrent DB connections (pool exhaustion)
- After: 1 DB connection at a time (no pool exhaustion)

Note: This follows the same pattern as PR #6617 and #6619 which fixed
similar issues in file attachment and multi-agent tool execution.
2025-12-15 12:02:34 -08:00
Kian Jones
1a2e0aa8b7 fix: prevent db connection pool exhaustion in MCP server manager (#6622)
Problem: When creating an MCP server with many tools, the code used two
asyncio.gather calls - one for tool creation and one for mapping creation.
Each operation involves database INSERT/UPDATE, leading to 2N concurrent
database connections.

Example: An MCP server with 50 tools creates 50 + 50 = 100 simultaneous
database connections (tools + mappings), severely exhausting the pool.

Root cause:
1. asyncio.gather(*[create_mcp_tool_async(...) for tool in tools])
2. asyncio.gather(*[create_mcp_tool_mapping(...) for tool in results])
Both process operations concurrently, each opening a DB session.

Solution: Process tool creation and mapping sequentially in a single loop.
Create each tool, then immediately create its mapping if successful. This:
- Reduces connection count from 2N to 1
- Maintains proper error handling per tool
- Prevents database connection pool exhaustion

Changes:
- apps/core/letta/services/mcp_server_manager.py:
  - Replaced two asyncio.gather calls with single sequential loop
  - Create mapping immediately after each successful tool creation
  - Maintained return_exceptions=True behavior with try/except
  - Added explanatory comment about db pool exhaustion prevention

Impact: With 50 MCP tools:
- Before: 100 concurrent DB connections (50 tools + 50 mappings, pool exhaustion)
- After: 1 DB connection at a time (no pool exhaustion)

Note: This follows the same pattern as PR #6617, #6619, #6620, and #6621
which fixed similar issues throughout the codebase.
2025-12-15 12:02:34 -08:00
Kian Jones
43aa97b7d2 fix: prevent db connection pool exhaustion in MCP tool creation (#6621)
Problem: When creating an MCP server with many tools, the code used
asyncio.gather to create all tools concurrently. Each tool creation involves
database operations (INSERT with upsert logic), leading to N concurrent
database connections.

Example: An MCP server with 50 tools creates 50 simultaneous database
connections during server creation, exhausting the connection pool.

Root cause: asyncio.gather(*[create_mcp_tool_async(...) for tool in tools])
processes all tool creations concurrently, each opening a DB session.

Solution: Create tools sequentially instead of concurrently. While this takes
longer for server creation, it prevents database connection pool exhaustion
and maintains error handling by catching exceptions per tool.

Changes:
- apps/core/letta/services/mcp_manager.py:
  - Replaced asyncio.gather with sequential for loop
  - Maintained return_exceptions=True behavior with try/except
  - Added explanatory comment about db pool exhaustion prevention

Impact: With 50 MCP tools:
- Before: 50 concurrent DB connections (pool exhaustion)
- After: 1 DB connection at a time (no pool exhaustion)

Note: This follows the same pattern as PR #6617, #6619, and #6620 which
fixed similar issues in file operations, multi-agent execution, and file
status checks.
2025-12-15 12:02:34 -08:00
cthomas
0d77b373e6 fix: remove concurrent db writes for file upload (#6617) 2025-12-15 12:02:34 -08:00
jnjpng
3221ed8a14 fix: update base provider to only handle _enc fields (#6591)
* base

* update

* another pass

* fix

* generate

* fix test

* don't set on create

* last fixes

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:02:34 -08:00
Sarah Wooders
c8fa77a01f feat: cleanup cancellation code and add more logging (#6588) 2025-12-15 12:02:34 -08:00
Sarah Wooders
70c57c5072 fix: various patches to summarizer (#6597) 2025-12-15 12:02:34 -08:00
Sarah Wooders
8440e319e2 Revert "feat: enable provider models persistence" (#6590)
Revert "feat: enable provider models persistence (#6193)"

This reverts commit 9682aff32640a6ee8cf71a6f18c9fa7cda25c40e.
2025-12-15 12:02:34 -08:00
Sarah Wooders
bbd52e291c feat: refactor summarization and message persistence code [LET-6464] (#6561) 2025-12-15 12:02:34 -08:00
Sarah Wooders
821549817d chore: add error logging if run updates are invalid (#6582) 2025-12-15 12:02:34 -08:00
Sarah Wooders
fca5774795 feat: store run errors on streaming (#6573) 2025-12-15 12:02:34 -08:00
Ari Webb
848a73125c feat: enable provider models persistence (#6193)
* Revert "fix test"

This reverts commit 5126815f23cefb4edad3e3bf9e7083209dcc7bf1.

* fix server and better test

* test fix, get api key for base and byok?

* set letta default endpoint

* try to fix timeout for test

* fix for letta api key

* Delete apps/core/tests/sdk_v1/conftest.py

* Update utils.py

* clean up a few issues

* fix filterning on list_llm_models

* soft delete models with provider

* add one more test

* fix ci

* add timeout

* band aid for letta embedding provider

* info instead of error logs when creating models
2025-12-15 12:02:34 -08:00
jnjpng
2536942be2 fix: combined tool manager improvements - tracing and redundant fetches (#6570)
* fix: combined tool manager improvements - tracing and redundant fetches

This PR combines improvements from #6530 and #6535:

- Add tracer import to enable proper tracing spans
- Improve update check logic to verify actual field changes before updating
- Return current_tool directly when no update is needed (avoids redundant fetch)
- Add structured tracing spans to update_tool_by_id_async for better observability
- Fix decorator order for better error handling (raise_on_invalid_id before trace_method)
- Remove unnecessary tracing spans in create_or_update_tool_async

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* revert: remove tracing spans from update_tool_by_id_async

Remove the tracer span additions from update_tool_by_id_async while keeping
all other improvements (decorator order fix, redundant fetch removal, and
improved update check logic).

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:02:34 -08:00
cthomas
a7c0bad857 fix: unbound var in summarization [LET-6484] (#6568)
* fix: unbound var in summarization

* fix indentation
2025-12-15 12:02:34 -08:00
Cameron
a56c6571d2 fix: update fetch_webpage docstring to reflect actual implementation (#6503)
The docstring incorrectly stated that fetch_webpage uses Jina AI reader.
Updated to accurately describe the actual implementation which uses:
1. Exa API (if EXA_API_KEY is available)
2. Trafilatura (fallback)
3. Readability + html2text (final fallback)

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2025-12-15 12:02:34 -08:00
Ari Webb
4092820f3a feat: add project id scoping for tools backend changes (#6529) 2025-12-15 12:02:34 -08:00
Sarah Wooders
4f1fbe45aa feat: add index and concurrency control for tools (fixed alembic) (#6552) 2025-12-15 12:02:34 -08:00
Sarah Wooders
c8c06168e2 Revert "feat: add index and concurrency control for tools " (#6551)
Revert "feat: add index and concurrency control for tools  (#6547)"

This reverts commit f4abf8e061bc2f5e08853b5ce5775a7f8626463a.
2025-12-15 12:02:34 -08:00
Sarah Wooders
a2d3011d84 feat: add index and concurrency control for tools (#6547) 2025-12-15 12:02:34 -08:00
Sarah Wooders
a18caf69f7 fix: undo tool manager changes (#6546) 2025-12-15 12:02:34 -08:00
jnjpng
0c21d20914 fix: redundant tool fetch on no op update and fix tool manager tracing (#6535)
base

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:02:33 -08:00
cthomas
6fd73d4dcf chore: add more summarize tracing (#6532) 2025-12-15 12:02:33 -08:00
Kian Jones
edeac2c679 fix: fix gemini otel bug and add tracing for tool upsert (#6523)
add tracing for tool upsert, and fix gemini otel bug
2025-12-15 12:02:33 -08:00
jnjpng
7058d63bd8 fix: update tool check and add tracing to update tool logic (#6530)
base

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:02:33 -08:00
cthomas
109989ac1a fix: tracing for summarizer errors (#6524) 2025-12-15 12:02:33 -08:00
jnjpng
e5bda413c0 fix: agent environment variables not using encrypted values (#6520)
* base

* clean up

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:02:33 -08:00
Kian Jones
0bac88d88a feat: run id correlation for core logs (#6518)
run-id correlation
2025-12-15 12:02:33 -08:00
Ari Webb
eb547bb96e fix: clear message history no longer deletes messages (#6515)
* fix: clear message history no longer deletes messages

* toast and make it stay for 8 secs

* fix test

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-12-15 12:02:33 -08:00
jnjpng
89ca8fe8b0 feat: read from encrypted values with error on fallback [LET-4250] (#6484)
* base

* warning

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:02:33 -08:00
Kian Jones
964fc05cb9 chore: add more tracing to tool upsert endpoint (#6471)
* add tracing

* fix hallucination
2025-12-15 12:02:19 -08:00
jnjpng
c90eacdcbb fix: step metrics not found (#6472)
* base

* comment

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:02:19 -08:00