letta-server

Author	SHA1	Message	Date
Ari Webb	dd0e513951	fix: lazy load conversations [LET-7682] (#9629 ) fix: lazy load conversations	2026-03-03 18:34:01 -08:00
Sarah Wooders	afbc416972	feat(core): add model/model_settings override fields to conversation create/update (#9607 )	2026-02-24 10:55:26 -08:00
Kian Jones	f5c4ab50f4	chore: add ty + pre-commit hook and repeal even more ruff rules (#9504 ) * auto fixes * auto fix pt2 and transitive deps and undefined var checking locals() * manual fixes (ignored or letta-code fixed) * fix circular import * remove all ignores, add FastAPI rules and Ruff rules * add ty and precommit * ruff stuff * ty check fixes * ty check fixes pt 2 * error on invalid	2026-02-24 10:55:11 -08:00
Kian Jones	25d54dd896	chore: enable F821, F401, W293 (#9503 ) * auto fixes * auto fix pt2 and transitive deps and undefined var checking locals() * manual fixes (ignored or letta-code fixed) * fix circular import	2026-02-24 10:55:08 -08:00
Kian Jones	0d42afa151	fix(core): catch LockNotAvailableError and return 409 instead of 500 (#9359 ) Re-apply changes on top of latest main to resolve merge conflicts. - Add DatabaseLockNotAvailableError custom exception in orm/errors.py - Catch asyncpg LockNotAvailableError and pgcode 55P03 in _handle_dbapi_error - Register FastAPI exception handler returning 409 with Retry-After header 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	71e0a8aab9	fix(core): use INSERT ON CONFLICT DO NOTHING for provider model sync (#9342 ) * fix(core): use INSERT ON CONFLICT DO NOTHING for provider model sync Replaces try/except around model.create_async() with pg_insert() .on_conflict_do_nothing() to prevent UniqueViolationError from being raised at the asyncpg driver level during concurrent model syncs. The previous approach caught the exception in Python but ddtrace still captured it at the driver level, causing Datadog error tracking noise. Fixes Datadog issue d8dec148-d535-11f0-95eb-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * cleaner impl * fix --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Ari Webb <ari@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	4eb27e23b3	fix(core): add deadlock retry logic to ORM write operations (#9352 ) Adds automatic retry with exponential backoff for PostgreSQL deadlock errors (40P01) in all ORM write methods: create_async, update_async, batch_create_async, hard_delete_async, and bulk_hard_delete_async. For update_async, column values are snapshotted before the commit attempt so they can be restored after rollback clears them. Also adds DatabaseDeadlockError to _handle_dbapi_error as a fallback when retries are exhausted. Datadog: https://us5.datadoghq.com/error-tracking/issue/53ccdd7a-f0cc-11f0-8969-da7ad0900000 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com> Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>	2026-02-24 10:52:07 -08:00
Kian Jones	d592ec3135	fix: handle DBAPIError wrapping asyncpg DeadlockDetectedError (#9355 ) SQLAlchemy wraps asyncpg's DeadlockDetectedError in a DBAPIError, which was falling through to the generic 500 handler. Now detected at both the ORM level (_handle_dbapi_error) and FastAPI handler level, returning 409 with Retry-After header. Datadog: https://us5.datadoghq.com/error-tracking/issue/2f1dc54c-dab6-11f0-a828-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Sarah Wooders	21e880907f	feat(core): structure memory directory and block labels [LET-7336] (#9309 )	2026-02-24 10:52:06 -08:00
Sarah Wooders	e0a23f7039	feat: add usage columns to steps table (#9270 ) * feat: add usage columns to steps table Adds denormalized usage fields to the steps table for easier querying: - model_handle: The model handle (e.g., "openai/gpt-4o-mini") - cached_input_tokens: Tokens served from cache - cache_write_tokens: Tokens written to cache (Anthropic) - reasoning_tokens: Reasoning/thinking tokens These fields mirror LettaUsageStatistics and are extracted from the existing prompt_tokens_details and completion_tokens_details JSON columns. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: regenerate OpenAPI specs and SDK for usage columns 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com> --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com>	2026-02-24 10:52:06 -08:00
Kian Jones	c1a02fa180	feat: add metadata-only provider trace storage option (#9155 ) * feat: add metadata-only provider trace storage option Add support for writing provider traces to a lightweight metadata-only table (~1.5GB) instead of the full table (~725GB) since request/response JSON is now stored in GCS. - Add `LETTA_TELEMETRY_PROVIDER_TRACE_PG_METADATA_ONLY` setting - Create `provider_trace_metadata` table via alembic migration - Conditionally write to new table when flag is enabled - Include backfill script for migrating existing data 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: regenerate API spec and SDK * fix: use composite PK (created_at, id) for provider_trace_metadata Aligns with GCS partitioning structure (raw/date=YYYY-MM-DD/{id}.json.gz) and enables efficient date-range queries via the B-tree index. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * ammendments * fix: add bulk data copy to migration Copy existing provider_traces metadata in-migration instead of separate backfill script. Creates indexes after bulk insert for better performance. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: remove data copy from migration, create empty table only Old data stays in provider_traces, new writes go to provider_trace_metadata when flag is enabled. Full traces are in GCS anyway. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: address PR comments - Remove GCS mention from ProviderTraceMetadata docstring - Move metadata object creation outside session context 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: reads always use full provider_traces table The metadata_only flag should only control writes. Reads always go to the full table to avoid returning ProviderTraceMetadata where ProviderTrace is expected. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: enable metadata-only provider trace writes in prod Add LETTA_TELEMETRY_PROVIDER_TRACE_PG_METADATA_ONLY=true to all Helm values (memgpt-server and lettuce-py, prod and dev). 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-29 12:44:04 -08:00
Kian Jones	0099a95a43	fix(sec): first pass of ensuring actor id is required everywhere (#9126 ) first pass of ensuring actor id is required	2026-01-29 12:44:04 -08:00
Kian Jones	e3fb00f970	feat(crouton): add orgId, userId, Compaction_Settings and LLM_Config (#9022 ) * LC one shot? * api changes * fix summarizer nameerror	2026-01-29 12:44:04 -08:00
Ari Webb	5c06918042	fix: don't need embedding model for self hosted [LET-7009] (#8935 ) * fix: don't need embedding model for self hosted * stage publish api * passes tests * add test * remove unnecessary upgrades * update revision order db migrations * add timeout for ci	2026-01-29 12:44:04 -08:00
Ari Webb	4ec6649caf	feat: byok provider models in db also (#8317 ) * feat: byok provider models in db also * make tests and sync api * fix inconsistent state with recreating provider of same name * fix sync on byok creation * update revision * move stripe code for testing purposes * revert * add refresh byok models endpoint * just stage publish api * add tests * reorder revision * add test for name clashes	2026-01-29 12:43:53 -08:00
Kian Jones	2ee28c3264	feat: add telemetry source identifier (#8918 ) * add telemetry source * add source to provider trave	2026-01-19 15:54:44 -08:00
Kian Jones	9418ab9815	feat: add provider trace backend abstraction for multi-backend telemetry (#8814 ) * feat: add provider trace backend abstraction for multi-backend telemetry Introduces a pluggable backend system for provider traces: - Base class with async/sync create and read interfaces - PostgreSQL backend (existing behavior) - ClickHouse backend (via OTEL instrumentation) - Socket backend (writes to Unix socket for crouton sidecar) - Factory for instantiating backends from config Refactors TelemetryManager to use backends with support for: - Multi-backend writes (concurrent via asyncio.gather) - Primary backend for reads (first in config list) - Graceful error handling per backend Config: LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND (comma-separated) Example: "postgres,socket" for dual-write to Postgres and crouton 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: add protocol version to socket backend records Adds PROTOCOL_VERSION constant to socket backend: - Included in every telemetry record sent to crouton - Must match ProtocolVersion in apps/crouton/main.go - Enables crouton to detect and reject incompatible messages 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: remove organization_id from ProviderTraceCreate calls The organization_id is now handled via the actor parameter in the telemetry manager, not through ProviderTraceCreate schema. This fixes validation errors after changing ProviderTraceCreate to inherit from BaseProviderTrace which forbids extra fields. 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * consolidate provider trace * add clickhouse-connect to fix bug on main lmao * auto generated sdk changes, and deployment details, and clikchouse prefix bug and added fields to runs trace return api * auto generated sdk changes, and deployment details, and clikchouse prefix bug and added fields to runs trace return api * consolidate provider trace * consolidate provider trace bug fix --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:43 -08:00
Sarah Wooders	b888c4c17a	feat: allow for conversation-level isolation of blocks (#8684 ) * feat: add conversation_id parameter to context endpoint [LET-6989] Add optional conversation_id query parameter to retrieve_agent_context_window. When provided, the endpoint uses messages from the specific conversation instead of the agent's default message_ids. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: regenerate SDK after context endpoint update [LET-6989] 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: add isolated blocks support for conversations Allows conversations to have their own copies of specific memory blocks (e.g., todo_list) that override agent defaults, enabling conversation-specific state isolation. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * undo * update apis * test * cleanup * fix tests * simplify * move override logic * patch --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:39 -08:00
github-actions[bot]	2460b36f97	fix: handle asyncpg QueryCanceledError for statement timeouts (#8241 ) The handle_db_timeout decorator only caught SQLAlchemy's TimeoutError (for pool/connection timeouts) but not asyncpg's QueryCanceledError which is thrown when PostgreSQL's statement_timeout kills a long-running query. This fix: - Import asyncpg.exceptions.QueryCanceledError - Update handle_db_timeout decorator to catch QueryCanceledError and wrap it in DatabaseTimeoutError - Update _handle_dbapi_error method to also handle wrapped QueryCanceledError Fixes #8108 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-19 15:54:38 -08:00
cthomas	ab4ccfca31	feat: add tags support to blocks (#8474 ) * feat: add tags support to blocks * fix: add timestamps and org scoping to blocks_tags Addresses PR feedback: 1. Migration: Added timestamps (created_at, updated_at), soft delete (is_deleted), audit fields (_created_by_id, _last_updated_by_id), and organization_id to blocks_tags table for filtering support. Follows SQLite baseline pattern (composite PK of block_id+tag, no separate id column) to avoid insert failures. 2. ORM: Relationship already correct with lazy="raise" to prevent implicit joins and passive_deletes=True for efficient CASCADE deletes. 3. Schema: Changed normalize_tags() from Any to dict for type safety. 4. SQLite: Added blocks_tags to SQLite baseline schema to prevent table-not-found errors. 5. Code: Updated all tag row inserts to include organization_id. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: add ORM columns and update SQLite baseline for blocks_tags Fixes test failures (CompileError: Unconsumed column names: organization_id): 1. ORM: Added organization_id, timestamps, audit fields to BlocksTags ORM model to match database schema from migrations. 2. SQLite baseline: Added full column set to blocks_tags (organization_id, timestamps, audit fields) to match PostgreSQL schema. 3. Test: Added 'tags' to expected Block schema fields. This ensures SQLite and PostgreSQL have matching schemas and the ORM can consume all columns that the code inserts. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * revert change to existing alembic migration * fix: remove passive_deletes and SQLite support for blocks_tags 1. Removed passive_deletes=True from Block.tags relationship to match AgentsTags pattern (neither have ondelete CASCADE in DB schema). 2. Removed SQLite branch from _replace_block_pivot_rows_async since blocks_tags table is PostgreSQL-only (migration skips SQLite). 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * api sync --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:38 -08:00
cthomas	e964307f6a	feat: add lazy=raise for passage-org relationship (#8482 )	2026-01-12 10:57:49 -08:00
cthomas	cc975b5d15	fix: reuse db session when loading pending approval (#8363 )	2026-01-12 10:57:48 -08:00
cthomas	9b359418d0	feat: add pending approval field on agent state (#8361 ) * feat: add pending approval field on agent state * test failures	2026-01-12 10:57:48 -08:00
cthomas	f87c607115	fix: add encrypted placeholder constant (#8354 )	2026-01-12 10:57:48 -08:00
cthomas	a54513c343	feat: move decryption outside db session (#8323 ) * feat: move decryption outside db session * fix pydantic error	2026-01-12 10:57:48 -08:00
Sarah Wooders	87d920782f	feat: add conversation and conversation_messages tables for concurrent messaging (#8182 )	2026-01-12 10:57:48 -08:00
cthomas	3aaab90b4c	feat: use bounded concurrency for decryption (#8296 )	2026-01-12 10:57:48 -08:00
cthomas	3b0b2cbee1	fix: unbounded file to pydantic conversion (#8292 ) * fix: unbounded file to pydantic conversion * remove var name	2026-01-12 10:57:48 -08:00
cthomas	9be2ab2c3d	fix: skip additional query in stale data error handling (#8171 )	2026-01-12 10:57:48 -08:00
cthomas	e904330dde	fix: step metrics db timeouts [LET-6697] (#8136 ) fix: step metrics db timeouts	2026-01-12 10:57:47 -08:00
cthomas	d489d8837f	fix: broken pagination for objects with missing created_at [LET-6699] (#8139 ) fix: broken pagination for objects with missing created_at	2026-01-12 10:57:47 -08:00
cthomas	0dd1df306a	fix: concurrent block update rollback [LET-6695] (#8133 ) fix: concurrent block update rollback	2026-01-12 10:57:47 -08:00
cthomas	b6535b7590	fix: agent-runs relationship load causing timeouts [LET-6694] (#8129 ) fix: agent-runs relationship load causing timeouts	2026-01-12 10:57:47 -08:00
Sarah Wooders	acd8dd7bcf	feat: make embedding_config optional on agent creation (#7553 ) * feat: make embedding_config optional on agent creation - Remove requirement for embedding_config in agent creation - Add EmbeddingConfigRequiredError for operations that need embeddings - Add null checks in sleeptime agent creation, passage insert, archive creation - Register new error in app.py exception handlers 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: update API schemas for optional embedding_config 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-12 10:57:19 -08:00
Ari Webb	4d8d9757aa	feat: add request-id for steps [LET-6587] (#7349 ) * feat: add request-id for steps * order revisions correctly * stage publish api	2026-01-12 10:57:19 -08:00
Ari Webb	79c2319403	chore: add project constraint on tools db (#7360 )	2025-12-17 17:32:27 -08:00
jnjpng	25d75d6528	fix: update aysnc get plaintext callsites (#7069 ) * base * resolve * fix * fix	2025-12-17 17:31:02 -08:00
jnjpng	00ba2d09f3	refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns (#6751 ) * refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns Complete migration to encrypted-only storage for sensitive fields: - Remove dual-write to plaintext columns (token, custom_headers, authorization_code, access_token, refresh_token, client_secret) - Read only from _enc columns, not from plaintext fallback - Remove helper methods (get_token_secret, set_token_secret, etc.) - Remove Secret.from_db() and Secret.to_dict() methods - Update tests to verify encrypted-only behavior After this change, plaintext columns can be set to NULL manually since they are no longer read from or written to. * fix test * rename * update * union * fix test	2025-12-17 17:31:02 -08:00
Sarah Wooders	812bfd16dd	Revert "feat: project_id uniqueness for tools" (#7007 ) Revert "feat: project_id uniqueness for tools (#6604)" This reverts commit 2c4b6397041e2c965493525fc52e056f10d1bdb6.	2025-12-15 12:03:09 -08:00
Kian Jones	bce1749408	fix: run PBKDF2 in thread pool to prevent event loop freeze (#6763 ) * fix: run PBKDF2 in thread pool to prevent event loop freeze Problem: Event loop freezes for 100-500ms during secret decryption, blocking all HTTP requests and async operations. The diagnostic monitor detected the main thread stuck in PBKDF2 HMAC SHA256 computation at: apps/core/letta/helpers/crypto_utils.py:51 (_derive_key) apps/core/letta/schemas/secret.py:161 (get_plaintext) Root cause: PBKDF2 with 100k iterations is intentionally CPU-intensive for security, but running it synchronously on the main thread blocks the event loop. Stack trace showed: Thread 1 (Main): PBKDF2HMAC -> SHA256_Final -> sha256_block_data_order_avx2 Event loop watchdog: Detected freeze at 01:11:44 (request started 01:12:03) Solution: 1. Run PBKDF2 in ThreadPoolExecutor to avoid blocking event loop 2. Add async versions of encrypt/decrypt methods 3. Add LRU cache for derived keys (deterministic results) 4. Add async get_plaintext_async() method to Secret class Changes: - apps/core/letta/helpers/crypto_utils.py: - Added ThreadPoolExecutor for crypto operations - Added @lru_cache(maxsize=256) to _derive_key_cached() - Added _derive_key_async() using loop.run_in_executor() - Added encrypt_async() and decrypt_async() methods - Added warnings to sync methods about blocking behavior - apps/core/letta/schemas/secret.py: - Added get_plaintext_async() method - Added warnings to get_plaintext() about blocking behavior Benefits: - Event loop no longer freezes during secret decryption - HTTP requests continue processing while crypto runs in background - Derived keys are cached, reducing CPU usage for repeated operations - Backward compatible - sync methods still work for non-async code Performance impact: - Before: 100-500ms event loop block per decryption - After: 100-500ms in thread pool (non-blocking) + LRU cache hits ~0.1ms Next steps (follow-up PRs): - Migrate all async callsites to use get_plaintext_async() - Add metrics to track sync vs async usage - Consider reducing PBKDF2 iterations if security allows * update * test --------- Co-authored-by: Letta Bot <jinjpeng@gmail.com>	2025-12-15 12:03:09 -08:00
Ari Webb	c1aa01db6f	feat: project_id uniqueness for tools (#6604 ) * feat: project_id uniqueness for tools * prevent double upsert of global tools * use default project if no header for sdk * reorder unique constraint for performance * use separate session for check conflict * feature flag adding project id header in cloud api * add my migration after one on main * remove comment * stage and publish api * web set project id just for tools * includes instead of startswith	2025-12-15 12:03:09 -08:00
jnjpng	4be813b956	fix: migrate sandbox and agent environment variables to encrypted only (#6623 ) * base * remove unnnecessary db migration * update * fix * update * update * comments * fix * revert * anotha --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:03:08 -08:00
Sarah Wooders	7ea297231a	feat: add `compaction_settings` to agents (#6625 ) * initial commit * Add database migration for compaction_settings field This migration adds the compaction_settings column to the agents table to support customized summarization configuration for each agent. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix * rename * update apis * fix tests * update web test --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Kian Jones <kian@letta.com>	2025-12-15 12:02:34 -08:00
Ari Webb	4092820f3a	feat: add project id scoping for tools backend changes (#6529 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	4f1fbe45aa	feat: add index and concurrency control for tools (fixed alembic) (#6552 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	c8c06168e2	Revert "feat: add index and concurrency control for tools " (#6551 ) Revert "feat: add index and concurrency control for tools (#6547)" This reverts commit f4abf8e061bc2f5e08853b5ce5775a7f8626463a.	2025-12-15 12:02:34 -08:00
Sarah Wooders	a2d3011d84	feat: add index and concurrency control for tools (#6547 )	2025-12-15 12:02:34 -08:00
Charles Packer	131891e05f	feat: add tracking of advanced usage data (eg caching) [LET-6372] (#6449 ) * feat: init refactor * feat: add helper code * fix: missing file + test * fix: just state/publish api	2025-12-15 12:02:19 -08:00
cthomas	6a47bf4946	feat: add fk reference for agent groups [LET-6260] (#6313 ) feat: add fk reference for agent groups	2025-11-24 19:10:27 -08:00
cthomas	6dd3f6fecd	feat: add back ref for mcp server on org (#6269 )	2025-11-24 19:10:11 -08:00

1 2 3 4 5 ...

560 Commits