Commit Graph

544 Commits

Author SHA1 Message Date
Kian Jones
9418ab9815 feat: add provider trace backend abstraction for multi-backend telemetry (#8814)
* feat: add provider trace backend abstraction for multi-backend telemetry

Introduces a pluggable backend system for provider traces:
- Base class with async/sync create and read interfaces
- PostgreSQL backend (existing behavior)
- ClickHouse backend (via OTEL instrumentation)
- Socket backend (writes to Unix socket for crouton sidecar)
- Factory for instantiating backends from config

Refactors TelemetryManager to use backends with support for:
- Multi-backend writes (concurrent via asyncio.gather)
- Primary backend for reads (first in config list)
- Graceful error handling per backend

Config: LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND (comma-separated)
Example: "postgres,socket" for dual-write to Postgres and crouton

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: add protocol version to socket backend records

Adds PROTOCOL_VERSION constant to socket backend:
- Included in every telemetry record sent to crouton
- Must match ProtocolVersion in apps/crouton/main.go
- Enables crouton to detect and reject incompatible messages

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: remove organization_id from ProviderTraceCreate calls

The organization_id is now handled via the actor parameter in the
telemetry manager, not through ProviderTraceCreate schema. This fixes
validation errors after changing ProviderTraceCreate to inherit from
BaseProviderTrace which forbids extra fields.

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* consolidate provider trace

* add clickhouse-connect to fix bug on main lmao

* auto generated sdk changes, and deployment details, and clikchouse prefix bug and added fields to runs trace return api

* auto generated sdk changes, and deployment details, and clikchouse prefix bug and added fields to runs trace return api

* consolidate provider trace

* consolidate provider trace bug fix

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-19 15:54:43 -08:00
Sarah Wooders
b888c4c17a feat: allow for conversation-level isolation of blocks (#8684)
* feat: add conversation_id parameter to context endpoint [LET-6989]

Add optional conversation_id query parameter to retrieve_agent_context_window.
When provided, the endpoint uses messages from the specific conversation
instead of the agent's default message_ids.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate SDK after context endpoint update [LET-6989]

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: add isolated blocks support for conversations

Allows conversations to have their own copies of specific memory blocks (e.g., todo_list) that override agent defaults, enabling conversation-specific state isolation.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* undo

* update apis

* test

* cleanup

* fix tests

* simplify

* move override logic

* patch

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-19 15:54:39 -08:00
github-actions[bot]
2460b36f97 fix: handle asyncpg QueryCanceledError for statement timeouts (#8241)
The handle_db_timeout decorator only caught SQLAlchemy's TimeoutError
(for pool/connection timeouts) but not asyncpg's QueryCanceledError
which is thrown when PostgreSQL's statement_timeout kills a long-running
query.

This fix:
- Import asyncpg.exceptions.QueryCanceledError
- Update handle_db_timeout decorator to catch QueryCanceledError and wrap
  it in DatabaseTimeoutError
- Update _handle_dbapi_error method to also handle wrapped QueryCanceledError

Fixes #8108

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
2026-01-19 15:54:38 -08:00
cthomas
ab4ccfca31 feat: add tags support to blocks (#8474)
* feat: add tags support to blocks

* fix: add timestamps and org scoping to blocks_tags

Addresses PR feedback:

1. Migration: Added timestamps (created_at, updated_at), soft delete
   (is_deleted), audit fields (_created_by_id, _last_updated_by_id),
   and organization_id to blocks_tags table for filtering support.
   Follows SQLite baseline pattern (composite PK of block_id+tag, no
   separate id column) to avoid insert failures.

2. ORM: Relationship already correct with lazy="raise" to prevent
   implicit joins and passive_deletes=True for efficient CASCADE deletes.

3. Schema: Changed normalize_tags() from Any to dict for type safety.

4. SQLite: Added blocks_tags to SQLite baseline schema to prevent
   table-not-found errors.

5. Code: Updated all tag row inserts to include organization_id.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add ORM columns and update SQLite baseline for blocks_tags

Fixes test failures (CompileError: Unconsumed column names: organization_id):

1. ORM: Added organization_id, timestamps, audit fields to BlocksTags
   ORM model to match database schema from migrations.

2. SQLite baseline: Added full column set to blocks_tags (organization_id,
   timestamps, audit fields) to match PostgreSQL schema.

3. Test: Added 'tags' to expected Block schema fields.

This ensures SQLite and PostgreSQL have matching schemas and the ORM
can consume all columns that the code inserts.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* revert change to existing alembic migration

* fix: remove passive_deletes and SQLite support for blocks_tags

1. Removed passive_deletes=True from Block.tags relationship to match
   AgentsTags pattern (neither have ondelete CASCADE in DB schema).

2. Removed SQLite branch from _replace_block_pivot_rows_async since
   blocks_tags table is PostgreSQL-only (migration skips SQLite).

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* api sync

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-19 15:54:38 -08:00
cthomas
e964307f6a feat: add lazy=raise for passage-org relationship (#8482) 2026-01-12 10:57:49 -08:00
cthomas
cc975b5d15 fix: reuse db session when loading pending approval (#8363) 2026-01-12 10:57:48 -08:00
cthomas
9b359418d0 feat: add pending approval field on agent state (#8361)
* feat: add pending approval field on agent state

* test failures
2026-01-12 10:57:48 -08:00
cthomas
f87c607115 fix: add encrypted placeholder constant (#8354) 2026-01-12 10:57:48 -08:00
cthomas
a54513c343 feat: move decryption outside db session (#8323)
* feat: move decryption outside db session

* fix pydantic error
2026-01-12 10:57:48 -08:00
Sarah Wooders
87d920782f feat: add conversation and conversation_messages tables for concurrent messaging (#8182) 2026-01-12 10:57:48 -08:00
cthomas
3aaab90b4c feat: use bounded concurrency for decryption (#8296) 2026-01-12 10:57:48 -08:00
cthomas
3b0b2cbee1 fix: unbounded file to pydantic conversion (#8292)
* fix: unbounded file to pydantic conversion

* remove var name
2026-01-12 10:57:48 -08:00
cthomas
9be2ab2c3d fix: skip additional query in stale data error handling (#8171) 2026-01-12 10:57:48 -08:00
cthomas
e904330dde fix: step metrics db timeouts [LET-6697] (#8136)
fix: step metrics db timeouts
2026-01-12 10:57:47 -08:00
cthomas
d489d8837f fix: broken pagination for objects with missing created_at [LET-6699] (#8139)
fix: broken pagination for objects with missing created_at
2026-01-12 10:57:47 -08:00
cthomas
0dd1df306a fix: concurrent block update rollback [LET-6695] (#8133)
fix: concurrent block update rollback
2026-01-12 10:57:47 -08:00
cthomas
b6535b7590 fix: agent-runs relationship load causing timeouts [LET-6694] (#8129)
fix: agent-runs relationship load causing timeouts
2026-01-12 10:57:47 -08:00
Sarah Wooders
acd8dd7bcf feat: make embedding_config optional on agent creation (#7553)
* feat: make embedding_config optional on agent creation

- Remove requirement for embedding_config in agent creation
- Add EmbeddingConfigRequiredError for operations that need embeddings
- Add null checks in sleeptime agent creation, passage insert, archive creation
- Register new error in app.py exception handlers

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: update API schemas for optional embedding_config

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-12 10:57:19 -08:00
Ari Webb
4d8d9757aa feat: add request-id for steps [LET-6587] (#7349)
* feat: add request-id for steps

* order revisions correctly

* stage publish api
2026-01-12 10:57:19 -08:00
Ari Webb
79c2319403 chore: add project constraint on tools db (#7360) 2025-12-17 17:32:27 -08:00
jnjpng
25d75d6528 fix: update aysnc get plaintext callsites (#7069)
* base

* resolve

* fix

* fix
2025-12-17 17:31:02 -08:00
jnjpng
00ba2d09f3 refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns (#6751)
* refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns

Complete migration to encrypted-only storage for sensitive fields:

- Remove dual-write to plaintext columns (token, custom_headers,
  authorization_code, access_token, refresh_token, client_secret)
- Read only from _enc columns, not from plaintext fallback
- Remove helper methods (get_token_secret, set_token_secret, etc.)
- Remove Secret.from_db() and Secret.to_dict() methods
- Update tests to verify encrypted-only behavior

After this change, plaintext columns can be set to NULL manually
since they are no longer read from or written to.

* fix test

* rename

* update

* union

* fix test
2025-12-17 17:31:02 -08:00
Sarah Wooders
812bfd16dd Revert "feat: project_id uniqueness for tools" (#7007)
Revert "feat: project_id uniqueness for tools (#6604)"

This reverts commit 2c4b6397041e2c965493525fc52e056f10d1bdb6.
2025-12-15 12:03:09 -08:00
Kian Jones
bce1749408 fix: run PBKDF2 in thread pool to prevent event loop freeze (#6763)
* fix: run PBKDF2 in thread pool to prevent event loop freeze

Problem: Event loop freezes for 100-500ms during secret decryption, blocking
all HTTP requests and async operations. The diagnostic monitor detected the
main thread stuck in PBKDF2 HMAC SHA256 computation at:
  apps/core/letta/helpers/crypto_utils.py:51 (_derive_key)
  apps/core/letta/schemas/secret.py:161 (get_plaintext)

Root cause: PBKDF2 with 100k iterations is intentionally CPU-intensive for
security, but running it synchronously on the main thread blocks the event loop.

Stack trace showed:
  Thread 1 (Main): PBKDF2HMAC -> SHA256_Final -> sha256_block_data_order_avx2
  Event loop watchdog: Detected freeze at 01:11:44 (request started 01:12:03)

Solution:
1. Run PBKDF2 in ThreadPoolExecutor to avoid blocking event loop
2. Add async versions of encrypt/decrypt methods
3. Add LRU cache for derived keys (deterministic results)
4. Add async get_plaintext_async() method to Secret class

Changes:
- apps/core/letta/helpers/crypto_utils.py:
  - Added ThreadPoolExecutor for crypto operations
  - Added @lru_cache(maxsize=256) to _derive_key_cached()
  - Added _derive_key_async() using loop.run_in_executor()
  - Added encrypt_async() and decrypt_async() methods
  - Added warnings to sync methods about blocking behavior

- apps/core/letta/schemas/secret.py:
  - Added get_plaintext_async() method
  - Added warnings to get_plaintext() about blocking behavior

Benefits:
- Event loop no longer freezes during secret decryption
- HTTP requests continue processing while crypto runs in background
- Derived keys are cached, reducing CPU usage for repeated operations
- Backward compatible - sync methods still work for non-async code

Performance impact:
- Before: 100-500ms event loop block per decryption
- After: 100-500ms in thread pool (non-blocking) + LRU cache hits ~0.1ms

Next steps (follow-up PRs):
- Migrate all async callsites to use get_plaintext_async()
- Add metrics to track sync vs async usage
- Consider reducing PBKDF2 iterations if security allows

* update

* test

---------

Co-authored-by: Letta Bot <jinjpeng@gmail.com>
2025-12-15 12:03:09 -08:00
Ari Webb
c1aa01db6f feat: project_id uniqueness for tools (#6604)
* feat: project_id uniqueness for tools

* prevent double upsert of global tools

* use default project if no header for sdk

* reorder unique constraint for performance

* use separate session for check conflict

* feature flag adding project id header in cloud api

* add my migration after one on main

* remove comment

* stage and publish api

* web set project id just for tools

* includes instead of startswith
2025-12-15 12:03:09 -08:00
jnjpng
4be813b956 fix: migrate sandbox and agent environment variables to encrypted only (#6623)
* base

* remove unnnecessary db migration

* update

* fix

* update

* update

* comments

* fix

* revert

* anotha

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:03:08 -08:00
Sarah Wooders
7ea297231a feat: add compaction_settings to agents (#6625)
* initial commit

* Add database migration for compaction_settings field

This migration adds the compaction_settings column to the agents table
to support customized summarization configuration for each agent.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix

* rename

* update apis

* fix tests

* update web test

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Kian Jones <kian@letta.com>
2025-12-15 12:02:34 -08:00
Ari Webb
4092820f3a feat: add project id scoping for tools backend changes (#6529) 2025-12-15 12:02:34 -08:00
Sarah Wooders
4f1fbe45aa feat: add index and concurrency control for tools (fixed alembic) (#6552) 2025-12-15 12:02:34 -08:00
Sarah Wooders
c8c06168e2 Revert "feat: add index and concurrency control for tools " (#6551)
Revert "feat: add index and concurrency control for tools  (#6547)"

This reverts commit f4abf8e061bc2f5e08853b5ce5775a7f8626463a.
2025-12-15 12:02:34 -08:00
Sarah Wooders
a2d3011d84 feat: add index and concurrency control for tools (#6547) 2025-12-15 12:02:34 -08:00
Charles Packer
131891e05f feat: add tracking of advanced usage data (eg caching) [LET-6372] (#6449)
* feat: init refactor

* feat: add helper code

* fix: missing file + test

* fix: just state/publish api
2025-12-15 12:02:19 -08:00
cthomas
6a47bf4946 feat: add fk reference for agent groups [LET-6260] (#6313)
feat: add fk reference for agent groups
2025-11-24 19:10:27 -08:00
cthomas
6dd3f6fecd feat: add back ref for mcp server on org (#6269) 2025-11-24 19:10:11 -08:00
cthomas
369274a219 fix: greenlet spawn errors in deletion path (#6267) 2025-11-24 19:10:11 -08:00
Kian Jones
78600fac23 chore: index suggestions from planetscale suggestions (#6111)
suggestions from ps dashboard
2025-11-13 15:36:56 -08:00
Sarah Wooders
982501f6fa feat: create model table to persist available models (#5835)
---------

Co-authored-by: Ari Webb <arijwebb@gmail.com>
Co-authored-by: Ari Webb <ari@letta.com>
2025-11-13 15:36:56 -08:00
Sarah Wooders
6eeb3c90bb feat: bring back model_settings and remove validation again (#6104) 2025-11-13 15:36:56 -08:00
Sarah Wooders
ddc87418f4 feat: revert model_settings (#6089) 2025-11-13 15:36:56 -08:00
Sarah Wooders
0b1fe096ec feat: split up handle and model_settings (#6022) 2025-11-13 15:36:56 -08:00
jnjpng
849d0dc64a feat: provider-specific model configuration (#5873) (#5874) 2025-11-13 15:36:55 -08:00
Christina Tong
ef3df907c5 feat: add last_stop_reason to AgentState [LET-5911] (#5772)
* feat: add last_stop_reason to AgentState [LET-5911]

* undo agent loop changes, use update_run_by_id_async

* add run manager test

* add integration tests

* remove comment

* remove duplicate test
2025-11-13 15:36:55 -08:00
Sarah Wooders
cfeed463a9 Revert "feat: provider-specific model configuration " (#5873)
Revert "feat: provider-specific model configuration  (#5774)"

This reverts commit 34a334949a3ef72cd49ff0ca3da9e85d16daa57c.
2025-11-13 15:36:20 -08:00
Sarah Wooders
aaa12a393c feat: provider-specific model configuration (#5774)
* initial code updates

* add models

* cleanup

* support overriding

* add apis

* cleanup reasoning interfaces to match models

* update schemas

* update apis

* add new field

* remove parallel

* various fixes

* modify schemas

* fix

* fix

* make model optional

* undo model schema change

* update schemas

* update schemas

* format

* fix tests

* attempt to patch web

* fic docs

* change schemas

* update error

* fix tests

* delete tests

* clean up undefined matching conditional

---------

Co-authored-by: jnjpng <jin@letta.com>
Co-authored-by: Letta Bot <noreply@letta.com>
2025-11-13 15:36:14 -08:00
Sarah Wooders
6654473514 fix: handle block race conditions (#5819) 2025-11-13 15:36:14 -08:00
Sarah Wooders
a566900533 chore: add back test_server.py (#5783) 2025-11-13 15:36:00 -08:00
jnjpng
d09f321af7 fix: before and after pagination comparison based off order [LET-5833] (#5712)
base

Co-authored-by: Letta Bot <noreply@letta.com>
2025-10-24 15:14:21 -07:00
Matthew Zhou
e7e86124f9 feat: Support embedding config on the archive [LET-5832] (#5714)
* Add embedding config field to archives

* Fix alembic script

* Simplify archive manager

* Fern autogen

* Fix failing tests

* Fix alembic
2025-10-24 15:14:21 -07:00
Ari Webb
c7c0d7507c feat: add new mcp_servers routes [LET-4321] (#5675)
---------

Co-authored-by: Ari Webb <ari@letta.com>
Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
2025-10-24 15:14:21 -07:00
cthomas
5e43291436 feat: remove relationship fields on identities (#5691) 2025-10-24 15:14:09 -07:00