letta-server

Author	SHA1	Message	Date
Sarah Wooders	3fdf2b6c79	chore: deprecate old agent messaging (#9120 )	2026-02-24 10:52:06 -08:00
Sarah Wooders	4096b30cd7	feat: log LLM traces to clickhouse (#9111 ) * feat: add non-streaming option for conversation messages - Add ConversationMessageRequest with stream=True default (backwards compatible) - stream=true (default): SSE streaming via StreamingService - stream=false: JSON response via AgentLoop.load().step() 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: regenerate API schema for ConversationMessageRequest * feat: add direct ClickHouse storage for raw LLM traces Adds ability to store raw LLM request/response payloads directly in ClickHouse, bypassing OTEL span attribute size limits. This enables debugging and analytics on large LLM payloads (>10MB system prompts, large tool schemas, etc.). New files: - letta/schemas/llm_raw_trace.py: Pydantic schema with ClickHouse row helper - letta/services/llm_raw_trace_writer.py: Async batching writer (fire-and-forget) - letta/services/llm_raw_trace_reader.py: Reader with query methods - scripts/sql/clickhouse/llm_raw_traces.ddl: Production table DDL - scripts/sql/clickhouse/llm_raw_traces_local.ddl: Local dev DDL - apps/core/clickhouse-init.sql: Local dev initialization Modified: - letta/settings.py: Added 4 settings (store_llm_raw_traces, ttl, batch_size, flush_interval) - letta/llm_api/llm_client_base.py: Integration into request_async_with_telemetry - compose.yaml: Added ClickHouse service for local dev - justfile: Added clickhouse, clickhouse-cli, clickhouse-traces commands Feature disabled by default (LETTA_STORE_LLM_RAW_TRACES=false). Uses ZSTD(3) compression for 10-30x reduction on JSON payloads. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: address code review feedback for LLM raw traces Fixes based on code review feedback: 1. Fix ClickHouse endpoint parsing - default to secure=False for raw host:port inputs (was defaulting to HTTPS which breaks local dev) 2. Make raw trace writes truly fire-and-forget - use asyncio.create_task() instead of awaiting, so JSON serialization doesn't block request path 3. Add bounded queue (maxsize=10000) - prevents unbounded memory growth under load. Drops traces with warning if queue is full. 4. Fix deprecated asyncio usage - get_running_loop() instead of get_event_loop() 5. Add org_id fallback - use _telemetry_org_id if actor doesn't have it 6. Remove unused imports - json import in reader 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: add missing asyncio import and simplify JSON serialization - Add missing 'import asyncio' that was causing 'name asyncio is not defined' error - Remove unnecessary clean_double_escapes() function - the JSON is stored correctly, the clickhouse-client CLI was just adding extra escaping when displaying - Update just clickhouse-trace to use Python client for correct JSON output 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * test: add clickhouse raw trace integration test * test: simplify clickhouse trace assertions * refactor: centralize usage parsing and stream error traces Use per-client usage helpers for raw trace extraction and ensure streaming errors log requests with error metadata. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * test: exercise provider usage parsing live Make live OpenAI/Anthropic/Gemini requests with credential gating and validate Anthropic cache usage mapping when present. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * test: fix usage parsing tests to pass - Use GoogleAIClient with GEMINI_API_KEY instead of GoogleVertexClient - Update model to gemini-2.0-flash (1.5-flash deprecated in v1beta) - Add tools=[] for Gemini/Anthropic build_request_data 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: extract_usage_statistics returns LettaUsageStatistics Standardize on LettaUsageStatistics as the canonical usage format returned by client helpers. Inline UsageStatistics construction for ChatCompletionResponse where needed. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: add is_byok and llm_config_json columns to ClickHouse traces Extend llm_raw_traces table with: - is_byok (UInt8): Track BYOK vs base provider usage for billing analytics - llm_config_json (String, ZSTD): Store full LLM config for debugging and analysis This enables queries like: - BYOK usage breakdown by provider/model - Config parameter analysis (temperature, max_tokens, etc.) - Debugging specific request configurations * feat: add tests for error traces, llm_config_json, and cache tokens - Update llm_raw_trace_reader.py to query new columns (is_byok, cached_input_tokens, cache_write_tokens, reasoning_tokens, llm_config_json) - Add test_error_trace_stored_in_clickhouse to verify error fields - Add test_cache_tokens_stored_for_anthropic to verify cache token storage - Update existing tests to verify llm_config_json is stored correctly - Make llm_config required in log_provider_trace_async() - Simplify provider extraction to use provider_name directly 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * ci: add ClickHouse integration tests to CI pipeline - Add use-clickhouse option to reusable-test-workflow.yml - Add ClickHouse service container with otel database - Add schema initialization step using clickhouse-init.sql - Add ClickHouse env vars (CLICKHOUSE_ENDPOINT, etc.) - Add separate clickhouse-integration-tests job running integration_test_clickhouse_llm_raw_traces.py 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: simplify provider and org_id extraction in raw trace writer - Use model_endpoint_type.value for provider (not provider_name) - Simplify org_id to just self.actor.organization_id (actor is always pydantic) 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: simplify LLMRawTraceWriter with _enabled flag - Check ClickHouse env vars once at init, set _enabled flag - Early return in write_async/flush_async if not enabled - Remove ValueError raises (never used) - Simplify _get_client (no validation needed since already checked) 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: add LLMRawTraceWriter shutdown to FastAPI lifespan Properly flush pending traces on graceful shutdown via lifespan instead of relying only on atexit handler. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: add agent_tags column to ClickHouse traces Store agent tags as Array(String) for filtering/analytics by tag. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * cleanup * fix(ci): fix ClickHouse schema initialization in CI - Create database separately before loading SQL file - Remove CREATE DATABASE from SQL file (handled in CI step) - Add verification step to confirm table was created - Use -sf flag for curl to fail on HTTP errors 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: simplify LLM trace writer with ClickHouse async_insert - Use ClickHouse async_insert for server-side batching instead of manual queue/flush loop - Sync cloud DDL schema with clickhouse-init.sql (add missing columns) - Remove redundant llm_raw_traces_local.ddl - Remove unused batch_size/flush_interval settings - Update tests for simplified writer Key changes: - async_insert=1, wait_for_async_insert=1 for reliable server-side batching - Simple per-trace retry with exponential backoff (max 3 retries) - ~150 lines removed from writer 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: consolidate ClickHouse direct writes into TelemetryManager backend - Add clickhouse_direct backend to provider_trace_backends - Remove duplicate ClickHouse write logic from llm_client_base.py - Configure via LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND=postgres,clickhouse_direct The clickhouse_direct backend: - Converts ProviderTrace to LLMRawTrace - Extracts usage stats from response JSON - Writes via LLMRawTraceWriter with async_insert 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: address PR review comments and fix llm_config bug Review comment fixes: - Rename clickhouse_direct -> clickhouse_analytics (clearer purpose) - Remove ClickHouse from OSS compose.yaml, create separate compose.clickhouse.yaml - Delete redundant scripts/test_llm_raw_traces.py (use pytest tests) - Remove unused llm_raw_traces_ttl_days setting (TTL handled in DDL) - Fix socket description leak in telemetry_manager docstring - Add cloud-only comment to clickhouse-init.sql - Update justfile to use separate compose file Bug fix: - Fix llm_config not being passed to ProviderTrace in telemetry - Now correctly populates provider, model, is_byok for all LLM calls - Affects both request_async_with_telemetry and log_provider_trace_async DDL optimizations: - Add secondary indexes (bloom_filter for agent_id, model, step_id) - Add minmax indexes for is_byok, is_error - Change model and error_type to LowCardinality for faster GROUP BY 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: rename llm_raw_traces -> llm_traces Address review feedback that "raw" is misleading since we denormalize fields. Renames: - Table: llm_raw_traces -> llm_traces - Schema: LLMRawTrace -> LLMTrace - Files: llm_raw_trace_{reader,writer}.py -> llm_trace_{reader,writer}.py - Setting: store_llm_raw_traces -> store_llm_traces 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: update workflow references to llm_traces Missed renaming table name in CI workflow files. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: update clickhouse_direct -> clickhouse_analytics in docstring 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: remove inaccurate OTEL size limit comments The 4MB limit is our own truncation logic, not an OTEL protocol limit. The real benefit is denormalized columns for analytics queries. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: remove local ClickHouse dev setup (cloud-only feature) - Delete clickhouse-init.sql and compose.clickhouse.yaml - Remove local clickhouse just commands - Update CI to use cloud DDL with MergeTree for testing clickhouse_analytics is a cloud-only feature. For local dev, use postgres backend. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: restore compose.yaml to match main 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: merge clickhouse_analytics into clickhouse backend Per review feedback - having two separate backends was confusing. Now the clickhouse backend: - Writes to llm_traces table (denormalized for cost analytics) - Reads from OTEL traces table (will cut over to llm_traces later) Config: LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND=postgres,clickhouse 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: correct path to DDL file in CI workflow 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: add provider index to DDL for faster filtering 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: configure telemetry backend in clickhouse tests Tests need to set telemetry_settings.provider_trace_backends to include 'clickhouse', otherwise traces are routed to default postgres backend. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: set provider_trace_backend field, not property provider_trace_backends is a computed property, need to set the underlying provider_trace_backend string field instead. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: error trace test and error_type extraction - Add TelemetryManager to error trace test so traces get written - Fix error_type extraction to check top-level before nested error dict 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: use provider_trace.id for trace correlation across backends - Pass provider_trace.id to LLMTrace instead of auto-generating - Log warning if ID is missing (shouldn't happen, helps debug) - Fallback to new UUID only if not set 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: trace ID correlation and concurrency issues - Strip "provider_trace-" prefix from ID for UUID storage in ClickHouse - Add asyncio.Lock to serialize writes (clickhouse_connect not thread-safe) - Fix Anthropic prompt_tokens to include cached tokens for cost analytics - Log warning if provider_trace.id is missing 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Caren Thomas <carenthomas@gmail.com>	2026-02-24 10:52:06 -08:00
jnjpng	24ea7dbaed	feat: include tools as part of token estimate in compact (#9242 ) * base * fix	2026-02-24 10:52:06 -08:00
Ari Webb	0bbb9c9bc0	feat: add reasoning zai openrouter (#9189 ) * feat: add reasoning zai openrouter * add openrouter reasoning * stage + publish api * openrouter reasoning always on * revert * fix * remove reference * do	2026-02-24 10:52:06 -08:00
jnjpng	d28ccc0be6	feat: add summary message and event on compaction (#9144 ) * base * update * update * revert formatting * routes * legacy * fix * review * update	2026-02-24 10:52:05 -08:00
Kian Jones	b0f8e16ac0	fix: check both unique constraints in provider model sync (#9193 ) The sync_provider_models_async function was only checking for existing models by (handle, organization_id, model_type) before creating, but the database has a second unique constraint on (name, provider_id, model_type). This caused UniqueConstraintViolationError when a model with the same name/provider already existed under a different handle. 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:05 -08:00
Kevin Lin	3a13c63f60	fix: preserve slashes in memory block paths (#9172 ) Remove .replace("/", "_") from memory block label processing to allow hierarchical memory organization with slashes. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-01-29 12:44:04 -08:00
Kian Jones	c1a02fa180	feat: add metadata-only provider trace storage option (#9155 ) * feat: add metadata-only provider trace storage option Add support for writing provider traces to a lightweight metadata-only table (~1.5GB) instead of the full table (~725GB) since request/response JSON is now stored in GCS. - Add `LETTA_TELEMETRY_PROVIDER_TRACE_PG_METADATA_ONLY` setting - Create `provider_trace_metadata` table via alembic migration - Conditionally write to new table when flag is enabled - Include backfill script for migrating existing data 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: regenerate API spec and SDK * fix: use composite PK (created_at, id) for provider_trace_metadata Aligns with GCS partitioning structure (raw/date=YYYY-MM-DD/{id}.json.gz) and enables efficient date-range queries via the B-tree index. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * ammendments * fix: add bulk data copy to migration Copy existing provider_traces metadata in-migration instead of separate backfill script. Creates indexes after bulk insert for better performance. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: remove data copy from migration, create empty table only Old data stays in provider_traces, new writes go to provider_trace_metadata when flag is enabled. Full traces are in GCS anyway. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: address PR comments - Remove GCS mention from ProviderTraceMetadata docstring - Move metadata object creation outside session context 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: reads always use full provider_traces table The metadata_only flag should only control writes. Reads always go to the full table to avoid returning ProviderTraceMetadata where ProviderTrace is expected. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: enable metadata-only provider trace writes in prod Add LETTA_TELEMETRY_PROVIDER_TRACE_PG_METADATA_ONLY=true to all Helm values (memgpt-server and lettuce-py, prod and dev). 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-29 12:44:04 -08:00
amysguan	69cad47e6a	fix: respect enable_reasoner setting from .af imports instead of falling back to model defaults (#9163 ) Co-authored-by: Amy Guan <amy@letta.com>	2026-01-29 12:44:04 -08:00
Ari Webb	a798cc90c4	fix: openrouter provider (#9166 ) * fix: openrouter provider * just stage publish api * web openapi	2026-01-29 12:44:04 -08:00
Ari Webb	9ce1249738	feat: openrouter byok (#9148 ) * feat: openrouter byok * new client is unnecessary * revert json diffs	2026-01-29 12:44:04 -08:00
Shubham Naik	bb2145c24c	connections (#9113 ) * chore: release code * chore: release code * chore: release code * chore: release code * chore: release code * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: change paths * chore: remote * chore: support multi project chat	2026-01-29 12:44:04 -08:00
Kian Jones	42b1e741dc	fix: prevent duplicate block attachment in sleeptime agents (#9150 ) Check if a block with the same label already exists before attaching to sleeptime agents. This prevents UniqueConstraintViolationError on the (agent_id, block_label) constraint when the same block is attached multiple times due to race conditions. 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-01-29 12:44:04 -08:00
Ari Webb	04e6d668ec	fix: make it so sync updates model_endpoint info (#9138 )	2026-01-29 12:44:04 -08:00
Charles Packer	0c016d3ee3	fix(core): correct cursor direction for descending pagination in list_agent_blocks_async (#9122 ) The cursor-based pagination was not accounting for sort order. When using descending order (the default), "after cursor X" should return items with id < X (items that come after X in the descending result set), but the code was using id > X which caused infinite loops in clients iterating through pages. This fix adjusts the cursor comparison based on the sort order: - ascending: after=id > X, before=id < X - descending: after=id < X, before=id > X Note: Other pagination methods (list_agent_sources_async, list_agent_tools_async, list_agent_groups_async) may have the same issue and should be audited. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-01-29 12:44:04 -08:00
cthomas	674bfe95ec	fix: no default actor setting check (#9124 ) * fix: no default actor setting check * prevent default actor creation in fallback	2026-01-29 12:44:04 -08:00
Kian Jones	0099a95a43	fix(sec): first pass of ensuring actor id is required everywhere (#9126 ) first pass of ensuring actor id is required	2026-01-29 12:44:04 -08:00
Sarah Wooders	b34ad43691	feat: add minimax byok to ui (#9101 ) * fix: patch minimax * feat: add frontend changes for minimax * add logo, fix backend * better check for is minimax * more references fixed for minimax * start revering unnecessary changes * revert backend changes, just ui * fix minimax fully * fix test * add key to deploy action --------- Co-authored-by: Ari Webb <ari@letta.com> Co-authored-by: Ari Webb <arijwebb@gmail.com>	2026-01-29 12:44:04 -08:00
Shelley Pham	5dc70e48eb	Shelley/let 7218 editor should be compatible with typescript [LET-7218] (#9087 ) * fix python icon not showing up * make typescript compatible for updating tools in typescript * Update flags.ts * display tools properly in navigation * add default json schema to newly created tools * add typescript to code editor * make editor typescript compatible * Update ToolsEditor.tsx * typescript ocmpatible editor * sandbox stuff * update breadcrumb icon * pass in source type to tool simulator * undo * Update tool-editor.cy.ts	2026-01-29 12:44:04 -08:00
github-actions[bot]	1d1bb29a43	feat: add override_model support for agent file import (#9058 )	2026-01-29 12:44:04 -08:00
cthomas	4c2253dc76	fix: use repr() fallback for empty exception messages in error logging (#9047 ) Problem: Error logs showed empty detail fields when exceptions had no message: ``` Error during step processing: Run run-xxx stopped with unknown error: , error_data: {...'detail': ''} ``` This made debugging production issues difficult as the actual error type was hidden. Root Cause: Python exceptions created with no arguments (e.g., `Exception()` or caught and re-raised in certain ways) have `str(e) == ""`: ```python e = Exception() str(e) # Returns "" repr(e) # Returns "Exception()" ``` When exceptions with empty string representations were caught, all logging and error messages showed blank details. Fix: Use `str(e) or repr(e)` fallback pattern in 3 places: 1. `letta_agent_v3.py` stream() exception handler (line 406) 2. `letta_agent_v3.py` step() exception handler (line 928) 3. `streaming_service.py` generic exception handler (line 469) Result: - Error logs now show `Exception()` or similar instead of empty string - Helps identify exception types even when message is missing - Better production debugging without changing exception handling logic 👾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-01-29 12:44:04 -08:00
cthomas	ca40eff7bc	fix: ensure stop_reason is always set when marking runs as failed (#9045 ) Problem: Production error showed runs being marked as failed with stop_reason=None, which violates LettaStopReason's Pydantic schema (requires valid enum value). This caused cascading validation errors that got stored in metadata. Example error: ``` Run is already in a terminal state failed with stop reason None, but is being updated with data {'status': 'failed', 'stop_reason': None, 'metadata': {'error': "1 validation error for LettaStopReason\nstop_reason Input should be 'end_turn', 'error', ... [type=enum, input_value=None]"}} ``` Root Causes: 1. routers/v1/agents.py had 3 exception handlers creating RunUpdate(status=failed) without stop_reason 2. Success path assumed result.stop_reason always exists (AttributeError if None) 3. run_manager.py tried to create LettaStopReason(stop_reason=None) when refreshing result messages Fixes: 1. Added stop_reason=StopReasonType.error to 3 exception handlers 2. Added defensive None checks before accessing result.stop_reason.stop_reason 3. Added fallback to StopReasonType.error when pydantic_run.stop_reason is None Trigger: OpenAI BadRequestError for invalid tool schema → exception handlers marked run as failed without stop_reason → validation error when constructing response 👾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-01-29 12:44:04 -08:00
Ari Webb	e5afbd0972	fix: base url wrong (#9040 )	2026-01-29 12:44:04 -08:00
Sarah Wooders	25e9539a6e	feat: add batch passage create and optional search `query` (#8866 )	2026-01-29 12:44:04 -08:00
cthomas	c162de5127	fix: use shared event + .athrow() to properly set stream_was_cancelle… (#9019 ) fix: use shared event + .athrow() to properly set stream_was_cancelled flag Problem: When a run is cancelled via /cancel endpoint, `stream_was_cancelled` remained False because `RunCancelledException` was raised in the consumer code (wrapper), which closes the generator from outside. This causes Python to skip the generator's except blocks and jump directly to finally with the wrong flag value. Solution: 1. Shared `asyncio.Event` registry for cross-layer cancellation signaling 2. `cancellation_aware_stream_wrapper` sets the event when cancellation detected 3. Wrapper uses `.athrow()` to inject exception INTO generator (not consumer-side raise) 4. All streaming interfaces check event in `finally` block to set flag correctly 5. `streaming_service.py` handles `RunCancelledException` gracefully, yields [DONE] Changes: - streaming_response.py: Event registry + .athrow() injection + graceful handling - openai_streaming_interface.py: 3 classes check event in finally - gemini_streaming_interface.py: Check event in finally - anthropic_.py: Catch RunCancelledException - simple_llm_stream_adapter.py: Create & pass event to interfaces - streaming_service.py: Handle RunCancelledException, yield [DONE], skip double-update - routers/v1/{conversations,runs}.py: Pass event to wrapper - integration_test_human_in_the_loop.py: New test for approval + cancellation Tests:* - test_tool_call with cancellation (OpenAI models) ✅ - test_approve_with_cancellation (approval flow + concurrent cancel) ✅ Known cosmetic warnings (pre-existing): - "Run already in terminal state" - agent loop tries to update after /cancel - "Stream ended without terminal event" - background streaming timing race 👾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-01-29 12:44:04 -08:00
Kian Jones	e3fb00f970	feat(crouton): add orgId, userId, Compaction_Settings and LLM_Config (#9022 ) * LC one shot? * api changes * fix summarizer nameerror	2026-01-29 12:44:04 -08:00
Ari Webb	5c06918042	fix: don't need embedding model for self hosted [LET-7009] (#8935 ) * fix: don't need embedding model for self hosted * stage publish api * passes tests * add test * remove unnecessary upgrades * update revision order db migrations * add timeout for ci	2026-01-29 12:44:04 -08:00
Kian Jones	2bb4caffc3	fix: remove unused embedding generation (#9013 ) * remove unused embedding generation * prevent double embed * fix embedding dimension comparison and valueerror	2026-01-29 12:43:53 -08:00
Ari Webb	5c1512237f	fix: restore deleted provider session conflicts (#9001 )	2026-01-29 12:43:53 -08:00
Kian Jones	4d256b3399	feat: add agent_id, run_id, step_id to summarization provider traces (#8996 ) * feat: add agent_id, run_id, step_id to summarization provider traces Summarization LLM calls were missing telemetry context (agent_id, agent_tags, run_id, step_id), making it impossible to attribute summarization costs to specific agents or trace them back to the step that triggered compaction. Changes: - Add step_id param to simple_summary() and set_telemetry_context() - Add agent_id, agent_tags, run_id, step_id to summarize_all() and summarize_via_sliding_window() - Update Summarizer class to accept and pass telemetry context - Update LettaAgentV3.compact() to pass full telemetry context - Update LettaAgentV2.summarize_conversation_history() with run_id/step_id - Update LettaAgent (v1) streaming methods with run_id param - Add run_id/step_id to SummarizeParams for Temporal activities 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: update test mock to accept new summarization params 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-29 12:43:53 -08:00
Kian Jones	1ab21af725	fix: safer type coersion for tools (#8990 ) * mvp * perfrom type coercion in sandbox * fix: safely resolve typing annotations on host Use an AST whitelist for generic annotations to avoid eval while keeping list/dict coercion working. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-29 12:43:53 -08:00
Ari Webb	2e826577d9	fix: fix zai and others byok (#8991 ) * fix: fix zai and other byok providers * fix test * get endpoint from typed provider and add test * also add base_url on provider create	2026-01-29 12:43:53 -08:00
Kian Jones	7133083b81	fix: agent_tags for provider traces (#8989 ) * add include tags * include agent_tags and pass them into the adapter	2026-01-29 12:43:53 -08:00
Charles Packer	2fc592e0b6	feat(core): add image support in tool returns [LET-7140] (#8985 ) * feat(core): add image support in tool returns [LET-7140] Enable tool_return to support both string and ImageContent content parts, matching the pattern used for user message inputs. This allows tools executed client-side to return images back to the agent. Changes: - Add LettaToolReturnContentUnion type for text/image content parts - Update ToolReturn schema to accept Union[str, List[content parts]] - Update converters for each provider: - OpenAI Chat Completions: placeholder text for images - OpenAI Responses API: full image support - Anthropic: full image support with base64 - Google: placeholder text for images - Add resolve_tool_return_images() for URL-to-base64 conversion - Make create_approval_response_message_from_input() async 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): support images in Google tool returns as sibling parts Following the gemini-cli pattern: images in tool returns are sent as sibling inlineData parts alongside the functionResponse, rather than inside it. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * test(core): add integration tests for multi-modal tool returns [LET-7140] Tests verify that: - Models with image support (Anthropic, OpenAI Responses API) can see images in tool returns and identify the secret text - Models without image support (Chat Completions) get placeholder text and cannot see the actual image content - Tool returns with images persist correctly in the database Uses secret.png test image containing hidden text "FIREBRAWL" that models must identify to pass the test. Also fixes misleading comment about Anthropic only supporting base64 images - they support URLs too, we just pre-resolve for consistency. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: simplify tool return image support implementation Reduce code verbosity while maintaining all functionality: - Extract _resolve_url_to_base64() helper in message_helper.py (eliminates duplication) - Add _get_text_from_part() helper for text extraction - Add _get_base64_image_data() helper for image data extraction - Add _tool_return_to_google_parts() to simplify Google implementation - Add _image_dict_to_data_url() for OpenAI Responses format - Use walrus operator and list comprehensions where appropriate - Add integration_test_multi_modal_tool_returns.py to CI workflow Net change: -120 lines while preserving all features and test coverage. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(tests): improve prompt for multi-modal tool return tests Make prompts more direct to reduce LLM flakiness: - Simplify tool description: "Retrieves a secret image with hidden text. Call this function to get the image." - Change user prompt from verbose request to direct command: "Call the get_secret_image function now." - Apply to both test methods This reduces ambiguity and makes tool calling more reliable across different LLM models. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix bugs * test(core): add google_ai/gemini-2.0-flash-exp to multi-modal tests Add Gemini model to test coverage for multi-modal tool returns. Google AI already supports images in tool returns via sibling inlineData parts. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(ui): handle multi-modal tool_return type in frontend components Convert Union<string, LettaToolReturnContentUnion[]> to string for display: - ViewRunDetails: Convert array to '[Image here]' placeholder - ToolCallMessageComponent: Convert array to '[Image here]' placeholder Fixes TypeScript errors in web, desktop-ui, and docker-ui type-checks. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Caren Thomas <carenthomas@gmail.com>	2026-01-29 12:43:53 -08:00
Ari Webb	4ec6649caf	feat: byok provider models in db also (#8317 ) * feat: byok provider models in db also * make tests and sync api * fix inconsistent state with recreating provider of same name * fix sync on byok creation * update revision * move stripe code for testing purposes * revert * add refresh byok models endpoint * just stage publish api * add tests * reorder revision * add test for name clashes	2026-01-29 12:43:53 -08:00
Kevin Lin	b5519f02fb	feat: make tool return messages more explicit [LET-7145] (#8986 ) prompt	2026-01-29 12:43:53 -08:00
Charles Packer	238894eebd	fix(core): disable MCP stdio servers by default (#8969 ) * fix(core): disable MCP stdio servers by default Stdio MCP servers spawn local processes on the host, which is not suitable for multi-tenant or shared server deployments. This change: - Changes `mcp_disable_stdio` default from False to True - Enforces the setting in `get_mcp_client()` and `create_mcp_server_from_config()` - Users running local/single-user deployments can set MCP_DISABLE_STDIO=false to enable stdio-based MCP servers (e.g., for npx/uvx tools) 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * update ci * push --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: jnjpng <jin@letta.com> Co-authored-by: Letta Bot <jinjpeng@gmail.com>	2026-01-29 12:43:53 -08:00
Ari Webb	5645ca8107	fix: use labels for error messages for builtin memory tools [LET-7095] (#8941 ) * fix: use labels for error messages for builtin memory tools * catch specific error	2026-01-29 12:43:53 -08:00
cthomas	5f58819bbf	fix: false positives for token markers (#8942 )	2026-01-29 12:43:23 -08:00
Ari Webb	7a104a74fe	fix: allow conversation_search time only queries [LET-7094] (#8939 ) fix: allow conversation_search time only queries	2026-01-19 15:54:44 -08:00
cthomas	b3a213a56f	revert: fix: include assistant response in sliding window summarizati… (#8929 ) revert: fix: include assistant response in sliding window summarization slice	2026-01-19 15:54:44 -08:00
cthomas	4be366470b	fix: summarizer cutoff bug (#8928 ) * fix: summarizer cutoff bug * update comment with context	2026-01-19 15:54:44 -08:00
github-actions[bot]	90f3ab9184	fix: validate URL scheme in fetch_webpage to reject file:// URLs (#8889 ) Adds validation to the fetch_webpage tool to ensure only HTTP/HTTPS URLs are accepted. Previously, passing a file:// URL would cause an unhandled requests.exceptions.InvalidSchema error. Now it raises a clear ValueError with a helpful error message. Fixes: requests.exceptions.InvalidSchema: No connection adapters were found for 'file://...' 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-19 15:54:44 -08:00
Kian Jones	2ee28c3264	feat: add telemetry source identifier (#8918 ) * add telemetry source * add source to provider trave	2026-01-19 15:54:44 -08:00
github-actions[bot]	15de765876	fix: downgrade MCP tool error logs from warning to debug (#8846 ) Change the log level for expected MCP tool execution failures (ToolError, McpError) from warning to debug in fastmcp_client.py to match base_client.py. These errors occur when an LLM calls an MCP tool with missing/invalid arguments - they are expected user-facing issues from external MCP servers, not system errors that should trigger production alerts. Fixes #8845 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-19 15:54:43 -08:00
github-actions[bot]	940d648d42	fix(mcp): handle MCP tool errors gracefully to prevent Datadog alerts (#8687 ) MCP tool errors (ToolError, McpError) are expected user-facing errors from external MCP servers (e.g., "No connected account found"). These were propagating through @trace_method decorator and being recorded as errors in Datadog APM. Changes: - Add try/except to catch expected MCP errors in ExternalMCPToolExecutor - Return ToolExecutionResult with status="error" instead of re-raising - Log expected errors at INFO level instead of letting them trace as ERROR - Remove stray 'pass' statement that was a no-op Fixes #8685 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-19 15:54:43 -08:00
Kian Jones	81b5d71889	feat: add agents and log error properly (#8914 ) * add agents and log error properly * fix llm stream adapter	2026-01-19 15:54:43 -08:00
Kian Jones	b0dfdd2725	fix commas in justfile helm secret setting and bug with missing metadata (#8874 )	2026-01-19 15:54:43 -08:00
Kian Jones	a92e868ee6	feat: centralize telemetry logging at LLM client level (#8815 ) * feat: centralize telemetry logging at LLM client level Moves telemetry logging from individual adapters to LLMClientBase: - Add TelemetryStreamWrapper for streaming telemetry on stream close - Add request_async_with_telemetry() for non-streaming requests - Add stream_async_with_telemetry() for streaming requests - Add set_telemetry_context() to configure agent_id, run_id, step_id Updates adapters and agents to use new pattern: - LettaLLMAdapter now accepts agent_id/run_id in constructor - Adapters call set_telemetry_context() before LLM requests - Removes duplicate telemetry logging from adapters - Enriches traces with agent_id, run_id, call_type metadata 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: accumulate streaming response content for telemetry TelemetryStreamWrapper now extracts actual response data from chunks: - Content text (concatenated from deltas) - Tool calls (id, name, arguments) - Model name, finish reason, usage stats 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: move streaming telemetry to caller (option 3) - Remove TelemetryStreamWrapper class - Add log_provider_trace_async() helper to LLMClientBase - stream_async_with_telemetry() now just returns raw stream - Callers log telemetry after processing with rich interface data Updated callers: - summarizer.py: logs content + usage after stream processing - letta_agent.py: logs tool_call, reasoning, model, usage 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: pass agent_id and run_id to parent adapter class LettaLLMStreamAdapter was not passing agent_id/run_id to parent, causing "unexpected keyword argument" errors. 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:43 -08:00
Kian Jones	9418ab9815	feat: add provider trace backend abstraction for multi-backend telemetry (#8814 ) * feat: add provider trace backend abstraction for multi-backend telemetry Introduces a pluggable backend system for provider traces: - Base class with async/sync create and read interfaces - PostgreSQL backend (existing behavior) - ClickHouse backend (via OTEL instrumentation) - Socket backend (writes to Unix socket for crouton sidecar) - Factory for instantiating backends from config Refactors TelemetryManager to use backends with support for: - Multi-backend writes (concurrent via asyncio.gather) - Primary backend for reads (first in config list) - Graceful error handling per backend Config: LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND (comma-separated) Example: "postgres,socket" for dual-write to Postgres and crouton 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: add protocol version to socket backend records Adds PROTOCOL_VERSION constant to socket backend: - Included in every telemetry record sent to crouton - Must match ProtocolVersion in apps/crouton/main.go - Enables crouton to detect and reject incompatible messages 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: remove organization_id from ProviderTraceCreate calls The organization_id is now handled via the actor parameter in the telemetry manager, not through ProviderTraceCreate schema. This fixes validation errors after changing ProviderTraceCreate to inherit from BaseProviderTrace which forbids extra fields. 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * consolidate provider trace * add clickhouse-connect to fix bug on main lmao * auto generated sdk changes, and deployment details, and clikchouse prefix bug and added fields to runs trace return api * auto generated sdk changes, and deployment details, and clikchouse prefix bug and added fields to runs trace return api * consolidate provider trace * consolidate provider trace bug fix --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:43 -08:00

1 2 3 4 5 ...

1773 Commits