Initialize `result` and `run_status` variables before the try block
to prevent UnboundLocalError in the finally block when exceptions
occur early in execution (e.g., during AgentLoop.load()).
Previously, if an exception was raised before `result` was assigned
inside the try block, the finally block would fail when trying to
access `result`, masking the original error with an UnboundLocalError.
This fix ensures the finally block can safely check and use these
variables regardless of when/where an exception occurs.
fix: check for actual error content, not just "error" key presence
OpenAI Responses API returns {"error": null} on success, which
incorrectly triggered is_error=True. Now we check if error_data
is truthy rather than just checking key existence.
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
Both letta_llm_stream_adapter and simple_llm_stream_adapter were
creating ProviderTrace without call_type, causing "unknown" in
ClickHouse analytics.
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* refactor: extract compact logic to shared function
Extract the compaction logic from LettaAgentV3.compact() into a
standalone compact_messages() function that can be shared between
the agent and temporal workflows.
Changes:
- Create apps/core/letta/services/summarizer/compact.py with:
- compact_messages(): Core compaction logic
- build_summarizer_llm_config(): LLM config builder for summarization
- CompactResult: Dataclass for compaction results
- Update LettaAgentV3.compact() to use compact_messages()
- Update temporal summarize_conversation_history activity to use
compact_messages() instead of the old Summarizer class
- Add use_summary_role parameter to SummarizeParams
This ensures consistent summarization behavior across different
execution paths and prevents drift as we improve the implementation.
* chore: clean up verbose comments
* fix: correct CompactionSettings import path
* fix: correct count_tokens import from summarizer_sliding_window
* fix: update test patch path for count_tokens_with_tools
After extracting compact logic to compact.py, the test was patching
the old location. Update the patch path to the new module location.
* fix: update test to use build_summarizer_llm_config from compact.py
The function was moved from LettaAgentV3._build_summarizer_llm_config
to compact.py as a standalone function.
* fix: add early check for system prompt size in compact_messages
Check if the system prompt alone exceeds the context window before
attempting summarization. The system prompt cannot be compacted,
so fail fast with SystemPromptTokenExceededError.
* fix: properly propagate SystemPromptTokenExceededError from compact
The exception handler in _step() was not setting the correct stop_reason
for SystemPromptTokenExceededError, which caused the finally block to
return early and swallow the exception.
Add special handling to set stop_reason to context_window_overflow_in_system_prompt
when SystemPromptTokenExceededError is caught.
* revert: remove redundant SystemPromptTokenExceededError handling
The special handling in the outer exception handler is redundant because
stop_reason is already set in the inner handler at line 943. The actual
fix for the test was the early check in compact_messages(), not this
redundant handling.
* fix: correctly re-raise SystemPromptTokenExceededError
The inner exception handler was using 'raise e' which re-raised the outer
ContextWindowExceededError instead of the current SystemPromptTokenExceededError.
Changed to 'raise' to correctly re-raise the current exception. This bug
was pre-existing but masked because _check_for_system_prompt_overflow was
only called as a fallback. The new early check in compact_messages() exposed it.
* revert: remove early check and restore raise e to match main behavior
* fix: set should_continue=False and correctly re-raise exception
- Add should_continue=False in SystemPromptTokenExceededError handler (matching main's _check_for_system_prompt_overflow behavior)
- Fix raise e -> raise to correctly propagate SystemPromptTokenExceededError
Note: test_large_system_prompt_summarization still fails locally but passes on main.
Need to investigate why exception isn't propagating correctly on refactored branch.
* fix: add SystemPromptTokenExceededError handler for post-step compaction
The post-step compaction (line 1066) was missing a SystemPromptTokenExceededError
exception handler. When compact_messages() raised this error, it would be caught
by the outer exception handler which would:
1. Set stop_reason to "error" instead of "context_window_overflow_in_system_prompt"
2. Not set should_continue = False
3. Get swallowed by the finally block (line 1126) which returns early
This caused test_large_system_prompt_summarization to fail because the exception
never propagated to the test.
The fix adds the same exception handler pattern used in the retry compaction flow
(line 941-946), ensuring proper state is set before re-raising.
This issue only affected the refactored code because on main, _check_for_system_prompt_overflow()
was an instance method that set should_continue/stop_reason BEFORE raising. In the refactor,
compact_messages() is a standalone function that cannot set instance state, so the caller
must handle the exception and set the state.
* feat: add ID format validation to agent and user schemas
Reuse existing validator types (ToolId, SourceId, BlockId, MessageId,
IdentityId, UserId) from letta.validators to enforce ID format validation
at the schema level. This ensures malformed IDs are rejected with a 422
validation error instead of causing 500 database errors.
Changes:
- CreateAgent: validate tool_ids, source_ids, folder_ids, block_ids, identity_ids
- UpdateAgent: validate tool_ids, source_ids, folder_ids, block_ids, message_ids, identity_ids
- UserUpdate: validate id
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
* fix: override ID validation in AgentSchema for agent file portability
AgentSchema extends CreateAgent but needs to allow arbitrary short IDs
(e.g., tool-0, block-0) for portable agent files. Override the validated
ID fields to use plain List[str] instead of the validated types.
Also fix test_agent.af to use proper UUID-format IDs.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: revert test_agent.af - short IDs are valid for agent files
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix openapi schema
---------
Co-authored-by: Letta <noreply@letta.com>
Log error traces to ClickHouse when streaming requests fail,
matching the behavior in letta_llm_stream_adapter.
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* feat: add non-streaming option for conversation messages
- Add ConversationMessageRequest with stream=True default (backwards compatible)
- stream=true (default): SSE streaming via StreamingService
- stream=false: JSON response via AgentLoop.load().step()
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API schema for ConversationMessageRequest
* feat: add direct ClickHouse storage for raw LLM traces
Adds ability to store raw LLM request/response payloads directly in ClickHouse,
bypassing OTEL span attribute size limits. This enables debugging and analytics
on large LLM payloads (>10MB system prompts, large tool schemas, etc.).
New files:
- letta/schemas/llm_raw_trace.py: Pydantic schema with ClickHouse row helper
- letta/services/llm_raw_trace_writer.py: Async batching writer (fire-and-forget)
- letta/services/llm_raw_trace_reader.py: Reader with query methods
- scripts/sql/clickhouse/llm_raw_traces.ddl: Production table DDL
- scripts/sql/clickhouse/llm_raw_traces_local.ddl: Local dev DDL
- apps/core/clickhouse-init.sql: Local dev initialization
Modified:
- letta/settings.py: Added 4 settings (store_llm_raw_traces, ttl, batch_size, flush_interval)
- letta/llm_api/llm_client_base.py: Integration into request_async_with_telemetry
- compose.yaml: Added ClickHouse service for local dev
- justfile: Added clickhouse, clickhouse-cli, clickhouse-traces commands
Feature disabled by default (LETTA_STORE_LLM_RAW_TRACES=false).
Uses ZSTD(3) compression for 10-30x reduction on JSON payloads.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: address code review feedback for LLM raw traces
Fixes based on code review feedback:
1. Fix ClickHouse endpoint parsing - default to secure=False for raw host:port
inputs (was defaulting to HTTPS which breaks local dev)
2. Make raw trace writes truly fire-and-forget - use asyncio.create_task()
instead of awaiting, so JSON serialization doesn't block request path
3. Add bounded queue (maxsize=10000) - prevents unbounded memory growth
under load. Drops traces with warning if queue is full.
4. Fix deprecated asyncio usage - get_running_loop() instead of get_event_loop()
5. Add org_id fallback - use _telemetry_org_id if actor doesn't have it
6. Remove unused imports - json import in reader
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: add missing asyncio import and simplify JSON serialization
- Add missing 'import asyncio' that was causing 'name asyncio is not defined' error
- Remove unnecessary clean_double_escapes() function - the JSON is stored correctly,
the clickhouse-client CLI was just adding extra escaping when displaying
- Update just clickhouse-trace to use Python client for correct JSON output
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* test: add clickhouse raw trace integration test
* test: simplify clickhouse trace assertions
* refactor: centralize usage parsing and stream error traces
Use per-client usage helpers for raw trace extraction and ensure streaming errors log requests with error metadata.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* test: exercise provider usage parsing live
Make live OpenAI/Anthropic/Gemini requests with credential gating and validate Anthropic cache usage mapping when present.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* test: fix usage parsing tests to pass
- Use GoogleAIClient with GEMINI_API_KEY instead of GoogleVertexClient
- Update model to gemini-2.0-flash (1.5-flash deprecated in v1beta)
- Add tools=[] for Gemini/Anthropic build_request_data
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* refactor: extract_usage_statistics returns LettaUsageStatistics
Standardize on LettaUsageStatistics as the canonical usage format returned by client helpers. Inline UsageStatistics construction for ChatCompletionResponse where needed.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* feat: add is_byok and llm_config_json columns to ClickHouse traces
Extend llm_raw_traces table with:
- is_byok (UInt8): Track BYOK vs base provider usage for billing analytics
- llm_config_json (String, ZSTD): Store full LLM config for debugging and analysis
This enables queries like:
- BYOK usage breakdown by provider/model
- Config parameter analysis (temperature, max_tokens, etc.)
- Debugging specific request configurations
* feat: add tests for error traces, llm_config_json, and cache tokens
- Update llm_raw_trace_reader.py to query new columns (is_byok,
cached_input_tokens, cache_write_tokens, reasoning_tokens, llm_config_json)
- Add test_error_trace_stored_in_clickhouse to verify error fields
- Add test_cache_tokens_stored_for_anthropic to verify cache token storage
- Update existing tests to verify llm_config_json is stored correctly
- Make llm_config required in log_provider_trace_async()
- Simplify provider extraction to use provider_name directly
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* ci: add ClickHouse integration tests to CI pipeline
- Add use-clickhouse option to reusable-test-workflow.yml
- Add ClickHouse service container with otel database
- Add schema initialization step using clickhouse-init.sql
- Add ClickHouse env vars (CLICKHOUSE_ENDPOINT, etc.)
- Add separate clickhouse-integration-tests job running
integration_test_clickhouse_llm_raw_traces.py
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* refactor: simplify provider and org_id extraction in raw trace writer
- Use model_endpoint_type.value for provider (not provider_name)
- Simplify org_id to just self.actor.organization_id (actor is always pydantic)
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* refactor: simplify LLMRawTraceWriter with _enabled flag
- Check ClickHouse env vars once at init, set _enabled flag
- Early return in write_async/flush_async if not enabled
- Remove ValueError raises (never used)
- Simplify _get_client (no validation needed since already checked)
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: add LLMRawTraceWriter shutdown to FastAPI lifespan
Properly flush pending traces on graceful shutdown via lifespan
instead of relying only on atexit handler.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* feat: add agent_tags column to ClickHouse traces
Store agent tags as Array(String) for filtering/analytics by tag.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* cleanup
* fix(ci): fix ClickHouse schema initialization in CI
- Create database separately before loading SQL file
- Remove CREATE DATABASE from SQL file (handled in CI step)
- Add verification step to confirm table was created
- Use -sf flag for curl to fail on HTTP errors
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* refactor: simplify LLM trace writer with ClickHouse async_insert
- Use ClickHouse async_insert for server-side batching instead of manual queue/flush loop
- Sync cloud DDL schema with clickhouse-init.sql (add missing columns)
- Remove redundant llm_raw_traces_local.ddl
- Remove unused batch_size/flush_interval settings
- Update tests for simplified writer
Key changes:
- async_insert=1, wait_for_async_insert=1 for reliable server-side batching
- Simple per-trace retry with exponential backoff (max 3 retries)
- ~150 lines removed from writer
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* refactor: consolidate ClickHouse direct writes into TelemetryManager backend
- Add clickhouse_direct backend to provider_trace_backends
- Remove duplicate ClickHouse write logic from llm_client_base.py
- Configure via LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND=postgres,clickhouse_direct
The clickhouse_direct backend:
- Converts ProviderTrace to LLMRawTrace
- Extracts usage stats from response JSON
- Writes via LLMRawTraceWriter with async_insert
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* refactor: address PR review comments and fix llm_config bug
Review comment fixes:
- Rename clickhouse_direct -> clickhouse_analytics (clearer purpose)
- Remove ClickHouse from OSS compose.yaml, create separate compose.clickhouse.yaml
- Delete redundant scripts/test_llm_raw_traces.py (use pytest tests)
- Remove unused llm_raw_traces_ttl_days setting (TTL handled in DDL)
- Fix socket description leak in telemetry_manager docstring
- Add cloud-only comment to clickhouse-init.sql
- Update justfile to use separate compose file
Bug fix:
- Fix llm_config not being passed to ProviderTrace in telemetry
- Now correctly populates provider, model, is_byok for all LLM calls
- Affects both request_async_with_telemetry and log_provider_trace_async
DDL optimizations:
- Add secondary indexes (bloom_filter for agent_id, model, step_id)
- Add minmax indexes for is_byok, is_error
- Change model and error_type to LowCardinality for faster GROUP BY
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* refactor: rename llm_raw_traces -> llm_traces
Address review feedback that "raw" is misleading since we denormalize fields.
Renames:
- Table: llm_raw_traces -> llm_traces
- Schema: LLMRawTrace -> LLMTrace
- Files: llm_raw_trace_{reader,writer}.py -> llm_trace_{reader,writer}.py
- Setting: store_llm_raw_traces -> store_llm_traces
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: update workflow references to llm_traces
Missed renaming table name in CI workflow files.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: update clickhouse_direct -> clickhouse_analytics in docstring
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: remove inaccurate OTEL size limit comments
The 4MB limit is our own truncation logic, not an OTEL protocol limit.
The real benefit is denormalized columns for analytics queries.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: remove local ClickHouse dev setup (cloud-only feature)
- Delete clickhouse-init.sql and compose.clickhouse.yaml
- Remove local clickhouse just commands
- Update CI to use cloud DDL with MergeTree for testing
clickhouse_analytics is a cloud-only feature. For local dev, use postgres backend.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: restore compose.yaml to match main
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* refactor: merge clickhouse_analytics into clickhouse backend
Per review feedback - having two separate backends was confusing.
Now the clickhouse backend:
- Writes to llm_traces table (denormalized for cost analytics)
- Reads from OTEL traces table (will cut over to llm_traces later)
Config: LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND=postgres,clickhouse
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: correct path to DDL file in CI workflow
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: add provider index to DDL for faster filtering
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: configure telemetry backend in clickhouse tests
Tests need to set telemetry_settings.provider_trace_backends to include
'clickhouse', otherwise traces are routed to default postgres backend.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: set provider_trace_backend field, not property
provider_trace_backends is a computed property, need to set the
underlying provider_trace_backend string field instead.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: error trace test and error_type extraction
- Add TelemetryManager to error trace test so traces get written
- Fix error_type extraction to check top-level before nested error dict
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: use provider_trace.id for trace correlation across backends
- Pass provider_trace.id to LLMTrace instead of auto-generating
- Log warning if ID is missing (shouldn't happen, helps debug)
- Fallback to new UUID only if not set
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: trace ID correlation and concurrency issues
- Strip "provider_trace-" prefix from ID for UUID storage in ClickHouse
- Add asyncio.Lock to serialize writes (clickhouse_connect not thread-safe)
- Fix Anthropic prompt_tokens to include cached tokens for cost analytics
- Log warning if provider_trace.id is missing
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Caren Thomas <carenthomas@gmail.com>
fix: handle oversized text in embedding requests with recursive chunking
When message text exceeds the embedding model's context length, recursively
split it until all chunks can be embedded successfully.
Changes:
- `tpuf_client.py`: Add `_split_text_in_half()` helper for recursive splitting
- `tpuf_client.py`: Add `_generate_embeddings_with_chunking()` that retries
with splits on context length errors
- `tpuf_client.py`: Store `message_id` and `chunk_index` columns in Turbopuffer
- `tpuf_client.py`: Deduplicate query results by `message_id`
- `tpuf_client.py`: Use `LettaInvalidArgumentError` instead of `ValueError`
- `tpuf_client.py`: Move LLMClient import to top of file
- `openai_client.py`: Remove fixed truncation (chunking handles this now)
- Add tests for `_split_text_in_half` and chunked query deduplication
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
Queries Postgres for statement_timeout on connection checkout and adds
it as db.statement_timeout attribute on cursor execution spans.
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
The LettaAgentV3 (and LettaAgentV2) agents inherit from BaseAgentV2,
which unlike the original BaseAgent class, did not expose an agent_id
attribute. This caused AttributeError: 'LettaAgentV3' object has no
attribute 'agent_id' when code attempted to access self.agent_id.
This fix adds an agent_id property to BaseAgentV2 that returns
self.agent_state.id, maintaining backward compatibility with code
that expects the self.agent_id interface from the original BaseAgent.
Closes#8805🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
When users send images as base64 data URLs (data:image/jpeg;base64,...),
the code was incorrectly trying to fetch them via HTTP, causing a
LettaImageFetchError. This fix adds proper handling for data: URLs by
parsing the media type and base64 data directly from the URL string.
Fixes#8957🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
The per_file_view_window_char_limit column is defined as INTEGER (32-bit)
in PostgreSQL. Without API-level validation, users could pass values
exceeding int32 max (2,147,483,647), causing database errors.
Changes:
- Added MAX_INT32, MAX_PER_FILE_VIEW_WINDOW_CHAR_LIMIT, and
MAX_FILES_OPEN_LIMIT constants
- Added field validators to CreateAgent and UpdateAgent schemas to
enforce these limits
Fixes#8936🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
The sync_provider_models_async function was only checking for existing
models by (handle, organization_id, model_type) before creating, but
the database has a second unique constraint on (name, provider_id,
model_type). This caused UniqueConstraintViolationError when a model
with the same name/provider already existed under a different handle.
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
Cloud SQL's connection pooler can return connections with stale
session settings from previous clients. If any client (Datadog,
monitoring tools, manual psql) sets statement_timeout, that setting
persists on the pooled connection.
This explicitly sets statement_timeout=0 via asyncpg's server_settings
on every connection, ensuring consistent behavior regardless of
connection pool state.
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* feat: add metadata-only provider trace storage option
Add support for writing provider traces to a lightweight metadata-only
table (~1.5GB) instead of the full table (~725GB) since request/response
JSON is now stored in GCS.
- Add `LETTA_TELEMETRY_PROVIDER_TRACE_PG_METADATA_ONLY` setting
- Create `provider_trace_metadata` table via alembic migration
- Conditionally write to new table when flag is enabled
- Include backfill script for migrating existing data
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
* fix: use composite PK (created_at, id) for provider_trace_metadata
Aligns with GCS partitioning structure (raw/date=YYYY-MM-DD/{id}.json.gz)
and enables efficient date-range queries via the B-tree index.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* ammendments
* fix: add bulk data copy to migration
Copy existing provider_traces metadata in-migration instead of separate
backfill script. Creates indexes after bulk insert for better performance.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: remove data copy from migration, create empty table only
Old data stays in provider_traces, new writes go to provider_trace_metadata
when flag is enabled. Full traces are in GCS anyway.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: address PR comments
- Remove GCS mention from ProviderTraceMetadata docstring
- Move metadata object creation outside session context
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: reads always use full provider_traces table
The metadata_only flag should only control writes. Reads always go to
the full table to avoid returning ProviderTraceMetadata where
ProviderTrace is expected.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* feat: enable metadata-only provider trace writes in prod
Add LETTA_TELEMETRY_PROVIDER_TRACE_PG_METADATA_ONLY=true to all
Helm values (memgpt-server and lettuce-py, prod and dev).
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
* fix: non-streaming conversation messages endpoint
**Problems:**
1. `AssertionError: run_id is required when enforce_run_id_set is True`
- Non-streaming path didn't create a run before calling `step()`
2. `ResponseValidationError: Unable to extract tag using discriminator 'message_type'`
- `response_model=LettaStreamingResponse` but non-streaming returns `LettaResponse`
**Fixes:**
1. Add run creation before calling `step()` (mirrors agents endpoint)
2. Set run_id in Redis for cancellation support
3. Pass `run_id` to `step()`
4. Change `response_model` from `LettaStreamingResponse` to `LettaResponse`
(streaming returns `StreamingResponse` which bypasses response_model validation)
**Test:**
Added `test_conversation_non_streaming_raw_http` to verify the fix.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* api sync
---------
Co-authored-by: Letta <noreply@letta.com>
**Problem:**
When a user sends a message with an image URL that times out or fails to
fetch, the server returns a 500 Internal Server Error with a generic message.
This is confusing because the user doesn't know what went wrong.
**Root Cause:**
`LettaImageFetchError` was not registered in the exception handlers, so it
bubbled up as an unhandled exception.
**Fix:**
Register `LettaImageFetchError` with the 400 Bad Request handler. Now users
get a clear error message like:
```
Failed to fetch image from https://...: Timeout after 2 attempts
```
This tells users exactly what went wrong so they can retry with a different
image or verify the URL is accessible.
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
**Error:**
```
TypeError: LettaAgentV2.__init__() got an unexpected keyword argument 'conversation_id'
```
**Trace:** https://letta.grafana.net/goto/afbk4da3fuxhcf?orgId=stacks-1189126
**Problem:**
The `POST /v1/conversations/{conversation_id}/compact` endpoint was failing
because `LettaAgentV3` inherits from `LettaAgentV2` without overriding
`__init__`, so passing `conversation_id` to the constructor failed.
**Fix:**
1. Add `__init__` to `LettaAgentV3` that accepts optional `conversation_id`
2. Remove redundant `conversation_id` param from `_checkpoint_messages` -
use `self.conversation_id` consistently instead
3. Clean up internal callers that were passing `conversation_id=self.conversation_id`
Backward compatible - existing code creating `LettaAgentV3(agent_state, actor)`
still works since `conversation_id` defaults to `None`.
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* feat: add ID format validation to batch request schema
Add ID format validation to LettaBatchRequest using existing validator
types from letta.validators.
Changes:
- LettaBatchRequest.agent_id: str → AgentId
This ensures malformed agent IDs in batch requests are rejected with 422
validation errors instead of causing 500 database errors.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
---------
Co-authored-by: Letta <noreply@letta.com>
* feat: add ID format validation to identity schemas
Add ID format validation to IdentityCreate, IdentityUpsert, and IdentityUpdate
schemas using existing validator types from letta.validators.
Changes:
- agent_ids: Optional[List[str]] → Optional[List[AgentId]]
- block_ids: Optional[List[str]] → Optional[List[BlockId]]
This ensures malformed IDs are rejected with 422 validation errors instead
of causing 500 database errors.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
---------
Co-authored-by: Letta <noreply@letta.com>
* feat: add ID format validation to group schemas
Add ID format validation to GroupCreate, GroupUpdate, and manager config
schemas using existing validator types from letta.validators.
Changes:
- GroupCreate/GroupUpdate: agent_ids → List[AgentId], shared_block_ids → List[BlockId]
- SupervisorManager, DynamicManager, SleeptimeManager, VoiceSleeptimeManager:
manager_agent_id → AgentId
- Update variants: manager_agent_id → Optional[AgentId]
This ensures malformed IDs are rejected with 422 validation errors instead
of causing 500 database errors.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
---------
Co-authored-by: Letta <noreply@letta.com>
Check if a block with the same label already exists before attaching
to sleeptime agents. This prevents UniqueConstraintViolationError on
the (agent_id, block_label) constraint when the same block is attached
multiple times due to race conditions.
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
**Problem:**
When retrying an approval response, the idempotency check only looked at
the last message. If the approved tool triggered server-side tool calls
(e.g., `memory`), those tool returns would be the last message, causing
the idempotency check to fail with:
"Cannot process approval response: No tool call is currently awaiting approval."
**Root Cause:**
The check at line 249 only validated `current_in_context_messages[-1]`,
but server-side tool calls can add additional tool return messages after
the original approved tool's return.
**Fix:**
Search the last 10 messages (instead of just the last one) for a tool
return matching the approval's tool_call_ids. This handles the case where
server-side tool calls happen after the approved tool executes, while
keeping the search bounded and efficient.
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
The cursor-based pagination was not accounting for sort order. When using
descending order (the default), "after cursor X" should return items with
id < X (items that come after X in the descending result set), but the code
was using id > X which caused infinite loops in clients iterating through pages.
This fix adjusts the cursor comparison based on the sort order:
- ascending: after=id > X, before=id < X
- descending: after=id < X, before=id > X
Note: Other pagination methods (list_agent_sources_async, list_agent_tools_async,
list_agent_groups_async) may have the same issue and should be audited.
🐾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>