Commit Graph

7073 Commits

Author SHA1 Message Date
jnjpng
9bf10b4761 fix: initialize result and run_status before try block in send_message (#9265)
Initialize `result` and `run_status` variables before the try block
to prevent UnboundLocalError in the finally block when exceptions
occur early in execution (e.g., during AgentLoop.load()).

Previously, if an exception was raised before `result` was assigned
inside the try block, the finally block would fail when trying to
access `result`, masking the original error with an UnboundLocalError.

This fix ensures the finally block can safely check and use these
variables regardless of when/where an exception occurs.
2026-02-24 10:52:06 -08:00
Ari Webb
170886b8a8 fix: compaction on 413 from openai (#9263) 2026-02-24 10:52:06 -08:00
cthomas
24e51f611c fix: check for actual error content in ClickHouse traces (#9260)
fix: check for actual error content, not just "error" key presence

OpenAI Responses API returns {"error": null} on success, which
incorrectly triggered is_error=True. Now we check if error_data
is truthy rather than just checking key existence.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
cthomas
e8a565a384 fix: add missing call_type to stream adapter ProviderTrace (#9259)
Both letta_llm_stream_adapter and simple_llm_stream_adapter were
creating ProviderTrace without call_type, causing "unknown" in
ClickHouse analytics.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
jnjpng
f48b60634f refactor: extract compact logic to shared function for temporal (#9249)
* refactor: extract compact logic to shared function

Extract the compaction logic from LettaAgentV3.compact() into a
standalone compact_messages() function that can be shared between
the agent and temporal workflows.

Changes:
- Create apps/core/letta/services/summarizer/compact.py with:
  - compact_messages(): Core compaction logic
  - build_summarizer_llm_config(): LLM config builder for summarization
  - CompactResult: Dataclass for compaction results
- Update LettaAgentV3.compact() to use compact_messages()
- Update temporal summarize_conversation_history activity to use
  compact_messages() instead of the old Summarizer class
- Add use_summary_role parameter to SummarizeParams

This ensures consistent summarization behavior across different
execution paths and prevents drift as we improve the implementation.

* chore: clean up verbose comments

* fix: correct CompactionSettings import path

* fix: correct count_tokens import from summarizer_sliding_window

* fix: update test patch path for count_tokens_with_tools

After extracting compact logic to compact.py, the test was patching
the old location. Update the patch path to the new module location.

* fix: update test to use build_summarizer_llm_config from compact.py

The function was moved from LettaAgentV3._build_summarizer_llm_config
to compact.py as a standalone function.

* fix: add early check for system prompt size in compact_messages

Check if the system prompt alone exceeds the context window before
attempting summarization. The system prompt cannot be compacted,
so fail fast with SystemPromptTokenExceededError.

* fix: properly propagate SystemPromptTokenExceededError from compact

The exception handler in _step() was not setting the correct stop_reason
for SystemPromptTokenExceededError, which caused the finally block to
return early and swallow the exception.

Add special handling to set stop_reason to context_window_overflow_in_system_prompt
when SystemPromptTokenExceededError is caught.

* revert: remove redundant SystemPromptTokenExceededError handling

The special handling in the outer exception handler is redundant because
stop_reason is already set in the inner handler at line 943. The actual
fix for the test was the early check in compact_messages(), not this
redundant handling.

* fix: correctly re-raise SystemPromptTokenExceededError

The inner exception handler was using 'raise e' which re-raised the outer
ContextWindowExceededError instead of the current SystemPromptTokenExceededError.

Changed to 'raise' to correctly re-raise the current exception. This bug
was pre-existing but masked because _check_for_system_prompt_overflow was
only called as a fallback. The new early check in compact_messages() exposed it.

* revert: remove early check and restore raise e to match main behavior

* fix: set should_continue=False and correctly re-raise exception

- Add should_continue=False in SystemPromptTokenExceededError handler (matching main's _check_for_system_prompt_overflow behavior)
- Fix raise e -> raise to correctly propagate SystemPromptTokenExceededError

Note: test_large_system_prompt_summarization still fails locally but passes on main.
Need to investigate why exception isn't propagating correctly on refactored branch.

* fix: add SystemPromptTokenExceededError handler for post-step compaction

The post-step compaction (line 1066) was missing a SystemPromptTokenExceededError
exception handler. When compact_messages() raised this error, it would be caught
by the outer exception handler which would:
1. Set stop_reason to "error" instead of "context_window_overflow_in_system_prompt"
2. Not set should_continue = False
3. Get swallowed by the finally block (line 1126) which returns early

This caused test_large_system_prompt_summarization to fail because the exception
never propagated to the test.

The fix adds the same exception handler pattern used in the retry compaction flow
(line 941-946), ensuring proper state is set before re-raising.

This issue only affected the refactored code because on main, _check_for_system_prompt_overflow()
was an instance method that set should_continue/stop_reason BEFORE raising. In the refactor,
compact_messages() is a standalone function that cannot set instance state, so the caller
must handle the exception and set the state.
2026-02-24 10:52:06 -08:00
Kian Jones
74dc6225ba feat: add support for YAML config file (#8999)
* feat: add simplified YAML config file support

Simple hierarchical YAML config that maps to environment variables:

```yaml
letta:
    telemetry:
        enable_datadog: true
    pg_host: localhost
```

Maps to:
- LETTA_TELEMETRY_ENABLE_DATADOG=true
- LETTA_PG_HOST=localhost

Config loaded at settings startup. Environment variables take precedence.

Closes #8997

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

* Rename config files from 'config.yaml' to 'conf.yaml'

* add conf.yaml

* modifications (#9199)

* feat: add simplified YAML config file support

Simple hierarchical YAML config that maps to environment variables:

```yaml
letta:
    telemetry:
        enable_datadog: true
    pg_host: localhost
```

Maps to:
- LETTA_TELEMETRY_ENABLE_DATADOG=true
- LETTA_PG_HOST=localhost

Config loaded at settings startup. Environment variables take precedence.

Closes #8997

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

* Rename config files from 'config.yaml' to 'conf.yaml'

* add conf.yaml

* fixes?

---------

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

---------

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>
2026-02-24 10:52:06 -08:00
Kian Jones
a206f7f345 feat: add ID format validation to agent and user schemas (#9151)
* feat: add ID format validation to agent and user schemas

Reuse existing validator types (ToolId, SourceId, BlockId, MessageId,
IdentityId, UserId) from letta.validators to enforce ID format validation
at the schema level. This ensures malformed IDs are rejected with a 422
validation error instead of causing 500 database errors.

Changes:
- CreateAgent: validate tool_ids, source_ids, folder_ids, block_ids, identity_ids
- UpdateAgent: validate tool_ids, source_ids, folder_ids, block_ids, message_ids, identity_ids
- UserUpdate: validate id

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate API spec and SDK

* fix: override ID validation in AgentSchema for agent file portability

AgentSchema extends CreateAgent but needs to allow arbitrary short IDs
(e.g., tool-0, block-0) for portable agent files. Override the validated
ID fields to use plain List[str] instead of the validated types.

Also fix test_agent.af to use proper UUID-format IDs.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate API spec and SDK

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: revert test_agent.af - short IDs are valid for agent files

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix openapi schema

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Kian Jones
025eeaa363 fix: override validation for group for agentfile import (#9248)
* override validation for group for agentfile import

* fix the rest of groupcreate

* add api changes
2026-02-24 10:52:06 -08:00
Sarah Wooders
3fdf2b6c79 chore: deprecate old agent messaging (#9120) 2026-02-24 10:52:06 -08:00
cthomas
cd7e80acc3 fix: add error trace logging to simple_llm_stream_adapter (#9247)
Log error traces to ClickHouse when streaming requests fail,
matching the behavior in letta_llm_stream_adapter.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
4096b30cd7 feat: log LLM traces to clickhouse (#9111)
* feat: add non-streaming option for conversation messages

- Add ConversationMessageRequest with stream=True default (backwards compatible)
- stream=true (default): SSE streaming via StreamingService
- stream=false: JSON response via AgentLoop.load().step()

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate API schema for ConversationMessageRequest

* feat: add direct ClickHouse storage for raw LLM traces

    Adds ability to store raw LLM request/response payloads directly in ClickHouse,
    bypassing OTEL span attribute size limits. This enables debugging and analytics
    on large LLM payloads (>10MB system prompts, large tool schemas, etc.).

    New files:
    - letta/schemas/llm_raw_trace.py: Pydantic schema with ClickHouse row helper
    - letta/services/llm_raw_trace_writer.py: Async batching writer (fire-and-forget)
    - letta/services/llm_raw_trace_reader.py: Reader with query methods
    - scripts/sql/clickhouse/llm_raw_traces.ddl: Production table DDL
    - scripts/sql/clickhouse/llm_raw_traces_local.ddl: Local dev DDL
    - apps/core/clickhouse-init.sql: Local dev initialization

    Modified:
    - letta/settings.py: Added 4 settings (store_llm_raw_traces, ttl, batch_size, flush_interval)
    - letta/llm_api/llm_client_base.py: Integration into request_async_with_telemetry
    - compose.yaml: Added ClickHouse service for local dev
    - justfile: Added clickhouse, clickhouse-cli, clickhouse-traces commands

    Feature disabled by default (LETTA_STORE_LLM_RAW_TRACES=false).
    Uses ZSTD(3) compression for 10-30x reduction on JSON payloads.

    🤖 Generated with [Letta Code](https://letta.com)

    Co-Authored-By: Letta <noreply@letta.com>

* fix: address code review feedback for LLM raw traces

Fixes based on code review feedback:

1. Fix ClickHouse endpoint parsing - default to secure=False for raw host:port
   inputs (was defaulting to HTTPS which breaks local dev)

2. Make raw trace writes truly fire-and-forget - use asyncio.create_task()
   instead of awaiting, so JSON serialization doesn't block request path

3. Add bounded queue (maxsize=10000) - prevents unbounded memory growth
   under load. Drops traces with warning if queue is full.

4. Fix deprecated asyncio usage - get_running_loop() instead of get_event_loop()

5. Add org_id fallback - use _telemetry_org_id if actor doesn't have it

6. Remove unused imports - json import in reader

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add missing asyncio import and simplify JSON serialization

- Add missing 'import asyncio' that was causing 'name asyncio is not defined' error
- Remove unnecessary clean_double_escapes() function - the JSON is stored correctly,
  the clickhouse-client CLI was just adding extra escaping when displaying
- Update just clickhouse-trace to use Python client for correct JSON output

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: add clickhouse raw trace integration test

* test: simplify clickhouse trace assertions

* refactor: centralize usage parsing and stream error traces

Use per-client usage helpers for raw trace extraction and ensure streaming errors log requests with error metadata.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: exercise provider usage parsing live

Make live OpenAI/Anthropic/Gemini requests with credential gating and validate Anthropic cache usage mapping when present.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: fix usage parsing tests to pass

- Use GoogleAIClient with GEMINI_API_KEY instead of GoogleVertexClient
- Update model to gemini-2.0-flash (1.5-flash deprecated in v1beta)
- Add tools=[] for Gemini/Anthropic build_request_data

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: extract_usage_statistics returns LettaUsageStatistics

Standardize on LettaUsageStatistics as the canonical usage format returned by client helpers. Inline UsageStatistics construction for ChatCompletionResponse where needed.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: add is_byok and llm_config_json columns to ClickHouse traces

Extend llm_raw_traces table with:
- is_byok (UInt8): Track BYOK vs base provider usage for billing analytics
- llm_config_json (String, ZSTD): Store full LLM config for debugging and analysis

This enables queries like:
- BYOK usage breakdown by provider/model
- Config parameter analysis (temperature, max_tokens, etc.)
- Debugging specific request configurations

* feat: add tests for error traces, llm_config_json, and cache tokens

- Update llm_raw_trace_reader.py to query new columns (is_byok,
  cached_input_tokens, cache_write_tokens, reasoning_tokens, llm_config_json)
- Add test_error_trace_stored_in_clickhouse to verify error fields
- Add test_cache_tokens_stored_for_anthropic to verify cache token storage
- Update existing tests to verify llm_config_json is stored correctly
- Make llm_config required in log_provider_trace_async()
- Simplify provider extraction to use provider_name directly

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* ci: add ClickHouse integration tests to CI pipeline

- Add use-clickhouse option to reusable-test-workflow.yml
- Add ClickHouse service container with otel database
- Add schema initialization step using clickhouse-init.sql
- Add ClickHouse env vars (CLICKHOUSE_ENDPOINT, etc.)
- Add separate clickhouse-integration-tests job running
  integration_test_clickhouse_llm_raw_traces.py

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: simplify provider and org_id extraction in raw trace writer

- Use model_endpoint_type.value for provider (not provider_name)
- Simplify org_id to just self.actor.organization_id (actor is always pydantic)

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: simplify LLMRawTraceWriter with _enabled flag

- Check ClickHouse env vars once at init, set _enabled flag
- Early return in write_async/flush_async if not enabled
- Remove ValueError raises (never used)
- Simplify _get_client (no validation needed since already checked)

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add LLMRawTraceWriter shutdown to FastAPI lifespan

Properly flush pending traces on graceful shutdown via lifespan
instead of relying only on atexit handler.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: add agent_tags column to ClickHouse traces

Store agent tags as Array(String) for filtering/analytics by tag.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* cleanup

* fix(ci): fix ClickHouse schema initialization in CI

- Create database separately before loading SQL file
- Remove CREATE DATABASE from SQL file (handled in CI step)
- Add verification step to confirm table was created
- Use -sf flag for curl to fail on HTTP errors

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: simplify LLM trace writer with ClickHouse async_insert

- Use ClickHouse async_insert for server-side batching instead of manual queue/flush loop
- Sync cloud DDL schema with clickhouse-init.sql (add missing columns)
- Remove redundant llm_raw_traces_local.ddl
- Remove unused batch_size/flush_interval settings
- Update tests for simplified writer

Key changes:
- async_insert=1, wait_for_async_insert=1 for reliable server-side batching
- Simple per-trace retry with exponential backoff (max 3 retries)
- ~150 lines removed from writer

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: consolidate ClickHouse direct writes into TelemetryManager backend

- Add clickhouse_direct backend to provider_trace_backends
- Remove duplicate ClickHouse write logic from llm_client_base.py
- Configure via LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND=postgres,clickhouse_direct

The clickhouse_direct backend:
- Converts ProviderTrace to LLMRawTrace
- Extracts usage stats from response JSON
- Writes via LLMRawTraceWriter with async_insert

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: address PR review comments and fix llm_config bug

Review comment fixes:
- Rename clickhouse_direct -> clickhouse_analytics (clearer purpose)
- Remove ClickHouse from OSS compose.yaml, create separate compose.clickhouse.yaml
- Delete redundant scripts/test_llm_raw_traces.py (use pytest tests)
- Remove unused llm_raw_traces_ttl_days setting (TTL handled in DDL)
- Fix socket description leak in telemetry_manager docstring
- Add cloud-only comment to clickhouse-init.sql
- Update justfile to use separate compose file

Bug fix:
- Fix llm_config not being passed to ProviderTrace in telemetry
- Now correctly populates provider, model, is_byok for all LLM calls
- Affects both request_async_with_telemetry and log_provider_trace_async

DDL optimizations:
- Add secondary indexes (bloom_filter for agent_id, model, step_id)
- Add minmax indexes for is_byok, is_error
- Change model and error_type to LowCardinality for faster GROUP BY

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: rename llm_raw_traces -> llm_traces

Address review feedback that "raw" is misleading since we denormalize fields.

Renames:
- Table: llm_raw_traces -> llm_traces
- Schema: LLMRawTrace -> LLMTrace
- Files: llm_raw_trace_{reader,writer}.py -> llm_trace_{reader,writer}.py
- Setting: store_llm_raw_traces -> store_llm_traces

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: update workflow references to llm_traces

Missed renaming table name in CI workflow files.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: update clickhouse_direct -> clickhouse_analytics in docstring

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: remove inaccurate OTEL size limit comments

The 4MB limit is our own truncation logic, not an OTEL protocol limit.
The real benefit is denormalized columns for analytics queries.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: remove local ClickHouse dev setup (cloud-only feature)

- Delete clickhouse-init.sql and compose.clickhouse.yaml
- Remove local clickhouse just commands
- Update CI to use cloud DDL with MergeTree for testing

clickhouse_analytics is a cloud-only feature. For local dev, use postgres backend.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: restore compose.yaml to match main

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: merge clickhouse_analytics into clickhouse backend

Per review feedback - having two separate backends was confusing.

Now the clickhouse backend:
- Writes to llm_traces table (denormalized for cost analytics)
- Reads from OTEL traces table (will cut over to llm_traces later)

Config: LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND=postgres,clickhouse

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: correct path to DDL file in CI workflow

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: add provider index to DDL for faster filtering

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: configure telemetry backend in clickhouse tests

Tests need to set telemetry_settings.provider_trace_backends to include
'clickhouse', otherwise traces are routed to default postgres backend.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: set provider_trace_backend field, not property

provider_trace_backends is a computed property, need to set the
underlying provider_trace_backend string field instead.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: error trace test and error_type extraction

- Add TelemetryManager to error trace test so traces get written
- Fix error_type extraction to check top-level before nested error dict

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: use provider_trace.id for trace correlation across backends

- Pass provider_trace.id to LLMTrace instead of auto-generating
- Log warning if ID is missing (shouldn't happen, helps debug)
- Fallback to new UUID only if not set

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: trace ID correlation and concurrency issues

- Strip "provider_trace-" prefix from ID for UUID storage in ClickHouse
- Add asyncio.Lock to serialize writes (clickhouse_connect not thread-safe)
- Fix Anthropic prompt_tokens to include cached tokens for cost analytics
- Log warning if provider_trace.id is missing

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Caren Thomas <carenthomas@gmail.com>
2026-02-24 10:52:06 -08:00
jnjpng
24ea7dbaed feat: include tools as part of token estimate in compact (#9242)
* base

* fix
2026-02-24 10:52:06 -08:00
Ari Webb
0bbb9c9bc0 feat: add reasoning zai openrouter (#9189)
* feat: add reasoning zai openrouter

* add openrouter reasoning

* stage + publish api

* openrouter reasoning always on

* revert

* fix

* remove reference

* do
2026-02-24 10:52:06 -08:00
Kian Jones
01cb00ae10 Revert "fix: truncate oversized text in embedding requests" (#9227)
Revert "fix: truncate oversized text in embedding requests (#9196)"

This reverts commit a9c342087e022519c63d62fb76b72aed8859539b.
2026-02-24 10:52:06 -08:00
Kian Jones
68eb076135 Revert "fix: force statement_timeout=0 on all database connections" (#9226)
Revert "fix: force statement_timeout=0 on all database connections (#9184)"

This reverts commit 0d3d9ea76bae586520ae8f50badb203ffd441675.
2026-02-24 10:52:06 -08:00
Kian Jones
ddd1a11a93 Revert "feat: add statement_timeout to SQLAlchemy OTEL spans" (#9224)
Revert "feat: add statement_timeout to SQLAlchemy OTEL spans (#9220)"

This reverts commit b8f3ed334f29b11d21d391e46c06224fa259399e.
2026-02-24 10:52:06 -08:00
Kian Jones
630c147b13 fix: truncate oversized text in embedding requests (#9196)
fix: handle oversized text in embedding requests with recursive chunking

When message text exceeds the embedding model's context length, recursively
split it until all chunks can be embedded successfully.

Changes:
- `tpuf_client.py`: Add `_split_text_in_half()` helper for recursive splitting
- `tpuf_client.py`: Add `_generate_embeddings_with_chunking()` that retries
  with splits on context length errors
- `tpuf_client.py`: Store `message_id` and `chunk_index` columns in Turbopuffer
- `tpuf_client.py`: Deduplicate query results by `message_id`
- `tpuf_client.py`: Use `LettaInvalidArgumentError` instead of `ValueError`
- `tpuf_client.py`: Move LLMClient import to top of file
- `openai_client.py`: Remove fixed truncation (chunking handles this now)
- Add tests for `_split_text_in_half` and chunked query deduplication

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
jnjpng
3f23a23227 feat: add compaction stats (#9219)
* base

* update

* last

* generate

* fix test
2026-02-24 10:52:06 -08:00
Kian Jones
720fc9c758 feat: add statement_timeout to SQLAlchemy OTEL spans (#9220)
Queries Postgres for statement_timeout on connection checkout and adds
it as db.statement_timeout attribute on cursor execution spans.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
jnjpng
df2e666ced test: skip deepwiki SSE MCP server test (#9218)
The deepwiki SSE MCP server is deprecated, so skip this test.

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
jnjpng
e25a0c9cdf feat: update compact endpoint to store summary message (#9215)
* base

* add tests
2026-02-24 10:52:06 -08:00
github-actions[bot]
9471fab2cf fix: Add agent_id property to BaseAgentV2 for backward compatibility (#8806)
The LettaAgentV3 (and LettaAgentV2) agents inherit from BaseAgentV2,
which unlike the original BaseAgent class, did not expose an agent_id
attribute. This caused AttributeError: 'LettaAgentV3' object has no
attribute 'agent_id' when code attempted to access self.agent_id.

This fix adds an agent_id property to BaseAgentV2 that returns
self.agent_state.id, maintaining backward compatibility with code
that expects the self.agent_id interface from the original BaseAgent.

Closes #8805

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
github-actions[bot]
5cde3c2ec0 fix: handle data URLs in image processing to prevent LettaImageFetchError (#8958)
When users send images as base64 data URLs (data:image/jpeg;base64,...),
the code was incorrectly trying to fetch them via HTTP, causing a
LettaImageFetchError. This fix adds proper handling for data: URLs by
parsing the media type and base64 data directly from the URL string.

Fixes #8957

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
2026-02-24 10:52:05 -08:00
github-actions[bot]
101cfefe5e fix: add retry logic for turbopuffer transient network errors (#8635)
Add async retry decorator with exponential backoff to handle transient
network connection errors (httpx.ConnectError) when connecting to
turbopuffer.com. This addresses production errors seen in memgpt-server.

Changes:
- Add `async_retry_with_backoff()` decorator for async methods
- Add `is_transient_error()` helper to identify retryable errors
- Apply retry logic to all turbopuffer network operations
- Retry config: 3 retries, 1s initial delay, 2x backoff, 10% jitter

Fixes #8390
Relates to #8155

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
2026-02-24 10:52:05 -08:00
github-actions[bot]
81398118dd fix: add validation for per_file_view_window_char_limit to prevent int32 overflow (#8937)
The per_file_view_window_char_limit column is defined as INTEGER (32-bit)
in PostgreSQL. Without API-level validation, users could pass values
exceeding int32 max (2,147,483,647), causing database errors.

Changes:
- Added MAX_INT32, MAX_PER_FILE_VIEW_WINDOW_CHAR_LIMIT, and
  MAX_FILES_OPEN_LIMIT constants
- Added field validators to CreateAgent and UpdateAgent schemas to
  enforce these limits

Fixes #8936

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
2026-02-24 10:52:05 -08:00
jnjpng
d28ccc0be6 feat: add summary message and event on compaction (#9144)
* base

* update

* update

* revert formatting

* routes

* legacy

* fix

* review

* update
2026-02-24 10:52:05 -08:00
Kian Jones
b0f8e16ac0 fix: check both unique constraints in provider model sync (#9193)
The sync_provider_models_async function was only checking for existing
models by (handle, organization_id, model_type) before creating, but
the database has a second unique constraint on (name, provider_id,
model_type). This caused UniqueConstraintViolationError when a model
with the same name/provider already existed under a different handle.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:05 -08:00
Kian Jones
3b5251fbd6 fix: force statement_timeout=0 on all database connections (#9184)
Cloud SQL's connection pooler can return connections with stale
session settings from previous clients. If any client (Datadog,
monitoring tools, manual psql) sets statement_timeout, that setting
persists on the pooled connection.

This explicitly sets statement_timeout=0 via asyncpg's server_settings
on every connection, ensuring consistent behavior regardless of
connection pool state.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:05 -08:00
Kian Jones
9a8d381b14 fix: add field validators to Block schema for int32 limit and null bytes (#9185)
- Add `validate_limit_int32` validator (mode=after) to reject limit values
  exceeding PostgreSQL INTEGER max (2147483647)
- Add `sanitize_value_null_bytes` validator (mode=before) to strip null bytes
  from block values, preventing asyncpg CharacterNotInRepertoireError

Fixes DB errors:
- "value out of int32 range" on block.limit
- "invalid byte sequence for encoding UTF8: 0x00" on block.value

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:05 -08:00
Ari Webb
7b0b1f2531 fix: warning (#9179)
* fix: warning

* just stage publish api

* note

* api
2026-02-24 10:52:05 -08:00
cthomas
65dbd7fd89 chore: bump v0.16.4 (#3168) 2026-01-29 12:50:03 -08:00
Caren Thomas
72871ff923 bump version 2026-01-29 12:45:45 -08:00
Kian Jones
00045afc3c chore: add statement timeout log (#9177)
* fix: remove duplicate 'Tag and push latest' step from crouton workflow

* startuplog

* Update apps/core/letta/server/rest_api/app.py
2026-01-29 12:44:04 -08:00
Kevin Lin
3a13c63f60 fix: preserve slashes in memory block paths (#9172)
Remove .replace("/", "_") from memory block label processing to allow hierarchical memory organization with slashes.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
Ari Webb
501de90d6c fix: fix base openrouter (#9171) 2026-01-29 12:44:04 -08:00
Ari Webb
bcd90859ec fix: byok uses our key, fix that (#9169) 2026-01-29 12:44:04 -08:00
Kian Jones
c1a02fa180 feat: add metadata-only provider trace storage option (#9155)
* feat: add metadata-only provider trace storage option

Add support for writing provider traces to a lightweight metadata-only
table (~1.5GB) instead of the full table (~725GB) since request/response
JSON is now stored in GCS.

- Add `LETTA_TELEMETRY_PROVIDER_TRACE_PG_METADATA_ONLY` setting
- Create `provider_trace_metadata` table via alembic migration
- Conditionally write to new table when flag is enabled
- Include backfill script for migrating existing data

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate API spec and SDK

* fix: use composite PK (created_at, id) for provider_trace_metadata

Aligns with GCS partitioning structure (raw/date=YYYY-MM-DD/{id}.json.gz)
and enables efficient date-range queries via the B-tree index.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* ammendments

* fix: add bulk data copy to migration

Copy existing provider_traces metadata in-migration instead of separate
backfill script. Creates indexes after bulk insert for better performance.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: remove data copy from migration, create empty table only

Old data stays in provider_traces, new writes go to provider_trace_metadata
when flag is enabled. Full traces are in GCS anyway.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: address PR comments

- Remove GCS mention from ProviderTraceMetadata docstring
- Move metadata object creation outside session context

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: reads always use full provider_traces table

The metadata_only flag should only control writes. Reads always go to
the full table to avoid returning ProviderTraceMetadata where
ProviderTrace is expected.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: enable metadata-only provider trace writes in prod

Add LETTA_TELEMETRY_PROVIDER_TRACE_PG_METADATA_ONLY=true to all
Helm values (memgpt-server and lettuce-py, prod and dev).

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
amysguan
69cad47e6a fix: respect enable_reasoner setting from .af imports instead of falling back to model defaults (#9163)
Co-authored-by: Amy Guan <amy@letta.com>
2026-01-29 12:44:04 -08:00
Ari Webb
a798cc90c4 fix: openrouter provider (#9166)
* fix: openrouter provider

* just stage publish api

* web openapi
2026-01-29 12:44:04 -08:00
cthomas
59ffaec8f4 fix: revert test comments (#9161) 2026-01-29 12:44:04 -08:00
Ari Webb
9ce1249738 feat: openrouter byok (#9148)
* feat: openrouter byok

* new client is unnecessary

* revert json diffs
2026-01-29 12:44:04 -08:00
cthomas
d992aa0df4 fix: non-streaming conversation messages endpoint (#9159)
* fix: non-streaming conversation messages endpoint

**Problems:**
1. `AssertionError: run_id is required when enforce_run_id_set is True`
   - Non-streaming path didn't create a run before calling `step()`

2. `ResponseValidationError: Unable to extract tag using discriminator 'message_type'`
   - `response_model=LettaStreamingResponse` but non-streaming returns `LettaResponse`

**Fixes:**
1. Add run creation before calling `step()` (mirrors agents endpoint)
2. Set run_id in Redis for cancellation support
3. Pass `run_id` to `step()`
4. Change `response_model` from `LettaStreamingResponse` to `LettaResponse`
   (streaming returns `StreamingResponse` which bypasses response_model validation)

**Test:**
Added `test_conversation_non_streaming_raw_http` to verify the fix.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* api sync

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
cthomas
dd7e28fae6 fix: return 400 instead of 500 for image fetch errors (#9157)
**Problem:**
When a user sends a message with an image URL that times out or fails to
fetch, the server returns a 500 Internal Server Error with a generic message.
This is confusing because the user doesn't know what went wrong.

**Root Cause:**
`LettaImageFetchError` was not registered in the exception handlers, so it
bubbled up as an unhandled exception.

**Fix:**
Register `LettaImageFetchError` with the 400 Bad Request handler. Now users
get a clear error message like:
```
Failed to fetch image from https://...: Timeout after 2 attempts
```

This tells users exactly what went wrong so they can retry with a different
image or verify the URL is accessible.

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
cthomas
372c8dcc85 fix: add conversation_id support to LettaAgentV3 constructor (#9156)
**Error:**
```
TypeError: LettaAgentV2.__init__() got an unexpected keyword argument 'conversation_id'
```

**Trace:** https://letta.grafana.net/goto/afbk4da3fuxhcf?orgId=stacks-1189126

**Problem:**
The `POST /v1/conversations/{conversation_id}/compact` endpoint was failing
because `LettaAgentV3` inherits from `LettaAgentV2` without overriding
`__init__`, so passing `conversation_id` to the constructor failed.

**Fix:**
1. Add `__init__` to `LettaAgentV3` that accepts optional `conversation_id`
2. Remove redundant `conversation_id` param from `_checkpoint_messages` -
   use `self.conversation_id` consistently instead
3. Clean up internal callers that were passing `conversation_id=self.conversation_id`

Backward compatible - existing code creating `LettaAgentV3(agent_state, actor)`
still works since `conversation_id` defaults to `None`.

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
Shubham Naik
bb2145c24c connections (#9113)
* chore: release code

* chore: release code

* chore: release code

* chore: release code

* chore: release code

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: remote

* chore: support  multi project chat
2026-01-29 12:44:04 -08:00
Kian Jones
45c0a4cd0d feat: add ID format validation to batch request schema (#9154)
* feat: add ID format validation to batch request schema

Add ID format validation to LettaBatchRequest using existing validator
types from letta.validators.

Changes:
- LettaBatchRequest.agent_id: str → AgentId

This ensures malformed agent IDs in batch requests are rejected with 422
validation errors instead of causing 500 database errors.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate API spec and SDK

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
Kian Jones
eaaca141f7 feat: add ID format validation to identity schemas (#9153)
* feat: add ID format validation to identity schemas

Add ID format validation to IdentityCreate, IdentityUpsert, and IdentityUpdate
schemas using existing validator types from letta.validators.

Changes:
- agent_ids: Optional[List[str]] → Optional[List[AgentId]]
- block_ids: Optional[List[str]] → Optional[List[BlockId]]

This ensures malformed IDs are rejected with 422 validation errors instead
of causing 500 database errors.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate API spec and SDK

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
Kian Jones
dd8be95142 feat: add ID format validation to group schemas (#9152)
* feat: add ID format validation to group schemas

Add ID format validation to GroupCreate, GroupUpdate, and manager config
schemas using existing validator types from letta.validators.

Changes:
- GroupCreate/GroupUpdate: agent_ids → List[AgentId], shared_block_ids → List[BlockId]
- SupervisorManager, DynamicManager, SleeptimeManager, VoiceSleeptimeManager:
  manager_agent_id → AgentId
- Update variants: manager_agent_id → Optional[AgentId]

This ensures malformed IDs are rejected with 422 validation errors instead
of causing 500 database errors.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate API spec and SDK

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
Kian Jones
42b1e741dc fix: prevent duplicate block attachment in sleeptime agents (#9150)
Check if a block with the same label already exists before attaching
to sleeptime agents. This prevents UniqueConstraintViolationError on
the (agent_id, block_label) constraint when the same block is attached
multiple times due to race conditions.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
Ari Webb
04e6d668ec fix: make it so sync updates model_endpoint info (#9138) 2026-01-29 12:44:04 -08:00