Commit Graph

7133 Commits

Author SHA1 Message Date
Kian Jones
825019c2ce fix(core): handle Anthropic streaming required ValueError (#9344)
* Fix Anthropic ValueError for long-running operations

Adds proper error handling for Anthropic SDK's streaming requirement.
When operations may exceed 10 minutes, the SDK raises a ValueError.

Changes:
- Catch ValueError in sync request() method
- Provide user-friendly error directing to async API
- Async version already had this fix with streaming fallback

Fixes Datadog issue 955d10b4-ed95-11f0-a5a5-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: use LLMBadRequestError instead of ValueError for Anthropic streaming constraint

ValueError maps to HTTP 400 which incorrectly implies a bad client request.
LLMBadRequestError maps to HTTP 502 (Bad Gateway) which correctly signals
that the downstream provider (Anthropic) rejected the proxied request due
to its own constraints.

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
2026-02-24 10:52:07 -08:00
Kian Jones
14ef479e70 fix(core): handle empty content in Anthropic response gracefully (#9345)
Fixes Datadog issue a47619fa-d5b8-11f0-9fd7-da7ad0900000

Handle empty content in Anthropic responses gracefully by replacing RuntimeError with LLMServerError. Now logs detailed debugging information (response ID, model, stop_reason) and returns a user-friendly error instead of crashing.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
4eb27e23b3 fix(core): add deadlock retry logic to ORM write operations (#9352)
Adds automatic retry with exponential backoff for PostgreSQL deadlock
errors (40P01) in all ORM write methods: create_async, update_async,
batch_create_async, hard_delete_async, and bulk_hard_delete_async.

For update_async, column values are snapshotted before the commit
attempt so they can be restored after rollback clears them.

Also adds DatabaseDeadlockError to _handle_dbapi_error as a fallback
when retries are exhausted.

Datadog: https://us5.datadoghq.com/error-tracking/issue/53ccdd7a-f0cc-11f0-8969-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
2026-02-24 10:52:07 -08:00
Kian Jones
2c0cddf9f5 fix(core): handle Google 499 CANCELLED as client disconnect, not server error (#9363)
The google.genai.errors.ClientError with code 499 (CANCELLED) indicates the
client disconnected, not a server-side failure. Previously this fell through
to the generic ClientError handler and was classified as LLMServerError,
causing false 500s in Datadog error tracking.

- Add explicit 499 handling in handle_llm_error: log at info level, return
  LLMConnectionError instead of LLMServerError
- Catch 499 during stream iteration in stream_async and end gracefully
  instead of propagating the error

Datadog: https://us5.datadoghq.com/error-tracking/issue/c8453aaa-d559-11f0-81c6-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
f20fdc73d1 fix(core): preserve Gemini thought_signature on function calls in non-streaming path (#9351)
* fix(core): preserve Gemini thought_signature on function calls in non-streaming path

The Google Gemini API requires thought_signature to be echoed back on
function call parts in multi-turn conversations. In the non-streaming
request path, the signature was only captured for subsequent function
calls (else branch) but dropped for the first/only function call (if
branch) in convert_response_to_chat_completion. This caused 400
INVALID_ARGUMENT errors on the next turn.

Additionally, when no ReasoningContent existed to carry the signature
(e.g. Gemini 2.5 Flash with include_thoughts=False), the signature was
lost in the adapter layer. Now it falls through to TextContent.

Datadog: https://us5.datadoghq.com/error-tracking/issue/17c4b114-d596-11f0-bcd6-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): preserve Gemini thought_signature in non-temporal agent path

Carry reasoning_content_signature on TextContent in letta_agent.py
at both locations where content falls through from reasoning (same
fix already applied to the adapter and temporal activity paths).

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
2026-02-24 10:52:07 -08:00
Kian Jones
32d87b70d7 fix(core): return 400 for ContextWindowExceededError instead of 500 (#9348)
SystemPromptTokenExceededError (a subclass of ContextWindowExceededError)
was hitting the generic 500 handler in Datadog. This is a client-side
configuration issue (system prompt too large for context window), so it
should return 400 Bad Request.

Datadog: https://us5.datadoghq.com/error-tracking/issue/3360e252-d5f1-11f0-8467-da7ad0900000
Also covers: https://us5.datadoghq.com/error-tracking/issue/6e50c57c

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
47aedfa1a7 fix(core): convert MCP ConnectionError to LettaMCPConnectionError for proper HTTP 502 responses (#9364)
MCP server connection failures were raising Python's builtin ConnectionError,
which bypassed the LettaMCPConnectionError FastAPI exception handler and hit
Datadog as unhandled 500 errors. Now all MCP client classes convert
ConnectionError to LettaMCPConnectionError at the source, which the existing
exception handler returns as a user-friendly 502.

Datadog: https://us5.datadoghq.com/error-tracking/issue/93db4a82-fe5a-11f0-85f0-da7ad0900000

🐛 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
31d221b47e fix: handle MCP tool schemas missing properties key gracefully (#9347)
MCP tools can have an inputSchema like `{"type": "object", "additionalProperties": false}`
with no `properties` key (valid for zero-arg tools). The hard assert on `"properties" in
parameters_schema` caused an AssertionError. Replace assertions with safe defaults.

Datadog: https://us5.datadoghq.com/error-tracking/issue/ecee9ee2-db75-11f0-ba66-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
cbbb6d776d fix(core): unwrap ExceptionGroup in OAuth stream generator (#9346)
Fixes Datadog issue 5efbb1d4-eec5-11f0-8f8e-da7ad0900000

Add ExceptionGroup unwrapping in OAuth stream exception handler.

The bug was caused by ExceptionGroup not being caught by the general
`except Exception` handler, since ExceptionGroup is a subclass of
BaseException, not Exception. This caused TaskGroup errors to escape
as unhandled ExceptionGroups in Datadog.

The fix adds an explicit ExceptionGroup handler before the general
Exception handler, following the same unwrapping pattern used in
other parts of the codebase (mcp_tool_executor.py, base_client.py).

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
745dd1e124 fix(core): reject empty API keys in Bearer auth headers (#9350)
Empty or None API keys resulted in "Bearer " header values which cause
httpx.LocalProtocolError. Use truthiness checks instead of `is not None`
to also reject empty strings before constructing Authorization headers.

Datadog: https://us5.datadoghq.com/error-tracking/issue/ad3c1e38-d557-11f0-a65d-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
d592ec3135 fix: handle DBAPIError wrapping asyncpg DeadlockDetectedError (#9355)
SQLAlchemy wraps asyncpg's DeadlockDetectedError in a DBAPIError,
which was falling through to the generic 500 handler. Now detected
at both the ORM level (_handle_dbapi_error) and FastAPI handler level,
returning 409 with Retry-After header.

Datadog: https://us5.datadoghq.com/error-tracking/issue/2f1dc54c-dab6-11f0-a828-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Charles Packer
f58c4a43fa fix(core): remove "edit again if necessary" from memory tool return message (#9367)
Models (especially Opus) take this instruction literally and re-call
the memory edit tool in a loop — one user saw 96 consecutive rethink
calls. Dropping the sentence stops the feedback loop while still
asking the agent to review the result.

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
369cdf72c7 feat(core): store block metadata as YAML frontmatter in .md files (#9365)
* feat(core): store block metadata as YAML frontmatter in .md files

Block .md files in git repos now embed metadata (description, limit,
read_only, metadata dict) as YAML frontmatter instead of a separate
metadata/blocks.json file. Only non-default values are rendered.

Format:
  ---
  description: "Who I am"
  limit: 5000
  ---
  Block value content here...

Changes:
- New block_markdown.py utility (serialize_block / parse_block_markdown)
- Updated all three write/read paths: manager.py, memfs_client.py,
  memfs_client_base.py
- block_manager_git.py now passes description/limit/read_only/metadata
  through to git commits
- Post-push sync (git_http.py) parses frontmatter and syncs metadata
  fields to Postgres
- Removed metadata/blocks.json reads/writes entirely
- Backward compat: files without frontmatter treated as raw value
- Integration test verifies frontmatter in cloned files and metadata
  sync via git push

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: derive frontmatter defaults from BaseBlock schema, not hardcoded dict

Remove _DEFAULTS dict from block_markdown.py. The core version now
imports BaseBlock and reads field defaults via model_fields. This
fixes the limit default (was 5000, should be CORE_MEMORY_BLOCK_CHAR_LIMIT=20000).

Also:
- memfs-py copy simplified to parse-only (no serialize, no letta imports)
- All hardcoded limit=5000 fallbacks replaced with CORE_MEMORY_BLOCK_CHAR_LIMIT
- Test updated: blocks with all-default metadata correctly have no frontmatter;
  frontmatter verified after setting non-default description via API

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: always include description and limit in frontmatter

description and limit are always rendered in the YAML frontmatter,
even when at their default values. Only read_only and metadata are
conditional (omitted when at defaults).

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: resolve read_only from block_update before git commit

read_only was using the old Postgres value instead of the update value
when committing to git. Also adds integration test coverage for
read_only: true appearing in frontmatter after API PATCH, and
verifying it's omitted when false (default).

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: add API→git round-trip coverage for description and limit

Verifies that PATCH description/limit via API is reflected in
frontmatter after git pull. Combined with the existing push→API
test (step 6), this gives full bidirectional coverage:
- API edit description/limit → pull → frontmatter updated
- Push frontmatter with description/limit → API reflects changes

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
cthomas
9dee331e6c fix(core): backfill missing blocks when git repo is incomplete (#9340)
When enable_git_memory_for_agent is called on an agent that already has
the git-memory-enabled tag, it was returning early if the repo existed,
even if the repo was missing blocks.

Now checks if all blocks are present in the repo and backfills any
missing ones.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
2cfaba3fe6 fix(core): validate file existence before creating files_agents rows (#9341)
Prevents ForeignKeyViolationError when attaching files to agents where
the file has been deleted between listing and attachment (race condition).
Now validates file IDs exist in the files table before inserting, and
skips any missing files with a warning log.

Fixes Datadog issue a1768774-d691-11f0-9330-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
cb982cae89 fix(core): skip duplicate block attachment to prevent UniqueViolationError (#9343)
Check if a block is already attached to an agent before appending to
core_memory. Prevents asyncpg UniqueViolationError on the
unique_agent_block constraint when attach_block_async is called twice
with the same (agent_id, block_id) pair.

Fixes Datadog issue d8dec148-d535-11f0-95eb-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
93e453ef8f fix(core): transform nested block labels on git memory enable (#9339)
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
58069d760e chore: update pgvector Docker image to official pgvector/pgvector (#9336)
chore: update pgvector image from ankane/pgvector:v0.8.1 to pgvector/pgvector:0.8.1-pg18-trixie
2026-02-24 10:52:06 -08:00
Kian Jones
98fa16899b fix(core): handle ExceptionGroup-wrapped ToolError in MCP clients (#9329)
Extends the ExceptionGroup unwrapping fix from mcp_tool_executor to the
base MCP client implementations (AsyncBaseMCPClient, AsyncFastMCPSSEClient,
AsyncFastMCPStreamableHTTPClient). When ToolError exceptions are wrapped
in ExceptionGroup by Python's async TaskGroup, the exception handler now
unwraps single-exception groups before checking class names.

This prevents wrapped ToolError exceptions from being logged to Datadog
as unexpected errors instead of being handled as expected validation failures.

Related to commit 1cbf1b231 which fixed the same issue in mcp_tool_executor.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Kian Jones
e7039470e9 fix(core): handle ExceptionGroup-wrapped ToolError and McpError in MCP tool execution (#9328)
* fix(core): handle ExceptionGroup-wrapped ToolError and McpError in MCP tool execution

Fixes 3 related Datadog bugs (all fastmcp.exceptions.ToolError):
- 75d43daa-ff04-11f0-81b2-da7ad0900000
- 7af6373e-0080-11f1-9855-da7ad0900000
- a322edc8-fffa-11f0-b26c-da7ad0900000

These errors were caused by ToolError and McpError exceptions bubbling up
unhandled from the MCP REST endpoint. This fix combines the approaches from
PRs #9320 and #9321:

1. Handle ExceptionGroup wrapping (Python 3.11+ async TaskGroup)
2. Check for ToolError by class name to handle module variations
3. Convert ToolError to LettaInvalidArgumentError for proper client response
4. Catch McpError and return HTTP 500 with proper error message

Issue-IDs: 75d43daa-ff04-11f0-81b2-da7ad0900000, 7af6373e-0080-11f1-9855-da7ad0900000, a322edc8-fffa-11f0-b26c-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: return 422 instead of 500 for McpError (user config issue)

* fix: use LettaMCPConnectionError instead of HTTPException for McpError

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
cthomas
70d749e859 fix(core): add retry with backoff for post-push sync race condition (#9335)
When git push completes, the webhook fires immediately but GCS upload
may still be in progress. This causes KeyError when trying to read
commit objects that haven't been uploaded yet.

Add retry with exponential backoff (1s, 2s, 4s) to handle this race.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Kian Jones
d48932bdb6 fix(core): sanitize Unicode surrogates in all LLM client requests (#9323)
Multiple OpenAI-compatible LLM clients (Azure, Deepseek, Groq, Together, XAI, ZAI)
and Anthropic-compatible clients (Anthropic, MiniMax, Google Vertex) were overriding
request_async/stream_async without calling sanitize_unicode_surrogates, causing
UnicodeEncodeError when message content contained lone UTF-16 surrogates.

Root cause: Child classes override parent methods but omit the sanitization step that
the base OpenAIClient includes. This allows corrupted Unicode (unpaired surrogates
from malformed emoji) to reach the httpx layer, which rejects it during UTF-8 encoding.

Fix: Import and call sanitize_unicode_surrogates in all overridden request methods.
Also removed duplicate sanitize_unicode_surrogates definition from openai_client.py
that shadowed the canonical implementation in letta.helpers.json_helpers.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

Issue-ID: 10c0f2e4-f87b-11f0-b91c-da7ad0900000
2026-02-24 10:52:06 -08:00
Kian Jones
69fc934135 fix(core): handle transient database connection errors with retry logic (#9324)
Add retry mechanism for ConnectionError during asyncpg SSL handshake failures.
Implements exponential backoff (3 attempts) and returns 503 on exhaustion.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

Issue-ID: 8caf1136-0200-11f1-8f4d-da7ad0900000
2026-02-24 10:52:06 -08:00
cthomas
05b77a5fed fix(core): update memfs client to use memory/ directory (#9322)
PR #9309 changed the block storage from blocks/ to memory/ directory.
Update memfs_client.py and memfs_client_base.py to match.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Kian Jones
b0c40b6b1d fix: multi_agent flaky test (#9314)
* fix(core): handle PermissionDeniedError in provider API key validation

Fixed OpenAI PermissionDeniedError being raised as unknown error when
validating provider API keys. The check_api_key methods in OpenAI-based
providers (OpenAI, OpenRouter, Azure, Together) now properly catch and
re-raise PermissionDeniedError as LLMPermissionDeniedError.

🐛 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): handle Unicode surrogates in OpenAI requests

Sanitize invalid UTF-16 surrogates before sending requests to OpenAI API.
Fixes UnicodeEncodeError when message content contains unpaired surrogates
from corrupted emoji data or malformed Unicode sequences.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* try to fix

* revert random stuff

* revert some stuff

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Kian Jones
662ec082cf fix(core): handle MCP errors and API key whitespace (#9306)
* fix: strip whitespace from API keys in LLM client headers

Fixes httpx.LocalProtocolError when API keys contain leading/trailing whitespace.
Strips whitespace from API keys before using them in HTTP headers across:
- OpenAI client (openai.py)
- Mistral client (mistral.py)
- Anthropic client (anthropic_client.py)
- Anthropic schema provider (schemas/providers/anthropic.py)
- Google AI client (google_ai_client.py)
- Proxy helpers (proxy_helpers.py)

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: handle McpError gracefully in MCP client execute_tool

Return error as failed result instead of re-raising to avoid Datadog alerts for expected user-facing errors like missing tool arguments.

* fix: strip whitespace from API keys before passing to httpx client

Fixes httpx.LocalProtocolError by stripping leading/trailing whitespace
from API keys before passing them to OpenAI/AsyncOpenAI clients. The
OpenAI client library constructs Authorization headers internally, and
invalid header values (like keys with leading spaces) cause protocol
errors.

Applied fix to:
- azure_client.py (AzureOpenAI/AsyncAzureOpenAI)
- deepseek_client.py (OpenAI/AsyncOpenAI)
- openai_client.py (OpenAI/AsyncOpenAI via kwargs)
- xai_client.py (OpenAI/AsyncOpenAI)

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: handle JSONDecodeError in OpenAI client requests

Catches json.JSONDecodeError from OpenAI SDK when API returns invalid
JSON (typically HTML error pages from 500-series errors) and converts
to LLMServerError with helpful details.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): strip API key whitespace at schema level on write/create

Add field_validator to ProviderCreate, ProviderUpdate, and ProviderCheck
schemas to strip whitespace from api_key and access_key fields before
persistence. This ensures keys are clean at the point of entry, preventing
whitespace from being encrypted and stored in the database.

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

* refactor: remove api_key.strip() calls across all LLM clients

Remove redundant .strip() calls on api_key parameters since pydantic models
now handle whitespace trimming at the validation layer. This centralizes
the validation logic and follows DRY principles.

- Updated 13 files across multiple LLM client implementations
- Removed 34 occurrences of api_key.strip()
- Includes: OpenAI, Anthropic, Azure, Google AI, Groq, XAI, DeepSeek, ZAI, Together, Mistral
- Also updated proxy helpers and provider schemas

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: remove redundant ternary operators from api_key parameters

Remove `if api_key else None` ternaries since pydantic validation ensures
api_key is either a valid string or None. The ternary was defensive programming
that's now unnecessary with proper model-level validation.

- Simplified 23 occurrences across 7 files
- Cleaner, more concise client initialization code
- No behavioral change since pydantic already handles this

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>
2026-02-24 10:52:06 -08:00
cthomas
126d8830b8 feat: set memfs env vars in deploy wf (#9318) 2026-02-24 10:52:06 -08:00
cthomas
0bdd555f33 feat: add memfs-py service (#9315)
* feat: add memfs-py service

* add tf for bucket access and secrets v2 access

* feat(memfs): add helm charts, deploy workflow, and bug fixes

- Add dev helm chart (helm/dev/memfs-py/) with CSI secrets pattern
- Update prod helm chart with CSI secrets and correct service account
- Add GitHub Actions deploy workflow
- Change port from 8284 to 8285 to avoid conflict with core's dulwich sidecar
- Fix chunked transfer encoding issue (strip HTTP_TRANSFER_ENCODING header)
- Fix timestamp parsing to handle both ISO and HTTP date formats
- Fix get_head_sha to raise FileNotFoundError on 404

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Kian Jones <kian@letta.com>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
21e880907f feat(core): structure memory directory and block labels [LET-7336] (#9309) 2026-02-24 10:52:06 -08:00
Kian Jones
3709be28dd fix(core): handle Google GenAI validation errors (#9307)
* fix: handle const keyword in google genai tool schemas

* fix: handle pydantic ValidationError in Google GenAI client

Fixes Datadog error tracking issue where pydantic_core.ValidationError
was raised when tool schemas contained unsupported fields (e.g., 'const',
'default', 'additionalProperties').

Changes:
- Add error handling for pydantic ValidationError in request(), request_async(), and stream_async()
- Convert validation errors to LLMBadRequestError with helpful error message
- Deep copy tool parameters before cleaning to avoid modifying shared objects
- Add imports for pydantic_core and copy module

This prevents unhandled exceptions and provides better diagnostics when
tool schemas contain fields not supported by Google AI API.
2026-02-24 10:52:06 -08:00
Kian Jones
be60697a62 fix(core): handle protocol errors and foreign key violations (#9308)
* fix(core): handle PermissionDeniedError in provider API key validation

Fixed OpenAI PermissionDeniedError being raised as unknown error when
validating provider API keys. The check_api_key methods in OpenAI-based
providers (OpenAI, OpenRouter, Azure, Together) now properly catch and
re-raise PermissionDeniedError as LLMPermissionDeniedError.

🐛 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): handle Unicode surrogates in OpenAI requests

Sanitize invalid UTF-16 surrogates before sending requests to OpenAI API.
Fixes UnicodeEncodeError when message content contains unpaired surrogates
from corrupted emoji data or malformed Unicode sequences.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): handle MCP tool schema validation errors gracefully

Catch fastmcp.exceptions.ToolError in execute_mcp_tool endpoint and
convert to LettaInvalidArgumentError (400) instead of letting it
propagate as 500 error. This is an expected user error when tool
arguments don't match the MCP tool's schema.

Fixes Datadog issue 8f2d874a-f8e5-11f0-9b25-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): handle ExceptionGroup-wrapped ToolError in MCP executor

When MCP tools fail with validation errors (e.g., missing required parameters),
fastmcp raises ToolError exceptions that may be wrapped in ExceptionGroup by
Python's async TaskGroup. The exception handler now unwraps single-exception
groups before checking if the error should be handled gracefully.

Fixes Calendly API "organization parameter missing" errors being logged to
Datadog instead of returning friendly error messages to users.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: handle missing agent in create_conversation to prevent foreign key violation

* Update .gitignore

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Ari Webb
85ee7ed7b4 fix: anthropic tool sanitation (#9310) 2026-02-24 10:52:06 -08:00
Kian Jones
6f746c5225 fix(core): handle Anthropic overloaded errors and Unicode encoding issues (#9305)
* fix: handle Anthropic overloaded_error in streaming interfaces

* fix: handle Unicode surrogates in OpenAI requests

Sanitize Unicode surrogate pairs before sending requests to OpenAI API.
Surrogate pairs (U+D800-U+DFFF) are UTF-16 encoding artifacts that cause
UnicodeEncodeError when encoding to UTF-8.

Fixes Datadog error: 'utf-8' codec can't encode character '\ud83c' in
position 326605: surrogates not allowed

* fix: handle UnicodeEncodeError from lone Unicode surrogates in OpenAI requests

Improved sanitize_unicode_surrogates() to explicitly filter out lone
surrogate characters (U+D800 to U+DFFF) which are invalid in UTF-8.

Previous implementation used errors='ignore' which could still fail in
edge cases. New approach directly checks Unicode code points and removes
any surrogates before data reaches httpx encoding.

Also added sanitization to stream_async_responses() method which was
missing it.

Fixes: 'utf-8' codec can't encode character '\ud83c' in position X:
surrogates not allowed
2026-02-24 10:52:06 -08:00
Sarah Wooders
93249b96f5 test(core): git-backed memory repo integration (real object store) (#9298)
---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
jnjpng
0bdedb3c0f feat: agent generate endpoint (#9304)
* base

* update

* clean up

* update
2026-02-24 10:52:06 -08:00
Ari Webb
5c6ca705f1 Revert "feat: bring back use message packing for timezone [LET-6846]" (#9302)
Revert "feat: bring back use message packing for timezone [LET-6846] (#9256)"

This reverts commit c5017cccdef95b84fc585b26a0ddc5b7e44eb7c9.
2026-02-24 10:52:06 -08:00
Devansh Jain
644f7b9d5d chore: Add Opus 4.6 with 1M context window [OPUS-46] (#9301)
opus 4.6 1M version
2026-02-24 10:52:06 -08:00
Kevin Lin
34159ffa21 feat: add Anthropic Opus 4.6 model support (#9123) 2026-02-24 10:52:06 -08:00
jnjpng
ff69c6a32e feat: add /agents/{agent_id}/generate endpoint for direct LLM requests (#9272)
* feat: add /agents/{agent_id}/generate endpoint for direct LLM requests

Add new endpoint that makes direct LLM provider requests without agent
context, memory, tools, or state modification. This enables:
- Quick LLM queries without agent overhead
- Testing model configurations
- Simple chat completions using agent's credentials
- Comparing responses across different models

Features:
- Uses agent's LLM config by default
- Supports model override with full provider config resolution
- Non-streaming, stateless operation
- Proper error handling and validation
- Request/response schemas with Pydantic validation

Implementation:
- Add GenerateRequest and GenerateResponse schemas
- Implement generate_completion endpoint handler
- Add necessary imports (LLMError, LLMClient, HandleNotFoundError)
- Include logging and comprehensive error handling

* fix: improve error handling and fix Message construction

- Fix critical bug: use content=[TextContent(text=...)] instead of text=...
- Add explicit error handling for NoResultFound and HandleNotFoundError
- Add error handling for convert_response_to_chat_completion
- Add structured logging for debugging
- Remove unnecessary .get() calls since Pydantic validates messages

* refactor: extract generate logic to AgentCompletionService

Move the generate endpoint business logic out of the endpoint handler
into a dedicated AgentCompletionService class for better code organization
and separation of concerns.

Changes:
- Create new AgentCompletionService in services/agent_completion_service.py
- Service handles all business logic: agent validation, LLM config resolution,
  message conversion, LLM client creation, and request/response processing
- Integrate service with SyncServer initialization
- Refactor generate_completion endpoint to use the service
- Endpoint now only handles HTTP concerns (auth, error mapping)

Benefits:
- Cleaner endpoint code (reduced from ~140 lines to ~25 lines)
- Better separation of concerns (HTTP vs business logic)
- Service logic can be reused or tested independently
- Follows established patterns in the codebase (AgentManager, etc.)

* feat: simplify generate API to accept just prompt text

Simplify the client interface by accepting a simple prompt string instead
of requiring clients to format messages.

Changes:
- Update GenerateRequest schema:
  - Replace 'messages' array with simple 'prompt' string
  - Add optional 'system_prompt' for context/instructions
  - Keep 'override_model' for model selection
- Update AgentCompletionService to format messages automatically:
  - Accepts prompt and optional system_prompt
  - Constructs message array internally (system + user messages)
  - Simpler API surface for clients
- Update endpoint documentation with new simplified examples
- Regenerate OpenAPI spec and TypeScript SDK

Benefits:
- Much simpler client experience - just send text
- No need to understand message formatting
- Still supports system prompts for context
- Cleaner API that matches common use cases

Example (before):
{
  "messages": [{"role": "user", "content": "What is 2+2?"}]
}

Example (after):
{
  "prompt": "What is 2+2?"
}

* test: add comprehensive integration tests for generate endpoint

Add 9 integration tests covering various scenarios:

Happy path tests:
- test_agent_generate_basic: Basic prompt -> response flow
- test_agent_generate_with_system_prompt: System prompt + user prompt
- test_agent_generate_with_model_override: Override model selection
- test_agent_generate_long_prompt: Handle longer prompts
- test_agent_generate_no_persistence: Verify no messages saved to agent

Error handling tests:
- test_agent_generate_empty_prompt_error: Empty prompt validation (422)
- test_agent_generate_invalid_agent_id: Invalid agent ID (404)
- test_agent_generate_invalid_model_override: Invalid model handle (404)

All tests verify:
- Response structure (content, model, usage)
- Proper status codes for errors
- Usage statistics (tokens, counts)
- No side effects on agent state

Tests follow existing test patterns in test_client.py and use the
letta_client SDK (assuming generate_completion method is auto-generated
from the OpenAPI spec).

* openapi

* refactor: rename AgentCompletionService to AgentGenerateCompletionManager

Rename for better clarity and consistency with codebase naming conventions:
- Rename file: agent_completion_service.py → agent_generate_completion_manager.py
- Rename class: AgentCompletionService → AgentGenerateCompletionManager
- Rename attribute: server.agent_completion_service → server.agent_generate_completion_manager
- Update docstrings: 'Service' → 'Manager'

Changes:
- apps/core/letta/services/agent_generate_completion_manager.py (renamed + updated class)
- apps/core/letta/server/server.py (import + initialization)
- apps/core/letta/server/rest_api/routers/v1/agents.py (usage in endpoint)

No functional changes, purely a naming refactor.

* fix: remove invalid Message parameters in generate manager

Remove agent_id=None and user_id=None from Message construction.
The Message model doesn't accept these as None values - only pass
required parameters (role, content).

Fixes validation error:
  'Extra inputs are not permitted [type=extra_forbidden, input_value=None]'

This aligns with other Message construction patterns in the codebase
(see tools.py, memory.py examples).

* feat: improve generate endpoint validation and tests

- Add field validator for whitespace-only prompts
- Always include system message (required by Anthropic)
- Use default "You are a helpful assistant." when no system_prompt provided
- Update tests to use direct HTTP calls via httpx
- Fix test issues:
  - Use valid agent ID format (agent-{uuid})
  - Use available model (openai/gpt-4o-mini)
  - Add whitespace validation test
- All 9 integration tests passing
2026-02-24 10:52:06 -08:00
Sarah Wooders
597c7f3e3e fix(core): git HTTP org fallback + post-push sync (#9297) 2026-02-24 10:52:06 -08:00
cthomas
530d33c254 feat: add skills support to agentfile (#9287) 2026-02-24 10:52:06 -08:00
Sarah Wooders
fdc32f6054 fix(core): derive dulwich org context from user_id fallback (#9296)
* fix(core): derive dulwich org context from user_id fallback

Make git smart HTTP more robust in prod by:
- normalizing/injecting a single X-Organization-Id header in the FastAPI proxy
- keeping dulwich org contextvar set through WSGI iteration
- falling back to resolving org_id from user_id when X-Organization-Id is missing
- adding opt-in debug logs (env LETTA_GIT_HTTP_DEBUG_LOGS or letta_debug query)

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor(core): drop user->org cache in dulwich org fallback

Keep the dulwich org_id fallback simple by resolving org_id from user_id via
UserManager lookup when X-Organization-Id is missing, without maintaining an
in-process cache.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore(core): make git HTTP debug logging always-on

Remove opt-in toggles for git HTTP debug logs and log proxy + dulwich request
context for every git smart-HTTP request.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
jnjpng
c801866d89 feat: add context token estimates to llm usage (#9295)
* base

* generate

* update
2026-02-24 10:52:06 -08:00
cthomas
e8db3ac89a fix: use conversation message_ids when exporting agent with conversation_id (#9294)
When exporting an agent with a conversation_id, the export function was
setting agent_state.message_ids from the conversation, but from_agent_state
was ignoring this and fetching messages generically via list_messages.

Now from_agent_state checks if message_ids is set and fetches those specific
messages instead.

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
df85ee970b fix: fix org id again (#9292)
* fix(core): pass org_id to dulwich via header for git HTTP

* fix(core): use actor org id for git HTTP org header

Git smart HTTP proxies were reading `organization_id` from AgentState, which
is not present and caused 500s during clone/push. Use the authenticated
actor's org id while still performing an authorization check on the agent.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
49d354bac1 fix(core): pass org_id to dulwich via header for git HTTP (#9291) 2026-02-24 10:52:06 -08:00
cthomas
09d7940090 fix: use string tool_choice for Groq and OpenRouter (#9267)
Some providers (Groq, OpenRouter proxied providers) only support string
values for tool_choice ("none", "auto", "required"), not the object
format {"type": "function", "name": "..."}.

When force_tool_call is set, convert to "required" instead of object
format for these providers.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
e0a23f7039 feat: add usage columns to steps table (#9270)
* feat: add usage columns to steps table

Adds denormalized usage fields to the steps table for easier querying:
- model_handle: The model handle (e.g., "openai/gpt-4o-mini")
- cached_input_tokens: Tokens served from cache
- cache_write_tokens: Tokens written to cache (Anthropic)
- reasoning_tokens: Reasoning/thinking tokens

These fields mirror LettaUsageStatistics and are extracted from the
existing prompt_tokens_details and completion_tokens_details JSON columns.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate OpenAPI specs and SDK for usage columns

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com>
2026-02-24 10:52:06 -08:00
Ari Webb
f957beaa37 fix: add new json (#9280) 2026-02-24 10:52:06 -08:00
Ari Webb
426f6a8ca4 feat: bring back use message packing for timezone [LET-6846] (#9256)
* feat: bring back use message packing for timezone

* add tests
2026-02-24 10:52:06 -08:00