letta-server

Author	SHA1	Message	Date
Sarah Wooders	0dde155e9a	feat: Prefix cache optimization system prompt (#9381 )	2026-02-24 10:52:07 -08:00
Kian Jones	7eb85707b1	feat(tf): gpu runners and prod memory_repos (#9283 ) * add gpu runners and prod memory_repos * add lmstudio and vllm in model_settings * fix llm_configs and change variable name in reusable workflow and change perms for memory_repos to admin in tf * fix: update self-hosted provider tests to use SDK 1.0 and v2 tests - Update letta-client from ==0.1.324 to >=1.0.0 - Switch ollama/vllm/lmstudio tests to integration_test_send_message_v2.py 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: use openai provider_type for self-hosted model settings ollama/vllm/lmstudio are not valid provider_type values in the SDK model_settings schema - they use openai-compatible APIs so provider_type should be openai. The provider routing is determined by the handle prefix. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: use openai_compat_base_url for ollama/vllm/lmstudio providers When reconstructing LLMConfig from a model handle lookup, use the provider's openai_compat_base_url (which includes /v1) instead of raw base_url. This fixes 404 errors when calling ollama/vllm/lmstudio since OpenAI client expects /v1/chat/completions endpoint. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: enable redis for ollama/vllm/lmstudio tests Background streaming tests require Redis. Add use-redis: true to self-hosted provider test workflows. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * add memfs-py in prod bucket access * change ollama * change packer model defaults * self-hosted provider support * diasble reasoner to match the number of messages in test case, enable parallel tool calls, and pass embedding configs * remove reasoning setting not supported for ollama * add qwen3 to extra assistant message case * lower temp * prep for lmstudio and vllm * used lmstudio_openai client * skip parallel tool calls on cpu ran provider lmstudio * revert downgrade since it's so slow already * add reuired flags for tool call parsing etc. * change tool call parser from hermes to qwen3_xml * qwen3_xmlk -> qwen3_coder * upgrade vllm to latest container * revert to hermes (incompatible with parallel tool calls?) and skipping vllm tests on parallel tool calls * install uv redis extra * remove lmstudio --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kevin Lin	23c94ec6d3	feat: add log probabilities from OpenAI-compatible servers and SGLang native endpoint (#9240 ) * Add log probabilities support for RL training This enables Letta server to request and return log probabilities from OpenAI-compatible providers (including SGLang) for use in RL training. Changes: - LLMConfig: Add return_logprobs and top_logprobs fields - OpenAIClient: Set logprobs in ChatCompletionRequest when enabled - LettaLLMAdapter: Add logprobs field and extract from response - LettaResponse: Add logprobs field to return log probs to client - LettaRequest: Add return_logprobs/top_logprobs for per-request override - LettaAgentV3: Store and pass logprobs through to response - agents.py: Handle request-level logprobs override Usage: response = client.agents.messages.create( agent_id=agent_id, messages=[...], return_logprobs=True, top_logprobs=5, ) print(response.logprobs) # Per-token log probabilities 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * Add multi-turn token tracking for RL training via SGLang native endpoint - Add TurnTokenData schema to track token IDs and logprobs per turn - Add return_token_ids flag to LettaRequest and LLMConfig - Create SGLangNativeClient for /generate endpoint (returns output_ids) - Create SGLangNativeAdapter that uses native endpoint - Modify LettaAgentV3 to accumulate turns across LLM calls - Include turns in LettaResponse when return_token_ids=True * Fix: Add SGLang native adapter to step() method, not just stream() * Fix: Handle Pydantic Message objects in SGLang native adapter * Fix: Remove api_key reference from LLMConfig (not present) * Fix: Add missing 'created' field to ChatCompletionResponse * Add full tool support to SGLang native adapter - Format tools into prompt in Qwen-style format - Parse tool calls from <tool_call> tags in response - Format tool results as <tool_response> in user messages - Set finish_reason to 'tool_calls' when tools are called * Use tokenizer.apply_chat_template for proper tool formatting - Add tokenizer caching in SGLang native adapter - Use apply_chat_template when tokenizer available - Fall back to manual formatting if not - Convert Letta messages to OpenAI format for tokenizer * Fix: Use func_response instead of tool_return for ToolReturn content * Fix: Get output_token_logprobs from meta_info in SGLang response * Fix: Allow None in output_token_logprobs (SGLang format includes null) * chore: remove unrelated files from logprobs branch 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: add missing call_type param to adapter constructors in letta_agent_v3 The SGLang refactor dropped call_type=LLMCallType.agent_step when extracting adapter creation into conditional blocks. Restores it for all 3 spots (SGLang in step, SimpleLLM in step, SGLang in stream). 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * just stage-api && just publish-api * fix: update expected LLMConfig fields in schema test for logprobs support 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: remove rllm provider references 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * just stage-api && just publish-api 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-65-206.ec2.internal> Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Sarah Wooders	f9f1c55c93	fix: fix context preview for git (#9403 )	2026-02-24 10:52:07 -08:00
Sarah Wooders	bbc648909b	refactor: drop memory/ prefix from git memory repo file paths and update core memory rendering [LET-7356] (#9395 )	2026-02-24 10:52:07 -08:00
Ari Webb	5fd5a6dd07	feat: add new azure api maintaining backward compat (#9387 ) * feat: add new azure provider type * fix context window	2026-02-24 10:52:07 -08:00
jnjpng	226df8baef	fix: propagate context window exceeded from chatgpt oauth client (#9393 ) * base * clean up * fixes	2026-02-24 10:52:07 -08:00
Ari Webb	c08b67a26a	feat: add ToolReturnCreate to MessageCreateParams [LET-7366] (#9385 ) * fix: add ToolReturnCreate to sdk types * ci	2026-02-24 10:52:07 -08:00
Cameron	89a7a12b38	fix(core): remove send_message tool requirement from A2A messages (#9383 ) The A2A messaging tools were instructing receiving agents to use the send_message tool to reply, but that tool is often not attached to agents anymore. This caused agents confusion when they couldn't find the required tool. For synchronous functions (send_message_to_agent_and_wait_for_reply, send_message_to_agents_matching_tags, send_message_to_all_agents_in_group), the system already captures AssistantMessage automatically, so agents just need to respond normally. For the async/fire-and-forget function (send_message_to_agent_async), updated to indicate it's a one-way notification and hint that messaging tools exist without requiring a specific one. 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	6e0e1cc312	fix(core): validate run exists before creating step/step_metrics (#9382 ) Checks if the referenced run_id exists in the runs table before inserting steps and step_metrics. If the run doesn't exist (deleted or failed creation), sets run_id to None instead of hitting ForeignKeyViolationError on fk_steps_run_id. Fixes https://us5.datadoghq.com/error-tracking/issue/a1768774-d691-11f0-9330-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	4c753f3f3c	fix: handle non-JSON responses from LLM provider endpoints (#9362 ) When an OpenAI/Anthropic-compatible endpoint returns a non-JSON response (e.g. HTML error page), the SDK's paginated response parser falls back to returning a raw string. The post-parser then calls _set_private_attributes() on that string, causing an AttributeError. Add explicit AttributeError handling around SDK models.list() calls in provider check_api_key/list_llm_models_async methods, and add type guards in convert_response_to_chat_completion to reject raw strings before Pydantic model construction. Datadog: https://us5.datadoghq.com/error-tracking/issue/59a7a206-00b8-11f1-be73-da7ad0900000 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	71e0a8aab9	fix(core): use INSERT ON CONFLICT DO NOTHING for provider model sync (#9342 ) * fix(core): use INSERT ON CONFLICT DO NOTHING for provider model sync Replaces try/except around model.create_async() with pg_insert() .on_conflict_do_nothing() to prevent UniqueViolationError from being raised at the asyncpg driver level during concurrent model syncs. The previous approach caught the exception in Python but ddtrace still captured it at the driver level, causing Datadog error tracking noise. Fixes Datadog issue d8dec148-d535-11f0-95eb-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * cleaner impl * fix --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Ari Webb <ari@letta.com>	2026-02-24 10:52:07 -08:00
Charles Packer	b0e16ae50f	fix: surface GPT-5.3 Codex for ChatGPT OAuth providers (#9379 )	2026-02-24 10:52:07 -08:00
Sarah Wooders	526da4c49b	Revert "perf: optimize prefix caching by skipping system prompt rebuild on every step" (#9380 ) Revert "perf: optimize prefix caching by skipping system prompt rebuild on ev…" This reverts commit eafa4144c2577a45b7007a177b701863b98d1dfa.	2026-02-24 10:52:07 -08:00
Sarah Wooders	9dbe28e8f1	perf: optimize prefix caching by skipping system prompt rebuild on every step (#9080 )	2026-02-24 10:52:07 -08:00
Kian Jones	825019c2ce	fix(core): handle Anthropic streaming required ValueError (#9344 ) * Fix Anthropic ValueError for long-running operations Adds proper error handling for Anthropic SDK's streaming requirement. When operations may exceed 10 minutes, the SDK raises a ValueError. Changes: - Catch ValueError in sync request() method - Provide user-friendly error directing to async API - Async version already had this fix with streaming fallback Fixes Datadog issue 955d10b4-ed95-11f0-a5a5-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: use LLMBadRequestError instead of ValueError for Anthropic streaming constraint ValueError maps to HTTP 400 which incorrectly implies a bad client request. LLMBadRequestError maps to HTTP 502 (Bad Gateway) which correctly signals that the downstream provider (Anthropic) rejected the proxied request due to its own constraints. Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com> 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>	2026-02-24 10:52:07 -08:00
Kian Jones	14ef479e70	fix(core): handle empty content in Anthropic response gracefully (#9345 ) Fixes Datadog issue a47619fa-d5b8-11f0-9fd7-da7ad0900000 Handle empty content in Anthropic responses gracefully by replacing RuntimeError with LLMServerError. Now logs detailed debugging information (response ID, model, stop_reason) and returns a user-friendly error instead of crashing. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	4eb27e23b3	fix(core): add deadlock retry logic to ORM write operations (#9352 ) Adds automatic retry with exponential backoff for PostgreSQL deadlock errors (40P01) in all ORM write methods: create_async, update_async, batch_create_async, hard_delete_async, and bulk_hard_delete_async. For update_async, column values are snapshotted before the commit attempt so they can be restored after rollback clears them. Also adds DatabaseDeadlockError to _handle_dbapi_error as a fallback when retries are exhausted. Datadog: https://us5.datadoghq.com/error-tracking/issue/53ccdd7a-f0cc-11f0-8969-da7ad0900000 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com> Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>	2026-02-24 10:52:07 -08:00
Kian Jones	2c0cddf9f5	fix(core): handle Google 499 CANCELLED as client disconnect, not server error (#9363 ) The google.genai.errors.ClientError with code 499 (CANCELLED) indicates the client disconnected, not a server-side failure. Previously this fell through to the generic ClientError handler and was classified as LLMServerError, causing false 500s in Datadog error tracking. - Add explicit 499 handling in handle_llm_error: log at info level, return LLMConnectionError instead of LLMServerError - Catch 499 during stream iteration in stream_async and end gracefully instead of propagating the error Datadog: https://us5.datadoghq.com/error-tracking/issue/c8453aaa-d559-11f0-81c6-da7ad0900000 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	f20fdc73d1	fix(core): preserve Gemini thought_signature on function calls in non-streaming path (#9351 ) * fix(core): preserve Gemini thought_signature on function calls in non-streaming path The Google Gemini API requires thought_signature to be echoed back on function call parts in multi-turn conversations. In the non-streaming request path, the signature was only captured for subsequent function calls (else branch) but dropped for the first/only function call (if branch) in convert_response_to_chat_completion. This caused 400 INVALID_ARGUMENT errors on the next turn. Additionally, when no ReasoningContent existed to carry the signature (e.g. Gemini 2.5 Flash with include_thoughts=False), the signature was lost in the adapter layer. Now it falls through to TextContent. Datadog: https://us5.datadoghq.com/error-tracking/issue/17c4b114-d596-11f0-bcd6-da7ad0900000 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): preserve Gemini thought_signature in non-temporal agent path Carry reasoning_content_signature on TextContent in letta_agent.py at both locations where content falls through from reasoning (same fix already applied to the adapter and temporal activity paths). Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com> 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>	2026-02-24 10:52:07 -08:00
Kian Jones	32d87b70d7	fix(core): return 400 for ContextWindowExceededError instead of 500 (#9348 ) SystemPromptTokenExceededError (a subclass of ContextWindowExceededError) was hitting the generic 500 handler in Datadog. This is a client-side configuration issue (system prompt too large for context window), so it should return 400 Bad Request. Datadog: https://us5.datadoghq.com/error-tracking/issue/3360e252-d5f1-11f0-8467-da7ad0900000 Also covers: https://us5.datadoghq.com/error-tracking/issue/6e50c57c 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	47aedfa1a7	fix(core): convert MCP ConnectionError to LettaMCPConnectionError for proper HTTP 502 responses (#9364 ) MCP server connection failures were raising Python's builtin ConnectionError, which bypassed the LettaMCPConnectionError FastAPI exception handler and hit Datadog as unhandled 500 errors. Now all MCP client classes convert ConnectionError to LettaMCPConnectionError at the source, which the existing exception handler returns as a user-friendly 502. Datadog: https://us5.datadoghq.com/error-tracking/issue/93db4a82-fe5a-11f0-85f0-da7ad0900000 🐛 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	31d221b47e	fix: handle MCP tool schemas missing `properties` key gracefully (#9347 ) MCP tools can have an inputSchema like `{"type": "object", "additionalProperties": false}` with no `properties` key (valid for zero-arg tools). The hard assert on `"properties" in parameters_schema` caused an AssertionError. Replace assertions with safe defaults. Datadog: https://us5.datadoghq.com/error-tracking/issue/ecee9ee2-db75-11f0-ba66-da7ad0900000 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	cbbb6d776d	fix(core): unwrap ExceptionGroup in OAuth stream generator (#9346 ) Fixes Datadog issue 5efbb1d4-eec5-11f0-8f8e-da7ad0900000 Add ExceptionGroup unwrapping in OAuth stream exception handler. The bug was caused by ExceptionGroup not being caught by the general `except Exception` handler, since ExceptionGroup is a subclass of BaseException, not Exception. This caused TaskGroup errors to escape as unhandled ExceptionGroups in Datadog. The fix adds an explicit ExceptionGroup handler before the general Exception handler, following the same unwrapping pattern used in other parts of the codebase (mcp_tool_executor.py, base_client.py). 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	745dd1e124	fix(core): reject empty API keys in Bearer auth headers (#9350 ) Empty or None API keys resulted in "Bearer " header values which cause httpx.LocalProtocolError. Use truthiness checks instead of `is not None` to also reject empty strings before constructing Authorization headers. Datadog: https://us5.datadoghq.com/error-tracking/issue/ad3c1e38-d557-11f0-a65d-da7ad0900000 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	d592ec3135	fix: handle DBAPIError wrapping asyncpg DeadlockDetectedError (#9355 ) SQLAlchemy wraps asyncpg's DeadlockDetectedError in a DBAPIError, which was falling through to the generic 500 handler. Now detected at both the ORM level (_handle_dbapi_error) and FastAPI handler level, returning 409 with Retry-After header. Datadog: https://us5.datadoghq.com/error-tracking/issue/2f1dc54c-dab6-11f0-a828-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Charles Packer	f58c4a43fa	fix(core): remove "edit again if necessary" from memory tool return message (#9367 ) Models (especially Opus) take this instruction literally and re-call the memory edit tool in a loop — one user saw 96 consecutive rethink calls. Dropping the sentence stops the feedback loop while still asking the agent to review the result. 👾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Sarah Wooders	369cdf72c7	feat(core): store block metadata as YAML frontmatter in .md files (#9365 ) * feat(core): store block metadata as YAML frontmatter in .md files Block .md files in git repos now embed metadata (description, limit, read_only, metadata dict) as YAML frontmatter instead of a separate metadata/blocks.json file. Only non-default values are rendered. Format: --- description: "Who I am" limit: 5000 --- Block value content here... Changes: - New block_markdown.py utility (serialize_block / parse_block_markdown) - Updated all three write/read paths: manager.py, memfs_client.py, memfs_client_base.py - block_manager_git.py now passes description/limit/read_only/metadata through to git commits - Post-push sync (git_http.py) parses frontmatter and syncs metadata fields to Postgres - Removed metadata/blocks.json reads/writes entirely - Backward compat: files without frontmatter treated as raw value - Integration test verifies frontmatter in cloned files and metadata sync via git push 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: derive frontmatter defaults from BaseBlock schema, not hardcoded dict Remove _DEFAULTS dict from block_markdown.py. The core version now imports BaseBlock and reads field defaults via model_fields. This fixes the limit default (was 5000, should be CORE_MEMORY_BLOCK_CHAR_LIMIT=20000). Also: - memfs-py copy simplified to parse-only (no serialize, no letta imports) - All hardcoded limit=5000 fallbacks replaced with CORE_MEMORY_BLOCK_CHAR_LIMIT - Test updated: blocks with all-default metadata correctly have no frontmatter; frontmatter verified after setting non-default description via API 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: always include description and limit in frontmatter description and limit are always rendered in the YAML frontmatter, even when at their default values. Only read_only and metadata are conditional (omitted when at defaults). 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: resolve read_only from block_update before git commit read_only was using the old Postgres value instead of the update value when committing to git. Also adds integration test coverage for read_only: true appearing in frontmatter after API PATCH, and verifying it's omitted when false (default). 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * test: add API→git round-trip coverage for description and limit Verifies that PATCH description/limit via API is reflected in frontmatter after git pull. Combined with the existing push→API test (step 6), this gives full bidirectional coverage: - API edit description/limit → pull → frontmatter updated - Push frontmatter with description/limit → API reflects changes 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
cthomas	9dee331e6c	fix(core): backfill missing blocks when git repo is incomplete (#9340 ) When enable_git_memory_for_agent is called on an agent that already has the git-memory-enabled tag, it was returning early if the repo existed, even if the repo was missing blocks. Now checks if all blocks are present in the repo and backfills any missing ones. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	2cfaba3fe6	fix(core): validate file existence before creating files_agents rows (#9341 ) Prevents ForeignKeyViolationError when attaching files to agents where the file has been deleted between listing and attachment (race condition). Now validates file IDs exist in the files table before inserting, and skips any missing files with a warning log. Fixes Datadog issue a1768774-d691-11f0-9330-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	cb982cae89	fix(core): skip duplicate block attachment to prevent UniqueViolationError (#9343 ) Check if a block is already attached to an agent before appending to core_memory. Prevents asyncpg UniqueViolationError on the unique_agent_block constraint when attach_block_async is called twice with the same (agent_id, block_id) pair. Fixes Datadog issue d8dec148-d535-11f0-95eb-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Sarah Wooders	93e453ef8f	fix(core): transform nested block labels on git memory enable (#9339 ) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	58069d760e	chore: update pgvector Docker image to official pgvector/pgvector (#9336 ) chore: update pgvector image from ankane/pgvector:v0.8.1 to pgvector/pgvector:0.8.1-pg18-trixie	2026-02-24 10:52:06 -08:00
Kian Jones	98fa16899b	fix(core): handle ExceptionGroup-wrapped ToolError in MCP clients (#9329 ) Extends the ExceptionGroup unwrapping fix from mcp_tool_executor to the base MCP client implementations (AsyncBaseMCPClient, AsyncFastMCPSSEClient, AsyncFastMCPStreamableHTTPClient). When ToolError exceptions are wrapped in ExceptionGroup by Python's async TaskGroup, the exception handler now unwraps single-exception groups before checking class names. This prevents wrapped ToolError exceptions from being logged to Datadog as unexpected errors instead of being handled as expected validation failures. Related to commit 1cbf1b231 which fixed the same issue in mcp_tool_executor. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
Kian Jones	e7039470e9	fix(core): handle ExceptionGroup-wrapped ToolError and McpError in MCP tool execution (#9328 ) * fix(core): handle ExceptionGroup-wrapped ToolError and McpError in MCP tool execution Fixes 3 related Datadog bugs (all fastmcp.exceptions.ToolError): - 75d43daa-ff04-11f0-81b2-da7ad0900000 - 7af6373e-0080-11f1-9855-da7ad0900000 - a322edc8-fffa-11f0-b26c-da7ad0900000 These errors were caused by ToolError and McpError exceptions bubbling up unhandled from the MCP REST endpoint. This fix combines the approaches from PRs #9320 and #9321: 1. Handle ExceptionGroup wrapping (Python 3.11+ async TaskGroup) 2. Check for ToolError by class name to handle module variations 3. Convert ToolError to LettaInvalidArgumentError for proper client response 4. Catch McpError and return HTTP 500 with proper error message Issue-IDs: 75d43daa-ff04-11f0-81b2-da7ad0900000, 7af6373e-0080-11f1-9855-da7ad0900000, a322edc8-fffa-11f0-b26c-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: return 422 instead of 500 for McpError (user config issue) * fix: use LettaMCPConnectionError instead of HTTPException for McpError --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
cthomas	70d749e859	fix(core): add retry with backoff for post-push sync race condition (#9335 ) When git push completes, the webhook fires immediately but GCS upload may still be in progress. This causes KeyError when trying to read commit objects that haven't been uploaded yet. Add retry with exponential backoff (1s, 2s, 4s) to handle this race. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
Kian Jones	d48932bdb6	fix(core): sanitize Unicode surrogates in all LLM client requests (#9323 ) Multiple OpenAI-compatible LLM clients (Azure, Deepseek, Groq, Together, XAI, ZAI) and Anthropic-compatible clients (Anthropic, MiniMax, Google Vertex) were overriding request_async/stream_async without calling sanitize_unicode_surrogates, causing UnicodeEncodeError when message content contained lone UTF-16 surrogates. Root cause: Child classes override parent methods but omit the sanitization step that the base OpenAIClient includes. This allows corrupted Unicode (unpaired surrogates from malformed emoji) to reach the httpx layer, which rejects it during UTF-8 encoding. Fix: Import and call sanitize_unicode_surrogates in all overridden request methods. Also removed duplicate sanitize_unicode_surrogates definition from openai_client.py that shadowed the canonical implementation in letta.helpers.json_helpers. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> Issue-ID: 10c0f2e4-f87b-11f0-b91c-da7ad0900000	2026-02-24 10:52:06 -08:00
Kian Jones	69fc934135	fix(core): handle transient database connection errors with retry logic (#9324 ) Add retry mechanism for ConnectionError during asyncpg SSL handshake failures. Implements exponential backoff (3 attempts) and returns 503 on exhaustion. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> Issue-ID: 8caf1136-0200-11f1-8f4d-da7ad0900000	2026-02-24 10:52:06 -08:00
cthomas	05b77a5fed	fix(core): update memfs client to use memory/ directory (#9322 ) PR #9309 changed the block storage from blocks/ to memory/ directory. Update memfs_client.py and memfs_client_base.py to match. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
Kian Jones	b0c40b6b1d	fix: multi_agent flaky test (#9314 ) * fix(core): handle PermissionDeniedError in provider API key validation Fixed OpenAI PermissionDeniedError being raised as unknown error when validating provider API keys. The check_api_key methods in OpenAI-based providers (OpenAI, OpenRouter, Azure, Together) now properly catch and re-raise PermissionDeniedError as LLMPermissionDeniedError. 🐛 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): handle Unicode surrogates in OpenAI requests Sanitize invalid UTF-16 surrogates before sending requests to OpenAI API. Fixes UnicodeEncodeError when message content contains unpaired surrogates from corrupted emoji data or malformed Unicode sequences. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * try to fix * revert random stuff * revert some stuff --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
Kian Jones	662ec082cf	fix(core): handle MCP errors and API key whitespace (#9306 ) * fix: strip whitespace from API keys in LLM client headers Fixes httpx.LocalProtocolError when API keys contain leading/trailing whitespace. Strips whitespace from API keys before using them in HTTP headers across: - OpenAI client (openai.py) - Mistral client (mistral.py) - Anthropic client (anthropic_client.py) - Anthropic schema provider (schemas/providers/anthropic.py) - Google AI client (google_ai_client.py) - Proxy helpers (proxy_helpers.py) 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: handle McpError gracefully in MCP client execute_tool Return error as failed result instead of re-raising to avoid Datadog alerts for expected user-facing errors like missing tool arguments. * fix: strip whitespace from API keys before passing to httpx client Fixes httpx.LocalProtocolError by stripping leading/trailing whitespace from API keys before passing them to OpenAI/AsyncOpenAI clients. The OpenAI client library constructs Authorization headers internally, and invalid header values (like keys with leading spaces) cause protocol errors. Applied fix to: - azure_client.py (AzureOpenAI/AsyncAzureOpenAI) - deepseek_client.py (OpenAI/AsyncOpenAI) - openai_client.py (OpenAI/AsyncOpenAI via kwargs) - xai_client.py (OpenAI/AsyncOpenAI) 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: handle JSONDecodeError in OpenAI client requests Catches json.JSONDecodeError from OpenAI SDK when API returns invalid JSON (typically HTML error pages from 500-series errors) and converts to LLMServerError with helpful details. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): strip API key whitespace at schema level on write/create Add field_validator to ProviderCreate, ProviderUpdate, and ProviderCheck schemas to strip whitespace from api_key and access_key fields before persistence. This ensures keys are clean at the point of entry, preventing whitespace from being encrypted and stored in the database. Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com> * refactor: remove api_key.strip() calls across all LLM clients Remove redundant .strip() calls on api_key parameters since pydantic models now handle whitespace trimming at the validation layer. This centralizes the validation logic and follows DRY principles. - Updated 13 files across multiple LLM client implementations - Removed 34 occurrences of api_key.strip() - Includes: OpenAI, Anthropic, Azure, Google AI, Groq, XAI, DeepSeek, ZAI, Together, Mistral - Also updated proxy helpers and provider schemas 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: remove redundant ternary operators from api_key parameters Remove `if api_key else None` ternaries since pydantic validation ensures api_key is either a valid string or None. The ternary was defensive programming that's now unnecessary with proper model-level validation. - Simplified 23 occurrences across 7 files - Cleaner, more concise client initialization code - No behavioral change since pydantic already handles this 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>	2026-02-24 10:52:06 -08:00
cthomas	126d8830b8	feat: set memfs env vars in deploy wf (#9318 )	2026-02-24 10:52:06 -08:00
cthomas	0bdd555f33	feat: add memfs-py service (#9315 ) * feat: add memfs-py service * add tf for bucket access and secrets v2 access * feat(memfs): add helm charts, deploy workflow, and bug fixes - Add dev helm chart (helm/dev/memfs-py/) with CSI secrets pattern - Update prod helm chart with CSI secrets and correct service account - Add GitHub Actions deploy workflow - Change port from 8284 to 8285 to avoid conflict with core's dulwich sidecar - Fix chunked transfer encoding issue (strip HTTP_TRANSFER_ENCODING header) - Fix timestamp parsing to handle both ISO and HTTP date formats - Fix get_head_sha to raise FileNotFoundError on 404 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Kian Jones <kian@letta.com> Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
Sarah Wooders	21e880907f	feat(core): structure memory directory and block labels [LET-7336] (#9309 )	2026-02-24 10:52:06 -08:00
Kian Jones	3709be28dd	fix(core): handle Google GenAI validation errors (#9307 ) * fix: handle const keyword in google genai tool schemas * fix: handle pydantic ValidationError in Google GenAI client Fixes Datadog error tracking issue where pydantic_core.ValidationError was raised when tool schemas contained unsupported fields (e.g., 'const', 'default', 'additionalProperties'). Changes: - Add error handling for pydantic ValidationError in request(), request_async(), and stream_async() - Convert validation errors to LLMBadRequestError with helpful error message - Deep copy tool parameters before cleaning to avoid modifying shared objects - Add imports for pydantic_core and copy module This prevents unhandled exceptions and provides better diagnostics when tool schemas contain fields not supported by Google AI API.	2026-02-24 10:52:06 -08:00
Kian Jones	be60697a62	fix(core): handle protocol errors and foreign key violations (#9308 ) * fix(core): handle PermissionDeniedError in provider API key validation Fixed OpenAI PermissionDeniedError being raised as unknown error when validating provider API keys. The check_api_key methods in OpenAI-based providers (OpenAI, OpenRouter, Azure, Together) now properly catch and re-raise PermissionDeniedError as LLMPermissionDeniedError. 🐛 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): handle Unicode surrogates in OpenAI requests Sanitize invalid UTF-16 surrogates before sending requests to OpenAI API. Fixes UnicodeEncodeError when message content contains unpaired surrogates from corrupted emoji data or malformed Unicode sequences. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): handle MCP tool schema validation errors gracefully Catch fastmcp.exceptions.ToolError in execute_mcp_tool endpoint and convert to LettaInvalidArgumentError (400) instead of letting it propagate as 500 error. This is an expected user error when tool arguments don't match the MCP tool's schema. Fixes Datadog issue 8f2d874a-f8e5-11f0-9b25-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): handle ExceptionGroup-wrapped ToolError in MCP executor When MCP tools fail with validation errors (e.g., missing required parameters), fastmcp raises ToolError exceptions that may be wrapped in ExceptionGroup by Python's async TaskGroup. The exception handler now unwraps single-exception groups before checking if the error should be handled gracefully. Fixes Calendly API "organization parameter missing" errors being logged to Datadog instead of returning friendly error messages to users. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: handle missing agent in create_conversation to prevent foreign key violation * Update .gitignore --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
Ari Webb	85ee7ed7b4	fix: anthropic tool sanitation (#9310 )	2026-02-24 10:52:06 -08:00
Kian Jones	6f746c5225	fix(core): handle Anthropic overloaded errors and Unicode encoding issues (#9305 ) * fix: handle Anthropic overloaded_error in streaming interfaces * fix: handle Unicode surrogates in OpenAI requests Sanitize Unicode surrogate pairs before sending requests to OpenAI API. Surrogate pairs (U+D800-U+DFFF) are UTF-16 encoding artifacts that cause UnicodeEncodeError when encoding to UTF-8. Fixes Datadog error: 'utf-8' codec can't encode character '\ud83c' in position 326605: surrogates not allowed * fix: handle UnicodeEncodeError from lone Unicode surrogates in OpenAI requests Improved sanitize_unicode_surrogates() to explicitly filter out lone surrogate characters (U+D800 to U+DFFF) which are invalid in UTF-8. Previous implementation used errors='ignore' which could still fail in edge cases. New approach directly checks Unicode code points and removes any surrogates before data reaches httpx encoding. Also added sanitization to stream_async_responses() method which was missing it. Fixes: 'utf-8' codec can't encode character '\ud83c' in position X: surrogates not allowed	2026-02-24 10:52:06 -08:00
Sarah Wooders	93249b96f5	test(core): git-backed memory repo integration (real object store) (#9298 ) --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
jnjpng	0bdedb3c0f	feat: agent generate endpoint (#9304 ) * base * update * clean up * update	2026-02-24 10:52:06 -08:00

1 2 3 4 5 ...

7148 Commits