Commit Graph

4831 Commits

Author SHA1 Message Date
Kevin Lin
7ec2783ded fix: increase SGLang provider default context window from 8k to 32k (#9435) 2026-02-24 10:52:07 -08:00
Kian Jones
e3dbb44fc9 feat(telem): support reading from clickhouse traces (#9431)
draft
2026-02-24 10:52:07 -08:00
Kian Jones
7c65fd77f1 fix(core): return 400 for Google GenAI ClientError bad requests (#9357)
Google genai.errors.ClientError with code 400 was being caught and
wrapped as LLMBadRequestError but returned to clients as 502 because
no dedicated FastAPI exception handler existed for LLMBadRequestError.

- Add LLMBadRequestError exception handler in app.py returning HTTP 400
- Fix ErrorCode on Google 400 bad requests from INTERNAL_SERVER_ERROR
  to INVALID_ARGUMENT
- Route Google API errors through handle_llm_error in stream_async path

Datadog: https://us5.datadoghq.com/error-tracking/issue/4eb3ff3c-d937-11f0-8177-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
3634464251 fix(core): handle anyio.BrokenResourceError for client disconnects (#9358)
Catch BrokenResourceError alongside ClosedResourceError in streaming
response, logging middleware, and app exception handlers so client
disconnects are logged at info level instead of surfacing as 500s.

Datadog: https://us5.datadoghq.com/error-tracking/issue/4f57af0c-d558-11f0-a65d-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
0d42afa151 fix(core): catch LockNotAvailableError and return 409 instead of 500 (#9359)
Re-apply changes on top of latest main to resolve merge conflicts.

- Add DatabaseLockNotAvailableError custom exception in orm/errors.py
- Catch asyncpg LockNotAvailableError and pgcode 55P03 in _handle_dbapi_error
- Register FastAPI exception handler returning 409 with Retry-After header



🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
411bb63990 fix(core): improve error handling for upstream LLM provider errors (#9423)
Handle HTML error responses from ALB/load balancers in OpenAI client and
add explicit InternalServerError handling for Anthropic upstream issues.

🐛 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
6d4e320cc3 chore: clean up dead code (#9427)
remove striaght up unsued code files
2026-02-24 10:52:07 -08:00
Kian Jones
2568c02b51 fix(core): strip quotes from MCP server header keys and values (#9349)
* fix(core): strip quotes from MCP server header keys and values

Users pasting JSON-formatted env vars into MCP server config end up with
quoted header names like `"CONTEXT7_API_KEY":` which causes
httpx.LocalProtocolError. Sanitize keys (strip surrounding quotes and
trailing colons) and values (strip surrounding quotes) in
resolve_custom_headers, resolve_environment_variables for HTTP configs,
and stdio env dicts.

Datadog: https://us5.datadoghq.com/error-tracking/issue/4a2f4af6-f2d8-11f0-930c-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: revert stdio env sanitization to pass-through

The stdio path doesn't need header/env sanitization - that's only
relevant for SSE/streamable HTTP servers with auth headers.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
ddcfeb26b1 fix(core): catch all MCP tool execution errors instead of re-raising (#9419)
* fix(core): catch all MCP tool execution errors instead of re-raising

MCP tools are external user-configured servers - any failure during
tool execution is expected and should be returned as (error_msg, False)
to the agent, not raised as an exception that hits Datadog as a 500.

Previously:
- base_client.py only caught McpError/ToolError, re-raised everything else
- fastmcp_client.py (both SSE and StreamableHTTP) always re-raised

Now all three execute_tool() methods catch all exceptions and return
the error message to the agent conversation. The agent handles tool
failures via the error message naturally.

This silences ~15 Datadog issue types including:
- fastmcp.exceptions.ToolError (validation, permissions)
- mcp.shared.exceptions.McpError (connection closed, credentials)
- httpx.HTTPStatusError (503 from Zapier, etc.)
- httpx.ConnectError, ReadTimeout, RemoteProtocolError
- requests.exceptions.ConnectionError
- builtins.ConnectionError

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): log unexpected MCP errors at warning level with traceback

Expected MCP errors (ToolError, McpError, httpx.*, ConnectionError, etc.)
log at info level. Anything else (e.g. TypeError, AttributeError from
our own code) logs at warning with exc_info=True so it still surfaces
in Datadog without crashing the request.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
382e216cbb fix(core): differentiate BYOK vs base provider in all LLM error details (#9425)
Add is_byok flag to every LLMError's details dict returned from
handle_llm_error across all providers (OpenAI, Anthropic, Google,
ChatGPT OAuth). This enables observability into whether errors
originate from Letta's production keys or user-provided BYOK keys.

The rate limit handler in app.py now returns a more helpful message
for BYOK users ("check your provider's rate limits and billing")
versus the generic message for base provider rate limits.

Datadog issues:
- https://us5.datadoghq.com/error-tracking/issue/b711c824-f490-11f0-96e4-da7ad0900000
- https://us5.datadoghq.com/error-tracking/issue/76623036-f4de-11f0-8697-da7ad0900000
- https://us5.datadoghq.com/error-tracking/issue/43e9888a-dfcf-11f0-a645-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
424a1ada64 fix: google gen ai format error fix (#9147)
* google gen ai format error fix

* fix(core): add $ref safety net, warning log, and unit tests for Google schema resolution

- Add `$ref` to unsupported_keys in `_clean_google_ai_schema_properties` so unresolvable refs (e.g. `#/properties/...` style) are stripped as a safety net instead of crashing the Google SDK
- Add warning log when `_resolve_json_schema_refs` encounters a ref it cannot resolve
- Deduplicate the `#/$defs/` and `#/definitions/` resolution branches
- Add 11 unit tests covering: single/multiple $defs, nested refs, refs in anyOf/allOf, array items, definitions key, unresolvable refs, and the full resolve+clean pipeline

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
02f776b016 fix: handle system messages with mixed TextContent + ImageContent (#9418)
* fix: handle system messages with mixed TextContent + ImageContent

System messages injected by external tools (e.g. packify.ai MCP) can
contain both TextContent and ImageContent. The assertions in
to_openai_responses_dicts and to_anthropic_dict expected exactly one
TextContent, causing AssertionError in production.

Extract all text parts and join them, matching how to_openai_dict
already handles this case.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: replace asserts with logger.warning + graceful skip

Asserts are the wrong tool for production input validation — if a
system message has only non-text content, we should warn and skip
rather than crash the request.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
a00270d199 fix(core): handle UTF-8 surrogate characters in API responses (#9422)
* fix(core): handle UTF-8 surrogate characters in API responses

LLM responses or user input can contain surrogate characters (U+D800-U+DFFF)
which are valid Python strings but illegal in UTF-8. ORJSONResponse rejects
them with "str is not valid UTF-8: surrogates not allowed". Add
SafeORJSONResponse that catches the TypeError and strips surrogates before
retrying serialization.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: reuse sanitize_unicode_surrogates from json_helpers

Replace the inline _sanitize_surrogates function with the existing
sanitize_unicode_surrogates helper from letta.helpers.json_helpers,
which is already used across all LLM clients.

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
2026-02-24 10:52:07 -08:00
jnjpng
39b25a0e3c fix: update ContextWindowCalculator to parse new system message sections (#9398)
* fix: update ContextWindowCalculator to parse new system message sections

The context window calculator was using outdated position-based parsing
that only handled 3 sections (base_instructions, memory_blocks, memory_metadata).
The actual system message now includes additional sections that were not
being tracked:

- <memory_filesystem> (git-enabled agents)
- <tool_usage_rules> (when tool rules configured)
- <directories> (when sources attached)

Changes:
- Add _extract_tag_content() helper for proper XML tag extraction
- Rewrite extract_system_components() to return a Dict with all 6 sections
- Update calculate_context_window() to count tokens for new sections
- Add new fields to ContextWindowOverview schema with backward-compatible defaults
- Add unit tests for the extraction logic

* update

* generate

* fix: check attached file in directories section instead of core_memory

Files are rendered inside <directories> tags, not <memory_blocks>.
Update validate_context_window_overview assertions accordingly.

* fix: address review feedback for context window parser

- Fix git-enabled agents regression: capture bare file blocks
  (e.g. <system/human.md>) rendered after </memory_filesystem> as
  core_memory via new _extract_git_core_memory() method
- Make _extract_top_level_tag robust: scan all occurrences to find
  tag outside container, handling nested-first + top-level-later case
- Document system_prompt tag inconsistency in docstring
- Add TODO to base_agent.py extract_dynamic_section linking to
  ContextWindowCalculator to flag parallel parser tech debt
- Add tests: git-enabled agent parsing, dual-occurrence tag
  extraction, pure text system prompt, git-enabled integration test
2026-02-24 10:52:07 -08:00
Sarah Wooders
2ffef0fb31 Fix git-memory context preview parsing (#9414)
* fix(core): handle git memory label prefix collisions in filesystem view

Prevent context window preview crashes when a block label is both a leaf and a prefix (e.g. system/human and system/human/context) by rendering a node as both file and directory. Add regression test.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): parse git-backed core memory in context window preview

ContextWindowCalculator.extract_system_components now detects git-backed memory rendering (<memory_filesystem> and <system/...> tags) when <memory_blocks> wrapper is absent, so core_memory is populated in the context preview. Add regression tests.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Shubham Naik
ca32311b9a feat: allow users to specify via query to stip messages [LET-7392] (#9411)
* feat: allow users to specify via query to stip messages

* chore: regenerate API SDK and OpenAPI spec [LET-7392]

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Ari Webb <AriWebb@users.noreply.github.com>
Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Ari Webb <AriWebb@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
jnjpng
6f51fa74be fix: handling should continue for non system exceeded exceptions (#9406)
* base

* add log
2026-02-24 10:52:07 -08:00
Sarah Wooders
0dde155e9a feat: Prefix cache optimization system prompt (#9381) 2026-02-24 10:52:07 -08:00
Kian Jones
7eb85707b1 feat(tf): gpu runners and prod memory_repos (#9283)
* add gpu runners and prod memory_repos

* add lmstudio and vllm in model_settings

* fix llm_configs and change variable name in reusable workflow and change perms for memory_repos to admin in tf

* fix: update self-hosted provider tests to use SDK 1.0 and v2 tests

- Update letta-client from ==0.1.324 to >=1.0.0
- Switch ollama/vllm/lmstudio tests to integration_test_send_message_v2.py

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: use openai provider_type for self-hosted model settings

ollama/vllm/lmstudio are not valid provider_type values in the SDK
model_settings schema - they use openai-compatible APIs so provider_type
should be openai. The provider routing is determined by the handle prefix.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: use openai_compat_base_url for ollama/vllm/lmstudio providers

When reconstructing LLMConfig from a model handle lookup, use the
provider's openai_compat_base_url (which includes /v1) instead of
raw base_url. This fixes 404 errors when calling ollama/vllm/lmstudio
since OpenAI client expects /v1/chat/completions endpoint.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: enable redis for ollama/vllm/lmstudio tests

Background streaming tests require Redis. Add use-redis: true to
self-hosted provider test workflows.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* add memfs-py in prod bucket access

* change ollama

* change packer model defaults

* self-hosted provider support

* diasble reasoner to match the number of messages in test case, enable parallel tool calls, and pass embedding configs

* remove reasoning setting not supported for ollama

* add qwen3 to extra assistant message case

* lower temp

* prep for lmstudio and vllm

* used lmstudio_openai client

* skip parallel tool calls on cpu ran provider lmstudio

* revert downgrade since it's so slow already

* add reuired flags for tool call parsing etc.

* change tool call parser from hermes to qwen3_xml

* qwen3_xmlk -> qwen3_coder

* upgrade vllm to latest container

* revert to hermes (incompatible with parallel tool calls?) and skipping vllm tests on parallel tool calls

* install uv redis extra

* remove lmstudio

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kevin Lin
23c94ec6d3 feat: add log probabilities from OpenAI-compatible servers and SGLang native endpoint (#9240)
* Add log probabilities support for RL training

This enables Letta server to request and return log probabilities from
OpenAI-compatible providers (including SGLang) for use in RL training.

Changes:
- LLMConfig: Add return_logprobs and top_logprobs fields
- OpenAIClient: Set logprobs in ChatCompletionRequest when enabled
- LettaLLMAdapter: Add logprobs field and extract from response
- LettaResponse: Add logprobs field to return log probs to client
- LettaRequest: Add return_logprobs/top_logprobs for per-request override
- LettaAgentV3: Store and pass logprobs through to response
- agents.py: Handle request-level logprobs override

Usage:
  response = client.agents.messages.create(
      agent_id=agent_id,
      messages=[...],
      return_logprobs=True,
      top_logprobs=5,
  )
  print(response.logprobs)  # Per-token log probabilities

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* Add multi-turn token tracking for RL training via SGLang native endpoint

- Add TurnTokenData schema to track token IDs and logprobs per turn
- Add return_token_ids flag to LettaRequest and LLMConfig
- Create SGLangNativeClient for /generate endpoint (returns output_ids)
- Create SGLangNativeAdapter that uses native endpoint
- Modify LettaAgentV3 to accumulate turns across LLM calls
- Include turns in LettaResponse when return_token_ids=True

* Fix: Add SGLang native adapter to step() method, not just stream()

* Fix: Handle Pydantic Message objects in SGLang native adapter

* Fix: Remove api_key reference from LLMConfig (not present)

* Fix: Add missing 'created' field to ChatCompletionResponse

* Add full tool support to SGLang native adapter

- Format tools into prompt in Qwen-style format
- Parse tool calls from <tool_call> tags in response
- Format tool results as <tool_response> in user messages
- Set finish_reason to 'tool_calls' when tools are called

* Use tokenizer.apply_chat_template for proper tool formatting

- Add tokenizer caching in SGLang native adapter
- Use apply_chat_template when tokenizer available
- Fall back to manual formatting if not
- Convert Letta messages to OpenAI format for tokenizer

* Fix: Use func_response instead of tool_return for ToolReturn content

* Fix: Get output_token_logprobs from meta_info in SGLang response

* Fix: Allow None in output_token_logprobs (SGLang format includes null)

* chore: remove unrelated files from logprobs branch

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add missing call_type param to adapter constructors in letta_agent_v3

The SGLang refactor dropped call_type=LLMCallType.agent_step when extracting
adapter creation into conditional blocks. Restores it for all 3 spots (SGLang
in step, SimpleLLM in step, SGLang in stream).

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* just stage-api && just publish-api

* fix: update expected LLMConfig fields in schema test for logprobs support

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: remove rllm provider references

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* just stage-api && just publish-api

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Ubuntu <ubuntu@ip-172-31-65-206.ec2.internal>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
f9f1c55c93 fix: fix context preview for git (#9403) 2026-02-24 10:52:07 -08:00
Sarah Wooders
bbc648909b refactor: drop memory/ prefix from git memory repo file paths and update core memory rendering [LET-7356] (#9395) 2026-02-24 10:52:07 -08:00
Ari Webb
5fd5a6dd07 feat: add new azure api maintaining backward compat (#9387)
* feat: add new azure provider type

* fix context window
2026-02-24 10:52:07 -08:00
jnjpng
226df8baef fix: propagate context window exceeded from chatgpt oauth client (#9393)
* base

* clean up

* fixes
2026-02-24 10:52:07 -08:00
Ari Webb
c08b67a26a feat: add ToolReturnCreate to MessageCreateParams [LET-7366] (#9385)
* fix: add ToolReturnCreate to sdk types

* ci
2026-02-24 10:52:07 -08:00
Cameron
89a7a12b38 fix(core): remove send_message tool requirement from A2A messages (#9383)
The A2A messaging tools were instructing receiving agents to use the
send_message tool to reply, but that tool is often not attached to
agents anymore. This caused agents confusion when they couldn't find
the required tool.

For synchronous functions (send_message_to_agent_and_wait_for_reply,
send_message_to_agents_matching_tags, send_message_to_all_agents_in_group),
the system already captures AssistantMessage automatically, so agents
just need to respond normally.

For the async/fire-and-forget function (send_message_to_agent_async),
updated to indicate it's a one-way notification and hint that messaging
tools exist without requiring a specific one.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
6e0e1cc312 fix(core): validate run exists before creating step/step_metrics (#9382)
Checks if the referenced run_id exists in the runs table before
inserting steps and step_metrics. If the run doesn't exist (deleted
or failed creation), sets run_id to None instead of hitting
ForeignKeyViolationError on fk_steps_run_id.

Fixes https://us5.datadoghq.com/error-tracking/issue/a1768774-d691-11f0-9330-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
4c753f3f3c fix: handle non-JSON responses from LLM provider endpoints (#9362)
When an OpenAI/Anthropic-compatible endpoint returns a non-JSON response
(e.g. HTML error page), the SDK's paginated response parser falls back
to returning a raw string. The post-parser then calls
_set_private_attributes() on that string, causing an AttributeError.

Add explicit AttributeError handling around SDK models.list() calls in
provider check_api_key/list_llm_models_async methods, and add type
guards in convert_response_to_chat_completion to reject raw strings
before Pydantic model construction.

Datadog: https://us5.datadoghq.com/error-tracking/issue/59a7a206-00b8-11f1-be73-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
71e0a8aab9 fix(core): use INSERT ON CONFLICT DO NOTHING for provider model sync (#9342)
* fix(core): use INSERT ON CONFLICT DO NOTHING for provider model sync

Replaces try/except around model.create_async() with pg_insert()
.on_conflict_do_nothing() to prevent UniqueViolationError from being
raised at the asyncpg driver level during concurrent model syncs.
The previous approach caught the exception in Python but ddtrace still
captured it at the driver level, causing Datadog error tracking noise.

Fixes Datadog issue d8dec148-d535-11f0-95eb-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* cleaner impl

* fix

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Ari Webb <ari@letta.com>
2026-02-24 10:52:07 -08:00
Charles Packer
b0e16ae50f fix: surface GPT-5.3 Codex for ChatGPT OAuth providers (#9379) 2026-02-24 10:52:07 -08:00
Sarah Wooders
526da4c49b Revert "perf: optimize prefix caching by skipping system prompt rebuild on every step" (#9380)
Revert "perf: optimize prefix caching by skipping system prompt rebuild on ev…"

This reverts commit eafa4144c2577a45b7007a177b701863b98d1dfa.
2026-02-24 10:52:07 -08:00
Sarah Wooders
9dbe28e8f1 perf: optimize prefix caching by skipping system prompt rebuild on every step (#9080) 2026-02-24 10:52:07 -08:00
Kian Jones
825019c2ce fix(core): handle Anthropic streaming required ValueError (#9344)
* Fix Anthropic ValueError for long-running operations

Adds proper error handling for Anthropic SDK's streaming requirement.
When operations may exceed 10 minutes, the SDK raises a ValueError.

Changes:
- Catch ValueError in sync request() method
- Provide user-friendly error directing to async API
- Async version already had this fix with streaming fallback

Fixes Datadog issue 955d10b4-ed95-11f0-a5a5-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: use LLMBadRequestError instead of ValueError for Anthropic streaming constraint

ValueError maps to HTTP 400 which incorrectly implies a bad client request.
LLMBadRequestError maps to HTTP 502 (Bad Gateway) which correctly signals
that the downstream provider (Anthropic) rejected the proxied request due
to its own constraints.

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
2026-02-24 10:52:07 -08:00
Kian Jones
14ef479e70 fix(core): handle empty content in Anthropic response gracefully (#9345)
Fixes Datadog issue a47619fa-d5b8-11f0-9fd7-da7ad0900000

Handle empty content in Anthropic responses gracefully by replacing RuntimeError with LLMServerError. Now logs detailed debugging information (response ID, model, stop_reason) and returns a user-friendly error instead of crashing.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
4eb27e23b3 fix(core): add deadlock retry logic to ORM write operations (#9352)
Adds automatic retry with exponential backoff for PostgreSQL deadlock
errors (40P01) in all ORM write methods: create_async, update_async,
batch_create_async, hard_delete_async, and bulk_hard_delete_async.

For update_async, column values are snapshotted before the commit
attempt so they can be restored after rollback clears them.

Also adds DatabaseDeadlockError to _handle_dbapi_error as a fallback
when retries are exhausted.

Datadog: https://us5.datadoghq.com/error-tracking/issue/53ccdd7a-f0cc-11f0-8969-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
2026-02-24 10:52:07 -08:00
Kian Jones
2c0cddf9f5 fix(core): handle Google 499 CANCELLED as client disconnect, not server error (#9363)
The google.genai.errors.ClientError with code 499 (CANCELLED) indicates the
client disconnected, not a server-side failure. Previously this fell through
to the generic ClientError handler and was classified as LLMServerError,
causing false 500s in Datadog error tracking.

- Add explicit 499 handling in handle_llm_error: log at info level, return
  LLMConnectionError instead of LLMServerError
- Catch 499 during stream iteration in stream_async and end gracefully
  instead of propagating the error

Datadog: https://us5.datadoghq.com/error-tracking/issue/c8453aaa-d559-11f0-81c6-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
f20fdc73d1 fix(core): preserve Gemini thought_signature on function calls in non-streaming path (#9351)
* fix(core): preserve Gemini thought_signature on function calls in non-streaming path

The Google Gemini API requires thought_signature to be echoed back on
function call parts in multi-turn conversations. In the non-streaming
request path, the signature was only captured for subsequent function
calls (else branch) but dropped for the first/only function call (if
branch) in convert_response_to_chat_completion. This caused 400
INVALID_ARGUMENT errors on the next turn.

Additionally, when no ReasoningContent existed to carry the signature
(e.g. Gemini 2.5 Flash with include_thoughts=False), the signature was
lost in the adapter layer. Now it falls through to TextContent.

Datadog: https://us5.datadoghq.com/error-tracking/issue/17c4b114-d596-11f0-bcd6-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): preserve Gemini thought_signature in non-temporal agent path

Carry reasoning_content_signature on TextContent in letta_agent.py
at both locations where content falls through from reasoning (same
fix already applied to the adapter and temporal activity paths).

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
2026-02-24 10:52:07 -08:00
Kian Jones
32d87b70d7 fix(core): return 400 for ContextWindowExceededError instead of 500 (#9348)
SystemPromptTokenExceededError (a subclass of ContextWindowExceededError)
was hitting the generic 500 handler in Datadog. This is a client-side
configuration issue (system prompt too large for context window), so it
should return 400 Bad Request.

Datadog: https://us5.datadoghq.com/error-tracking/issue/3360e252-d5f1-11f0-8467-da7ad0900000
Also covers: https://us5.datadoghq.com/error-tracking/issue/6e50c57c

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
47aedfa1a7 fix(core): convert MCP ConnectionError to LettaMCPConnectionError for proper HTTP 502 responses (#9364)
MCP server connection failures were raising Python's builtin ConnectionError,
which bypassed the LettaMCPConnectionError FastAPI exception handler and hit
Datadog as unhandled 500 errors. Now all MCP client classes convert
ConnectionError to LettaMCPConnectionError at the source, which the existing
exception handler returns as a user-friendly 502.

Datadog: https://us5.datadoghq.com/error-tracking/issue/93db4a82-fe5a-11f0-85f0-da7ad0900000

🐛 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
31d221b47e fix: handle MCP tool schemas missing properties key gracefully (#9347)
MCP tools can have an inputSchema like `{"type": "object", "additionalProperties": false}`
with no `properties` key (valid for zero-arg tools). The hard assert on `"properties" in
parameters_schema` caused an AssertionError. Replace assertions with safe defaults.

Datadog: https://us5.datadoghq.com/error-tracking/issue/ecee9ee2-db75-11f0-ba66-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
cbbb6d776d fix(core): unwrap ExceptionGroup in OAuth stream generator (#9346)
Fixes Datadog issue 5efbb1d4-eec5-11f0-8f8e-da7ad0900000

Add ExceptionGroup unwrapping in OAuth stream exception handler.

The bug was caused by ExceptionGroup not being caught by the general
`except Exception` handler, since ExceptionGroup is a subclass of
BaseException, not Exception. This caused TaskGroup errors to escape
as unhandled ExceptionGroups in Datadog.

The fix adds an explicit ExceptionGroup handler before the general
Exception handler, following the same unwrapping pattern used in
other parts of the codebase (mcp_tool_executor.py, base_client.py).

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
745dd1e124 fix(core): reject empty API keys in Bearer auth headers (#9350)
Empty or None API keys resulted in "Bearer " header values which cause
httpx.LocalProtocolError. Use truthiness checks instead of `is not None`
to also reject empty strings before constructing Authorization headers.

Datadog: https://us5.datadoghq.com/error-tracking/issue/ad3c1e38-d557-11f0-a65d-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
d592ec3135 fix: handle DBAPIError wrapping asyncpg DeadlockDetectedError (#9355)
SQLAlchemy wraps asyncpg's DeadlockDetectedError in a DBAPIError,
which was falling through to the generic 500 handler. Now detected
at both the ORM level (_handle_dbapi_error) and FastAPI handler level,
returning 409 with Retry-After header.

Datadog: https://us5.datadoghq.com/error-tracking/issue/2f1dc54c-dab6-11f0-a828-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Charles Packer
f58c4a43fa fix(core): remove "edit again if necessary" from memory tool return message (#9367)
Models (especially Opus) take this instruction literally and re-call
the memory edit tool in a loop — one user saw 96 consecutive rethink
calls. Dropping the sentence stops the feedback loop while still
asking the agent to review the result.

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
369cdf72c7 feat(core): store block metadata as YAML frontmatter in .md files (#9365)
* feat(core): store block metadata as YAML frontmatter in .md files

Block .md files in git repos now embed metadata (description, limit,
read_only, metadata dict) as YAML frontmatter instead of a separate
metadata/blocks.json file. Only non-default values are rendered.

Format:
  ---
  description: "Who I am"
  limit: 5000
  ---
  Block value content here...

Changes:
- New block_markdown.py utility (serialize_block / parse_block_markdown)
- Updated all three write/read paths: manager.py, memfs_client.py,
  memfs_client_base.py
- block_manager_git.py now passes description/limit/read_only/metadata
  through to git commits
- Post-push sync (git_http.py) parses frontmatter and syncs metadata
  fields to Postgres
- Removed metadata/blocks.json reads/writes entirely
- Backward compat: files without frontmatter treated as raw value
- Integration test verifies frontmatter in cloned files and metadata
  sync via git push

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: derive frontmatter defaults from BaseBlock schema, not hardcoded dict

Remove _DEFAULTS dict from block_markdown.py. The core version now
imports BaseBlock and reads field defaults via model_fields. This
fixes the limit default (was 5000, should be CORE_MEMORY_BLOCK_CHAR_LIMIT=20000).

Also:
- memfs-py copy simplified to parse-only (no serialize, no letta imports)
- All hardcoded limit=5000 fallbacks replaced with CORE_MEMORY_BLOCK_CHAR_LIMIT
- Test updated: blocks with all-default metadata correctly have no frontmatter;
  frontmatter verified after setting non-default description via API

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: always include description and limit in frontmatter

description and limit are always rendered in the YAML frontmatter,
even when at their default values. Only read_only and metadata are
conditional (omitted when at defaults).

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: resolve read_only from block_update before git commit

read_only was using the old Postgres value instead of the update value
when committing to git. Also adds integration test coverage for
read_only: true appearing in frontmatter after API PATCH, and
verifying it's omitted when false (default).

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: add API→git round-trip coverage for description and limit

Verifies that PATCH description/limit via API is reflected in
frontmatter after git pull. Combined with the existing push→API
test (step 6), this gives full bidirectional coverage:
- API edit description/limit → pull → frontmatter updated
- Push frontmatter with description/limit → API reflects changes

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
cthomas
9dee331e6c fix(core): backfill missing blocks when git repo is incomplete (#9340)
When enable_git_memory_for_agent is called on an agent that already has
the git-memory-enabled tag, it was returning early if the repo existed,
even if the repo was missing blocks.

Now checks if all blocks are present in the repo and backfills any
missing ones.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
2cfaba3fe6 fix(core): validate file existence before creating files_agents rows (#9341)
Prevents ForeignKeyViolationError when attaching files to agents where
the file has been deleted between listing and attachment (race condition).
Now validates file IDs exist in the files table before inserting, and
skips any missing files with a warning log.

Fixes Datadog issue a1768774-d691-11f0-9330-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
cb982cae89 fix(core): skip duplicate block attachment to prevent UniqueViolationError (#9343)
Check if a block is already attached to an agent before appending to
core_memory. Prevents asyncpg UniqueViolationError on the
unique_agent_block constraint when attach_block_async is called twice
with the same (agent_id, block_id) pair.

Fixes Datadog issue d8dec148-d535-11f0-95eb-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
93e453ef8f fix(core): transform nested block labels on git memory enable (#9339)
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
98fa16899b fix(core): handle ExceptionGroup-wrapped ToolError in MCP clients (#9329)
Extends the ExceptionGroup unwrapping fix from mcp_tool_executor to the
base MCP client implementations (AsyncBaseMCPClient, AsyncFastMCPSSEClient,
AsyncFastMCPStreamableHTTPClient). When ToolError exceptions are wrapped
in ExceptionGroup by Python's async TaskGroup, the exception handler now
unwraps single-exception groups before checking class names.

This prevents wrapped ToolError exceptions from being logged to Datadog
as unexpected errors instead of being handled as expected validation failures.

Related to commit 1cbf1b231 which fixed the same issue in mcp_tool_executor.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00