Commit Graph

7201 Commits

Author SHA1 Message Date
Shubham Naik
3247fa7065 chore: fix favoriting bugs (#9505)
* chore: fix favoriting bugs

* chore: fix favoriting bugs

* chore: fix favoriting bugs
2026-02-24 10:55:11 -08:00
Shubham Naik
4a829123cd Listener mode (#9486)
* feat: listener mdoe

* feat: listener mdoe

* feat: listener mdoe

* feat: listener mdoe

* feat: listener mdoe

* feat: listener mdoe

* chore: merge

* feat: listen mode

* feat: add some keycontrols

* feat: add some keycontrols

* chore: hotwire fix for core

* chore: restore connection
2026-02-24 10:55:11 -08:00
Kian Jones
f5c4ab50f4 chore: add ty + pre-commit hook and repeal even more ruff rules (#9504)
* auto fixes

* auto fix pt2 and transitive deps and undefined var checking locals()

* manual fixes (ignored or letta-code fixed)

* fix circular import

* remove all ignores, add FastAPI rules and Ruff rules

* add ty and precommit

* ruff stuff

* ty check fixes

* ty check fixes pt 2

* error on invalid
2026-02-24 10:55:11 -08:00
Devansh Jain
39ddda81cc feat: add Anthropic Sonnet 4.6 (#9408) 2026-02-24 10:55:11 -08:00
Kian Jones
25d54dd896 chore: enable F821, F401, W293 (#9503)
* auto fixes

* auto fix pt2 and transitive deps and undefined var checking locals()

* manual fixes (ignored or letta-code fixed)

* fix circular import
2026-02-24 10:55:08 -08:00
Ari Webb
fa70e09963 Revert "fix(core): add OpenAI prompt cache key and model-gated 24h retention" (#9502)
Revert "fix(core): add OpenAI prompt cache key and model-gated 24h retention …"

This reverts commit f5bb9c629cb7d45544e90758cdfb899bcef41912.
2026-02-24 10:52:07 -08:00
Sarah Wooders
2bf3314cef fix: import asyncio for parallel tool calls (#9501) 2026-02-24 10:52:07 -08:00
Shubham Naik
20c71523f8 chore: hotwire fix for core (#9500) 2026-02-24 10:52:07 -08:00
Shubham Naik
e66981c7e8 feat: update undertaker to use rate limiter (#9498) 2026-02-24 10:52:07 -08:00
Charles Packer
619e81ed1e fix(core): add OpenAI prompt cache key and model-gated 24h retention (#9492)
* fix(core): apply OpenAI prompt cache settings to request payloads

Set prompt_cache_key using agent and conversation context on both Responses and Chat Completions request builders, and enable 24h retention only for supported OpenAI models while excluding OpenRouter paths.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): prefix prompt cache key with letta tag

Add a `letta:` prefix to generated OpenAI prompt_cache_key values so cache-related entries are easier to identify in provider-side logs and diagnostics.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* add integration test

* skip test

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Ari Webb <ari@letta.com>
2026-02-24 10:52:07 -08:00
jnjpng
5b001a7749 fix: rename ChatGPT server error to ChatGPT API error (#9497)
fix: rename ChatGPT server error to ChatGPT API error in error messages
2026-02-24 10:52:07 -08:00
jnjpng
fbc0bb60d9 fix: retry ChatGPT 502 and upstream connection errors with exponential backoff (#9495)
502s and upstream connection errors (envoy proxy failures) from ChatGPT
were not being retried. This classifies them as LLMConnectionError (retryable)
in both the streaming and non-streaming paths, and adds retry handling in
the non-streaming HTTPStatusError handler so 502s get the same exponential
backoff treatment as transport-level connection drops.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
26cbdb7b7b fix(core): skip malformed send_message entries in message conversion (#9494)
Avoid failing message-list endpoints when historical send_message tool calls are missing the expected message argument by logging and skipping malformed entries during conversion.

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Charles Packer
c32d53f8a3 fix(core): remove old static landing page from Docker image (#9369)
The "Experience the new ADE" page was outdated and no longer useful.
Root path now redirects to /docs (FastAPI Swagger UI) instead.

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
amysguan
80a0d1a95f Add LLM client compaction errors to traces (#9474)
* add llm client errors to traces

* update response json for telemetry

* prevent silent failures and properly log errored responses in streaming path

* remove double logging

---------

Co-authored-by: Amy Guan <amy@letta.com>
Co-authored-by: Kian Jones <kian@letta.com>
2026-02-24 10:52:07 -08:00
Shubham Naik
2f76f2c629 Agent page update (#9475)
* feat: agent page updates

* feat: agent page updates

* feat: agent page updates

* feat: agent page updates

* feat: agent page updates

* feat: agent page updates

* chore: fix code

* chore: fix code
2026-02-24 10:52:07 -08:00
jnjpng
e3eafb1977 fix: re-raise LLMError before wrapping with handle_llm_error (#9482)
LLMError exceptions are already properly formatted errors that should
propagate directly. Without this check, they get unnecessarily wrapped
by handle_llm_error, losing their original error information.
2026-02-24 10:52:07 -08:00
Kian Jones
2f0294165c debug: log statement_timeout + connection pid on session checkout (#9472)
* debug: log statement_timeout + connection pid on every session checkout

Temporary instrumentation to diagnose why some PlanetScale connections
have statement_timeout=5s while others have 0 (disabled).

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* debug: log statement_timeout on every checkout, not just non-zero

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: rollback implicit transaction from debug query

The SELECT implicitly begins a transaction, causing "A transaction is
already begun" errors for code that calls session.begin() explicitly.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Ari Webb
0a8a8fda54 feat: add credit verification before agent message endpoints [LET-XXXX] (#9433)
* feat: add credit verification before agent message endpoints

Add credit verification checks to message endpoints to prevent
execution when organizations have insufficient credits.

- Add InsufficientCreditsError exception type
- Add CreditVerificationService that calls step-orchestrator API
- Add credit checks to /agents/{id}/messages endpoints
- Add credit checks to /conversations/{id}/messages endpoint

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* surface error in ade

* do per step instead

* parallel check

* parallel to step

* small fixes

* stage publish api

* fixes

* revert unnecessary frontend changes

* insufficient credits stop reason

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Ari Webb
5faec5632f fix: add m2.5 (#9480)
* fix: add m2.5

* fix test
2026-02-24 10:52:07 -08:00
amysguan
9bec8c64f5 New prompts/defaults for sliding_window and all compaction (#9444)
* new prompts for sliding window and all compaction + defaults to corresponding prompt

* regenerate api spec

---------

Co-authored-by: Amy Guan <amy@letta.com>
2026-02-24 10:52:07 -08:00
github-actions[bot]
0b08164cc2 fix: update system prompt metadata label to "System prompt last recompiled" (#9477)
fix: update system prompt metadata label from "Memory blocks were last modified" to "System prompt last recompiled"

When git-based memory is enabled, there are no memory blocks, so the label
"Memory blocks were last modified" is inaccurate. Changed to
"System prompt last recompiled" which accurately reflects the timestamp meaning.

Fixes #9476



🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
f55ff3a822 fix(core): descriptive error for empty message_ids instead of IndexError (#9464)
fix(core): replace IndexError with descriptive error for empty message_ids

Agent with no in-context messages (empty/None message_ids) would crash
with a cryptic IndexError on message_ids[0]. Now raises a clear
LettaError explaining the system message was not initialized.

Datadog: https://us5.datadoghq.com/error-tracking/issue/6c061c28-0830-11f1-b060-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
80f34f134d fix(core): catch bare openai.APIError in handle_llm_error (#9468)
* fix(core): catch bare openai.APIError in handle_llm_error fallthrough

openai.APIError raised during streaming (e.g. OpenRouter credit
exhaustion) is not an APIStatusError, so it skipped the catch-all
at the end and fell through to LLMError("Unhandled"). Now bare
APIErrors that aren't context window overflows are mapped to
LLMBadRequestError.

Datadog: https://us5.datadoghq.com/error-tracking/issue/7a2c356c-0849-11f1-be66-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat(core): add LLMInsufficientCreditsError for BYOK credit exhaustion

Adds dedicated error type for insufficient credits/quota across all
providers (OpenAI, Anthropic, Google). Returns HTTP 402 with
BYOK-aware messaging instead of generic 400.

- New LLMInsufficientCreditsError class and PAYMENT_REQUIRED ErrorCode
- is_insufficient_credits_message() helper detecting credit/quota strings
- All 3 provider clients detect 402 status + credit keywords
- FastAPI handler returns 402 with "your API key" vs generic messaging
- 5 new parametrized tests covering OpenRouter, OpenAI, and negative case

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Ari Webb
cfd2ca3102 fix: zai clear empty messages (#9466) 2026-02-24 10:52:07 -08:00
Shubham Naik
6579f9b906 feat: add saveTemplateVersionNoProject endpoint (#9465)
* feat: add saveTemplateVersionNoProject endpoint

Added NoProject version of saveTemplateVersion endpoint:
- Backend supports both URL patterns:
  - `/v1/templates/:project_id/:template_name` (with project in path)
  - `/v1/templates/:template_name` (NoProject, uses X-Project header)
- Stainless surfaces only the cleaner NoProject version as `templates.save()`
- NoProject route exported first for correct route matching

Changes:
- Added saveTemplateVersionNoProject contract and handler
- Updated stainless.yml to map `save` method to NoProject endpoint
- Follows same pattern as other NoProject endpoints (create, delete, rollback)

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* bump

* bump

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Shubham Naik
565fd3c143 feat: add template rollback endpoint [LET-7423] (#9455)
* feat: add template rollback endpoint [LET-7423]

Adds POST /v1/templates/:template_name/rollback endpoint to restore templates to previous versions.

Key features:
- Rollback to any numbered version (1, 2, 3, etc.) or "latest"
- Auto-saves unsaved changes before rollback to prevent data loss
- Validates input (rejects "current"/"dev" as target versions)
- Preserves entity IDs and relationships across rollback
- Uses project context from X-Project header (no project_id in path)

Implementation includes:
- API contract in templatesContract.ts
- Handler in templatesRouter.ts with comprehensive error handling
- 9 E2E tests covering functionality and edge cases
- Updated stainless.yml for SDK generation

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore test

* fix: add X-Project header to rollback endpoint tests

The rollback endpoint uses project context from X-Project header instead of URL path.
Updated all rollback test calls to include the X-Project header with testProject value.

This follows the no-project-in-path pattern for template endpoints.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: support both URL patterns for rollback endpoint

Added dual URL pattern support for rollback endpoint:
- `/v1/templates/:project_id/:template_name/rollback` (with project in path)
- `/v1/templates/:template_name/rollback` (NoProject, uses X-Project header)

Backend supports both patterns, but Stainless only exposes the cleaner NoProject version for SDKs.

Key changes:
- Fixed "rollback to latest" bug by resolving target version BEFORE auto-saving
- NoProject route is exported first to ensure correct route matching order
- Updated tests to use project_id in path for better compatibility
- All 8 rollback tests passing

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* bump

* bump

* bump

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
jnjpng
778f28ccf3 fix: handle transient network errors in ChatGPT OAuth client (#9462)
- Map httpx.ReadError/WriteError/ConnectError to LLMConnectionError in
  handle_llm_error so Temporal correctly classifies them as retryable
  (previously fell through to generic non-retryable LLMError)
- Add client-level retry with exponential backoff (up to 3 attempts) on
  request_async and stream_async for transient transport errors
- Stream retry is guarded by has_yielded flag to avoid corrupting
  partial responses already consumed by the caller
2026-02-24 10:52:07 -08:00
Kian Jones
4126fdadea fix(core): preserve thought_signature on TextContent in Gemini streaming path (#9461)
get_content() was only setting signature on ReasoningContent items.
When Gemini returns a function call with thought_signature but no
ReasoningContent (e.g. include_thoughts=False), the signature was
stored on self.thinking_signature but never attached to TextContent.
This caused "missing thought_signature in functionCall parts" errors
when the message was echoed back to Gemini on the next turn.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
b9c4ed3b15 fix: catch contextwindowexceeded error on gemini (#9450)
* catch contextwindowexceeded error

* fix(core): detect Google token limit errors as ContextWindowExceededError

Google's error message says "input token count exceeds the maximum
number of tokens allowed" which doesn't contain the word "context",
so it was falling through to generic LLMBadRequestError instead of
ContextWindowExceededError. This means compaction won't auto-trigger.

Expands the detection to also match "token count" and "tokens allowed"
in addition to the existing "context" keyword.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): add missing message arg to LLMBadRequestError in OpenAI client

The generic 400 path in handle_llm_error was constructing
LLMBadRequestError without the required message positional arg,
causing TypeError in prod during summarization.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* ci: add adapters/ test suite to core unit test matrix

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(tests): update adapter error handling test expectations to match actual behavior

The streaming adapter's error handling double-wraps errors: the
AnthropicStreamingInterface calls handle_llm_error first, then the
adapter catches the result and calls handle_llm_error again, which
falls through to the base class LLMError. Updated test expectations
to match this behavior.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): prevent double-wrapping of LLMError in stream adapter

The AnthropicStreamingInterface.process() already transforms raw
provider errors into LLMError subtypes via handle_llm_error. The
adapter was catching the result and calling handle_llm_error again,
which didn't recognize the already-transformed LLMError and wrapped
it in a generic LLMError("Unhandled LLM error"). This downgraded
specific error types (LLMConnectionError, LLMServerError, etc.)
and broke retry logic that matches on specific subtypes.

Now the adapter checks if the error is already an LLMError and
re-raises it as-is. Tests restored to original correct expectations.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
05073ba837 fix(core): preserve git-memory formatting and enforce lock conflicts (#9451)
* test(core): strengthen git-memory system prompt stability integration coverage

Switch git-memory HTTP integration tests to OpenAI model handles and add assertions that system prompt content remains stable after normal turns and direct block value updates until explicit recompilation or reset.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): preserve git-memory formatting and enforce lock conflicts

Preserve existing markdown frontmatter formatting on block updates while still ensuring required metadata fields exist, and make post-push git sync propagate memory-repo lock conflicts as 409 responses. Also enable slash-containing core-memory block labels in route params and add regression coverage.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(memfs): fail closed on memory repo lock contention

Make memfs git commits fail closed when the per-agent Redis lock cannot be acquired, return 409 MEMORY_REPO_BUSY from the memfs files write API, and map that 409 back to core MemoryRepoBusyError so API callers receive consistent busy conflicts.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore(core): minimize git-memory fix scope to memfs lock and frontmatter paths

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: drop unrelated changes and keep memfs-focused scope

Revert branch-only changes that are not required for the memfs lock contention and frontmatter-preservation fix so the PR contains only issue-relevant files.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(memfs): lock push sync path and improve nested sync diagnostics

Serialize memfs push-to-GCS sync with the same per-agent Redis lock key used by API commits, and add targeted post-push nested-block diagnostics plus a focused nested-label sync regression test for _sync_after_push.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
d7793a4474 fix(core): stabilize system prompt refresh and expand git-memory coverage (#9438)
* fix(core): stabilize system prompt refresh and expand git-memory coverage

Only rebuild system prompts on explicit refresh paths so normal turns preserve prefix-cache stability, including git/custom prompt layouts. Add integration coverage for memory filesystem tree structure and recompile/reset system-message updates via message-id retrieval.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): recompile system prompt around compaction and stabilize source tests

Force system prompt refresh before/after compaction in LettaAgentV3 so repaired system+memory state is used and persisted across subsequent turns. Update source-system prompt tests to explicitly recompile before raw preview assertions instead of assuming automatic rebuild timing.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
5b7dd15905 fix(core): use BYOK API keys for Google AI/Vertex LLM requests (#9439)
GoogleAIClient and GoogleVertexClient were hardcoding Letta's managed
credentials for all requests, ignoring user-provided BYOK API keys.
This meant Letta was paying Google API costs for BYOK users.

Add _get_client_async and update _get_client to check BYOK overrides
(via get_byok_overrides / get_byok_overrides_async) before falling back
to managed credentials, matching the pattern used by OpenAIClient and
AnthropicClient.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Ari Webb
d0e25ae471 feat: add glm 5 to core (#9436)
* feat: add glm 5 to core

* test glm 5
2026-02-24 10:52:07 -08:00
Kevin Lin
7ec2783ded fix: increase SGLang provider default context window from 8k to 32k (#9435) 2026-02-24 10:52:07 -08:00
Kian Jones
e3dbb44fc9 feat(telem): support reading from clickhouse traces (#9431)
draft
2026-02-24 10:52:07 -08:00
Kian Jones
7c65fd77f1 fix(core): return 400 for Google GenAI ClientError bad requests (#9357)
Google genai.errors.ClientError with code 400 was being caught and
wrapped as LLMBadRequestError but returned to clients as 502 because
no dedicated FastAPI exception handler existed for LLMBadRequestError.

- Add LLMBadRequestError exception handler in app.py returning HTTP 400
- Fix ErrorCode on Google 400 bad requests from INTERNAL_SERVER_ERROR
  to INVALID_ARGUMENT
- Route Google API errors through handle_llm_error in stream_async path

Datadog: https://us5.datadoghq.com/error-tracking/issue/4eb3ff3c-d937-11f0-8177-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
02183efd5f test: enable SQLAlchemy pooling in CI tests (#9279)
* test: enable SQLAlchemy pooling in CI tests

Changes CI test config to use LETTA_DISABLE_SQLALCHEMY_POOLING=false,
enabling connection pooling to match production settings.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: remove hardcoded LETTA_DISABLE_SQLALCHEMY_POOLING fixture from conftest

Remove the fixture that hardcoded the pooling setting in test code.
The value should instead come from the CI workflow environment via
vars.LETTA_DISABLE_SQLALCHEMY_POOLING (same source as production).

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>
Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>
2026-02-24 10:52:07 -08:00
Kian Jones
3634464251 fix(core): handle anyio.BrokenResourceError for client disconnects (#9358)
Catch BrokenResourceError alongside ClosedResourceError in streaming
response, logging middleware, and app exception handlers so client
disconnects are logged at info level instead of surfacing as 500s.

Datadog: https://us5.datadoghq.com/error-tracking/issue/4f57af0c-d558-11f0-a65d-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
0d42afa151 fix(core): catch LockNotAvailableError and return 409 instead of 500 (#9359)
Re-apply changes on top of latest main to resolve merge conflicts.

- Add DatabaseLockNotAvailableError custom exception in orm/errors.py
- Catch asyncpg LockNotAvailableError and pgcode 55P03 in _handle_dbapi_error
- Register FastAPI exception handler returning 409 with Retry-After header



🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
411bb63990 fix(core): improve error handling for upstream LLM provider errors (#9423)
Handle HTML error responses from ALB/load balancers in OpenAI client and
add explicit InternalServerError handling for Anthropic upstream issues.

🐛 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
6d4e320cc3 chore: clean up dead code (#9427)
remove striaght up unsued code files
2026-02-24 10:52:07 -08:00
Kian Jones
2568c02b51 fix(core): strip quotes from MCP server header keys and values (#9349)
* fix(core): strip quotes from MCP server header keys and values

Users pasting JSON-formatted env vars into MCP server config end up with
quoted header names like `"CONTEXT7_API_KEY":` which causes
httpx.LocalProtocolError. Sanitize keys (strip surrounding quotes and
trailing colons) and values (strip surrounding quotes) in
resolve_custom_headers, resolve_environment_variables for HTTP configs,
and stdio env dicts.

Datadog: https://us5.datadoghq.com/error-tracking/issue/4a2f4af6-f2d8-11f0-930c-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: revert stdio env sanitization to pass-through

The stdio path doesn't need header/env sanitization - that's only
relevant for SSE/streamable HTTP servers with auth headers.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
ddcfeb26b1 fix(core): catch all MCP tool execution errors instead of re-raising (#9419)
* fix(core): catch all MCP tool execution errors instead of re-raising

MCP tools are external user-configured servers - any failure during
tool execution is expected and should be returned as (error_msg, False)
to the agent, not raised as an exception that hits Datadog as a 500.

Previously:
- base_client.py only caught McpError/ToolError, re-raised everything else
- fastmcp_client.py (both SSE and StreamableHTTP) always re-raised

Now all three execute_tool() methods catch all exceptions and return
the error message to the agent conversation. The agent handles tool
failures via the error message naturally.

This silences ~15 Datadog issue types including:
- fastmcp.exceptions.ToolError (validation, permissions)
- mcp.shared.exceptions.McpError (connection closed, credentials)
- httpx.HTTPStatusError (503 from Zapier, etc.)
- httpx.ConnectError, ReadTimeout, RemoteProtocolError
- requests.exceptions.ConnectionError
- builtins.ConnectionError

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): log unexpected MCP errors at warning level with traceback

Expected MCP errors (ToolError, McpError, httpx.*, ConnectionError, etc.)
log at info level. Anything else (e.g. TypeError, AttributeError from
our own code) logs at warning with exc_info=True so it still surfaces
in Datadog without crashing the request.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
382e216cbb fix(core): differentiate BYOK vs base provider in all LLM error details (#9425)
Add is_byok flag to every LLMError's details dict returned from
handle_llm_error across all providers (OpenAI, Anthropic, Google,
ChatGPT OAuth). This enables observability into whether errors
originate from Letta's production keys or user-provided BYOK keys.

The rate limit handler in app.py now returns a more helpful message
for BYOK users ("check your provider's rate limits and billing")
versus the generic message for base provider rate limits.

Datadog issues:
- https://us5.datadoghq.com/error-tracking/issue/b711c824-f490-11f0-96e4-da7ad0900000
- https://us5.datadoghq.com/error-tracking/issue/76623036-f4de-11f0-8697-da7ad0900000
- https://us5.datadoghq.com/error-tracking/issue/43e9888a-dfcf-11f0-a645-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
424a1ada64 fix: google gen ai format error fix (#9147)
* google gen ai format error fix

* fix(core): add $ref safety net, warning log, and unit tests for Google schema resolution

- Add `$ref` to unsupported_keys in `_clean_google_ai_schema_properties` so unresolvable refs (e.g. `#/properties/...` style) are stripped as a safety net instead of crashing the Google SDK
- Add warning log when `_resolve_json_schema_refs` encounters a ref it cannot resolve
- Deduplicate the `#/$defs/` and `#/definitions/` resolution branches
- Add 11 unit tests covering: single/multiple $defs, nested refs, refs in anyOf/allOf, array items, definitions key, unresolvable refs, and the full resolve+clean pipeline

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
02f776b016 fix: handle system messages with mixed TextContent + ImageContent (#9418)
* fix: handle system messages with mixed TextContent + ImageContent

System messages injected by external tools (e.g. packify.ai MCP) can
contain both TextContent and ImageContent. The assertions in
to_openai_responses_dicts and to_anthropic_dict expected exactly one
TextContent, causing AssertionError in production.

Extract all text parts and join them, matching how to_openai_dict
already handles this case.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: replace asserts with logger.warning + graceful skip

Asserts are the wrong tool for production input validation — if a
system message has only non-text content, we should warn and skip
rather than crash the request.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
a00270d199 fix(core): handle UTF-8 surrogate characters in API responses (#9422)
* fix(core): handle UTF-8 surrogate characters in API responses

LLM responses or user input can contain surrogate characters (U+D800-U+DFFF)
which are valid Python strings but illegal in UTF-8. ORJSONResponse rejects
them with "str is not valid UTF-8: surrogates not allowed". Add
SafeORJSONResponse that catches the TypeError and strips surrogates before
retrying serialization.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: reuse sanitize_unicode_surrogates from json_helpers

Replace the inline _sanitize_surrogates function with the existing
sanitize_unicode_surrogates helper from letta.helpers.json_helpers,
which is already used across all LLM clients.

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
2026-02-24 10:52:07 -08:00
jnjpng
39b25a0e3c fix: update ContextWindowCalculator to parse new system message sections (#9398)
* fix: update ContextWindowCalculator to parse new system message sections

The context window calculator was using outdated position-based parsing
that only handled 3 sections (base_instructions, memory_blocks, memory_metadata).
The actual system message now includes additional sections that were not
being tracked:

- <memory_filesystem> (git-enabled agents)
- <tool_usage_rules> (when tool rules configured)
- <directories> (when sources attached)

Changes:
- Add _extract_tag_content() helper for proper XML tag extraction
- Rewrite extract_system_components() to return a Dict with all 6 sections
- Update calculate_context_window() to count tokens for new sections
- Add new fields to ContextWindowOverview schema with backward-compatible defaults
- Add unit tests for the extraction logic

* update

* generate

* fix: check attached file in directories section instead of core_memory

Files are rendered inside <directories> tags, not <memory_blocks>.
Update validate_context_window_overview assertions accordingly.

* fix: address review feedback for context window parser

- Fix git-enabled agents regression: capture bare file blocks
  (e.g. <system/human.md>) rendered after </memory_filesystem> as
  core_memory via new _extract_git_core_memory() method
- Make _extract_top_level_tag robust: scan all occurrences to find
  tag outside container, handling nested-first + top-level-later case
- Document system_prompt tag inconsistency in docstring
- Add TODO to base_agent.py extract_dynamic_section linking to
  ContextWindowCalculator to flag parallel parser tech debt
- Add tests: git-enabled agent parsing, dual-occurrence tag
  extraction, pure text system prompt, git-enabled integration test
2026-02-24 10:52:07 -08:00
Kian Jones
7cc1cd3dc0 feat(ci): self-hosted provider test for lmstudio (#9404)
* add gpu runners and prod memory_repos

* add lmstudio and vllm in model_settings

* fix llm_configs and change variable name in reusable workflow and change perms for memory_repos to admin in tf

* fix: update self-hosted provider tests to use SDK 1.0 and v2 tests

- Update letta-client from ==0.1.324 to >=1.0.0
- Switch ollama/vllm/lmstudio tests to integration_test_send_message_v2.py

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: use openai provider_type for self-hosted model settings

ollama/vllm/lmstudio are not valid provider_type values in the SDK
model_settings schema - they use openai-compatible APIs so provider_type
should be openai. The provider routing is determined by the handle prefix.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: enable redis for ollama/vllm/lmstudio tests

Background streaming tests require Redis. Add use-redis: true to
self-hosted provider test workflows.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* prep for lmstudio and vllm

* used lmstudio_openai client

* change tool call parser from hermes to qwen3_xml

* qwen3_xmlk -> qwen3_coder

* revert to hermes (incompatible with parallel tool calls?) and skipping vllm tests on parallel tool calls

* install uv redis extra

* remove lmstudio

* create lmstudio test

* qwen3-14b on lmstudio

* try with qwen3-4b

* actually update the model config json to use qwen3-4b

* add test_providers::test_lmstudio

* bump timeout from 60 to 120 for slow lmstudio on cpu model

* misc vllm changes

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00