letta-server

Author	SHA1	Message	Date
cthomas	3d781efd21	fix(core): raise LLMEmptyResponseError for empty Anthropic responses (#9624 ) * fix(core): raise LLMEmptyResponseError for empty Anthropic responses Fixes LET-7679: Opus 4.6 occasionally returns empty responses (no content and no tool calls), causing silent failures with stop_reason=end_turn. Changes: - Add LLMEmptyResponseError class (subclass of LLMServerError) - Raise error in anthropic_client for empty non-streaming responses - Raise error in anthropic_streaming_interface for empty streaming responses - Pass through LLMError instances in handle_llm_error to preserve specific types - Add test for empty streaming response detection This allows clients (letta-code) to catch this specific error and implement retry logic with cache-busting modifications. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): set invalid_llm_response stop reason for empty responses Catch LLMEmptyResponseError specifically and set stop_reason to invalid_llm_response instead of llm_api_error. This allows clients to distinguish empty responses from transient API errors. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-03-03 18:34:01 -08:00
Kian Jones	25d54dd896	chore: enable F821, F401, W293 (#9503 ) * auto fixes * auto fix pt2 and transitive deps and undefined var checking locals() * manual fixes (ignored or letta-code fixed) * fix circular import	2026-02-24 10:55:08 -08:00
Kian Jones	80f34f134d	fix(core): catch bare openai.APIError in handle_llm_error (#9468 ) * fix(core): catch bare openai.APIError in handle_llm_error fallthrough openai.APIError raised during streaming (e.g. OpenRouter credit exhaustion) is not an APIStatusError, so it skipped the catch-all at the end and fell through to LLMError("Unhandled"). Now bare APIErrors that aren't context window overflows are mapped to LLMBadRequestError. Datadog: https://us5.datadoghq.com/error-tracking/issue/7a2c356c-0849-11f1-be66-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat(core): add LLMInsufficientCreditsError for BYOK credit exhaustion Adds dedicated error type for insufficient credits/quota across all providers (OpenAI, Anthropic, Google). Returns HTTP 402 with BYOK-aware messaging instead of generic 400. - New LLMInsufficientCreditsError class and PAYMENT_REQUIRED ErrorCode - is_insufficient_credits_message() helper detecting credit/quota strings - All 3 provider clients detect 402 status + credit keywords - FastAPI handler returns 402 with "your API key" vs generic messaging - 5 new parametrized tests covering OpenRouter, OpenAI, and negative case 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	b9c4ed3b15	fix: catch contextwindowexceeded error on gemini (#9450 ) * catch contextwindowexceeded error * fix(core): detect Google token limit errors as ContextWindowExceededError Google's error message says "input token count exceeds the maximum number of tokens allowed" which doesn't contain the word "context", so it was falling through to generic LLMBadRequestError instead of ContextWindowExceededError. This means compaction won't auto-trigger. Expands the detection to also match "token count" and "tokens allowed" in addition to the existing "context" keyword. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): add missing message arg to LLMBadRequestError in OpenAI client The generic 400 path in handle_llm_error was constructing LLMBadRequestError without the required message positional arg, causing TypeError in prod during summarization. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * ci: add adapters/ test suite to core unit test matrix 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(tests): update adapter error handling test expectations to match actual behavior The streaming adapter's error handling double-wraps errors: the AnthropicStreamingInterface calls handle_llm_error first, then the adapter catches the result and calls handle_llm_error again, which falls through to the base class LLMError. Updated test expectations to match this behavior. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): prevent double-wrapping of LLMError in stream adapter The AnthropicStreamingInterface.process() already transforms raw provider errors into LLMError subtypes via handle_llm_error. The adapter was catching the result and calling handle_llm_error again, which didn't recognize the already-transformed LLMError and wrapped it in a generic LLMError("Unhandled LLM error"). This downgraded specific error types (LLMConnectionError, LLMServerError, etc.) and broke retry logic that matches on specific subtypes. Now the adapter checks if the error is already an LLMError and re-raises it as-is. Tests restored to original correct expectations. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Sarah Wooders	eaf64fb510	fix: add LLMCallType enum and ensure call_type is set on all provider traces (#9258 ) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
github-actions[bot]	f2171447a8	fix: handle httpx.ReadError, WriteError, and ConnectError in LLM streaming clients (#8243 ) Adds explicit handling for httpx network errors (ReadError, WriteError, ConnectError) in AnthropicClient, OpenAIClient, and GoogleVertexClient. These errors can occur during streaming when the connection is unexpectedly closed while reading/writing data. Maps these errors to LLMConnectionError for consistent error handling. Fixes #8221 (and duplicate #8156) 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-12 10:57:49 -08:00
github-actions[bot]	05ec02e384	fix: handle Anthropic 413 request_too_large as ContextWindowExceededError (#8424 ) The Anthropic API returns a 413 status code with error type `request_too_large` when the request payload exceeds the maximum allowed size. This error should be converted to `ContextWindowExceededError` so the system can handle it appropriately (e.g., by summarizing the conversation to reduce context size). Changes: - Added `request_too_large` and `request exceeds the maximum size` to the early string-based error detection in `handle_llm_error` - Added specific handling for HTTP 413 status code in the `APIStatusError` handler - Added tests to verify the new error handling behavior Fixes: #8422 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-12 10:57:48 -08:00
Sarah Wooders	d5decc2a27	fix: persist streaming errors in run metadata (#8062 )	2026-01-12 10:57:47 -08:00

8 Commits