Files
letta-server/letta/llm_api/error_utils.py
Kian Jones 80f34f134d fix(core): catch bare openai.APIError in handle_llm_error (#9468)
* fix(core): catch bare openai.APIError in handle_llm_error fallthrough

openai.APIError raised during streaming (e.g. OpenRouter credit
exhaustion) is not an APIStatusError, so it skipped the catch-all
at the end and fell through to LLMError("Unhandled"). Now bare
APIErrors that aren't context window overflows are mapped to
LLMBadRequestError.

Datadog: https://us5.datadoghq.com/error-tracking/issue/7a2c356c-0849-11f1-be66-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat(core): add LLMInsufficientCreditsError for BYOK credit exhaustion

Adds dedicated error type for insufficient credits/quota across all
providers (OpenAI, Anthropic, Google). Returns HTTP 402 with
BYOK-aware messaging instead of generic 400.

- New LLMInsufficientCreditsError class and PAYMENT_REQUIRED ErrorCode
- is_insufficient_credits_message() helper detecting credit/quota strings
- All 3 provider clients detect 402 status + credit keywords
- FastAPI handler returns 402 with "your API key" vs generic messaging
- 5 new parametrized tests covering OpenRouter, OpenAI, and negative case

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00

41 lines
1.5 KiB
Python

"""Shared helpers for provider error detection/mapping.
Keep these utilities free of heavy imports to avoid circular dependencies between
LLM clients (provider-specific) and streaming interfaces.
"""
def is_context_window_overflow_message(msg: str) -> bool:
"""Best-effort detection for context window overflow errors.
Different providers (and even different API surfaces within the same provider)
may phrase context-window errors differently. We centralize the heuristic so
all layers (clients, streaming interfaces, agent loops) behave consistently.
"""
return (
"exceeds the context window" in msg
or "This model's maximum context length is" in msg
or "maximum context length" in msg
or "context_length_exceeded" in msg
or "Input tokens exceed the configured limit" in msg
)
def is_insufficient_credits_message(msg: str) -> bool:
"""Best-effort detection for insufficient credits/quota/billing errors.
BYOK users on OpenRouter, OpenAI, etc. may exhaust their credits mid-stream
or get rejected pre-flight. We detect these so they map to 402 instead of 400/500.
"""
lower = msg.lower()
return (
"insufficient credits" in lower
or "requires more credits" in lower
or "add more credits" in lower
or "exceeded your current quota" in lower
or "you've exceeded your budget" in lower
or ("billing" in lower and "hard limit" in lower)
or "can only afford" in lower
)