* fix(core): catch bare openai.APIError in handle_llm_error fallthrough
openai.APIError raised during streaming (e.g. OpenRouter credit
exhaustion) is not an APIStatusError, so it skipped the catch-all
at the end and fell through to LLMError("Unhandled"). Now bare
APIErrors that aren't context window overflows are mapped to
LLMBadRequestError.
Datadog: https://us5.datadoghq.com/error-tracking/issue/7a2c356c-0849-11f1-be66-da7ad0900000
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* feat(core): add LLMInsufficientCreditsError for BYOK credit exhaustion
Adds dedicated error type for insufficient credits/quota across all
providers (OpenAI, Anthropic, Google). Returns HTTP 402 with
BYOK-aware messaging instead of generic 400.
- New LLMInsufficientCreditsError class and PAYMENT_REQUIRED ErrorCode
- is_insufficient_credits_message() helper detecting credit/quota strings
- All 3 provider clients detect 402 status + credit keywords
- FastAPI handler returns 402 with "your API key" vs generic messaging
- 5 new parametrized tests covering OpenRouter, OpenAI, and negative case
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
41 lines
1.5 KiB
Python
41 lines
1.5 KiB
Python
"""Shared helpers for provider error detection/mapping.
|
|
|
|
Keep these utilities free of heavy imports to avoid circular dependencies between
|
|
LLM clients (provider-specific) and streaming interfaces.
|
|
"""
|
|
|
|
|
|
def is_context_window_overflow_message(msg: str) -> bool:
|
|
"""Best-effort detection for context window overflow errors.
|
|
|
|
Different providers (and even different API surfaces within the same provider)
|
|
may phrase context-window errors differently. We centralize the heuristic so
|
|
all layers (clients, streaming interfaces, agent loops) behave consistently.
|
|
"""
|
|
|
|
return (
|
|
"exceeds the context window" in msg
|
|
or "This model's maximum context length is" in msg
|
|
or "maximum context length" in msg
|
|
or "context_length_exceeded" in msg
|
|
or "Input tokens exceed the configured limit" in msg
|
|
)
|
|
|
|
|
|
def is_insufficient_credits_message(msg: str) -> bool:
|
|
"""Best-effort detection for insufficient credits/quota/billing errors.
|
|
|
|
BYOK users on OpenRouter, OpenAI, etc. may exhaust their credits mid-stream
|
|
or get rejected pre-flight. We detect these so they map to 402 instead of 400/500.
|
|
"""
|
|
lower = msg.lower()
|
|
return (
|
|
"insufficient credits" in lower
|
|
or "requires more credits" in lower
|
|
or "add more credits" in lower
|
|
or "exceeded your current quota" in lower
|
|
or "you've exceeded your budget" in lower
|
|
or ("billing" in lower and "hard limit" in lower)
|
|
or "can only afford" in lower
|
|
)
|