letta-server/letta/llm_api at e66981c7e86343639c705ea0c8d77460c5a75311 - letta-server - WIUF Gitea: Waiting is - Until Fullness

Fimeg/letta-server

Files

History

Charles Packer 619e81ed1e fix(core): add OpenAI prompt cache key and model-gated 24h retention (#9492 )

* fix(core): apply OpenAI prompt cache settings to request payloads

Set prompt_cache_key using agent and conversation context on both Responses and Chat Completions request builders, and enable 24h retention only for supported OpenAI models while excluding OpenRouter paths.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): prefix prompt cache key with letta tag

Add a `letta:` prefix to generated OpenAI prompt_cache_key values so cache-related entries are easier to identify in provider-side logs and diagnostics.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* add integration test

* skip test

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Ari Webb <ari@letta.com>

2026-02-24 10:52:07 -08:00

..

sample_response_jsons

merge this (#4759 )

2025-09-17 15:47:40 -07:00

__init__.py

merge this (#4759 )

2025-09-17 15:47:40 -07:00

anthropic_client.py

fix(core): catch bare openai.APIError in handle_llm_error (#9468 )

2026-02-24 10:52:07 -08:00

anthropic_constants.py

feat: Add structured outputs for Anthropic (#7495 )

2026-01-12 10:57:19 -08:00

azure_client.py

fix(core): differentiate BYOK vs base provider in all LLM error details (#9425 )

2026-02-24 10:52:07 -08:00

bedrock_client.py

feat: enable bedrock for anthropic models (#8847 )

2026-01-19 15:54:44 -08:00

chatgpt_oauth_client.py

fix: rename ChatGPT server error to ChatGPT API error (#9497 )

2026-02-24 10:52:07 -08:00

deepseek_client.py

fix(core): sanitize Unicode surrogates in all LLM client requests (#9323 )

2026-02-24 10:52:06 -08:00

error_utils.py

fix(core): catch bare openai.APIError in handle_llm_error (#9468 )

2026-02-24 10:52:07 -08:00

google_ai_client.py

fix(core): use BYOK API keys for Google AI/Vertex LLM requests (#9439 )

2026-02-24 10:52:07 -08:00

google_constants.py

fix: max output tokens for gemini 3 models (#7322 )

2025-12-17 17:31:03 -08:00

google_vertex_client.py

fix(core): catch bare openai.APIError in handle_llm_error (#9468 )

2026-02-24 10:52:07 -08:00

groq_client.py

fix(core): sanitize Unicode surrogates in all LLM client requests (#9323 )

2026-02-24 10:52:06 -08:00

helpers.py

fix: remove unused sync code (#8613 )

2026-01-19 15:54:37 -08:00

llm_api_tools.py

fix: add LLMCallType enum and ensure call_type is set on all provider traces (#9258 )

2026-02-24 10:52:06 -08:00

llm_client_base.py

Add LLM client compaction errors to traces (#9474 )

2026-02-24 10:52:07 -08:00

llm_client.py

feat: openrouter byok (#9148 )

2026-01-29 12:44:04 -08:00

minimax_client.py

fix(core): sanitize Unicode surrogates in all LLM client requests (#9323 )

2026-02-24 10:52:06 -08:00

mistral.py

fix(core): reject empty API keys in Bearer auth headers (#9350 )

2026-02-24 10:52:07 -08:00

openai_client.py

fix(core): add OpenAI prompt cache key and model-gated 24h retention (#9492 )

2026-02-24 10:52:07 -08:00

openai.py

fix(core): reject empty API keys in Bearer auth headers (#9350 )

2026-02-24 10:52:07 -08:00

sglang_native_client.py

feat: add log probabilities from OpenAI-compatible servers and SGLang native endpoint (#9240 )

2026-02-24 10:52:07 -08:00

together_client.py

fix(core): sanitize Unicode surrogates in all LLM client requests (#9323 )

2026-02-24 10:52:06 -08:00

xai_client.py

fix(core): sanitize Unicode surrogates in all LLM client requests (#9323 )

2026-02-24 10:52:06 -08:00

zai_client.py

fix: zai clear empty messages (#9466 )

2026-02-24 10:52:07 -08:00