letta-server

Author	SHA1	Message	Date
Kian Jones	a92e868ee6	feat: centralize telemetry logging at LLM client level (#8815 ) * feat: centralize telemetry logging at LLM client level Moves telemetry logging from individual adapters to LLMClientBase: - Add TelemetryStreamWrapper for streaming telemetry on stream close - Add request_async_with_telemetry() for non-streaming requests - Add stream_async_with_telemetry() for streaming requests - Add set_telemetry_context() to configure agent_id, run_id, step_id Updates adapters and agents to use new pattern: - LettaLLMAdapter now accepts agent_id/run_id in constructor - Adapters call set_telemetry_context() before LLM requests - Removes duplicate telemetry logging from adapters - Enriches traces with agent_id, run_id, call_type metadata 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: accumulate streaming response content for telemetry TelemetryStreamWrapper now extracts actual response data from chunks: - Content text (concatenated from deltas) - Tool calls (id, name, arguments) - Model name, finish reason, usage stats 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: move streaming telemetry to caller (option 3) - Remove TelemetryStreamWrapper class - Add log_provider_trace_async() helper to LLMClientBase - stream_async_with_telemetry() now just returns raw stream - Callers log telemetry after processing with rich interface data Updated callers: - summarizer.py: logs content + usage after stream processing - letta_agent.py: logs tool_call, reasoning, model, usage 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: pass agent_id and run_id to parent adapter class LettaLLMStreamAdapter was not passing agent_id/run_id to parent, causing "unexpected keyword argument" errors. 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:43 -08:00
Kian Jones	9418ab9815	feat: add provider trace backend abstraction for multi-backend telemetry (#8814 ) * feat: add provider trace backend abstraction for multi-backend telemetry Introduces a pluggable backend system for provider traces: - Base class with async/sync create and read interfaces - PostgreSQL backend (existing behavior) - ClickHouse backend (via OTEL instrumentation) - Socket backend (writes to Unix socket for crouton sidecar) - Factory for instantiating backends from config Refactors TelemetryManager to use backends with support for: - Multi-backend writes (concurrent via asyncio.gather) - Primary backend for reads (first in config list) - Graceful error handling per backend Config: LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND (comma-separated) Example: "postgres,socket" for dual-write to Postgres and crouton 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: add protocol version to socket backend records Adds PROTOCOL_VERSION constant to socket backend: - Included in every telemetry record sent to crouton - Must match ProtocolVersion in apps/crouton/main.go - Enables crouton to detect and reject incompatible messages 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: remove organization_id from ProviderTraceCreate calls The organization_id is now handled via the actor parameter in the telemetry manager, not through ProviderTraceCreate schema. This fixes validation errors after changing ProviderTraceCreate to inherit from BaseProviderTrace which forbids extra fields. 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * consolidate provider trace * add clickhouse-connect to fix bug on main lmao * auto generated sdk changes, and deployment details, and clikchouse prefix bug and added fields to runs trace return api * auto generated sdk changes, and deployment details, and clikchouse prefix bug and added fields to runs trace return api * consolidate provider trace * consolidate provider trace bug fix --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:43 -08:00
jnjpng	e2a8a95371	fix: chatgpt oauth model endpoint (#8855 ) base	2026-01-19 15:54:43 -08:00
jnjpng	a98bc31bf3	fix: refactor enable strict mode for structured output (#8840 ) * base * test	2026-01-19 15:54:42 -08:00
jnjpng	21c70323df	fix: respect strict mode for temporal with fallback on client (#8839 ) * base * update	2026-01-19 15:54:42 -08:00
jnjpng	85c242077e	feat: strict tool calling setting (#8810 ) base	2026-01-19 15:54:42 -08:00
jnjpng	637e320b11	chore: fix up minor nits and typing for chatgpt client (#8807 ) base	2026-01-19 15:54:42 -08:00
jnjpng	5017cb1d12	feat: add chatgpt oauth client for codex routing (#8774 ) * base * refresh * use default model fallback * patch * streaming * generate	2026-01-19 15:54:42 -08:00
Sarah Wooders	97cdfb4225	Revert "feat: add strict tool calling setting [LET-6902]" (#8720 ) Revert "feat: add strict tool calling setting [LET-6902] (#8577)" This reverts commit 697c9d0dee6af73ec4d5d98780e2ca7632a69173.	2026-01-19 15:54:39 -08:00
github-actions[bot]	5fbf8f93e2	fix: add explicit timeouts to httpx clients to prevent ReadTimeout errors (#8538 ) This commit addresses the httpx.ReadTimeout error detected in production by adding explicit timeout configurations to several httpx client usages: 1. MCP SSE client: Pass mcp_connect_to_server_timeout (30s) to sse_client() 2. MCP StreamableHTTP client: Pass mcp_connect_to_server_timeout (30s) to streamablehttp_client() 3. OpenAI model list API: Add 30s timeout with 10s connect timeout 4. Google AI model list/details API: Add 30s timeout with 10s connect timeout Previously, these httpx clients were created without explicit timeouts, which could cause ReadTimeout errors when remote servers are slow to respond. Fixes #8073 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-19 15:54:38 -08:00
github-actions[bot]	85c40c8154	fix: add streaming fallback for long-running Anthropic requests (#8564 ) When the Anthropic SDK detects a request may exceed 10 minutes, it raises a ValueError requiring streaming mode. This fix catches that specific error in request_async and automatically falls back to streaming mode, accumulating the response into the same format as non-streaming. This resolves the production error: "ValueError: Streaming is required for operations that may take longer than 10 minutes" Fixes #8516 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-19 15:54:38 -08:00
Sarah Wooders	bdede5f90c	feat: add strict tool calling setting [LET-6902] (#8577 )	2026-01-19 15:54:38 -08:00
cthomas	9b5067bed9	fix: remove unused sync code (#8613 ) * chore: remove unused sync code * chore: remove deprecated sync Google AI functions Removes unused sync functions that used httpx.Client (blocking): - google_ai_get_model_details() - google_ai_get_model_context_window() - GoogleGeminiProvider.get_model_context_window() All code now uses async versions with httpx.AsyncClient. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:37 -08:00
cthomas	57cb2d7566	fix: async functions must call async methods (#8612 ) Critical fixes: - llm_client_base.send_llm_request() now calls await self.request_async() instead of self.request() - Remove unused sync get_openai_embedding() that used sync OpenAI client - Remove deprecated compile_in_thread_async() from Memory These were blocking the event loop during LLM requests and embeddings. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:37 -08:00
Sarah Wooders	0cbdf452fa	fix: temporarily disable structured outputs for anthropic (#8491 )	2026-01-12 10:57:49 -08:00
cthomas	938bb78afe	fix: handle anthropic incorrect tool id bug (#8447 )	2026-01-12 10:57:49 -08:00
github-actions[bot]	f2171447a8	fix: handle httpx.ReadError, WriteError, and ConnectError in LLM streaming clients (#8243 ) Adds explicit handling for httpx network errors (ReadError, WriteError, ConnectError) in AnthropicClient, OpenAIClient, and GoogleVertexClient. These errors can occur during streaming when the connection is unexpectedly closed while reading/writing data. Maps these errors to LLMConnectionError for consistent error handling. Fixes #8221 (and duplicate #8156) 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-12 10:57:49 -08:00
github-actions[bot]	05ec02e384	fix: handle Anthropic 413 request_too_large as ContextWindowExceededError (#8424 ) The Anthropic API returns a 413 status code with error type `request_too_large` when the request payload exceeds the maximum allowed size. This error should be converted to `ContextWindowExceededError` so the system can handle it appropriately (e.g., by summarizing the conversation to reduce context size). Changes: - Added `request_too_large` and `request exceeds the maximum size` to the early string-based error detection in `handle_llm_error` - Added specific handling for HTTP 413 status code in the `APIStatusError` handler - Added tests to verify the new error handling behavior Fixes: #8422 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-12 10:57:48 -08:00
Ari Webb	02f3e3f3b9	fix: fix providers and models persistence (#8302 )	2026-01-12 10:57:48 -08:00
Ari Webb	cc825b4f5c	Revert "Revert "feat: enable provider models persistence" (#6590 )" (#6595 )	2026-01-12 10:57:48 -08:00
github-actions[bot]	76008c61f4	fix: handle httpx.RemoteProtocolError during LLM streaming (#8206 )	2026-01-12 10:57:48 -08:00
Sarah Wooders	256dfb31cf	fix: attempt to fix /context again (#7946 )	2026-01-12 10:57:20 -08:00
Sarah Wooders	f512d13bc9	feat: test token counting (#7943 )	2026-01-12 10:57:20 -08:00
Kevin Lin	a1dfedfb0b	feat: Add structured outputs for Anthropic (#7495 )	2026-01-12 10:57:19 -08:00
jnjpng	a1fb295e69	feat: enable claude pro max plan for letta code (#7663 ) * base * more * rev	2026-01-12 10:57:19 -08:00
Ari Webb	cd45212acb	feat: add zai provider support (#7626 ) * feat: add zai provider support * add zai_api_key secret to deploy-core * add to justfile * add testing, provider integration skill * enable zai key * fix zai test * clean up skill a little * small changes	2026-01-12 10:57:19 -08:00
Kevin Lin	33afb930fc	fix: Google AI client logging as Vertex (#7337 ) fix	2025-12-17 17:32:27 -08:00
Ari Webb	e47eb23522	fix: max output tokens for gemini 3 models (#7322 )	2025-12-17 17:31:03 -08:00
Sarah Wooders	8729a037b9	fix: handle new openai overflow error format (#7110 )	2025-12-17 17:31:02 -08:00
Charles Packer	33d39f4643	fix(core): patch usage data tracking for anthropic when context caching is on (#6997 )	2025-12-15 12:03:09 -08:00
Kevin Lin	4b9485a484	feat: Add max tokens exceeded to stop reasons [LET-6480] (#6576 )	2025-12-15 12:03:09 -08:00
Ari Webb	4d90f37f50	feat: add gpt-5.2 support (#6698 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	c8fa77a01f	feat: cleanup cancellation code and add more logging (#6588 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	8440e319e2	Revert "feat: enable provider models persistence" (#6590 ) Revert "feat: enable provider models persistence (#6193)" This reverts commit 9682aff32640a6ee8cf71a6f18c9fa7cda25c40e.	2025-12-15 12:02:34 -08:00
Ari Webb	848a73125c	feat: enable provider models persistence (#6193 ) * Revert "fix test" This reverts commit 5126815f23cefb4edad3e3bf9e7083209dcc7bf1. * fix server and better test * test fix, get api key for base and byok? * set letta default endpoint * try to fix timeout for test * fix for letta api key * Delete apps/core/tests/sdk_v1/conftest.py * Update utils.py * clean up a few issues * fix filterning on list_llm_models * soft delete models with provider * add one more test * fix ci * add timeout * band aid for letta embedding provider * info instead of error logs when creating models	2025-12-15 12:02:34 -08:00
Devansh Jain	d1536df6f6	chore: Update deepseek client for v3.2 models (#6556 ) * support for v3.2 models * streaming + context window fix * fix for no assitant text from deepseek	2025-12-15 12:02:34 -08:00
Kian Jones	d6292b6eb6	fix: bug which causes unrecoverable state if previous message was an image (#6486 ) * trying tout gpt-5.1-codex * add unit test for message content * try to support multimodal	2025-12-15 12:02:33 -08:00
Sarah Wooders	91e3dd8b3e	feat: fix new summarizer code and add more tests (#6461 )	2025-12-15 12:02:19 -08:00
Charles Packer	e67c98eedb	feat: add tests for prompt caching + fix anthropic prompt caching [LET-6373] (#6454 ) * feat: add tests for prompt caching * fix: add cache control breakpoints for anthropic + fix tests * fix: silence logging * fix: patch token counting error * fix: same patch on non-streaming path	2025-12-15 12:02:19 -08:00
Charles Packer	88a3743cc8	fix(core): distinguish between null and 0 for prompt caching (#6451 ) * fix(core): distinguish between null and 0 for prompt caching * fix: runtime errors * fix: just publish just sgate	2025-12-15 12:02:19 -08:00
Charles Packer	131891e05f	feat: add tracking of advanced usage data (eg caching) [LET-6372] (#6449 ) * feat: init refactor * feat: add helper code * fix: missing file + test * fix: just state/publish api	2025-12-15 12:02:19 -08:00
Sarah Wooders	807c5c18d9	feat: add gemini token counting [LET-6371] (#6444 )	2025-12-15 12:02:19 -08:00
jnjpng	c6df306ccf	fix: upgrade google-genai sdk version and fix gemini 3 streaming (#6437 ) * base * base --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:02:18 -08:00
Ari Webb	d4e7428c98	feat: structured outputs for anthropic [LET-6232] (#6410 ) feat: structured outputs for anthropic Co-authored-by: Ari Webb <ari@letta.com>	2025-11-26 14:39:40 -08:00
Ari Webb	89c7ab5f14	feat: structured outputs for openai [LET-6233] (#6363 ) * first hack with test * remove changes integration test * Delete apps/core/tests/sdk_v1/integration/integration_test_send_message_v2.py * add test * remove comment * stage and publish api * deprecate base level response_schema * add param to llm_config test --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-26 14:39:39 -08:00
Charles Packer	c4699b3d17	feat: add support for opus 4.5 (#6256 ) * feat: add support for new model * fix: just stage-api && just publish-api (anthropic model settings changed) * fix: just stage-api && just publish-api (anthropic model settings changed) * fix: make kevlar have default reasoning on * fix: bump anthropic sdk version * fix: patch name * pin newer version anthropic --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-24 19:10:27 -08:00
jnjpng	9adeb122cd	feat: set thinking level high for gemini 3 (#6330 ) * base * include thoughts * update * clean --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-11-24 19:10:27 -08:00
cthomas	d903056a43	fix: list indices must be int not str error [LET-6254] (#6302 ) fix: list indices must be int not str error	2025-11-24 19:10:26 -08:00
cthomas	1c358c863e	fix: google api sync request (#6292 ) * fix: google api sync request * remove unused function	2025-11-24 19:10:26 -08:00
cthomas	2ebf863998	fix: type error in agent step (#6287 )	2025-11-24 19:10:26 -08:00

1 2 3 4 5 ...

335 Commits