letta-server

Author	SHA1	Message	Date
Sarah Wooders	97cdfb4225	Revert "feat: add strict tool calling setting [LET-6902]" (#8720 ) Revert "feat: add strict tool calling setting [LET-6902] (#8577)" This reverts commit 697c9d0dee6af73ec4d5d98780e2ca7632a69173.	2026-01-19 15:54:39 -08:00
github-actions[bot]	5fbf8f93e2	fix: add explicit timeouts to httpx clients to prevent ReadTimeout errors (#8538 ) This commit addresses the httpx.ReadTimeout error detected in production by adding explicit timeout configurations to several httpx client usages: 1. MCP SSE client: Pass mcp_connect_to_server_timeout (30s) to sse_client() 2. MCP StreamableHTTP client: Pass mcp_connect_to_server_timeout (30s) to streamablehttp_client() 3. OpenAI model list API: Add 30s timeout with 10s connect timeout 4. Google AI model list/details API: Add 30s timeout with 10s connect timeout Previously, these httpx clients were created without explicit timeouts, which could cause ReadTimeout errors when remote servers are slow to respond. Fixes #8073 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-19 15:54:38 -08:00
github-actions[bot]	85c40c8154	fix: add streaming fallback for long-running Anthropic requests (#8564 ) When the Anthropic SDK detects a request may exceed 10 minutes, it raises a ValueError requiring streaming mode. This fix catches that specific error in request_async and automatically falls back to streaming mode, accumulating the response into the same format as non-streaming. This resolves the production error: "ValueError: Streaming is required for operations that may take longer than 10 minutes" Fixes #8516 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-19 15:54:38 -08:00
Sarah Wooders	bdede5f90c	feat: add strict tool calling setting [LET-6902] (#8577 )	2026-01-19 15:54:38 -08:00
cthomas	9b5067bed9	fix: remove unused sync code (#8613 ) * chore: remove unused sync code * chore: remove deprecated sync Google AI functions Removes unused sync functions that used httpx.Client (blocking): - google_ai_get_model_details() - google_ai_get_model_context_window() - GoogleGeminiProvider.get_model_context_window() All code now uses async versions with httpx.AsyncClient. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:37 -08:00
cthomas	57cb2d7566	fix: async functions must call async methods (#8612 ) Critical fixes: - llm_client_base.send_llm_request() now calls await self.request_async() instead of self.request() - Remove unused sync get_openai_embedding() that used sync OpenAI client - Remove deprecated compile_in_thread_async() from Memory These were blocking the event loop during LLM requests and embeddings. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:37 -08:00
Sarah Wooders	0cbdf452fa	fix: temporarily disable structured outputs for anthropic (#8491 )	2026-01-12 10:57:49 -08:00
cthomas	938bb78afe	fix: handle anthropic incorrect tool id bug (#8447 )	2026-01-12 10:57:49 -08:00
github-actions[bot]	f2171447a8	fix: handle httpx.ReadError, WriteError, and ConnectError in LLM streaming clients (#8243 ) Adds explicit handling for httpx network errors (ReadError, WriteError, ConnectError) in AnthropicClient, OpenAIClient, and GoogleVertexClient. These errors can occur during streaming when the connection is unexpectedly closed while reading/writing data. Maps these errors to LLMConnectionError for consistent error handling. Fixes #8221 (and duplicate #8156) 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-12 10:57:49 -08:00
github-actions[bot]	05ec02e384	fix: handle Anthropic 413 request_too_large as ContextWindowExceededError (#8424 ) The Anthropic API returns a 413 status code with error type `request_too_large` when the request payload exceeds the maximum allowed size. This error should be converted to `ContextWindowExceededError` so the system can handle it appropriately (e.g., by summarizing the conversation to reduce context size). Changes: - Added `request_too_large` and `request exceeds the maximum size` to the early string-based error detection in `handle_llm_error` - Added specific handling for HTTP 413 status code in the `APIStatusError` handler - Added tests to verify the new error handling behavior Fixes: #8422 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-12 10:57:48 -08:00
Ari Webb	02f3e3f3b9	fix: fix providers and models persistence (#8302 )	2026-01-12 10:57:48 -08:00
Ari Webb	cc825b4f5c	Revert "Revert "feat: enable provider models persistence" (#6590 )" (#6595 )	2026-01-12 10:57:48 -08:00
github-actions[bot]	76008c61f4	fix: handle httpx.RemoteProtocolError during LLM streaming (#8206 )	2026-01-12 10:57:48 -08:00
Sarah Wooders	256dfb31cf	fix: attempt to fix /context again (#7946 )	2026-01-12 10:57:20 -08:00
Sarah Wooders	f512d13bc9	feat: test token counting (#7943 )	2026-01-12 10:57:20 -08:00
Kevin Lin	a1dfedfb0b	feat: Add structured outputs for Anthropic (#7495 )	2026-01-12 10:57:19 -08:00
jnjpng	a1fb295e69	feat: enable claude pro max plan for letta code (#7663 ) * base * more * rev	2026-01-12 10:57:19 -08:00
Ari Webb	cd45212acb	feat: add zai provider support (#7626 ) * feat: add zai provider support * add zai_api_key secret to deploy-core * add to justfile * add testing, provider integration skill * enable zai key * fix zai test * clean up skill a little * small changes	2026-01-12 10:57:19 -08:00
Kevin Lin	33afb930fc	fix: Google AI client logging as Vertex (#7337 ) fix	2025-12-17 17:32:27 -08:00
Ari Webb	e47eb23522	fix: max output tokens for gemini 3 models (#7322 )	2025-12-17 17:31:03 -08:00
Sarah Wooders	8729a037b9	fix: handle new openai overflow error format (#7110 )	2025-12-17 17:31:02 -08:00
Charles Packer	33d39f4643	fix(core): patch usage data tracking for anthropic when context caching is on (#6997 )	2025-12-15 12:03:09 -08:00
Kevin Lin	4b9485a484	feat: Add max tokens exceeded to stop reasons [LET-6480] (#6576 )	2025-12-15 12:03:09 -08:00
Ari Webb	4d90f37f50	feat: add gpt-5.2 support (#6698 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	c8fa77a01f	feat: cleanup cancellation code and add more logging (#6588 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	8440e319e2	Revert "feat: enable provider models persistence" (#6590 ) Revert "feat: enable provider models persistence (#6193)" This reverts commit 9682aff32640a6ee8cf71a6f18c9fa7cda25c40e.	2025-12-15 12:02:34 -08:00
Ari Webb	848a73125c	feat: enable provider models persistence (#6193 ) * Revert "fix test" This reverts commit 5126815f23cefb4edad3e3bf9e7083209dcc7bf1. * fix server and better test * test fix, get api key for base and byok? * set letta default endpoint * try to fix timeout for test * fix for letta api key * Delete apps/core/tests/sdk_v1/conftest.py * Update utils.py * clean up a few issues * fix filterning on list_llm_models * soft delete models with provider * add one more test * fix ci * add timeout * band aid for letta embedding provider * info instead of error logs when creating models	2025-12-15 12:02:34 -08:00
Devansh Jain	d1536df6f6	chore: Update deepseek client for v3.2 models (#6556 ) * support for v3.2 models * streaming + context window fix * fix for no assitant text from deepseek	2025-12-15 12:02:34 -08:00
Kian Jones	d6292b6eb6	fix: bug which causes unrecoverable state if previous message was an image (#6486 ) * trying tout gpt-5.1-codex * add unit test for message content * try to support multimodal	2025-12-15 12:02:33 -08:00
Sarah Wooders	91e3dd8b3e	feat: fix new summarizer code and add more tests (#6461 )	2025-12-15 12:02:19 -08:00
Charles Packer	e67c98eedb	feat: add tests for prompt caching + fix anthropic prompt caching [LET-6373] (#6454 ) * feat: add tests for prompt caching * fix: add cache control breakpoints for anthropic + fix tests * fix: silence logging * fix: patch token counting error * fix: same patch on non-streaming path	2025-12-15 12:02:19 -08:00
Charles Packer	88a3743cc8	fix(core): distinguish between null and 0 for prompt caching (#6451 ) * fix(core): distinguish between null and 0 for prompt caching * fix: runtime errors * fix: just publish just sgate	2025-12-15 12:02:19 -08:00
Charles Packer	131891e05f	feat: add tracking of advanced usage data (eg caching) [LET-6372] (#6449 ) * feat: init refactor * feat: add helper code * fix: missing file + test * fix: just state/publish api	2025-12-15 12:02:19 -08:00
Sarah Wooders	807c5c18d9	feat: add gemini token counting [LET-6371] (#6444 )	2025-12-15 12:02:19 -08:00
jnjpng	c6df306ccf	fix: upgrade google-genai sdk version and fix gemini 3 streaming (#6437 ) * base * base --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:02:18 -08:00
Ari Webb	d4e7428c98	feat: structured outputs for anthropic [LET-6232] (#6410 ) feat: structured outputs for anthropic Co-authored-by: Ari Webb <ari@letta.com>	2025-11-26 14:39:40 -08:00
Ari Webb	89c7ab5f14	feat: structured outputs for openai [LET-6233] (#6363 ) * first hack with test * remove changes integration test * Delete apps/core/tests/sdk_v1/integration/integration_test_send_message_v2.py * add test * remove comment * stage and publish api * deprecate base level response_schema * add param to llm_config test --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-26 14:39:39 -08:00
Charles Packer	c4699b3d17	feat: add support for opus 4.5 (#6256 ) * feat: add support for new model * fix: just stage-api && just publish-api (anthropic model settings changed) * fix: just stage-api && just publish-api (anthropic model settings changed) * fix: make kevlar have default reasoning on * fix: bump anthropic sdk version * fix: patch name * pin newer version anthropic --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-24 19:10:27 -08:00
jnjpng	9adeb122cd	feat: set thinking level high for gemini 3 (#6330 ) * base * include thoughts * update * clean --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-11-24 19:10:27 -08:00
cthomas	d903056a43	fix: list indices must be int not str error [LET-6254] (#6302 ) fix: list indices must be int not str error	2025-11-24 19:10:26 -08:00
cthomas	1c358c863e	fix: google api sync request (#6292 ) * fix: google api sync request * remove unused function	2025-11-24 19:10:26 -08:00
cthomas	2ebf863998	fix: type error in agent step (#6287 )	2025-11-24 19:10:26 -08:00
cthomas	6f810d95d8	feat: add semaphore to limit embeddings creation (#6261 )	2025-11-24 19:10:11 -08:00
Ari Webb	9bff8413a0	fix: fix gemini models reasoning [LET-6228] (#6245 ) * fixed locally * fix tests --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-24 19:09:33 -08:00
Ari Webb	7380eaec13	feat: enable gpt5.1 models [LET-6178] (#6175 ) * hack at gpt51 * revert package lock * first hack * default context window --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-24 19:09:33 -08:00
Kian Jones	ddb6f3836e	Fix: prevent empty embedding batches from causing memory spikes (#6230 ) Root cause: When splitting failed embedding batches, mid=0 for single items created empty chunks. These empty chunks were then processed, creating hundreds of no-op tasks that consumed memory. Crash pattern from logs: - 600+ 'batch_size=0' embedding tasks created - Memory spiked 531 MB → 4.9 GB - Pod crashed Fixes: 1. Skip empty chunks before creating tasks 2. Guard chunk splits to prevent empty slices (mid = max(1, len//2)) 3. Break early if all chunks are empty This prevents the asyncio.gather() from creating thousands of empty coroutines that exhaust memory.	2025-11-24 19:09:33 -08:00
Sarah Wooders	a6b19bf3aa	Revert "feat: support anthropic structured outputs [LET-6190]" (#6229 ) Revert "feat: support anthropic structured outputs [LET-6190] (#6200)" This reverts commit 46f6422e68b0d35df1690118dd971c20a4ba7a81.	2025-11-24 19:09:33 -08:00
Sarah Wooders	c18af2bc81	feat: support anthropic structured outputs [LET-6190] (#6200 )	2025-11-24 19:09:33 -08:00
Ari Webb	f19a71dad1	chore: log problematic chunk (#6166 ) log problematic chunk Co-authored-by: Ari Webb <ari@letta.com>	2025-11-24 19:09:32 -08:00
Kian Jones	aafd5696c5	fix: logging and try to handle invalid embeddings (#6145 ) logging and try to handkle invalid embeddings	2025-11-13 15:36:56 -08:00

1 2 3 4 5 ...

327 Commits