letta-server

Author	SHA1	Message	Date
Charles Packer	33d39f4643	fix(core): patch usage data tracking for anthropic when context caching is on (#6997 )	2025-12-15 12:03:09 -08:00
Kevin Lin	4b9485a484	feat: Add max tokens exceeded to stop reasons [LET-6480] (#6576 )	2025-12-15 12:03:09 -08:00
Ari Webb	4d90f37f50	feat: add gpt-5.2 support (#6698 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	c8fa77a01f	feat: cleanup cancellation code and add more logging (#6588 )	2025-12-15 12:02:34 -08:00
Sarah Wooders	8440e319e2	Revert "feat: enable provider models persistence" (#6590 ) Revert "feat: enable provider models persistence (#6193)" This reverts commit 9682aff32640a6ee8cf71a6f18c9fa7cda25c40e.	2025-12-15 12:02:34 -08:00
Ari Webb	848a73125c	feat: enable provider models persistence (#6193 ) * Revert "fix test" This reverts commit 5126815f23cefb4edad3e3bf9e7083209dcc7bf1. * fix server and better test * test fix, get api key for base and byok? * set letta default endpoint * try to fix timeout for test * fix for letta api key * Delete apps/core/tests/sdk_v1/conftest.py * Update utils.py * clean up a few issues * fix filterning on list_llm_models * soft delete models with provider * add one more test * fix ci * add timeout * band aid for letta embedding provider * info instead of error logs when creating models	2025-12-15 12:02:34 -08:00
Devansh Jain	d1536df6f6	chore: Update deepseek client for v3.2 models (#6556 ) * support for v3.2 models * streaming + context window fix * fix for no assitant text from deepseek	2025-12-15 12:02:34 -08:00
Kian Jones	d6292b6eb6	fix: bug which causes unrecoverable state if previous message was an image (#6486 ) * trying tout gpt-5.1-codex * add unit test for message content * try to support multimodal	2025-12-15 12:02:33 -08:00
Sarah Wooders	91e3dd8b3e	feat: fix new summarizer code and add more tests (#6461 )	2025-12-15 12:02:19 -08:00
Charles Packer	e67c98eedb	feat: add tests for prompt caching + fix anthropic prompt caching [LET-6373] (#6454 ) * feat: add tests for prompt caching * fix: add cache control breakpoints for anthropic + fix tests * fix: silence logging * fix: patch token counting error * fix: same patch on non-streaming path	2025-12-15 12:02:19 -08:00
Charles Packer	88a3743cc8	fix(core): distinguish between null and 0 for prompt caching (#6451 ) * fix(core): distinguish between null and 0 for prompt caching * fix: runtime errors * fix: just publish just sgate	2025-12-15 12:02:19 -08:00
Charles Packer	131891e05f	feat: add tracking of advanced usage data (eg caching) [LET-6372] (#6449 ) * feat: init refactor * feat: add helper code * fix: missing file + test * fix: just state/publish api	2025-12-15 12:02:19 -08:00
Sarah Wooders	807c5c18d9	feat: add gemini token counting [LET-6371] (#6444 )	2025-12-15 12:02:19 -08:00
jnjpng	c6df306ccf	fix: upgrade google-genai sdk version and fix gemini 3 streaming (#6437 ) * base * base --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-12-15 12:02:18 -08:00
Ari Webb	d4e7428c98	feat: structured outputs for anthropic [LET-6232] (#6410 ) feat: structured outputs for anthropic Co-authored-by: Ari Webb <ari@letta.com>	2025-11-26 14:39:40 -08:00
Ari Webb	89c7ab5f14	feat: structured outputs for openai [LET-6233] (#6363 ) * first hack with test * remove changes integration test * Delete apps/core/tests/sdk_v1/integration/integration_test_send_message_v2.py * add test * remove comment * stage and publish api * deprecate base level response_schema * add param to llm_config test --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-26 14:39:39 -08:00
Charles Packer	c4699b3d17	feat: add support for opus 4.5 (#6256 ) * feat: add support for new model * fix: just stage-api && just publish-api (anthropic model settings changed) * fix: just stage-api && just publish-api (anthropic model settings changed) * fix: make kevlar have default reasoning on * fix: bump anthropic sdk version * fix: patch name * pin newer version anthropic --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-24 19:10:27 -08:00
jnjpng	9adeb122cd	feat: set thinking level high for gemini 3 (#6330 ) * base * include thoughts * update * clean --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-11-24 19:10:27 -08:00
cthomas	d903056a43	fix: list indices must be int not str error [LET-6254] (#6302 ) fix: list indices must be int not str error	2025-11-24 19:10:26 -08:00
cthomas	1c358c863e	fix: google api sync request (#6292 ) * fix: google api sync request * remove unused function	2025-11-24 19:10:26 -08:00
cthomas	2ebf863998	fix: type error in agent step (#6287 )	2025-11-24 19:10:26 -08:00
cthomas	6f810d95d8	feat: add semaphore to limit embeddings creation (#6261 )	2025-11-24 19:10:11 -08:00
Ari Webb	9bff8413a0	fix: fix gemini models reasoning [LET-6228] (#6245 ) * fixed locally * fix tests --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-24 19:09:33 -08:00
Ari Webb	7380eaec13	feat: enable gpt5.1 models [LET-6178] (#6175 ) * hack at gpt51 * revert package lock * first hack * default context window --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-24 19:09:33 -08:00
Kian Jones	ddb6f3836e	Fix: prevent empty embedding batches from causing memory spikes (#6230 ) Root cause: When splitting failed embedding batches, mid=0 for single items created empty chunks. These empty chunks were then processed, creating hundreds of no-op tasks that consumed memory. Crash pattern from logs: - 600+ 'batch_size=0' embedding tasks created - Memory spiked 531 MB → 4.9 GB - Pod crashed Fixes: 1. Skip empty chunks before creating tasks 2. Guard chunk splits to prevent empty slices (mid = max(1, len//2)) 3. Break early if all chunks are empty This prevents the asyncio.gather() from creating thousands of empty coroutines that exhaust memory.	2025-11-24 19:09:33 -08:00
Sarah Wooders	a6b19bf3aa	Revert "feat: support anthropic structured outputs [LET-6190]" (#6229 ) Revert "feat: support anthropic structured outputs [LET-6190] (#6200)" This reverts commit 46f6422e68b0d35df1690118dd971c20a4ba7a81.	2025-11-24 19:09:33 -08:00
Sarah Wooders	c18af2bc81	feat: support anthropic structured outputs [LET-6190] (#6200 )	2025-11-24 19:09:33 -08:00
Ari Webb	f19a71dad1	chore: log problematic chunk (#6166 ) log problematic chunk Co-authored-by: Ari Webb <ari@letta.com>	2025-11-24 19:09:32 -08:00
Kian Jones	aafd5696c5	fix: logging and try to handle invalid embeddings (#6145 ) logging and try to handkle invalid embeddings	2025-11-13 15:36:56 -08:00
jnjpng	cb63753db6	fix: f string formatting separator in anthropic client (#6058 ) * fix * fix --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-11-13 15:36:56 -08:00
Charles Packer	18029250d0	fix(core): sanitize messages to anthropic in the main path the same way (or similar) to how we do it in the token counter (#6044 ) * fix(core): sanitize messages to anthropic in the main path the same way (or similar) to how we do it in the token counter * fix: also patch poison error in backend by filtering lazily * fix: remap streaming errors (what the fuck) * fix: dedupe tool clals * fix: cleanup, removed try/catch	2025-11-13 15:36:55 -08:00
Kian Jones	6943b68288	tests: adding unit testing and fix edge case (#5992 ) cursor bugbot suggestion number 2 and adding unit testing	2025-11-13 15:36:55 -08:00
Kian Jones	688397bbd2	fix: filter out messages with empty content in anthropic client (#5987 ) * no bare raise in anthropic client. Also filter out messages with empty content * use client logic * replace empty/whitespace with period * cursor bugbot	2025-11-13 15:36:55 -08:00
Charles Packer	b42e9fa7f3	fix(core): properly map bytes overflow to context overflow error (#5930 )	2025-11-13 15:36:45 -08:00
Ari Webb	7427c0998e	feat: gemini parallel tool calling non streaming [LET-5993] (#5889 ) * first hack * just test non streaming * stream_steps should pass too * clean up --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-13 15:36:39 -08:00
Sarah Wooders	57bb051ea4	feat: add tool return truncation to summarization as a fallback [LET-5970] (#5859 )	2025-11-13 15:36:30 -08:00
Ari Webb	48cc73175b	feat: parallel tool calling for openai non streaming [LET-4593] (#5773 ) * first hack * clean up * first implementation working * revert package-lock * remove openai test * error throw * typo * Update integration_test_send_message_v2.py * Update integration_test_send_message_v2.py * refine test * Only make changes for openai non streaming * Add tests --------- Co-authored-by: Ari Webb <ari@letta.com> Co-authored-by: Matt Zhou <mattzh1314@gmail.com>	2025-11-13 15:36:14 -08:00
Charles Packer	f6e6c85c46	fix(core): patch counting bug (#5857 )	2025-11-13 15:36:14 -08:00
Sarah Wooders	d109bab587	feat: handle overloaded errors [LET-5969] (#5856 ) feat: handle overloaded errors	2025-11-13 15:36:14 -08:00
Ari Webb	f3a40a41f5	feat: updated backend to not allow minimal for codex [LET-5883] (#5760 ) * updated backend * add function in openai_client * remove values before error * remove test --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-13 15:35:34 -08:00
Sarah Wooders	e7fff12da0	feat: patch model listing to actually match `handle` [LET-5888] (#5754 )	2025-11-13 15:35:34 -08:00
Kian Jones	704d3b2d79	chore: refactor not to use warnings.warn (#5730 ) * refactor not to use warnings.warn * temp circular import fix maybe unecessary/bnad * fix Deprecation warning * fix deprecation warning and mcp thing? * revert changes to mcp server test * fix deprecation warning	2025-10-24 15:14:31 -07:00
Kian Jones	161c70f11c	chore: add logger calls instead of prints (#5720 ) add logger calls instead of prints	2025-10-24 15:14:21 -07:00
Kian Jones	6757c7e201	fix: also catch attribute error (.get() on type int for example) (#5650 ) also catch attribute error (.get() on type int for example)	2025-10-24 15:13:47 -07:00
Ari Webb	7057dce156	feat: add support for reasoning for Claude Haiku 4.5 [LET-5735] (#5585 ) frontend and backend changes Co-authored-by: Ari Webb <ari@letta.com>	2025-10-24 15:13:35 -07:00
jnjpng	b0c0c8752b	feat: support anyOf for structured output tools [LET-5615] (#5556 ) * base * works? * update tests --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-10-24 15:13:15 -07:00
Matthew Zhou	09ba075cfa	feat: Modify embedding strategy to first halve the batch size v.s. the batc… [LET-5510] (#5434 ) Modify embedding strategy to first halve the batch size v.s. the batch size	2025-10-24 15:12:11 -07:00
Matthew Zhou	0543a60538	chore: Restore chunk size for openai embeddings (#5431 ) Restore chunk size	2025-10-24 15:12:11 -07:00
Ari Webb	624c591820	Ari/let 5486 badrequesterror error code 400 error message requested [LET-5486] (#5422 ) * letta agent v2 throw exception not error * warning instead of error or exception * decrease min_chunk_size --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-10-24 15:12:11 -07:00
cthomas	731e314b5d	fix: anthropic payload tool parse error LET-5469 (#5388 ) fix: anthropic payload tool parse error	2025-10-24 15:11:31 -07:00

1 2 3 4 5 ...

406 Commits