Commit Graph

285 Commits

Author SHA1 Message Date
cthomas
6f810d95d8 feat: add semaphore to limit embeddings creation (#6261) 2025-11-24 19:10:11 -08:00
Ari Webb
9bff8413a0 fix: fix gemini models reasoning [LET-6228] (#6245)
* fixed locally

* fix tests

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-24 19:09:33 -08:00
Ari Webb
7380eaec13 feat: enable gpt5.1 models [LET-6178] (#6175)
* hack at gpt51

* revert package lock

* first hack

* default context window

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-24 19:09:33 -08:00
Kian Jones
ddb6f3836e Fix: prevent empty embedding batches from causing memory spikes (#6230)
Root cause: When splitting failed embedding batches, mid=0 for single
items created empty chunks. These empty chunks were then processed,
creating hundreds of no-op tasks that consumed memory.

Crash pattern from logs:
- 600+ 'batch_size=0' embedding tasks created
- Memory spiked 531 MB → 4.9 GB
- Pod crashed

Fixes:
1. Skip empty chunks before creating tasks
2. Guard chunk splits to prevent empty slices (mid = max(1, len//2))
3. Break early if all chunks are empty

This prevents the asyncio.gather() from creating thousands of empty
coroutines that exhaust memory.
2025-11-24 19:09:33 -08:00
Sarah Wooders
a6b19bf3aa Revert "feat: support anthropic structured outputs [LET-6190]" (#6229)
Revert "feat: support anthropic structured outputs [LET-6190] (#6200)"

This reverts commit 46f6422e68b0d35df1690118dd971c20a4ba7a81.
2025-11-24 19:09:33 -08:00
Sarah Wooders
c18af2bc81 feat: support anthropic structured outputs [LET-6190] (#6200) 2025-11-24 19:09:33 -08:00
Ari Webb
f19a71dad1 chore: log problematic chunk (#6166)
log problematic chunk

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-24 19:09:32 -08:00
Kian Jones
aafd5696c5 fix: logging and try to handle invalid embeddings (#6145)
logging and try to handkle invalid embeddings
2025-11-13 15:36:56 -08:00
jnjpng
cb63753db6 fix: f string formatting separator in anthropic client (#6058)
* fix

* fix

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-11-13 15:36:56 -08:00
Charles Packer
18029250d0 fix(core): sanitize messages to anthropic in the main path the same way (or similar) to how we do it in the token counter (#6044)
* fix(core): sanitize messages to anthropic in the main path the same way (or similar) to how we do it in the token counter

* fix: also patch poison error in backend by filtering lazily

* fix: remap streaming errors (what the fuck)

* fix: dedupe tool clals

* fix: cleanup, removed try/catch
2025-11-13 15:36:55 -08:00
Kian Jones
6943b68288 tests: adding unit testing and fix edge case (#5992)
cursor bugbot suggestion number 2 and adding unit testing
2025-11-13 15:36:55 -08:00
Kian Jones
688397bbd2 fix: filter out messages with empty content in anthropic client (#5987)
* no bare raise in anthropic client. Also filter out messages with empty content

* use client logic

* replace empty/whitespace with period

* cursor bugbot
2025-11-13 15:36:55 -08:00
Charles Packer
b42e9fa7f3 fix(core): properly map bytes overflow to context overflow error (#5930) 2025-11-13 15:36:45 -08:00
Ari Webb
7427c0998e feat: gemini parallel tool calling non streaming [LET-5993] (#5889)
* first hack

* just test non streaming

* stream_steps should pass too

* clean up

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-13 15:36:39 -08:00
Sarah Wooders
57bb051ea4 feat: add tool return truncation to summarization as a fallback [LET-5970] (#5859) 2025-11-13 15:36:30 -08:00
Ari Webb
48cc73175b feat: parallel tool calling for openai non streaming [LET-4593] (#5773)
* first hack

* clean up

* first implementation working

* revert package-lock

* remove openai test

* error throw

* typo

* Update integration_test_send_message_v2.py

* Update integration_test_send_message_v2.py

* refine test

* Only make changes for openai non streaming

* Add tests

---------

Co-authored-by: Ari Webb <ari@letta.com>
Co-authored-by: Matt Zhou <mattzh1314@gmail.com>
2025-11-13 15:36:14 -08:00
Charles Packer
f6e6c85c46 fix(core): patch counting bug (#5857) 2025-11-13 15:36:14 -08:00
Sarah Wooders
d109bab587 feat: handle overloaded errors [LET-5969] (#5856)
feat: handle overloaded errors
2025-11-13 15:36:14 -08:00
Ari Webb
f3a40a41f5 feat: updated backend to not allow minimal for codex [LET-5883] (#5760)
* updated backend

* add function in openai_client

* remove values before error

* remove test

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-13 15:35:34 -08:00
Sarah Wooders
e7fff12da0 feat: patch model listing to actually match handle [LET-5888] (#5754) 2025-11-13 15:35:34 -08:00
Kian Jones
704d3b2d79 chore: refactor not to use warnings.warn (#5730)
* refactor not to use warnings.warn

* temp circular import fix maybe unecessary/bnad

* fix Deprecation warning

* fix deprecation warning and mcp thing?

* revert changes to mcp server test

* fix deprecation warning
2025-10-24 15:14:31 -07:00
Kian Jones
161c70f11c chore: add logger calls instead of prints (#5720)
add logger calls instead of prints
2025-10-24 15:14:21 -07:00
Kian Jones
6757c7e201 fix: also catch attribute error (.get() on type int for example) (#5650)
also catch attribute error (.get() on type int for example)
2025-10-24 15:13:47 -07:00
Ari Webb
7057dce156 feat: add support for reasoning for Claude Haiku 4.5 [LET-5735] (#5585)
frontend and backend changes

Co-authored-by: Ari Webb <ari@letta.com>
2025-10-24 15:13:35 -07:00
jnjpng
b0c0c8752b feat: support anyOf for structured output tools [LET-5615] (#5556)
* base

* works?

* update tests

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-10-24 15:13:15 -07:00
Matthew Zhou
09ba075cfa feat: Modify embedding strategy to first halve the batch size v.s. the batc… [LET-5510] (#5434)
Modify embedding strategy to first halve the batch size v.s. the batch size
2025-10-24 15:12:11 -07:00
Matthew Zhou
0543a60538 chore: Restore chunk size for openai embeddings (#5431)
Restore chunk size
2025-10-24 15:12:11 -07:00
Ari Webb
624c591820 Ari/let 5486 badrequesterror error code 400 error message requested [LET-5486] (#5422)
* letta agent v2 throw exception not error

* warning instead of error or exception

* decrease min_chunk_size

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-10-24 15:12:11 -07:00
cthomas
731e314b5d fix: anthropic payload tool parse error LET-5469 (#5388)
fix: anthropic payload tool parse error
2025-10-24 15:11:31 -07:00
Kevin Lin
08da1a64bb feat: parse reasoning_content from OAI proxies (eg. vLLM / OpenRouter) (#5372)
* reasonig_content support

* fix

* comment

* fix

* rm comment

---------

Co-authored-by: Charles Packer <packercharles@gmail.com>
2025-10-24 15:11:31 -07:00
Matthew Zhou
25f140bd13 fix: Fix anthropic step parallel tool calling and add tests [LET-5438] (#5379)
* Fix anthropic step parallel tool calling and add tests

* Remove print statements
2025-10-24 15:11:31 -07:00
cthomas
a1e771877d feat: add model name patch for openrouter (#5303)
* feat: add model name patch for openrouter

* add comment
2025-10-09 15:25:21 -07:00
Charles Packer
6889a73dcf fix(core): patch the headers not appearing in the misc anthropic routes [LET-5337] (#5274)
fix(core): patch the headers not appearing in the misc anthroipic routes
2025-10-09 13:20:53 -07:00
cthomas
9cb5156ac7 chore: sync 0.12.0 version (#3023)
Co-authored-by: Matthew Zhou <mattzh1314@gmail.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
2025-10-08 16:10:51 -07:00
cthomas
f8dce88ce4 feat: support for models that do not allow None content (#5218) 2025-10-07 17:50:50 -07:00
Charles Packer
07a687880f feat(core): add sonnet 1m support [LET-4620] (#5152)
feat(core): add sonnet 1m support
2025-10-07 17:50:49 -07:00
cthomas
89321ff29a feat: handle flaky reasoning in v2 tests (#5133) 2025-10-07 17:50:49 -07:00
cthomas
93d9ff01c6 feat: add gemini native thinking (#5124)
* feat: add gemini native thinking

* update test

* revert comments
2025-10-07 17:50:49 -07:00
cthomas
6b6b82a07a feat: add streaming support to vertex client (#5106) 2025-10-07 17:50:48 -07:00
cthomas
a3545110cf feat: add full responses api support in new agent loop (#5051)
* feat: add full responses api support in new agent loop

* update matrix in workflow

* relax check for reasoning messages for high effort gpt 5

* fix indent

* one more relax
2025-10-07 17:50:48 -07:00
Matthew Zhou
df5c997da0 feat: Enable dynamic toggling of tool choice in v3 agent loop for OpenAI [LET-4564] (#5042)
* Add subsequent flag

* Finish integrating constrained/unconstrained toggling on v3 agent loop

* Update tests to run on v3

* Run lint
2025-10-07 17:50:47 -07:00
Charles Packer
1c7448eb9d fix: patch remaining failing send_message CI tests (#4974)
* fix: patch failing summarizer tests for anthropic claude 3.5

* fix: carveout for gemini-2.5-flash because it doesn't do the send_message tool call

* fix: deprecate old gemini test now that model is unavailable

* fix: deprecate old gemini test now that model is unavailable

* fix: deprecate old gemini test now that model is unavailable

* fix: patch flash flakiness

* fix: relax the gemini 2.5 flash test

* fix: relax the gemini 2.5 flash test

* fix: relax again

* fix: another flash fix

* fix: relax gpt-4o-mini

* fix: swap 4o-mini for 4.1

* fix: drop 4o-mini
2025-10-07 17:50:46 -07:00
Charles Packer
9fd901c3a5 fix(core): patch default reasoning for letta-free on letta v1 (#4953)
* fix(core): patch default reasoning for letta-free on letta v1

* fix: patch (unrelated?) bug w/ missing required for empty tools like reset_research
2025-10-07 17:50:46 -07:00
Sarah Wooders
b5de42fefd fix: patch summarizers for integration_test_send_message.py (#4919)
* fix: integration_test_send_message.py

* patch summarizer

* remove print
2025-10-07 17:50:46 -07:00
Charles Packer
81bf132712 fix: disable reasoning toggle when agent type is letta agent v1 [PRO-1346] (#4946)
* fix: change the backend logic for the reasoning toggle to be different for v1 vs old agent types, also change the frontend to disable toggling unless it's a toggle-able model

* fix: add more logic to when the reasoning switch is enabled or disabled

* fix: safety net in the backend to set config when undefined

* fix: disable prefix fill, it's a bug
2025-10-07 17:50:46 -07:00
Charles Packer
8da15aaf08 fix(core): patch issue where LLM may generate a 'noop' call [PRO-1340] (#4944)
fix(core): patch issue where LLM may generate a 'noop' call
2025-10-07 17:50:46 -07:00
Matthew Zhou
b0bc04fec7 fix: Patch batch routes (#4916)
Patch batch routes
2025-10-07 17:50:46 -07:00
Charles Packer
a4041879a4 feat: add new agent loop (squash rebase of OSS PR) (#4815)
* feat: squash rebase of OSS PR

* fix: revert changes that weren't on manual rebase

* fix: caught another one

* fix: disable force

* chore: drop print

* fix: just stage-api && just publish-api

* fix: make agent_type consistently an arg in the client

* fix: patch multi-modal support

* chore: put in todo stub

* fix: disable hardcoding for tests

* fix: patch validate agent sync (#4882)

patch validate agent sync

* fix: strip bad merge diff

* fix: revert unrelated diff

* fix: react_v2 naming -> letta_v1 naming

* fix: strip bad merge

---------

Co-authored-by: Kevin Lin <klin5061@gmail.com>
2025-10-07 17:50:45 -07:00
Charles Packer
9edc7f4d64 feat: add OpenRouterProvider (#4848)
* feat: init add of openrouter provider, doesn't work bc no key pass and no header pass

* fix: working
2025-10-07 17:50:45 -07:00
Kian Jones
b8e9a80d93 merge this (#4759)
* wait I forgot to comit locally

* cp the entire core directory and then rm the .git subdir
2025-09-17 15:47:40 -07:00