Commit Graph

309 Commits

Author SHA1 Message Date
Kevin Lin
33afb930fc fix: Google AI client logging as Vertex (#7337)
fix
2025-12-17 17:32:27 -08:00
Ari Webb
e47eb23522 fix: max output tokens for gemini 3 models (#7322) 2025-12-17 17:31:03 -08:00
Sarah Wooders
8729a037b9 fix: handle new openai overflow error format (#7110) 2025-12-17 17:31:02 -08:00
Charles Packer
33d39f4643 fix(core): patch usage data tracking for anthropic when context caching is on (#6997) 2025-12-15 12:03:09 -08:00
Kevin Lin
4b9485a484 feat: Add max tokens exceeded to stop reasons [LET-6480] (#6576) 2025-12-15 12:03:09 -08:00
Ari Webb
4d90f37f50 feat: add gpt-5.2 support (#6698) 2025-12-15 12:02:34 -08:00
Sarah Wooders
c8fa77a01f feat: cleanup cancellation code and add more logging (#6588) 2025-12-15 12:02:34 -08:00
Sarah Wooders
8440e319e2 Revert "feat: enable provider models persistence" (#6590)
Revert "feat: enable provider models persistence (#6193)"

This reverts commit 9682aff32640a6ee8cf71a6f18c9fa7cda25c40e.
2025-12-15 12:02:34 -08:00
Ari Webb
848a73125c feat: enable provider models persistence (#6193)
* Revert "fix test"

This reverts commit 5126815f23cefb4edad3e3bf9e7083209dcc7bf1.

* fix server and better test

* test fix, get api key for base and byok?

* set letta default endpoint

* try to fix timeout for test

* fix for letta api key

* Delete apps/core/tests/sdk_v1/conftest.py

* Update utils.py

* clean up a few issues

* fix filterning on list_llm_models

* soft delete models with provider

* add one more test

* fix ci

* add timeout

* band aid for letta embedding provider

* info instead of error logs when creating models
2025-12-15 12:02:34 -08:00
Devansh Jain
d1536df6f6 chore: Update deepseek client for v3.2 models (#6556)
* support for v3.2 models

* streaming + context window fix

* fix for no assitant text from deepseek
2025-12-15 12:02:34 -08:00
Kian Jones
d6292b6eb6 fix: bug which causes unrecoverable state if previous message was an image (#6486)
* trying tout gpt-5.1-codex

* add unit test for message content

* try to support multimodal
2025-12-15 12:02:33 -08:00
Sarah Wooders
91e3dd8b3e feat: fix new summarizer code and add more tests (#6461) 2025-12-15 12:02:19 -08:00
Charles Packer
e67c98eedb feat: add tests for prompt caching + fix anthropic prompt caching [LET-6373] (#6454)
* feat: add tests for prompt caching

* fix: add cache control breakpoints for anthropic + fix tests

* fix: silence logging

* fix: patch token counting error

* fix: same patch on non-streaming path
2025-12-15 12:02:19 -08:00
Charles Packer
88a3743cc8 fix(core): distinguish between null and 0 for prompt caching (#6451)
* fix(core): distinguish between null and 0 for prompt caching

* fix: runtime errors

* fix: just publish just sgate
2025-12-15 12:02:19 -08:00
Charles Packer
131891e05f feat: add tracking of advanced usage data (eg caching) [LET-6372] (#6449)
* feat: init refactor

* feat: add helper code

* fix: missing file + test

* fix: just state/publish api
2025-12-15 12:02:19 -08:00
Sarah Wooders
807c5c18d9 feat: add gemini token counting [LET-6371] (#6444) 2025-12-15 12:02:19 -08:00
jnjpng
c6df306ccf fix: upgrade google-genai sdk version and fix gemini 3 streaming (#6437)
* base

* base

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:02:18 -08:00
Ari Webb
d4e7428c98 feat: structured outputs for anthropic [LET-6232] (#6410)
feat: structured outputs for anthropic

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-26 14:39:40 -08:00
Ari Webb
89c7ab5f14 feat: structured outputs for openai [LET-6233] (#6363)
* first hack with test

* remove changes integration test

* Delete apps/core/tests/sdk_v1/integration/integration_test_send_message_v2.py

* add test

* remove comment

* stage and publish api

* deprecate base level response_schema

* add param to llm_config test

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-26 14:39:39 -08:00
Charles Packer
c4699b3d17 feat: add support for opus 4.5 (#6256)
* feat: add support for new model

* fix: just stage-api && just publish-api (anthropic model settings changed)

* fix: just stage-api && just publish-api (anthropic model settings changed)

* fix: make kevlar have default reasoning on

* fix: bump anthropic sdk version

* fix: patch name

* pin newer version anthropic

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-24 19:10:27 -08:00
jnjpng
9adeb122cd feat: set thinking level high for gemini 3 (#6330)
* base

* include thoughts

* update

* clean

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-11-24 19:10:27 -08:00
cthomas
d903056a43 fix: list indices must be int not str error [LET-6254] (#6302)
fix: list indices must be int not str error
2025-11-24 19:10:26 -08:00
cthomas
1c358c863e fix: google api sync request (#6292)
* fix: google api sync request

* remove unused function
2025-11-24 19:10:26 -08:00
cthomas
2ebf863998 fix: type error in agent step (#6287) 2025-11-24 19:10:26 -08:00
cthomas
6f810d95d8 feat: add semaphore to limit embeddings creation (#6261) 2025-11-24 19:10:11 -08:00
Ari Webb
9bff8413a0 fix: fix gemini models reasoning [LET-6228] (#6245)
* fixed locally

* fix tests

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-24 19:09:33 -08:00
Ari Webb
7380eaec13 feat: enable gpt5.1 models [LET-6178] (#6175)
* hack at gpt51

* revert package lock

* first hack

* default context window

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-24 19:09:33 -08:00
Kian Jones
ddb6f3836e Fix: prevent empty embedding batches from causing memory spikes (#6230)
Root cause: When splitting failed embedding batches, mid=0 for single
items created empty chunks. These empty chunks were then processed,
creating hundreds of no-op tasks that consumed memory.

Crash pattern from logs:
- 600+ 'batch_size=0' embedding tasks created
- Memory spiked 531 MB → 4.9 GB
- Pod crashed

Fixes:
1. Skip empty chunks before creating tasks
2. Guard chunk splits to prevent empty slices (mid = max(1, len//2))
3. Break early if all chunks are empty

This prevents the asyncio.gather() from creating thousands of empty
coroutines that exhaust memory.
2025-11-24 19:09:33 -08:00
Sarah Wooders
a6b19bf3aa Revert "feat: support anthropic structured outputs [LET-6190]" (#6229)
Revert "feat: support anthropic structured outputs [LET-6190] (#6200)"

This reverts commit 46f6422e68b0d35df1690118dd971c20a4ba7a81.
2025-11-24 19:09:33 -08:00
Sarah Wooders
c18af2bc81 feat: support anthropic structured outputs [LET-6190] (#6200) 2025-11-24 19:09:33 -08:00
Ari Webb
f19a71dad1 chore: log problematic chunk (#6166)
log problematic chunk

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-24 19:09:32 -08:00
Kian Jones
aafd5696c5 fix: logging and try to handle invalid embeddings (#6145)
logging and try to handkle invalid embeddings
2025-11-13 15:36:56 -08:00
jnjpng
cb63753db6 fix: f string formatting separator in anthropic client (#6058)
* fix

* fix

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-11-13 15:36:56 -08:00
Charles Packer
18029250d0 fix(core): sanitize messages to anthropic in the main path the same way (or similar) to how we do it in the token counter (#6044)
* fix(core): sanitize messages to anthropic in the main path the same way (or similar) to how we do it in the token counter

* fix: also patch poison error in backend by filtering lazily

* fix: remap streaming errors (what the fuck)

* fix: dedupe tool clals

* fix: cleanup, removed try/catch
2025-11-13 15:36:55 -08:00
Kian Jones
6943b68288 tests: adding unit testing and fix edge case (#5992)
cursor bugbot suggestion number 2 and adding unit testing
2025-11-13 15:36:55 -08:00
Kian Jones
688397bbd2 fix: filter out messages with empty content in anthropic client (#5987)
* no bare raise in anthropic client. Also filter out messages with empty content

* use client logic

* replace empty/whitespace with period

* cursor bugbot
2025-11-13 15:36:55 -08:00
Charles Packer
b42e9fa7f3 fix(core): properly map bytes overflow to context overflow error (#5930) 2025-11-13 15:36:45 -08:00
Ari Webb
7427c0998e feat: gemini parallel tool calling non streaming [LET-5993] (#5889)
* first hack

* just test non streaming

* stream_steps should pass too

* clean up

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-13 15:36:39 -08:00
Sarah Wooders
57bb051ea4 feat: add tool return truncation to summarization as a fallback [LET-5970] (#5859) 2025-11-13 15:36:30 -08:00
Ari Webb
48cc73175b feat: parallel tool calling for openai non streaming [LET-4593] (#5773)
* first hack

* clean up

* first implementation working

* revert package-lock

* remove openai test

* error throw

* typo

* Update integration_test_send_message_v2.py

* Update integration_test_send_message_v2.py

* refine test

* Only make changes for openai non streaming

* Add tests

---------

Co-authored-by: Ari Webb <ari@letta.com>
Co-authored-by: Matt Zhou <mattzh1314@gmail.com>
2025-11-13 15:36:14 -08:00
Charles Packer
f6e6c85c46 fix(core): patch counting bug (#5857) 2025-11-13 15:36:14 -08:00
Sarah Wooders
d109bab587 feat: handle overloaded errors [LET-5969] (#5856)
feat: handle overloaded errors
2025-11-13 15:36:14 -08:00
Ari Webb
f3a40a41f5 feat: updated backend to not allow minimal for codex [LET-5883] (#5760)
* updated backend

* add function in openai_client

* remove values before error

* remove test

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-13 15:35:34 -08:00
Sarah Wooders
e7fff12da0 feat: patch model listing to actually match handle [LET-5888] (#5754) 2025-11-13 15:35:34 -08:00
Kian Jones
704d3b2d79 chore: refactor not to use warnings.warn (#5730)
* refactor not to use warnings.warn

* temp circular import fix maybe unecessary/bnad

* fix Deprecation warning

* fix deprecation warning and mcp thing?

* revert changes to mcp server test

* fix deprecation warning
2025-10-24 15:14:31 -07:00
Kian Jones
161c70f11c chore: add logger calls instead of prints (#5720)
add logger calls instead of prints
2025-10-24 15:14:21 -07:00
Kian Jones
6757c7e201 fix: also catch attribute error (.get() on type int for example) (#5650)
also catch attribute error (.get() on type int for example)
2025-10-24 15:13:47 -07:00
Ari Webb
7057dce156 feat: add support for reasoning for Claude Haiku 4.5 [LET-5735] (#5585)
frontend and backend changes

Co-authored-by: Ari Webb <ari@letta.com>
2025-10-24 15:13:35 -07:00
jnjpng
b0c0c8752b feat: support anyOf for structured output tools [LET-5615] (#5556)
* base

* works?

* update tests

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-10-24 15:13:15 -07:00
Matthew Zhou
09ba075cfa feat: Modify embedding strategy to first halve the batch size v.s. the batc… [LET-5510] (#5434)
Modify embedding strategy to first halve the batch size v.s. the batch size
2025-10-24 15:12:11 -07:00