Commit Graph

55 Commits

Author SHA1 Message Date
Ari Webb
0bbb9c9bc0 feat: add reasoning zai openrouter (#9189)
* feat: add reasoning zai openrouter

* add openrouter reasoning

* stage + publish api

* openrouter reasoning always on

* revert

* fix

* remove reference

* do
2026-02-24 10:52:06 -08:00
Sarah Wooders
221b4e6279 refactor: add extract_usage_statistics returning LettaUsageStatistics (#9065)
👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
cthomas
c162de5127 fix: use shared event + .athrow() to properly set stream_was_cancelle… (#9019)
fix: use shared event + .athrow() to properly set stream_was_cancelled flag

**Problem:**
When a run is cancelled via /cancel endpoint, `stream_was_cancelled` remained
False because `RunCancelledException` was raised in the consumer code (wrapper),
which closes the generator from outside. This causes Python to skip the
generator's except blocks and jump directly to finally with the wrong flag value.

**Solution:**
1. Shared `asyncio.Event` registry for cross-layer cancellation signaling
2. `cancellation_aware_stream_wrapper` sets the event when cancellation detected
3. Wrapper uses `.athrow()` to inject exception INTO generator (not consumer-side raise)
4. All streaming interfaces check event in `finally` block to set flag correctly
5. `streaming_service.py` handles `RunCancelledException` gracefully, yields [DONE]

**Changes:**
- streaming_response.py: Event registry + .athrow() injection + graceful handling
- openai_streaming_interface.py: 3 classes check event in finally
- gemini_streaming_interface.py: Check event in finally
- anthropic_*.py: Catch RunCancelledException
- simple_llm_stream_adapter.py: Create & pass event to interfaces
- streaming_service.py: Handle RunCancelledException, yield [DONE], skip double-update
- routers/v1/{conversations,runs}.py: Pass event to wrapper
- integration_test_human_in_the_loop.py: New test for approval + cancellation

**Tests:**
- test_tool_call with cancellation (OpenAI models) 
- test_approve_with_cancellation (approval flow + concurrent cancel) 

**Known cosmetic warnings (pre-existing):**
- "Run already in terminal state" - agent loop tries to update after /cancel
- "Stream ended without terminal event" - background streaming timing race

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
Sarah Wooders
8729a037b9 fix: handle new openai overflow error format (#7110) 2025-12-17 17:31:02 -08:00
Devansh Jain
d1536df6f6 chore: Update deepseek client for v3.2 models (#6556)
* support for v3.2 models

* streaming + context window fix

* fix for no assitant text from deepseek
2025-12-15 12:02:34 -08:00
Kian Jones
647e271c2a fix: add more logging for stream error (#6490)
* trying tout gpt-5.1-codex

* add unit test for message content

* try to support multimodal

* remove ValueError and add logging on stream error

* prevent stream termination from api spec implementation errors

* fix: remove final_response references from non-Responses API interfaces

* fix: add diagnostic attributes to SimpleOpenAIResponsesStreamingInterface

* fix: remove final_response from SimpleOpenAIStreamingInterface (Chat Completions API)
2025-12-15 12:02:33 -08:00
Sarah Wooders
91e3dd8b3e feat: fix new summarizer code and add more tests (#6461) 2025-12-15 12:02:19 -08:00
Charles Packer
4af6465226 feat(core+web): store raw usage data on streams (and visualize properly in ADE) (#6452)
* feat(core): store raw usage data on streams

* fix(web): various fixes to deal w/ hardcoding against openai
2025-12-15 12:02:19 -08:00
Charles Packer
88a3743cc8 fix(core): distinguish between null and 0 for prompt caching (#6451)
* fix(core): distinguish between null and 0 for prompt caching

* fix: runtime errors

* fix: just publish just sgate
2025-12-15 12:02:19 -08:00
Charles Packer
131891e05f feat: add tracking of advanced usage data (eg caching) [LET-6372] (#6449)
* feat: init refactor

* feat: add helper code

* fix: missing file + test

* fix: just state/publish api
2025-12-15 12:02:19 -08:00
Charles Packer
081a1f6920 fix(core): patch responses api parallel tool calling not returning tool call IDs (#6079)
* fix(core): patch responses api parallel tool calling not returning tool call ids

* fix(core): patch chatcompletions as well

* fix: patch problem with gpt-4.1
2025-11-13 15:36:56 -08:00
Matthew Zhou
a699aca626 fix: Eliminate O(n^2) string growth for OpenAI [LET-6065] (#5973)
Finish
2025-11-13 15:36:55 -08:00
Matthew Zhou
d3ab51b822 feat: Support parallel tool calling streaming for responses OpenAI [LET-5977] (#5867) 2025-11-13 15:36:20 -08:00
Matthew Zhou
ff81f4153b feat: Support parallel tool calling streaming for OpenAI chat completions [LET-4594] (#5865)
* Finish chat completions parallel tool calling

* Undo comments

* Add comments

* Remove test file
2025-11-13 15:36:14 -08:00
cthomas
1848df2daa feat: add special approval request otid for openai streaming (#5744)
* feat: add special approval request otid for openai streaming

* fix import
2025-10-24 15:14:39 -07:00
cthomas
4823416af9 feat: default unpack assistant message content [LET-5404] (#5707)
feat: default unpack assistant message content
2025-10-24 15:14:20 -07:00
Kevin Lin
08da1a64bb feat: parse reasoning_content from OAI proxies (eg. vLLM / OpenRouter) (#5372)
* reasonig_content support

* fix

* comment

* fix

* rm comment

---------

Co-authored-by: Charles Packer <packercharles@gmail.com>
2025-10-24 15:11:31 -07:00
Kian Jones
c2e474e03a feat: refactor logs to parse as a single log line each and filter out 404s from sentry (#5242)
* add multiline log auto detect

* implement logger.exception()

* filter out 404

* remove potentially problematic changes
2025-10-24 15:11:31 -07:00
Matthew Zhou
5593f1450b feat: Double write to ToolCallMessage's new list tool_calls field (#5268)
* Add new tool_calls field to ToolCallMessage

* fern autogen

* Double write to new tool_calls field

* Update straggling instances
2025-10-09 13:20:52 -07:00
Sarah Wooders
ef07e03ee3 feat: add run_id to input messages and step_id to messages (#5099) 2025-10-07 17:50:48 -07:00
cthomas
a3545110cf feat: add full responses api support in new agent loop (#5051)
* feat: add full responses api support in new agent loop

* update matrix in workflow

* relax check for reasoning messages for high effort gpt 5

* fix indent

* one more relax
2025-10-07 17:50:48 -07:00
cthomas
67f8e46619 feat: add run id to streamed messages (#5037) 2025-10-07 17:50:47 -07:00
Charles Packer
a4041879a4 feat: add new agent loop (squash rebase of OSS PR) (#4815)
* feat: squash rebase of OSS PR

* fix: revert changes that weren't on manual rebase

* fix: caught another one

* fix: disable force

* chore: drop print

* fix: just stage-api && just publish-api

* fix: make agent_type consistently an arg in the client

* fix: patch multi-modal support

* chore: put in todo stub

* fix: disable hardcoding for tests

* fix: patch validate agent sync (#4882)

patch validate agent sync

* fix: strip bad merge diff

* fix: revert unrelated diff

* fix: react_v2 naming -> letta_v1 naming

* fix: strip bad merge

---------

Co-authored-by: Kevin Lin <klin5061@gmail.com>
2025-10-07 17:50:45 -07:00
Kian Jones
b8e9a80d93 merge this (#4759)
* wait I forgot to comit locally

* cp the entire core directory and then rm the .git subdir
2025-09-17 15:47:40 -07:00
Kian Jones
22f70ca07c chore: officially migrate to submodule (#4502)
* remove apps/core and apps/fern

* fix precommit

* add submodule updates in workflows

* submodule

* remove core tests

* update core revision

* Add submodules: true to all GitHub workflows

- Ensure all workflows can access git submodules
- Add submodules support to deployment, test, and CI workflows
- Fix YAML syntax issues in workflow files

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* remove core-lint

* upgrade core with latest main of oss

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-09 12:45:53 -07:00
cthomas
cb7296c81d fix: approval request for streaming (#4445)
* fix: approval request for streaming

* fix: claude code attempt, unit test passing (add on to #4445) (#4448)

* fix: claude code attempt, unit test passing

* chore: update locks to 0.1.314 from 0.1.312

* chore: just stage-api && just publish-api

* chore: drop dead poetry lock

---------

Co-authored-by: Charles Packer <packercharles@gmail.com>
2025-09-05 17:43:21 -07:00
Charles Packer
264171f327 fix: patch streaming hidden reasoning event [LET-4167] (#4367)
* fix: patch streaming hidden reasoning event

* fix: patch reasoning_effort not getting passed to openai
2025-09-02 16:21:18 -07:00
Charles Packer
9d49eff204 fix: patch the streaming issue in the openai client for when inner_thoughts_in_kwargs is off [LET-4146] (#4350)
fix: patch the streaming issue in the openai client for when inner_thoughts_in_kwargs is off
2025-09-02 12:44:25 -07:00
cthomas
1edcc13778 feat: support filtering out messages when converting to openai dict (#4337)
* feat: support filtering out messages when converting to openai dict

* fix imports
2025-09-01 12:48:45 -07:00
Kian Jones
fecf6decfb chore: migrate to ruff (#4305)
* base requirements

* autofix

* Configure ruff for Python linting and formatting

- Set up minimal ruff configuration with basic checks (E, W, F, I)
- Add temporary ignores for common issues during migration
- Configure pre-commit hooks to use ruff with pass_filenames
- This enables gradual migration from black to ruff

* Delete sdj

* autofixed only

* migrate lint action

* more autofixed

* more fixes

* change precommit

* try changing the hook

* try this stuff
2025-08-29 11:11:19 -07:00
cthomas
c8b370466e fix: duplicate message stream error (#3834) 2025-08-11 14:27:35 -07:00
cthomas
db41f01ac2 feat: continue stream processing on client cancel (#3796) 2025-08-07 13:17:36 -07:00
Andy Li
ca6f474c4e feat: track metrics for runs in db 2025-08-06 15:46:50 -07:00
cthomas
7d33254f5f feat: log stream cancellation to sentry (#3759) 2025-08-05 16:07:30 -07:00
jnjpng
6b082f0447 fix: manually count tokens for streaming lmstudio models
Co-authored-by: Jin Peng <jinjpeng@Jins-MacBook-Pro.local>
Co-authored-by: Charles Packer <packercharles@gmail.com>
2025-07-29 18:12:42 -07:00
Andy Li
33c1f26ab6 feat: support for agent loop job cancelation (#2837) 2025-07-02 14:31:16 -07:00
Kevin Lin
868294533c feat: add omitted reasoning to streaming openai reasoning (#2846)
Co-authored-by: Charles Packer <packercharles@gmail.com>
2025-06-24 18:47:38 -07:00
Sarah Wooders
5fa52a2c38 fix: avoid calling model_dump on stop reason messages twice (#2811) 2025-06-13 18:25:35 -07:00
cthomas
1405464a1c feat: send stop reason in letta APIs (#2789) 2025-06-13 16:04:48 -07:00
Andy Li
33bfd14017 fix: metric tracking (#2785) 2025-06-13 13:53:10 -07:00
cthomas
605a1f410c feat: consolidate logic for finish tokens (#2779) 2025-06-12 15:24:06 -07:00
Kevin Lin
58c4448235 fix: patch reasoning models (#2703)
Co-authored-by: Charles Packer <packercharles@gmail.com>
2025-06-11 17:20:04 -07:00
Andy Li
d2252f2953 feat: otel metrics and expanded collecting (#2647)
(passed tests in last run)
2025-06-05 17:20:14 -07:00
cthomas
22c66da7bc fix: add temp hack to gracefully handle parallel tool calling (#2654) 2025-06-05 14:43:46 -07:00
Kevin Lin
0d6907c8cf fix: set openai streaming interface letta_message_id (#2648)
Co-authored-by: Caren Thomas <carenthomas@gmail.com>
2025-06-05 12:26:01 -07:00
cthomas
a8f394d675 feat: populate tool call name and id in when token streaming (#2639) 2025-06-04 17:06:44 -07:00
Matthew Zhou
7debadb3b9 fix: Change enum to fix composio tests (#2488) 2025-05-28 10:24:22 -07:00
Matthew Zhou
dba4cc9ea0 feat: Add TTFT latency from provider in traces (#2481) 2025-05-28 10:06:16 -07:00
cthomas
871e171b44 feat: add tracing to streaming interface (#2477) 2025-05-27 16:20:05 -07:00
Matthew Zhou
ad6e446849 feat: Asyncify insert archival memories (#2430)
Co-authored-by: Caren Thomas <carenthomas@gmail.com>
2025-05-25 22:28:35 -07:00