Commit Graph

27 Commits

Author SHA1 Message Date
jnjpng
9155b4fa86 fix: use canonical stop reason mapping in redis stream finalizer (#9600)
fix: derive run status from StopReasonType mapping
2026-02-24 10:55:12 -08:00
Kian Jones
f5c4ab50f4 chore: add ty + pre-commit hook and repeal even more ruff rules (#9504)
* auto fixes

* auto fix pt2 and transitive deps and undefined var checking locals()

* manual fixes (ignored or letta-code fixed)

* fix circular import

* remove all ignores, add FastAPI rules and Ruff rules

* add ty and precommit

* ruff stuff

* ty check fixes

* ty check fixes pt 2

* error on invalid
2026-02-24 10:55:11 -08:00
cthomas
2bccd36382 Revert "fix: ensure stop_reason is always set and reduce noisy logs (… (#9086)
Revert "fix: ensure stop_reason is always set and reduce noisy logs (#9046)"

This reverts commit 4241a360579440d2697124ba69061d0e46ecc5e9.

**Problem:**
After the original change, caren-code-agent reported streams hanging
indefinitely. The trace shows ttft (time to first token) succeeds, but
the stream never closes.

**Root Cause (suspected):**
The change modified `is_complete=is_done` to `is_complete=saw_done`,
meaning error events no longer mark the stream as complete immediately.
This may cause timing issues where clients wait for more data before
the finalizer runs.

**Fix:**
Revert to the defensive "belt-and-suspenders" approach that always
appends [DONE]. The noisy logs are preferable to hanging streams.

The original comment noted: "Even if a previous chunk set `complete`,
an extra [DONE] is harmless and ensures SDKs that rely on explicit
[DONE] will exit."

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
cthomas
2a2e777807 fix: ensure stop_reason is always set and reduce noisy logs (#9046)
fix: consume [DONE] token after error events to prevent forced finalizer append

**Problem:**
Stream finalizer was frequently logging warning and appending forced [DONE]:
```
[Stream Finalizer] Appending forced [DONE] for run=run-xxx (saw_error=True,
saw_done=False, final_stop_reason=llm_api_error)
```

This happened on every error, even though streaming_service.py already yields
[DONE] after all error events.

**Root Cause:**
Line 266: `is_done = saw_done or saw_error` caused loop to break immediately
after seeing error event, BEFORE consuming the [DONE] chunk that follows:

```python
is_done = saw_done or saw_error
await writer.write_chunk(...)
if is_done:  # Breaks on error!
    break
```

Sequence:
1. streaming_service.py yields: `event: error\ndata: {...}\n\n`
2. Redis reader sees error → sets `saw_error=True`
3. Sets `is_done=True` and breaks
4. Never reads next chunk: `data: [DONE]\n\n`
5. Finalizer runs → `saw_done=False` → appends forced [DONE]

**Fix:**
1. Only break when `saw_done=True` (not `saw_error`) → allows consuming [DONE]
2. Only run finalizer when `saw_done=False` → reduces log noise

**Result:**
- [DONE] now consumed naturally from streaming_service.py error handlers
- Finalizer warning only appears when truly needed (fallback cases)
- Cleaner production logs

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
cthomas
5f58819bbf fix: false positives for token markers (#8942) 2026-01-29 12:43:23 -08:00
Ari Webb
6d859174c2 feat: make conversations throw http busy to stop race condition [LET-6842] (#8411)
* feat: make conversations throw http busy to stop race condition

* use redis lock instead

* move acquire lock into redis client, integration tests, move lock release into run manager

* fix tests, bug

* conditional import

* remove else

* better release

* run ci

* final reordering lock

* update tests

* wrong naming of lock holder token
2026-01-12 10:57:49 -08:00
Sarah Wooders
d5decc2a27 fix: persist streaming errors in run metadata (#8062) 2026-01-12 10:57:47 -08:00
cthomas
e4c58da682 fix: streaming error for stop reason chunks (#6285) 2025-11-24 19:10:26 -08:00
cthomas
1be2f61f05 feat: add new letta error message stream response type (#6192) 2025-11-24 19:10:11 -08:00
Charles Packer
2e721ddc62 fix: various hardening to prevent stale state on background mode runs (#6072)
fix: various hardening to prevent stale state on backgroun
d mode runs
2025-11-13 15:36:56 -08:00
Charles Packer
363a5c1f92 fix: fix poison state from bad approval response (#5979)
* fix: detect and fail on malformed approval responses

* fix: guard against None approvals in utils.py

* fix: add extra warning

* fix: stop silent drops in deserialize_approvals

* fix: patch v3 stream error handling to prevent sending end_turn after an error occurs, and ensures stop_reason is always set when an error occurs

* fix: Prevents infinite client hangs by ensuring a terminal event is ALWAYS sent

* fix:  Ensures terminal events are sent even if inner stream generator fails to
  send them
2025-11-13 15:36:55 -08:00
Charles Packer
468b47bef5 fix(core): patch sse streaming errors (#5906)
* fix: patch sse streaming errors

* fix: don't re-raise, but log explicitly with sentry

* chore: cleanup comments

* fix: revert change from #5907, also make sure to write out a [DONE] to close the stream
2025-11-13 15:36:39 -08:00
Charles Packer
a6077f3927 fix(core): Fix agent loop continuing after cancellation in letta_agent_v3 [LET-6006] (#5905)
* Fix agent loop continuing after cancellation in letta_agent_v3

Bug: When a run is cancelled, _check_run_cancellation() sets
self.should_continue=False and returns early from _step(), but the outer
for loop (line 245) continues to the next iteration, executing subsequent
steps even though cancellation was requested.

Symptom: User hits cancel during step 1, backend marks run as cancelled,
but agent continues executing steps 2, 3, etc.

Root cause: After the 'async for chunk in response' loop completes (line 255),
there was no check of self.should_continue before continuing to the next
iteration of the outer step loop.

Fix: Added 'if not self.should_continue: break' check after the inner loop
to exit the outer step loop when cancellation is detected. This makes v3
consistent with v2 which already had this check (line 306-307).

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>

* add integration tests

* fix: misc fixes required to get cancellations to work on letta code localhost

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
2025-11-13 15:36:39 -08:00
cthomas
128afeb587 feat: fix cancellation bugs and add testing (#5353) 2025-10-24 15:11:31 -07:00
cthomas
cd900a6f4d feat: check run completion in send message tests (#5030) 2025-10-07 17:50:47 -07:00
Sarah Wooders
354205f581 feat: create new runs table [LET-4467] (#4841) 2025-10-07 17:50:47 -07:00
Kian Jones
b8e9a80d93 merge this (#4759)
* wait I forgot to comit locally

* cp the entire core directory and then rm the .git subdir
2025-09-17 15:47:40 -07:00
Kian Jones
22f70ca07c chore: officially migrate to submodule (#4502)
* remove apps/core and apps/fern

* fix precommit

* add submodule updates in workflows

* submodule

* remove core tests

* update core revision

* Add submodules: true to all GitHub workflows

- Ensure all workflows can access git submodules
- Add submodules support to deployment, test, and CI workflows
- Fix YAML syntax issues in workflow files

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* remove core-lint

* upgrade core with latest main of oss

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-09 12:45:53 -07:00
Charles Packer
71a5eaa262 fix(core): change the backend mid-stream error packing to match what the FE expects [PRO-1107] (#4340)
fix(core): change the backend mid-stream error packing to match what the FE expects
2025-09-01 14:59:42 -07:00
Kian Jones
fecf6decfb chore: migrate to ruff (#4305)
* base requirements

* autofix

* Configure ruff for Python linting and formatting

- Set up minimal ruff configuration with basic checks (E, W, F, I)
- Add temporary ignores for common issues during migration
- Configure pre-commit hooks to use ruff with pass_filenames
- This enables gradual migration from black to ruff

* Delete sdj

* autofixed only

* migrate lint action

* more autofixed

* more fixes

* change precommit

* try changing the hook

* try this stuff
2025-08-29 11:11:19 -07:00
cthomas
6c59bc8dea feat: add retrieve stream testing (#4168) 2025-08-25 16:13:59 -07:00
cthomas
8909fd257b feat: add new client.runs.stream endpoint (#4165) 2025-08-25 13:42:22 -07:00
cthomas
dc83ff9f52 feat: set 3hr ttl for streaming background jobs (#4167) 2025-08-25 13:39:06 -07:00
cthomas
755731ca12 feat: populate run id in streaming response (#4161) 2025-08-25 11:55:41 -07:00
cthomas
dc9b7f28f3 feat: populate seq_id in chunks for background mode (#4160) 2025-08-25 11:51:41 -07:00
cthomas
d8c98ec23a feat: add error chunk handling for background mode (#4158) 2025-08-25 11:12:16 -07:00
cthomas
0db7fc981c feat: add background flag to streaming api (#4148) 2025-08-24 20:45:35 -07:00