letta-server/compose.yaml at ani-patches

Fimeg/letta-server

Fork 0

Files

Ani Tunturi 1d1adb261a

Test Package Installation / test-install (3.11) (push) Waiting to run

Details

Test Package Installation / test-install (3.12) (push) Waiting to run

Details

Test Package Installation / test-install (3.13) (push) Waiting to run

Details

fix: orphaned approvals, token inflation, reasoning fields, memfs redis dep

[IN TESTING — self-hosted 0.16.6, Kimi-K2.5 via Synthetic Direct]

Four independent fixes that landed together on this stack:

helpers.py — skip PendingApprovalError when the associated run is already
cancelled or failed. Stale approvals from interrupted runs were blocking all
subsequent messages on that conversation. Now checks run status before raising;
falls back to raising on lookup failure (conservative).

letta_agent_v3.py — use prompt_tokens not total_tokens for context window
estimate. total_tokens inflated the estimate by including completion tokens,
triggering premature compaction. This was causing context window resets mid-
conversation and is the root of the token inflation bug (see #3242).

openai_client.py (both build_request_data paths) — strip reasoning_content,
reasoning_content_signature, redacted_reasoning_content, omitted_reasoning_content
from message history before sending to inference backends. Fireworks and Synthetic
Direct reject these fields with 422/400 errors. exclude_none handles None values
but not actual text content from previous assistant turns.

block_manager_git.py — skip DB write when block value is unchanged. Reduces
unnecessary write amplification on every memfs sync cycle.

memfs_client_base.py — remove redis_client= kwarg from GitOperations init.
Dependency was removed upstream but the call site wasn't updated.

Dockerfile / compose files — context window and config updates for 220k limit.

2026-03-26 23:24:32 -04:00

2.5 KiB

Raw Permalink Blame History

View Raw

2.5 KiB Raw Permalink Blame History

2.5 KiB

Raw Permalink Blame History