fix: orphaned approvals, token inflation, reasoning fields, memfs redis dep
Some checks are pending
Test Package Installation / test-install (3.11) (push) Waiting to run
Test Package Installation / test-install (3.12) (push) Waiting to run
Test Package Installation / test-install (3.13) (push) Waiting to run

[IN TESTING — self-hosted 0.16.6, Kimi-K2.5 via Synthetic Direct]

Four independent fixes that landed together on this stack:

helpers.py — skip PendingApprovalError when the associated run is already
cancelled or failed. Stale approvals from interrupted runs were blocking all
subsequent messages on that conversation. Now checks run status before raising;
falls back to raising on lookup failure (conservative).

letta_agent_v3.py — use prompt_tokens not total_tokens for context window
estimate. total_tokens inflated the estimate by including completion tokens,
triggering premature compaction. This was causing context window resets mid-
conversation and is the root of the token inflation bug (see #3242).

openai_client.py (both build_request_data paths) — strip reasoning_content,
reasoning_content_signature, redacted_reasoning_content, omitted_reasoning_content
from message history before sending to inference backends. Fireworks and Synthetic
Direct reject these fields with 422/400 errors. exclude_none handles None values
but not actual text content from previous assistant turns.

block_manager_git.py — skip DB write when block value is unchanged. Reduces
unnecessary write amplification on every memfs sync cycle.

memfs_client_base.py — remove redis_client= kwarg from GitOperations init.
Dependency was removed upstream but the call site wasn't updated.

Dockerfile / compose files — context window and config updates for 220k limit.
This commit is contained in:
Ani Tunturi
2026-03-26 23:24:32 -04:00
parent 08d3c26732
commit 1d1adb261a
9 changed files with 91 additions and 15 deletions

View File

@@ -454,6 +454,15 @@ class OpenAIClient(LLMClientBase):
)
request_data = data.model_dump(exclude_unset=True, exclude_none=True)
# Strip reasoning fields (see streaming build_request_data for explanation)
_REASONING_FIELDS = ("reasoning_content", "reasoning_content_signature",
"redacted_reasoning_content", "omitted_reasoning_content")
if "messages" in request_data:
for message in request_data["messages"]:
for field in _REASONING_FIELDS:
message.pop(field, None)
return request_data
@trace_method
@@ -641,6 +650,15 @@ class OpenAIClient(LLMClientBase):
tool.function.strict = False
request_data = data.model_dump(exclude_unset=True, exclude_none=True)
# Strip reasoning fields that strict backends (Fireworks/Synthetic) reject.
# exclude_none handles fields that are None, but reasoning_content has actual
# text from previous assistant turns and must be explicitly removed.
_REASONING_FIELDS = ("reasoning_content", "reasoning_content_signature",
"redacted_reasoning_content", "omitted_reasoning_content")
if "messages" in request_data:
for message in request_data["messages"]:
for field in _REASONING_FIELDS:
message.pop(field, None)
# If Ollama
# if llm_config.handle.startswith("ollama/") and llm_config.enable_reasoner: