fix: orphaned approvals, token inflation, reasoning fields, memfs redis dep
Some checks are pending
Test Package Installation / test-install (3.11) (push) Waiting to run
Test Package Installation / test-install (3.12) (push) Waiting to run
Test Package Installation / test-install (3.13) (push) Waiting to run

[IN TESTING — self-hosted 0.16.6, Kimi-K2.5 via Synthetic Direct]

Four independent fixes that landed together on this stack:

helpers.py — skip PendingApprovalError when the associated run is already
cancelled or failed. Stale approvals from interrupted runs were blocking all
subsequent messages on that conversation. Now checks run status before raising;
falls back to raising on lookup failure (conservative).

letta_agent_v3.py — use prompt_tokens not total_tokens for context window
estimate. total_tokens inflated the estimate by including completion tokens,
triggering premature compaction. This was causing context window resets mid-
conversation and is the root of the token inflation bug (see #3242).

openai_client.py (both build_request_data paths) — strip reasoning_content,
reasoning_content_signature, redacted_reasoning_content, omitted_reasoning_content
from message history before sending to inference backends. Fireworks and Synthetic
Direct reject these fields with 422/400 errors. exclude_none handles None values
but not actual text content from previous assistant turns.

block_manager_git.py — skip DB write when block value is unchanged. Reduces
unnecessary write amplification on every memfs sync cycle.

memfs_client_base.py — remove redis_client= kwarg from GitOperations init.
Dependency was removed upstream but the call site wasn't updated.

Dockerfile / compose files — context window and config updates for 220k limit.
This commit is contained in:
Ani Tunturi
2026-03-26 23:24:32 -04:00
parent 08d3c26732
commit 1d1adb261a
9 changed files with 91 additions and 15 deletions

View File

@@ -51,25 +51,46 @@ def sanitize_unicode_surrogates(value: Any) -> Any:
return value
def sanitize_control_characters(value: Any) -> Any:
"""Recursively remove ASCII control characters (0x00-0x1F) from strings,
preserving tab (0x09), newline (0x0A), and carriage return (0x0D).
_UNICODE_TO_ASCII = {
"\u2014": "--", # em-dash
"\u2013": "-", # en-dash
"\u2012": "-", # figure dash
"\u2010": "-", # hyphen
"\u2011": "-", # non-breaking hyphen
"\u201c": '"', # left double quotation mark
"\u201d": '"', # right double quotation mark
"\u2018": "'", # left single quotation mark
"\u2019": "'", # right single quotation mark
"\u201a": ",", # single low-9 quotation mark
"\u201e": '"', # double low-9 quotation mark
"\u2026": "...", # horizontal ellipsis
"\u00a0": " ", # non-breaking space
"\u00ad": "", # soft hyphen (invisible, strip)
}
Some inference backends (e.g. Fireworks AI) perform strict JSON parsing on
the request body and reject payloads containing unescaped control characters.
Python's json.dumps will escape these, but certain proxy layers may
double-parse or re-serialize in ways that expose the raw bytes.
def sanitize_control_characters(value: Any) -> Any:
"""Recursively sanitize strings for strict ASCII-only JSON backends (e.g. Synthetic).
Removes ASCII control characters (0x00-0x1F) except tab/newline/CR.
Replaces common non-ASCII typography (em-dash, curly quotes, ellipsis, etc.)
with ASCII equivalents. Strips remaining non-ASCII chars (> 0x7E) that would
appear as raw multi-byte UTF-8 sequences in the request body and cause parse
failures on backends that expect ASCII-safe JSON.
This function sanitizes:
- Strings: strips control characters except whitespace (tab, newline, CR)
- Strings: replaces/strips non-ASCII; strips control chars except whitespace
- Dicts: recursively sanitizes all string values
- Lists: recursively sanitizes all elements
- Other types: returned as-is
"""
if isinstance(value, str):
# Replace known typographic Unicode with ASCII equivalents first
for uni, asc in _UNICODE_TO_ASCII.items():
value = value.replace(uni, asc)
return "".join(
char for char in value
if ord(char) >= 0x20 # printable
if ord(char) <= 0x7E # printable ASCII only
or char in ("\t", "\n", "\r") # allowed whitespace
)
elif isinstance(value, dict):