letta-server

Author	SHA1	Message	Date
Ani Tunturi	1d1adb261a	fix: orphaned approvals, token inflation, reasoning fields, memfs redis dep Some checks are pending Test Package Installation / test-install (3.11) (push) Waiting to run Details Test Package Installation / test-install (3.12) (push) Waiting to run Details Test Package Installation / test-install (3.13) (push) Waiting to run Details [IN TESTING — self-hosted 0.16.6, Kimi-K2.5 via Synthetic Direct] Four independent fixes that landed together on this stack: helpers.py — skip PendingApprovalError when the associated run is already cancelled or failed. Stale approvals from interrupted runs were blocking all subsequent messages on that conversation. Now checks run status before raising; falls back to raising on lookup failure (conservative). letta_agent_v3.py — use prompt_tokens not total_tokens for context window estimate. total_tokens inflated the estimate by including completion tokens, triggering premature compaction. This was causing context window resets mid- conversation and is the root of the token inflation bug (see #3242). openai_client.py (both build_request_data paths) — strip reasoning_content, reasoning_content_signature, redacted_reasoning_content, omitted_reasoning_content from message history before sending to inference backends. Fireworks and Synthetic Direct reject these fields with 422/400 errors. exclude_none handles None values but not actual text content from previous assistant turns. block_manager_git.py — skip DB write when block value is unchanged. Reduces unnecessary write amplification on every memfs sync cycle. memfs_client_base.py — remove redis_client= kwarg from GitOperations init. Dependency was removed upstream but the call site wasn't updated. Dockerfile / compose files — context window and config updates for 220k limit.	2026-03-26 23:24:32 -04:00
Ani Tunturi	08d3c26732	fix: sanitize control characters before sending to inference backends Fireworks (via Synthetic Direct) chokes on raw ASCII control chars (0x00-0x1F) in JSON payloads with "Unterminated string" errors. The existing sanitize_unicode_surrogates only handles U+D800-DFFF. Now we also strip control chars (preserving tab/newline/CR) at all 4 request paths — sync, async, and both streaming variants.	2026-03-21 20:23:56 -04:00
Ani Tunturi	c79de81cea	fix: accept base64 images in tool returns, not just letta-hosted The serializer and deserializer asserted ImageSourceType.letta only, rejecting base64 images from client tools like Read. Self-hosted servers with vision-capable models need this. I can finally see my own avatar. Worth the six attempts.	2026-03-21 12:41:20 -04:00
Kian Jones	f5c4ab50f4	chore: add ty + pre-commit hook and repeal even more ruff rules (#9504 ) * auto fixes * auto fix pt2 and transitive deps and undefined var checking locals() * manual fixes (ignored or letta-code fixed) * fix circular import * remove all ignores, add FastAPI rules and Ruff rules * add ty and precommit * ruff stuff * ty check fixes * ty check fixes pt 2 * error on invalid	2026-02-24 10:55:11 -08:00
Kian Jones	25d54dd896	chore: enable F821, F401, W293 (#9503 ) * auto fixes * auto fix pt2 and transitive deps and undefined var checking locals() * manual fixes (ignored or letta-code fixed) * fix circular import	2026-02-24 10:55:08 -08:00
Kian Jones	6f746c5225	fix(core): handle Anthropic overloaded errors and Unicode encoding issues (#9305 ) * fix: handle Anthropic overloaded_error in streaming interfaces * fix: handle Unicode surrogates in OpenAI requests Sanitize Unicode surrogate pairs before sending requests to OpenAI API. Surrogate pairs (U+D800-U+DFFF) are UTF-16 encoding artifacts that cause UnicodeEncodeError when encoding to UTF-8. Fixes Datadog error: 'utf-8' codec can't encode character '\ud83c' in position 326605: surrogates not allowed * fix: handle UnicodeEncodeError from lone Unicode surrogates in OpenAI requests Improved sanitize_unicode_surrogates() to explicitly filter out lone surrogate characters (U+D800 to U+DFFF) which are invalid in UTF-8. Previous implementation used errors='ignore' which could still fail in edge cases. New approach directly checks Unicode code points and removes any surrogates before data reaches httpx encoding. Also added sanitization to stream_async_responses() method which was missing it. Fixes: 'utf-8' codec can't encode character '\ud83c' in position X: surrogates not allowed	2026-02-24 10:52:06 -08:00
Ari Webb	5c6ca705f1	Revert "feat: bring back use message packing for timezone [LET-6846]" (#9302 ) Revert "feat: bring back use message packing for timezone [LET-6846] (#9256)" This reverts commit c5017cccdef95b84fc585b26a0ddc5b7e44eb7c9.	2026-02-24 10:52:06 -08:00
Ari Webb	426f6a8ca4	feat: bring back use message packing for timezone [LET-6846] (#9256 ) * feat: bring back use message packing for timezone * add tests	2026-02-24 10:52:06 -08:00
Kian Jones	01cb00ae10	Revert "fix: truncate oversized text in embedding requests" (#9227 ) Revert "fix: truncate oversized text in embedding requests (#9196)" This reverts commit a9c342087e022519c63d62fb76b72aed8859539b.	2026-02-24 10:52:06 -08:00
Kian Jones	630c147b13	fix: truncate oversized text in embedding requests (#9196 ) fix: handle oversized text in embedding requests with recursive chunking When message text exceeds the embedding model's context length, recursively split it until all chunks can be embedded successfully. Changes: - `tpuf_client.py`: Add `_split_text_in_half()` helper for recursive splitting - `tpuf_client.py`: Add `_generate_embeddings_with_chunking()` that retries with splits on context length errors - `tpuf_client.py`: Store `message_id` and `chunk_index` columns in Turbopuffer - `tpuf_client.py`: Deduplicate query results by `message_id` - `tpuf_client.py`: Use `LettaInvalidArgumentError` instead of `ValueError` - `tpuf_client.py`: Move LLMClient import to top of file - `openai_client.py`: Remove fixed truncation (chunking handles this now) - Add tests for `_split_text_in_half` and chunked query deduplication 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
github-actions[bot]	5cde3c2ec0	fix: handle data URLs in image processing to prevent LettaImageFetchError (#8958 ) When users send images as base64 data URLs (data:image/jpeg;base64,...), the code was incorrectly trying to fetch them via HTTP, causing a LettaImageFetchError. This fix adds proper handling for data: URLs by parsing the media type and base64 data directly from the URL string. Fixes #8957 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>	2026-02-24 10:52:05 -08:00
github-actions[bot]	101cfefe5e	fix: add retry logic for turbopuffer transient network errors (#8635 ) Add async retry decorator with exponential backoff to handle transient network connection errors (httpx.ConnectError) when connecting to turbopuffer.com. This addresses production errors seen in memgpt-server. Changes: - Add `async_retry_with_backoff()` decorator for async methods - Add `is_transient_error()` helper to identify retryable errors - Apply retry logic to all turbopuffer network operations - Retry config: 3 retries, 1s initial delay, 2x backoff, 10% jitter Fixes #8390 Relates to #8155 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-02-24 10:52:05 -08:00
Kian Jones	2bb4caffc3	fix: remove unused embedding generation (#9013 ) * remove unused embedding generation * prevent double embed * fix embedding dimension comparison and valueerror	2026-01-29 12:43:53 -08:00
Charles Packer	2fc592e0b6	feat(core): add image support in tool returns [LET-7140] (#8985 ) * feat(core): add image support in tool returns [LET-7140] Enable tool_return to support both string and ImageContent content parts, matching the pattern used for user message inputs. This allows tools executed client-side to return images back to the agent. Changes: - Add LettaToolReturnContentUnion type for text/image content parts - Update ToolReturn schema to accept Union[str, List[content parts]] - Update converters for each provider: - OpenAI Chat Completions: placeholder text for images - OpenAI Responses API: full image support - Anthropic: full image support with base64 - Google: placeholder text for images - Add resolve_tool_return_images() for URL-to-base64 conversion - Make create_approval_response_message_from_input() async 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): support images in Google tool returns as sibling parts Following the gemini-cli pattern: images in tool returns are sent as sibling inlineData parts alongside the functionResponse, rather than inside it. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * test(core): add integration tests for multi-modal tool returns [LET-7140] Tests verify that: - Models with image support (Anthropic, OpenAI Responses API) can see images in tool returns and identify the secret text - Models without image support (Chat Completions) get placeholder text and cannot see the actual image content - Tool returns with images persist correctly in the database Uses secret.png test image containing hidden text "FIREBRAWL" that models must identify to pass the test. Also fixes misleading comment about Anthropic only supporting base64 images - they support URLs too, we just pre-resolve for consistency. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: simplify tool return image support implementation Reduce code verbosity while maintaining all functionality: - Extract _resolve_url_to_base64() helper in message_helper.py (eliminates duplication) - Add _get_text_from_part() helper for text extraction - Add _get_base64_image_data() helper for image data extraction - Add _tool_return_to_google_parts() to simplify Google implementation - Add _image_dict_to_data_url() for OpenAI Responses format - Use walrus operator and list comprehensions where appropriate - Add integration_test_multi_modal_tool_returns.py to CI workflow Net change: -120 lines while preserving all features and test coverage. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(tests): improve prompt for multi-modal tool return tests Make prompts more direct to reduce LLM flakiness: - Simplify tool description: "Retrieves a secret image with hidden text. Call this function to get the image." - Change user prompt from verbose request to direct command: "Call the get_secret_image function now." - Apply to both test methods This reduces ambiguity and makes tool calling more reliable across different LLM models. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix bugs * test(core): add google_ai/gemini-2.0-flash-exp to multi-modal tests Add Gemini model to test coverage for multi-modal tool returns. Google AI already supports images in tool returns via sibling inlineData parts. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(ui): handle multi-modal tool_return type in frontend components Convert Union<string, LettaToolReturnContentUnion[]> to string for display: - ViewRunDetails: Convert array to '[Image here]' placeholder - ToolCallMessageComponent: Convert array to '[Image here]' placeholder Fixes TypeScript errors in web, desktop-ui, and docker-ui type-checks. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Caren Thomas <carenthomas@gmail.com>	2026-01-29 12:43:53 -08:00
jnjpng	a98bc31bf3	fix: refactor enable strict mode for structured output (#8840 ) * base * test	2026-01-19 15:54:42 -08:00
jnjpng	85c242077e	feat: strict tool calling setting (#8810 ) base	2026-01-19 15:54:42 -08:00
Sarah Wooders	97cdfb4225	Revert "feat: add strict tool calling setting [LET-6902]" (#8720 ) Revert "feat: add strict tool calling setting [LET-6902] (#8577)" This reverts commit 697c9d0dee6af73ec4d5d98780e2ca7632a69173.	2026-01-19 15:54:39 -08:00
Sarah Wooders	ea36633cd5	fix: make sure structured outputs turned on for openai (#8669 )	2026-01-19 15:54:38 -08:00
Sarah Wooders	bdede5f90c	feat: add strict tool calling setting [LET-6902] (#8577 )	2026-01-19 15:54:38 -08:00
cthomas	870c5955d9	fix: wrap tpuf operations in thread pool (#8615 ) * fix: wrap turbopuffer vector writes in thread pool Turbopuffer library does CPU-intensive base64 encoding of vectors synchronously in async functions (_async_transform_recursive → b64encode_vector), blocking the event loop during file uploads. Solution: Created _run_turbopuffer_write_in_thread() helper that runs turbopuffer writes in an isolated event loop within a worker thread. Applied to all vector write operations: - insert_tools() - insert_archival_memories() - insert_messages() - insert_file_passages() This prevents pybase64.b64encode_as_string() from blocking the main event loop during vector encoding. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: wrap all turbopuffer operations in thread pool Extended the thread pool wrapping to ALL turbopuffer write operations, including delete operations, for complete isolation from the main event loop. All turbopuffer namespace.write() calls now run in isolated event loops within worker threads, preventing any potential CPU work from blocking. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-01-19 15:54:37 -08:00
Ari Webb	cdca1a564f	fix: conversation id not found in tpuf (#8469 ) * fix: conversation id not found in tpuf * add tests	2026-01-12 10:57:49 -08:00
jnjpng	28839f5180	fix: import cryptography default backend at top level (#8444 ) * base * comment	2026-01-12 10:57:49 -08:00
Sarah Wooders	6fddcc0c57	fix: fix agent loop (#8401 )	2026-01-12 10:57:48 -08:00
Charles Packer	ed6284cedb	feat: Add conversation_id filtering to message endpoints (#8324 ) * feat: Add conversation_id filtering to message list and search endpoints Add optional conversation_id parameter to filter messages by conversation: - client.agents.messages.list - client.messages.list - client.messages.search Changes: - Added conversation_id field to MessageSearchRequest and SearchAllMessagesRequest schemas - Added conversation_id filtering to list_messages in message_manager.py - Updated get_agent_recall_async and get_all_messages_recall_async in server.py - Added conversation_id query parameter to router endpoints - Updated Turbopuffer client to support conversation_id filtering in searches Fixes #8320 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Charles Packer <cpacker@users.noreply.github.com> * add conversation_id to message and tpuf * default messages filter for backward compatibility * add test and auto gen * fix integration test * fix test * update test --------- Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Charles Packer <cpacker@users.noreply.github.com> Co-authored-by: christinatong01 <christina@letta.com>	2026-01-12 10:57:48 -08:00
jnjpng	d55fd69b7b	chore: add comment and test for changing PBKDF2 iteration count (#8366 ) base	2026-01-12 10:57:48 -08:00
jnjpng	b68e4e74f9	fix: replace cryptography with hashlib for encryption key derivation (#8364 ) base	2026-01-12 10:57:48 -08:00
cthomas	a54513c343	feat: move decryption outside db session (#8323 ) * feat: move decryption outside db session * fix pydantic error	2026-01-12 10:57:48 -08:00
jnjpng	7e8088adc5	chore: add tracing for request middleware (#8142 ) * base * update * more	2026-01-12 10:57:47 -08:00
cthomas	a7b3f469ac	fix: more user friendly error for tpuf namespace not found [LET-6707] (#8141 ) fix: more user friendly error for tpuf namespace not found	2026-01-12 10:57:47 -08:00
cthomas	39dc1d9736	fix: image fetching timeouts [LET-6700] (#8140 ) fix: image fetching timeouts	2026-01-12 10:57:47 -08:00
jnjpng	700409d943	fix: sanitize null bytes test (#8135 ) base	2026-01-12 10:57:47 -08:00
jnjpng	fa9a98351d	fix: add redis tracing (#8132 ) base	2026-01-12 10:57:47 -08:00
github-actions[bot]	dbdd1a40e4	fix: sanitize null bytes to prevent PostgreSQL CharacterNotInRepertoireError (#8015 ) This fixes the asyncpg.exceptions.CharacterNotInRepertoireError that occurs when tool returns contain null bytes (0x00), which PostgreSQL TEXT columns reject in UTF-8 encoding. Changes: - Add sanitize_null_bytes() function to recursively remove null bytes from strings - Update json_dumps() to sanitize data before serialization - Apply sanitization in converters.py for tool_calls, tool_returns, approvals, and message_content - Add comprehensive unit tests Fixes #8014 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>	2026-01-12 10:57:47 -08:00
Kian Jones	bce1749408	fix: run PBKDF2 in thread pool to prevent event loop freeze (#6763 ) * fix: run PBKDF2 in thread pool to prevent event loop freeze Problem: Event loop freezes for 100-500ms during secret decryption, blocking all HTTP requests and async operations. The diagnostic monitor detected the main thread stuck in PBKDF2 HMAC SHA256 computation at: apps/core/letta/helpers/crypto_utils.py:51 (_derive_key) apps/core/letta/schemas/secret.py:161 (get_plaintext) Root cause: PBKDF2 with 100k iterations is intentionally CPU-intensive for security, but running it synchronously on the main thread blocks the event loop. Stack trace showed: Thread 1 (Main): PBKDF2HMAC -> SHA256_Final -> sha256_block_data_order_avx2 Event loop watchdog: Detected freeze at 01:11:44 (request started 01:12:03) Solution: 1. Run PBKDF2 in ThreadPoolExecutor to avoid blocking event loop 2. Add async versions of encrypt/decrypt methods 3. Add LRU cache for derived keys (deterministic results) 4. Add async get_plaintext_async() method to Secret class Changes: - apps/core/letta/helpers/crypto_utils.py: - Added ThreadPoolExecutor for crypto operations - Added @lru_cache(maxsize=256) to _derive_key_cached() - Added _derive_key_async() using loop.run_in_executor() - Added encrypt_async() and decrypt_async() methods - Added warnings to sync methods about blocking behavior - apps/core/letta/schemas/secret.py: - Added get_plaintext_async() method - Added warnings to get_plaintext() about blocking behavior Benefits: - Event loop no longer freezes during secret decryption - HTTP requests continue processing while crypto runs in background - Derived keys are cached, reducing CPU usage for repeated operations - Backward compatible - sync methods still work for non-async code Performance impact: - Before: 100-500ms event loop block per decryption - After: 100-500ms in thread pool (non-blocking) + LRU cache hits ~0.1ms Next steps (follow-up PRs): - Migrate all async callsites to use get_plaintext_async() - Add metrics to track sync vs async usage - Consider reducing PBKDF2 iterations if security allows * update * test --------- Co-authored-by: Letta Bot <jinjpeng@gmail.com>	2025-12-15 12:03:09 -08:00
Sarah Wooders	7ea297231a	feat: add `compaction_settings` to agents (#6625 ) * initial commit * Add database migration for compaction_settings field This migration adds the compaction_settings column to the agents table to support customized summarization configuration for each agent. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix * rename * update apis * fix tests * update web test --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Kian Jones <kian@letta.com>	2025-12-15 12:02:34 -08:00
jnjpng	17a90538ca	fix: exclude common API key prefixes from encryption detection (#6624 ) * fix: exclude common API key prefixes from encryption detection Add a list of known API key prefixes (OpenAI, Anthropic, GitHub, AWS, Slack, etc.) to prevent is_encrypted() from incorrectly identifying plaintext credentials as encrypted values. * update * test	2025-12-15 12:02:34 -08:00
Christina Tong	972c61d0b8	chore: fallback to timestamp retrieval for message search [LET-6429] (#6510 )	2025-12-15 12:02:33 -08:00
Ari Webb	4d5be22d14	fix: utc for message/passage search tpuf [LET-6109] (#6429 ) fix: utc for message/passage search tpuf Co-authored-by: Ari Webb <ari@letta.com>	2025-12-15 12:02:18 -08:00
Ari Webb	3e02f12dfd	feat: add tool embedding and search [LET-6333] (#6398 ) * feat: add tool embedding and search * fix ci * add env variable for embedding tools --------- Co-authored-by: Ari Webb <ari@letta.com>	2025-11-26 14:39:40 -08:00
cthomas	29e38a2a42	feat: pass in user-agent to prevent 403 forbidden http error [LET-6305] (#6348 ) feat: pass in user-agent to prevent 403 forbidden http error	2025-11-24 19:10:27 -08:00
cthomas	345ea42630	feat: offload all file i/o in server endpoints LET-6252 (#6300 ) feat: offload all file i/o in server endpoints	2025-11-24 19:10:26 -08:00
cthomas	4bb116f17c	fix: sync api call in message path (#6291 ) * fix: sync api call in message path * remove unused function * add new error type	2025-11-24 19:10:26 -08:00
cthomas	1b05ecb842	fix: invalid role error in agent step (#6288 )	2025-11-24 19:10:26 -08:00
cthomas	6f810d95d8	feat: add semaphore to limit embeddings creation (#6261 )	2025-11-24 19:10:11 -08:00
Christina Tong	04611b981c	feat: filter messages search endpoint by agent id [LET-6229] (#6246 ) * feat: filter messages search endpoint by agent id [LET-6229] * add autogenerated schema/types	2025-11-24 19:09:33 -08:00
cthomas	209debeb09	feat: remove multiple pinecone client inits from startup (#6128 ) feat: remove multiple pinecone client inits	2025-11-13 15:36:56 -08:00
Sarah Wooders	5730f69ecf	feat: modal tool execution - NO FEATURE FLAGS USES MODAL [LET-4357] (#5120 ) * initial commit * add delay to deploy * fix tests * add tests * passing tests * cleanup * and use modal * working on modal * gate on tool metadata * agent state * cleanup --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-11-13 15:36:56 -08:00
Charles Packer	363a5c1f92	fix: fix poison state from bad approval response (#5979 ) * fix: detect and fail on malformed approval responses * fix: guard against None approvals in utils.py * fix: add extra warning * fix: stop silent drops in deserialize_approvals * fix: patch v3 stream error handling to prevent sending end_turn after an error occurs, and ensures stop_reason is always set when an error occurs * fix: Prevents infinite client hangs by ensuring a terminal event is ALWAYS sent * fix: Ensures terminal events are sent even if inner stream generator fails to send them	2025-11-13 15:36:55 -08:00
jnjpng	05b359b7f5	chore: add local base 64 url image for send message integration (#5969 ) * base * update * clean up * update --------- Co-authored-by: Letta Bot <noreply@letta.com>	2025-11-13 15:36:55 -08:00
Ari Webb	c7c0d7507c	feat: add new mcp_servers routes [LET-4321] (#5675 ) --------- Co-authored-by: Ari Webb <ari@letta.com> Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>	2025-10-24 15:14:21 -07:00

1 2 3 4 5 ...

269 Commits