The handle_db_timeout decorator only caught SQLAlchemy's TimeoutError
(for pool/connection timeouts) but not asyncpg's QueryCanceledError
which is thrown when PostgreSQL's statement_timeout kills a long-running
query.
This fix:
- Import asyncpg.exceptions.QueryCanceledError
- Update handle_db_timeout decorator to catch QueryCanceledError and wrap
it in DatabaseTimeoutError
- Update _handle_dbapi_error method to also handle wrapped QueryCanceledError
Fixes#8108🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
When multiple concurrent transactions try to upsert the same tools,
they can deadlock if they acquire row locks in different orders.
This fix sorts tools by name before the bulk INSERT to ensure all
transactions acquire locks in a consistent order, preventing deadlocks.
Fixes#8666🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
Add ToolError to exception handling alongside McpError in MCP client classes.
ToolError is raised by fastmcp for input validation errors (e.g., missing
required properties like 'filename'). Both error types are expected user-facing
errors from external MCP servers and should be logged at warning/debug level
to avoid triggering production alerts.
Fixes issue with production error: "fastmcp.exceptions.ToolError: Input
validation error: 'filename' is a required property"
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
The MCP library internally uses TaskGroup for async operations, which can
raise ExceptionGroup when cleanup fails. This was causing unhandled errors
to propagate in production.
Changes:
- Update cleanup() method in AsyncBaseMCPClient to catch ExceptionGroup
using except* syntax and log errors at debug level (best-effort cleanup)
- Remove redundant try/except blocks in mcp_manager.py and
mcp_server_manager.py that incorrectly re-raised cleanup exceptions
Fixes#8560🐾 Generated with [Letta Code](https://letta.com)
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
- Fix typo "upate" -> "update" in TODO comments (mcp_manager.py, mcp_server_manager.py)
- Improve comments in OAuth callback handler to explain why MCPOAuthSession
is used directly (callback is unauthenticated, manager requires actor)
- Clean up variable naming in callback handler
🐾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* feat: add tags support to blocks
* fix: add timestamps and org scoping to blocks_tags
Addresses PR feedback:
1. Migration: Added timestamps (created_at, updated_at), soft delete
(is_deleted), audit fields (_created_by_id, _last_updated_by_id),
and organization_id to blocks_tags table for filtering support.
Follows SQLite baseline pattern (composite PK of block_id+tag, no
separate id column) to avoid insert failures.
2. ORM: Relationship already correct with lazy="raise" to prevent
implicit joins and passive_deletes=True for efficient CASCADE deletes.
3. Schema: Changed normalize_tags() from Any to dict for type safety.
4. SQLite: Added blocks_tags to SQLite baseline schema to prevent
table-not-found errors.
5. Code: Updated all tag row inserts to include organization_id.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: add ORM columns and update SQLite baseline for blocks_tags
Fixes test failures (CompileError: Unconsumed column names: organization_id):
1. ORM: Added organization_id, timestamps, audit fields to BlocksTags
ORM model to match database schema from migrations.
2. SQLite baseline: Added full column set to blocks_tags (organization_id,
timestamps, audit fields) to match PostgreSQL schema.
3. Test: Added 'tags' to expected Block schema fields.
This ensures SQLite and PostgreSQL have matching schemas and the ORM
can consume all columns that the code inserts.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* revert change to existing alembic migration
* fix: remove passive_deletes and SQLite support for blocks_tags
1. Removed passive_deletes=True from Block.tags relationship to match
AgentsTags pattern (neither have ondelete CASCADE in DB schema).
2. Removed SQLite branch from _replace_block_pivot_rows_async since
blocks_tags table is PostgreSQL-only (migration skips SQLite).
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* api sync
---------
Co-authored-by: Letta <noreply@letta.com>
* fix: wrap turbopuffer vector writes in thread pool
Turbopuffer library does CPU-intensive base64 encoding of vectors
synchronously in async functions (_async_transform_recursive →
b64encode_vector), blocking the event loop during file uploads.
Solution: Created _run_turbopuffer_write_in_thread() helper that runs
turbopuffer writes in an isolated event loop within a worker thread.
Applied to all vector write operations:
- insert_tools()
- insert_archival_memories()
- insert_messages()
- insert_file_passages()
This prevents pybase64.b64encode_as_string() from blocking the main
event loop during vector encoding.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: wrap all turbopuffer operations in thread pool
Extended the thread pool wrapping to ALL turbopuffer write operations,
including delete operations, for complete isolation from the main event loop.
All turbopuffer namespace.write() calls now run in isolated event loops
within worker threads, preventing any potential CPU work from blocking.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
MarkItDown.convert() does blocking file I/O and CPU-intensive PDF parsing.
This was blocking the event loop during file uploads.
Now wraps the entire markitdown pipeline (tempfile write, convert, cleanup)
in asyncio.to_thread() to run in thread pool.
🐾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
Critical fixes:
- llm_client_base.send_llm_request() now calls await self.request_async() instead of self.request()
- Remove unused sync get_openai_embedding() that used sync OpenAI client
- Remove deprecated compile_in_thread_async() from Memory
These were blocking the event loop during LLM requests and embeddings.
🐾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
Remove commented-out sync_value_and_value_enc model validator and
unused imports (traceback, model_validator, logger). This code was
disabled and replaced with async decryption via from_orm_async methods.
Adds explicit handling for httpx network errors (ReadError, WriteError,
ConnectError) in AnthropicClient, OpenAIClient, and GoogleVertexClient.
These errors can occur during streaming when the connection is unexpectedly
closed while reading/writing data.
Maps these errors to LLMConnectionError for consistent error handling.
Fixes#8221 (and duplicate #8156)
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
* fix: handle missing tool_call_id in Anthropic message conversion
- Add null check for self.tool_returns before iterating
- Fall back to message's tool_call_id when tool_return.tool_call_id is None
- Improve error message to show actual tool name from message.name
- Only raise error if no valid tool_call_id is available from either source
This fixes the error "Anthropic API requires tool_use_id to be set" that
occurs when a ToolReturn object in the database doesn't have tool_call_id
set, by using the message-level tool_call_id as a fallback.
Fixes#8379🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-Authored-By: Letta <noreply@letta.com>
* fix: restrict tool_call_id fallback to single tool returns
The message-level `self.tool_call_id` is set to the first tool return's ID
for legacy compatibility. For parallel tool calls with multiple tool_returns,
using this as a fallback would incorrectly assign the first tool return's ID
to all subsequent returns missing their own ID.
This change:
- Only allows the fallback when there's exactly one tool return
- For multiple tool returns, each must have its own ID or raise an error
- Adds tool return index to error messages for better debugging
Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
The Anthropic API returns a 413 status code with error type `request_too_large`
when the request payload exceeds the maximum allowed size. This error should
be converted to `ContextWindowExceededError` so the system can handle it
appropriately (e.g., by summarizing the conversation to reduce context size).
Changes:
- Added `request_too_large` and `request exceeds the maximum size` to the
early string-based error detection in `handle_llm_error`
- Added specific handling for HTTP 413 status code in the `APIStatusError`
handler
- Added tests to verify the new error handling behavior
Fixes: #8422🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
MCP errors from external servers (e.g., "The specified key does not exist")
are user-facing issues, not system errors. Downgrading the log level from
warning to debug prevents these expected failures from triggering production
alerts in Datadog/Sentry.
Fixes#8370🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
* feat: Add conversation_id filtering to message list and search endpoints
Add optional conversation_id parameter to filter messages by conversation:
- client.agents.messages.list
- client.messages.list
- client.messages.search
Changes:
- Added conversation_id field to MessageSearchRequest and SearchAllMessagesRequest schemas
- Added conversation_id filtering to list_messages in message_manager.py
- Updated get_agent_recall_async and get_all_messages_recall_async in server.py
- Added conversation_id query parameter to router endpoints
- Updated Turbopuffer client to support conversation_id filtering in searches
Fixes#8320🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Charles Packer <cpacker@users.noreply.github.com>
* add conversation_id to message and tpuf
* default messages filter for backward compatibility
* add test and auto gen
* fix integration test
* fix test
* update test
---------
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Charles Packer <cpacker@users.noreply.github.com>
Co-authored-by: christinatong01 <christina@letta.com>