The cursor-based pagination was not accounting for sort order. When using
descending order (the default), "after cursor X" should return items with
id < X (items that come after X in the descending result set), but the code
was using id > X which caused infinite loops in clients iterating through pages.
This fix adjusts the cursor comparison based on the sort order:
- ascending: after=id > X, before=id < X
- descending: after=id < X, before=id > X
Note: Other pagination methods (list_agent_sources_async, list_agent_tools_async,
list_agent_groups_async) may have the same issue and should be audited.
🐾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* fix: remove deprecation from agent passages endpoints
The client.agent.passages endpoints (list, create, search, delete) were
incorrectly marked as deprecated. This would break significant amounts
of user code and negatively impact developer experience.
Fixes#9116
Co-authored-by: Ari Webb <AriWebb@users.noreply.github.com>
* stage publish api
---------
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Ari Webb <AriWebb@users.noreply.github.com>
Co-authored-by: Ari Webb <ari@letta.com>
The MiniMaxProvider class was missing a check_api_key() implementation,
causing /v1/providers/check to return a 500 error when validating
MiniMax API keys. The base Provider class raises NotImplementedError.
This adds check_api_key() using the Anthropic client (since MiniMax uses
an Anthropic-compatible API), following the same pattern as AnthropicProvider.
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
Revert "fix: ensure stop_reason is always set and reduce noisy logs (#9046)"
This reverts commit 4241a360579440d2697124ba69061d0e46ecc5e9.
**Problem:**
After the original change, caren-code-agent reported streams hanging
indefinitely. The trace shows ttft (time to first token) succeeds, but
the stream never closes.
**Root Cause (suspected):**
The change modified `is_complete=is_done` to `is_complete=saw_done`,
meaning error events no longer mark the stream as complete immediately.
This may cause timing issues where clients wait for more data before
the finalizer runs.
**Fix:**
Revert to the defensive "belt-and-suspenders" approach that always
appends [DONE]. The noisy logs are preferable to hanging streams.
The original comment noted: "Even if a previous chunk set `complete`,
an extra [DONE] is harmless and ensures SDKs that rely on explicit
[DONE] will exit."
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
**Problem:**
Runs failed with error:
```
Argument step_id does not match type <class 'str'>; is None of type <class 'NoneType'>
```
This happened when processing approval responses where the original
approval request message had `step_id=None`.
**Root Cause:**
Line 672 in `_step()` directly used `approval_request.step_id`:
```python
step_id = approval_request.step_id # Can be None!
step_metrics = await self.step_manager.get_step_metrics_async(step_id=step_id, ...)
```
`Message.step_id` is `Optional[str]` (default None), but `get_step_metrics_async`
has `step_id: str` with `@enforce_types` validation.
Old approval messages or edge cases could have `step_id=None`, causing
the enforce_types decorator to reject the call.
**Fix:**
Check if `step_id is None` and generate a new step_id + initialize step
checkpoint if needed, instead of assuming step_id always exists.
**Note:**
Similar issue exists in letta_agent_v2.py and temporal agents, but v2
is deprecated.
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
fix: load default provider config when summarizer uses different provider
**Problem:**
Summarization failed when agent used one provider (e.g., Google AI) but
summarizer config specified a different provider (e.g., Anthropic):
```python
# Agent LLM config
model_endpoint_type='google_ai', handle='gemini-something/gemini-2.5-pro',
context_window=100000
# Summarizer config
model='anthropic/claude-haiku-4-5-20251001'
# Bug: Resulting summarizer_llm_config mixed Google + Anthropic settings
model='claude-haiku-4-5-20251001', model_endpoint_type='google_ai', # ❌ Wrong endpoint!
context_window=100000 # ❌ Google's context window, not Anthropic's default!
```
This sent Claude requests to Google AI endpoints with incorrect parameters.
**Root Cause:**
`_build_summarizer_llm_config()` always copied the agent's LLM config as base,
then patched model/provider fields. But this kept all provider-specific settings
(endpoint, context_window, etc.) from the wrong provider.
**Fix:**
1. Parse provider_name from summarizer handle
2. Check if it matches agent's model_endpoint_type (or provider_name for custom)
3. **If YES** → Use agent config as base, override model/handle (same provider)
4. **If NO** → Load default config via `provider_manager.get_llm_config_from_handle()` (new provider)
**Example Flow:**
```python
# Agent: google_ai/gemini-2.5-pro
# Summarizer: anthropic/claude-haiku
provider_name = "anthropic" # Parsed from handle
provider_matches = ("anthropic" == "google_ai") # False ❌
# Different provider → load default Anthropic config
base = await provider_manager.get_llm_config_from_handle(
handle="anthropic/claude-haiku",
actor=self.actor
)
# Returns: model_endpoint_type='anthropic', endpoint='https://api.anthropic.com', etc. ✅
```
**Result:**
- Summarizer with different provider gets correct default config
- No more mixing Google endpoints with Anthropic models
- Same-provider summarizers still inherit agent settings efficiently
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* feat: add conversation_id to message search results
Add conversation_id field to all *MessageListResult classes
(SystemMessageListResult, UserMessageListResult, ReasoningMessageListResult,
AssistantMessageListResult) so that conversation IDs are returned from
the /messages/search endpoint alongside agent IDs.
Fixes#9055
Co-authored-by: Charles Packer <cpacker@users.noreply.github.com>
* chore: regenerate SDK and OpenAPI spec
Regenerate autogenerated files after adding conversation_id to
message search result schemas.
Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com>
---------
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Charles Packer <cpacker@users.noreply.github.com>
Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com>
fix: gracefully skip assistant messages with empty content in LLM format conversion
**Problem:**
Context window calculation crashed with AssertionError when converting messages
to Google/Anthropic/OpenAI format:
```
AssertionError at line 2047: assert self.tool_calls is not None or
text_content is not None or len(self.content) > 1
```
This happened when loading agents with old/malformed messages that had
`content=None` or `content=[]` in the database.
**Root Cause:**
The Message ORM model allows `content: Optional[List[...]] = None` (line 252),
but format conversion methods assumed content would always have extractable text
or tool calls.
Scenarios that triggered crashes:
1. Assistant message with `content=None` (old migrations/edge cases)
2. Assistant message with `content=[]` (message creation bugs)
3. Assistant message with single non-text content that doesn't match extraction logic
**Fix:**
Replaced assertions with defensive checks in 3 conversion methods:
1. `to_google_dict()` (line 2054) - Return None to skip unconvertible messages
2. `to_openai_responses_api_dicts()` (line 1476) - Return early to skip
3. `to_anthropic_dict()` (line 1794) - Return None to skip
Pattern: Check for empty content, return None/early to skip gracefully.
**Result:**
- Context window calculation no longer crashes on malformed/old messages
- Messages with no convertible content are silently skipped
- Consistent with existing Anthropic reasoning-only message handling (line 1308)
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
**Problem:**
Error logs showed empty detail fields when exceptions had no message:
```
Error during step processing:
Run run-xxx stopped with unknown error: , error_data: {...'detail': ''}
```
This made debugging production issues difficult as the actual error type
was hidden.
**Root Cause:**
Python exceptions created with no arguments (e.g., `Exception()` or caught
and re-raised in certain ways) have `str(e) == ""`:
```python
e = Exception()
str(e) # Returns ""
repr(e) # Returns "Exception()"
```
When exceptions with empty string representations were caught, all logging
and error messages showed blank details.
**Fix:**
Use `str(e) or repr(e)` fallback pattern in 3 places:
1. `letta_agent_v3.py` stream() exception handler (line 406)
2. `letta_agent_v3.py` step() exception handler (line 928)
3. `streaming_service.py` generic exception handler (line 469)
**Result:**
- Error logs now show `Exception()` or similar instead of empty string
- Helps identify exception types even when message is missing
- Better production debugging without changing exception handling logic
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
fix: consume [DONE] token after error events to prevent forced finalizer append
**Problem:**
Stream finalizer was frequently logging warning and appending forced [DONE]:
```
[Stream Finalizer] Appending forced [DONE] for run=run-xxx (saw_error=True,
saw_done=False, final_stop_reason=llm_api_error)
```
This happened on every error, even though streaming_service.py already yields
[DONE] after all error events.
**Root Cause:**
Line 266: `is_done = saw_done or saw_error` caused loop to break immediately
after seeing error event, BEFORE consuming the [DONE] chunk that follows:
```python
is_done = saw_done or saw_error
await writer.write_chunk(...)
if is_done: # Breaks on error!
break
```
Sequence:
1. streaming_service.py yields: `event: error\ndata: {...}\n\n`
2. Redis reader sees error → sets `saw_error=True`
3. Sets `is_done=True` and breaks
4. Never reads next chunk: `data: [DONE]\n\n`
5. Finalizer runs → `saw_done=False` → appends forced [DONE]
**Fix:**
1. Only break when `saw_done=True` (not `saw_error`) → allows consuming [DONE]
2. Only run finalizer when `saw_done=False` → reduces log noise
**Result:**
- [DONE] now consumed naturally from streaming_service.py error handlers
- Finalizer warning only appears when truly needed (fallback cases)
- Cleaner production logs
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
**Problem:**
Production error showed runs being marked as failed with stop_reason=None,
which violates LettaStopReason's Pydantic schema (requires valid enum value).
This caused cascading validation errors that got stored in metadata.
Example error:
```
Run is already in a terminal state failed with stop reason None, but is being
updated with data {'status': 'failed', 'stop_reason': None, 'metadata':
{'error': "1 validation error for LettaStopReason\nstop_reason Input should
be 'end_turn', 'error', ... [type=enum, input_value=None]"}}
```
**Root Causes:**
1. routers/v1/agents.py had 3 exception handlers creating RunUpdate(status=failed)
without stop_reason
2. Success path assumed result.stop_reason always exists (AttributeError if None)
3. run_manager.py tried to create LettaStopReason(stop_reason=None) when
refreshing result messages
**Fixes:**
1. Added stop_reason=StopReasonType.error to 3 exception handlers
2. Added defensive None checks before accessing result.stop_reason.stop_reason
3. Added fallback to StopReasonType.error when pydantic_run.stop_reason is None
**Trigger:**
OpenAI BadRequestError for invalid tool schema → exception handlers marked
run as failed without stop_reason → validation error when constructing response
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
fix: use shared event + .athrow() to properly set stream_was_cancelled flag
**Problem:**
When a run is cancelled via /cancel endpoint, `stream_was_cancelled` remained
False because `RunCancelledException` was raised in the consumer code (wrapper),
which closes the generator from outside. This causes Python to skip the
generator's except blocks and jump directly to finally with the wrong flag value.
**Solution:**
1. Shared `asyncio.Event` registry for cross-layer cancellation signaling
2. `cancellation_aware_stream_wrapper` sets the event when cancellation detected
3. Wrapper uses `.athrow()` to inject exception INTO generator (not consumer-side raise)
4. All streaming interfaces check event in `finally` block to set flag correctly
5. `streaming_service.py` handles `RunCancelledException` gracefully, yields [DONE]
**Changes:**
- streaming_response.py: Event registry + .athrow() injection + graceful handling
- openai_streaming_interface.py: 3 classes check event in finally
- gemini_streaming_interface.py: Check event in finally
- anthropic_*.py: Catch RunCancelledException
- simple_llm_stream_adapter.py: Create & pass event to interfaces
- streaming_service.py: Handle RunCancelledException, yield [DONE], skip double-update
- routers/v1/{conversations,runs}.py: Pass event to wrapper
- integration_test_human_in_the_loop.py: New test for approval + cancellation
**Tests:**
- test_tool_call with cancellation (OpenAI models) ✅
- test_approve_with_cancellation (approval flow + concurrent cancel) ✅
**Known cosmetic warnings (pre-existing):**
- "Run already in terminal state" - agent loop tries to update after /cancel
- "Stream ended without terminal event" - background streaming timing race
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* fix: don't need embedding model for self hosted
* stage publish api
* passes tests
* add test
* remove unnecessary upgrades
* update revision order db migrations
* add timeout for ci
* make favorite tag a const
* add favorite:user:{userId} for favorites
* favorite agent upon initial creation
* rename const
* add eslint ignore
* expect favorite tag
* test: add comprehensive provider trace telemetry tests
Add two test files for provider trace telemetry:
1. test_provider_trace.py - Integration tests for:
- Basic agent steps (streaming and non-streaming)
- Tool calls
- Telemetry context fields (agent_id, agent_tags, step_id, run_id)
- Multi-step conversations
- Request/response JSON content
2. test_provider_trace_summarization.py - Unit tests for:
- simple_summary() telemetry context passing
- summarize_all() telemetry pass-through
- summarize_via_sliding_window() telemetry pass-through
- Summarizer class runtime vs constructor telemetry
- LLMClient.set_telemetry_context() method
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* test: add telemetry tests for tool generation, adapters, and agent versions
Add comprehensive unit tests for provider trace telemetry:
- TestToolGenerationTelemetry: Verify /generate-tool endpoint sets
call_type="tool_generation" and has no agent context
- TestLLMClientTelemetryContext: Verify LLMClient.set_telemetry_context
accepts all telemetry fields
- TestAdapterTelemetryAttributes: Verify base adapter and subclasses
(LettaLLMRequestAdapter, LettaLLMStreamAdapter) support telemetry attrs
- TestSummarizerTelemetry: Verify Summarizer stores and passes telemetry
- TestAgentAdapterInstantiation: Verify LettaAgentV2 creates Summarizer
with correct agent_id
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* ci: add provider trace telemetry tests to unit test workflow
Add the new provider trace test files to the CI matrix:
- test_provider_trace_backends.py
- test_provider_trace_summarization.py
- test_provider_trace_agents.py
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: update socket backend test to match new record structure
The socket backend record structure changed - step_id/run_id are now
at top level, and model/usage are nested in request/response objects.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: add step_id to V1 agent telemetry context
Pass step_id to set_telemetry_context in both streaming and non-streaming
paths in LettaAgent (v1). The step_id is available via step_metrics.id
in the non-streaming path and passed explicitly in the streaming path.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
* feat: add agent_id, run_id, step_id to summarization provider traces
Summarization LLM calls were missing telemetry context (agent_id,
agent_tags, run_id, step_id), making it impossible to attribute
summarization costs to specific agents or trace them back to the
step that triggered compaction.
Changes:
- Add step_id param to simple_summary() and set_telemetry_context()
- Add agent_id, agent_tags, run_id, step_id to summarize_all() and
summarize_via_sliding_window()
- Update Summarizer class to accept and pass telemetry context
- Update LettaAgentV3.compact() to pass full telemetry context
- Update LettaAgentV2.summarize_conversation_history() with run_id/step_id
- Update LettaAgent (v1) streaming methods with run_id param
- Add run_id/step_id to SummarizeParams for Temporal activities
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: update test mock to accept new summarization params
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
* mvp
* perfrom type coercion in sandbox
* fix: safely resolve typing annotations on host
Use an AST whitelist for generic annotations to avoid eval while keeping list/dict coercion working.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
* feat(core): add image support in tool returns [LET-7140]
Enable tool_return to support both string and ImageContent content parts,
matching the pattern used for user message inputs. This allows tools
executed client-side to return images back to the agent.
Changes:
- Add LettaToolReturnContentUnion type for text/image content parts
- Update ToolReturn schema to accept Union[str, List[content parts]]
- Update converters for each provider:
- OpenAI Chat Completions: placeholder text for images
- OpenAI Responses API: full image support
- Anthropic: full image support with base64
- Google: placeholder text for images
- Add resolve_tool_return_images() for URL-to-base64 conversion
- Make create_approval_response_message_from_input() async
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): support images in Google tool returns as sibling parts
Following the gemini-cli pattern: images in tool returns are sent as
sibling inlineData parts alongside the functionResponse, rather than
inside it.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* test(core): add integration tests for multi-modal tool returns [LET-7140]
Tests verify that:
- Models with image support (Anthropic, OpenAI Responses API) can see
images in tool returns and identify the secret text
- Models without image support (Chat Completions) get placeholder text
and cannot see the actual image content
- Tool returns with images persist correctly in the database
Uses secret.png test image containing hidden text "FIREBRAWL" that
models must identify to pass the test.
Also fixes misleading comment about Anthropic only supporting base64
images - they support URLs too, we just pre-resolve for consistency.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* refactor: simplify tool return image support implementation
Reduce code verbosity while maintaining all functionality:
- Extract _resolve_url_to_base64() helper in message_helper.py (eliminates duplication)
- Add _get_text_from_part() helper for text extraction
- Add _get_base64_image_data() helper for image data extraction
- Add _tool_return_to_google_parts() to simplify Google implementation
- Add _image_dict_to_data_url() for OpenAI Responses format
- Use walrus operator and list comprehensions where appropriate
- Add integration_test_multi_modal_tool_returns.py to CI workflow
Net change: -120 lines while preserving all features and test coverage.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(tests): improve prompt for multi-modal tool return tests
Make prompts more direct to reduce LLM flakiness:
- Simplify tool description: "Retrieves a secret image with hidden text. Call this function to get the image."
- Change user prompt from verbose request to direct command: "Call the get_secret_image function now."
- Apply to both test methods
This reduces ambiguity and makes tool calling more reliable across different LLM models.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix bugs
* test(core): add google_ai/gemini-2.0-flash-exp to multi-modal tests
Add Gemini model to test coverage for multi-modal tool returns. Google AI already supports images in tool returns via sibling inlineData parts.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(ui): handle multi-modal tool_return type in frontend components
Convert Union<string, LettaToolReturnContentUnion[]> to string for display:
- ViewRunDetails: Convert array to '[Image here]' placeholder
- ToolCallMessageComponent: Convert array to '[Image here]' placeholder
Fixes TypeScript errors in web, desktop-ui, and docker-ui type-checks.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Caren Thomas <carenthomas@gmail.com>
* feat: byok provider models in db also
* make tests and sync api
* fix inconsistent state with recreating provider of same name
* fix sync on byok creation
* update revision
* move stripe code for testing purposes
* revert
* add refresh byok models endpoint
* just stage publish api
* add tests
* reorder revision
* add test for name clashes