* feat: add order_by and order params to /v1/conversations list endpoint [LET-7628]
Added sorting support to the conversations list endpoint, matching the pattern from /v1/agents.
**API Changes:**
- Added `order` query param: "asc" or "desc" (default: "desc")
- Added `order_by` query param: "created_at" or "last_run_completion" (default: "created_at")
**Implementation:**
**created_at ordering:**
- Simple ORDER BY on ConversationModel.created_at
- No join required, fast query
- Nulls not applicable (created_at always set)
**last_run_completion ordering:**
- LEFT JOIN with runs table using subquery
- Subquery: MAX(completed_at) grouped by conversation_id
- Uses OUTER JOIN so conversations with no runs are included
- Nulls last ordering (conversations with no runs go to end)
- Index on runs.conversation_id ensures performant join
**Pagination:**
- Cursor-based pagination with `after` parameter
- Handles null values correctly for last_run_completion
- For created_at: simple timestamp comparison
- For last_run_completion: complex null-aware cursor logic
**Performance:**
- Existing index: `ix_runs_conversation_id` on runs table
- Subquery with GROUP BY is efficient for this use case
- OUTER JOIN ensures conversations without runs are included
**Follows agents pattern:**
- Same parameter names (order, order_by)
- Same Literal types and defaults
- Converts "asc"/"desc" to ascending boolean internally
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: order
---------
Co-authored-by: Letta <noreply@letta.com>
* feat: add recent_chunks field to feedback endpoint
Accept an optional recent_chunks string in the /v1/metadata/feedback
POST body and forward it to PostHog. Contains the last 100 truncated
streaming chunks as JSON for debugging issues reported via /feedback.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate openapi specs with recent_chunks field
---------
Co-authored-by: Letta <noreply@letta.com>
* feat: make agent_id optional in conversations list endpoint [LET-7612]
Allow listing all conversations without filtering by agent_id.
**Router changes (conversations.py):**
- Changed agent_id from required (`...`) to optional (`None`)
- Updated description to clarify behavior
- Updated docstring to reflect optional filtering
**Manager changes (conversation_manager.py):**
- Updated list_conversations signature: agent_id: str → Optional[str]
- Updated docstring to clarify optional behavior
- Summary search query: conditionally adds agent_id filter only if provided
- Default list logic: passes agent_id (can be None) to list_async
**How it works:**
- Without agent_id: returns all conversations for the user's organization
- With agent_id: returns conversations filtered by that agent
- list_async handles None gracefully via **kwargs pattern
**Use case:**
- Cloud UI can list all user conversations across agents
- Still supports filtering by agent_id when needed
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: update logs
* chore: update logs
---------
Co-authored-by: Letta <noreply@letta.com>
* feat: add two-way mode control for listener connections
Enable bidirectional permission mode control between letta-cloud UI and letta-code instances.
**Backend:**
- Added ModeChangeMessage and ModeChangedMessage to WebSocket protocol
- Added sendModeChange endpoint (/v1/listeners/:connectionId/mode)
- listenersRouter publishes mode_change via Redis Pub/Sub
- listenersHandler handles mode_changed acknowledgments from letta-code
- Stores current mode in Redis for UI state sync
**Contract:**
- Added sendModeChange contract with PermissionModeSchema
- 4 modes: default, acceptEdits, plan, bypassPermissions
**Frontend:**
- Extended PermissionMode type to 4 modes (was 2: ask/never)
- PermissionModeSelector now shows all 4 modes with descriptions
- Added disabled prop (grayed out when Cloud orchestrator selected)
- PermissionModeContext.sendModeChangeToDevice() calls API
- AgentMessenger sends mode changes to device on mode/device change
- Updated auto-approval logic (only in Cloud mode, only for bypassPermissions)
- Updated inputMode logic (device handles approvals, not cloud)
**Translations:**
- Updated en.json with 4 mode labels and descriptions
- Removed legacy "askAlways" and "neverAsk" keys
**Mode Behavior:**
- default: Ask permission for each tool
- acceptEdits: Auto-approve file edits only
- plan: Read-only exploration (denies writes)
- bypassPermissions: Auto-approve everything
**Lint Fixes:**
- Removed unused imports and functions from trackingMiddleware.ts
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: store mode in connectionData and show approvals for all modes
**Backend:**
- Fixed Redis WRONGTYPE error - store currentMode inside connectionData object
- Changed const connectionData to let connectionData (needs mutation)
- Updated mode_changed handler to reassign entire connectionData object
- Updated ping handler for consistency (also reassigns connectionData)
- Added currentMode field to ListenerConnectionSchema (optional)
**Frontend:**
- Simplified inputMode logic - always show approval UI when toolCallsToApprove.length > 0
- Removed mode-specific approval filtering (show approvals even in bypass/acceptEdits for visibility)
- Users can see what tools are being auto-approved during execution
**Why:**
- Redis key is a JSON string (via setRedisData), not a hash
- Cannot use hset on string keys - causes WRONGTYPE error
- Must update entire object via setRedisData like ping handler does
- Approval visibility helpful for debugging/understanding agent behavior
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: use useMutation hook for sendModeChange instead of direct call
cloudAPI is initialized via initTsrReactQuery, so sendModeChange is a
mutation hook object, not a callable function. Use .useMutation() at the
component level and mutateAsync in the callback.
Co-authored-by: Shubham Naik <4shub@users.noreply.github.com>
* chore: update logs
---------
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Shubham Naik <4shub@users.noreply.github.com>
* feat: add flattened feedback context fields for PostHog filterability
Add system info (local_time, device_type, cwd), session stats (token counts,
timing), agent info (agent_name, agent_description, model), and account info
(billing_tier) as flat top-level fields to the /v1/metadata/feedback endpoint.
Flat fields allow direct PostHog filtering/breakdown without HogQL.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API specs
---------
Co-authored-by: Letta <noreply@letta.com>
* new prompts for sliding window and all compaction + defaults to corresponding prompt
* regenerate api spec
---------
Co-authored-by: Amy Guan <amy@letta.com>
* feat: add template rollback endpoint [LET-7423]
Adds POST /v1/templates/:template_name/rollback endpoint to restore templates to previous versions.
Key features:
- Rollback to any numbered version (1, 2, 3, etc.) or "latest"
- Auto-saves unsaved changes before rollback to prevent data loss
- Validates input (rejects "current"/"dev" as target versions)
- Preserves entity IDs and relationships across rollback
- Uses project context from X-Project header (no project_id in path)
Implementation includes:
- API contract in templatesContract.ts
- Handler in templatesRouter.ts with comprehensive error handling
- 9 E2E tests covering functionality and edge cases
- Updated stainless.yml for SDK generation
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore test
* fix: add X-Project header to rollback endpoint tests
The rollback endpoint uses project context from X-Project header instead of URL path.
Updated all rollback test calls to include the X-Project header with testProject value.
This follows the no-project-in-path pattern for template endpoints.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* feat: support both URL patterns for rollback endpoint
Added dual URL pattern support for rollback endpoint:
- `/v1/templates/:project_id/:template_name/rollback` (with project in path)
- `/v1/templates/:template_name/rollback` (NoProject, uses X-Project header)
Backend supports both patterns, but Stainless only exposes the cleaner NoProject version for SDKs.
Key changes:
- Fixed "rollback to latest" bug by resolving target version BEFORE auto-saving
- NoProject route is exported first to ensure correct route matching order
- Updated tests to use project_id in path for better compatibility
- All 8 rollback tests passing
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* bump
* bump
* bump
---------
Co-authored-by: Letta <noreply@letta.com>
* fix: update ContextWindowCalculator to parse new system message sections
The context window calculator was using outdated position-based parsing
that only handled 3 sections (base_instructions, memory_blocks, memory_metadata).
The actual system message now includes additional sections that were not
being tracked:
- <memory_filesystem> (git-enabled agents)
- <tool_usage_rules> (when tool rules configured)
- <directories> (when sources attached)
Changes:
- Add _extract_tag_content() helper for proper XML tag extraction
- Rewrite extract_system_components() to return a Dict with all 6 sections
- Update calculate_context_window() to count tokens for new sections
- Add new fields to ContextWindowOverview schema with backward-compatible defaults
- Add unit tests for the extraction logic
* update
* generate
* fix: check attached file in directories section instead of core_memory
Files are rendered inside <directories> tags, not <memory_blocks>.
Update validate_context_window_overview assertions accordingly.
* fix: address review feedback for context window parser
- Fix git-enabled agents regression: capture bare file blocks
(e.g. <system/human.md>) rendered after </memory_filesystem> as
core_memory via new _extract_git_core_memory() method
- Make _extract_top_level_tag robust: scan all occurrences to find
tag outside container, handling nested-first + top-level-later case
- Document system_prompt tag inconsistency in docstring
- Add TODO to base_agent.py extract_dynamic_section linking to
ContextWindowCalculator to flag parallel parser tech debt
- Add tests: git-enabled agent parsing, dual-occurrence tag
extraction, pure text system prompt, git-enabled integration test
* Add log probabilities support for RL training
This enables Letta server to request and return log probabilities from
OpenAI-compatible providers (including SGLang) for use in RL training.
Changes:
- LLMConfig: Add return_logprobs and top_logprobs fields
- OpenAIClient: Set logprobs in ChatCompletionRequest when enabled
- LettaLLMAdapter: Add logprobs field and extract from response
- LettaResponse: Add logprobs field to return log probs to client
- LettaRequest: Add return_logprobs/top_logprobs for per-request override
- LettaAgentV3: Store and pass logprobs through to response
- agents.py: Handle request-level logprobs override
Usage:
response = client.agents.messages.create(
agent_id=agent_id,
messages=[...],
return_logprobs=True,
top_logprobs=5,
)
print(response.logprobs) # Per-token log probabilities
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* Add multi-turn token tracking for RL training via SGLang native endpoint
- Add TurnTokenData schema to track token IDs and logprobs per turn
- Add return_token_ids flag to LettaRequest and LLMConfig
- Create SGLangNativeClient for /generate endpoint (returns output_ids)
- Create SGLangNativeAdapter that uses native endpoint
- Modify LettaAgentV3 to accumulate turns across LLM calls
- Include turns in LettaResponse when return_token_ids=True
* Fix: Add SGLang native adapter to step() method, not just stream()
* Fix: Handle Pydantic Message objects in SGLang native adapter
* Fix: Remove api_key reference from LLMConfig (not present)
* Fix: Add missing 'created' field to ChatCompletionResponse
* Add full tool support to SGLang native adapter
- Format tools into prompt in Qwen-style format
- Parse tool calls from <tool_call> tags in response
- Format tool results as <tool_response> in user messages
- Set finish_reason to 'tool_calls' when tools are called
* Use tokenizer.apply_chat_template for proper tool formatting
- Add tokenizer caching in SGLang native adapter
- Use apply_chat_template when tokenizer available
- Fall back to manual formatting if not
- Convert Letta messages to OpenAI format for tokenizer
* Fix: Use func_response instead of tool_return for ToolReturn content
* Fix: Get output_token_logprobs from meta_info in SGLang response
* Fix: Allow None in output_token_logprobs (SGLang format includes null)
* chore: remove unrelated files from logprobs branch
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: add missing call_type param to adapter constructors in letta_agent_v3
The SGLang refactor dropped call_type=LLMCallType.agent_step when extracting
adapter creation into conditional blocks. Restores it for all 3 spots (SGLang
in step, SimpleLLM in step, SGLang in stream).
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* just stage-api && just publish-api
* fix: update expected LLMConfig fields in schema test for logprobs support
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: remove rllm provider references
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* just stage-api && just publish-api
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Ubuntu <ubuntu@ip-172-31-65-206.ec2.internal>
Co-authored-by: Letta <noreply@letta.com>
* feat: add /agents/{agent_id}/generate endpoint for direct LLM requests
Add new endpoint that makes direct LLM provider requests without agent
context, memory, tools, or state modification. This enables:
- Quick LLM queries without agent overhead
- Testing model configurations
- Simple chat completions using agent's credentials
- Comparing responses across different models
Features:
- Uses agent's LLM config by default
- Supports model override with full provider config resolution
- Non-streaming, stateless operation
- Proper error handling and validation
- Request/response schemas with Pydantic validation
Implementation:
- Add GenerateRequest and GenerateResponse schemas
- Implement generate_completion endpoint handler
- Add necessary imports (LLMError, LLMClient, HandleNotFoundError)
- Include logging and comprehensive error handling
* fix: improve error handling and fix Message construction
- Fix critical bug: use content=[TextContent(text=...)] instead of text=...
- Add explicit error handling for NoResultFound and HandleNotFoundError
- Add error handling for convert_response_to_chat_completion
- Add structured logging for debugging
- Remove unnecessary .get() calls since Pydantic validates messages
* refactor: extract generate logic to AgentCompletionService
Move the generate endpoint business logic out of the endpoint handler
into a dedicated AgentCompletionService class for better code organization
and separation of concerns.
Changes:
- Create new AgentCompletionService in services/agent_completion_service.py
- Service handles all business logic: agent validation, LLM config resolution,
message conversion, LLM client creation, and request/response processing
- Integrate service with SyncServer initialization
- Refactor generate_completion endpoint to use the service
- Endpoint now only handles HTTP concerns (auth, error mapping)
Benefits:
- Cleaner endpoint code (reduced from ~140 lines to ~25 lines)
- Better separation of concerns (HTTP vs business logic)
- Service logic can be reused or tested independently
- Follows established patterns in the codebase (AgentManager, etc.)
* feat: simplify generate API to accept just prompt text
Simplify the client interface by accepting a simple prompt string instead
of requiring clients to format messages.
Changes:
- Update GenerateRequest schema:
- Replace 'messages' array with simple 'prompt' string
- Add optional 'system_prompt' for context/instructions
- Keep 'override_model' for model selection
- Update AgentCompletionService to format messages automatically:
- Accepts prompt and optional system_prompt
- Constructs message array internally (system + user messages)
- Simpler API surface for clients
- Update endpoint documentation with new simplified examples
- Regenerate OpenAPI spec and TypeScript SDK
Benefits:
- Much simpler client experience - just send text
- No need to understand message formatting
- Still supports system prompts for context
- Cleaner API that matches common use cases
Example (before):
{
"messages": [{"role": "user", "content": "What is 2+2?"}]
}
Example (after):
{
"prompt": "What is 2+2?"
}
* test: add comprehensive integration tests for generate endpoint
Add 9 integration tests covering various scenarios:
Happy path tests:
- test_agent_generate_basic: Basic prompt -> response flow
- test_agent_generate_with_system_prompt: System prompt + user prompt
- test_agent_generate_with_model_override: Override model selection
- test_agent_generate_long_prompt: Handle longer prompts
- test_agent_generate_no_persistence: Verify no messages saved to agent
Error handling tests:
- test_agent_generate_empty_prompt_error: Empty prompt validation (422)
- test_agent_generate_invalid_agent_id: Invalid agent ID (404)
- test_agent_generate_invalid_model_override: Invalid model handle (404)
All tests verify:
- Response structure (content, model, usage)
- Proper status codes for errors
- Usage statistics (tokens, counts)
- No side effects on agent state
Tests follow existing test patterns in test_client.py and use the
letta_client SDK (assuming generate_completion method is auto-generated
from the OpenAPI spec).
* openapi
* refactor: rename AgentCompletionService to AgentGenerateCompletionManager
Rename for better clarity and consistency with codebase naming conventions:
- Rename file: agent_completion_service.py → agent_generate_completion_manager.py
- Rename class: AgentCompletionService → AgentGenerateCompletionManager
- Rename attribute: server.agent_completion_service → server.agent_generate_completion_manager
- Update docstrings: 'Service' → 'Manager'
Changes:
- apps/core/letta/services/agent_generate_completion_manager.py (renamed + updated class)
- apps/core/letta/server/server.py (import + initialization)
- apps/core/letta/server/rest_api/routers/v1/agents.py (usage in endpoint)
No functional changes, purely a naming refactor.
* fix: remove invalid Message parameters in generate manager
Remove agent_id=None and user_id=None from Message construction.
The Message model doesn't accept these as None values - only pass
required parameters (role, content).
Fixes validation error:
'Extra inputs are not permitted [type=extra_forbidden, input_value=None]'
This aligns with other Message construction patterns in the codebase
(see tools.py, memory.py examples).
* feat: improve generate endpoint validation and tests
- Add field validator for whitespace-only prompts
- Always include system message (required by Anthropic)
- Use default "You are a helpful assistant." when no system_prompt provided
- Update tests to use direct HTTP calls via httpx
- Fix test issues:
- Use valid agent ID format (agent-{uuid})
- Use available model (openai/gpt-4o-mini)
- Add whitespace validation test
- All 9 integration tests passing
* feat: add usage columns to steps table
Adds denormalized usage fields to the steps table for easier querying:
- model_handle: The model handle (e.g., "openai/gpt-4o-mini")
- cached_input_tokens: Tokens served from cache
- cache_write_tokens: Tokens written to cache (Anthropic)
- reasoning_tokens: Reasoning/thinking tokens
These fields mirror LettaUsageStatistics and are extracted from the
existing prompt_tokens_details and completion_tokens_details JSON columns.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate OpenAPI specs and SDK for usage columns
🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com>
---------
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com>
* feat: add ID format validation to agent and user schemas
Reuse existing validator types (ToolId, SourceId, BlockId, MessageId,
IdentityId, UserId) from letta.validators to enforce ID format validation
at the schema level. This ensures malformed IDs are rejected with a 422
validation error instead of causing 500 database errors.
Changes:
- CreateAgent: validate tool_ids, source_ids, folder_ids, block_ids, identity_ids
- UpdateAgent: validate tool_ids, source_ids, folder_ids, block_ids, message_ids, identity_ids
- UserUpdate: validate id
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
* fix: override ID validation in AgentSchema for agent file portability
AgentSchema extends CreateAgent but needs to allow arbitrary short IDs
(e.g., tool-0, block-0) for portable agent files. Override the validated
ID fields to use plain List[str] instead of the validated types.
Also fix test_agent.af to use proper UUID-format IDs.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: revert test_agent.af - short IDs are valid for agent files
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix openapi schema
---------
Co-authored-by: Letta <noreply@letta.com>
* fix: non-streaming conversation messages endpoint
**Problems:**
1. `AssertionError: run_id is required when enforce_run_id_set is True`
- Non-streaming path didn't create a run before calling `step()`
2. `ResponseValidationError: Unable to extract tag using discriminator 'message_type'`
- `response_model=LettaStreamingResponse` but non-streaming returns `LettaResponse`
**Fixes:**
1. Add run creation before calling `step()` (mirrors agents endpoint)
2. Set run_id in Redis for cancellation support
3. Pass `run_id` to `step()`
4. Change `response_model` from `LettaStreamingResponse` to `LettaResponse`
(streaming returns `StreamingResponse` which bypasses response_model validation)
**Test:**
Added `test_conversation_non_streaming_raw_http` to verify the fix.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* api sync
---------
Co-authored-by: Letta <noreply@letta.com>
* feat: add ID format validation to batch request schema
Add ID format validation to LettaBatchRequest using existing validator
types from letta.validators.
Changes:
- LettaBatchRequest.agent_id: str → AgentId
This ensures malformed agent IDs in batch requests are rejected with 422
validation errors instead of causing 500 database errors.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
---------
Co-authored-by: Letta <noreply@letta.com>
* feat: add ID format validation to identity schemas
Add ID format validation to IdentityCreate, IdentityUpsert, and IdentityUpdate
schemas using existing validator types from letta.validators.
Changes:
- agent_ids: Optional[List[str]] → Optional[List[AgentId]]
- block_ids: Optional[List[str]] → Optional[List[BlockId]]
This ensures malformed IDs are rejected with 422 validation errors instead
of causing 500 database errors.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
---------
Co-authored-by: Letta <noreply@letta.com>
* feat: add ID format validation to group schemas
Add ID format validation to GroupCreate, GroupUpdate, and manager config
schemas using existing validator types from letta.validators.
Changes:
- GroupCreate/GroupUpdate: agent_ids → List[AgentId], shared_block_ids → List[BlockId]
- SupervisorManager, DynamicManager, SleeptimeManager, VoiceSleeptimeManager:
manager_agent_id → AgentId
- Update variants: manager_agent_id → Optional[AgentId]
This ensures malformed IDs are rejected with 422 validation errors instead
of causing 500 database errors.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate API spec and SDK
---------
Co-authored-by: Letta <noreply@letta.com>
* fix: remove deprecation from agent passages endpoints
The client.agent.passages endpoints (list, create, search, delete) were
incorrectly marked as deprecated. This would break significant amounts
of user code and negatively impact developer experience.
Fixes#9116
Co-authored-by: Ari Webb <AriWebb@users.noreply.github.com>
* stage publish api
---------
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Ari Webb <AriWebb@users.noreply.github.com>
Co-authored-by: Ari Webb <ari@letta.com>