letta-server

Author	SHA1	Message	Date
cthomas	416ffc7cd7	Add billing context to LLM telemetry traces (#9745 ) * feat: add billing context to LLM telemetry traces Add billing metadata (plan type, cost source, customer ID) to LLM traces in ClickHouse for cost analytics and attribution. Data Flow: - Cloud-API: Extract billing info from subscription in rate limiting, set x-billing-* headers - Core: Parse headers into BillingContext object via dependencies - Adapters: Flow billing_context through all LLM adapters (blocking & streaming) - Agent: Pass billing_context to step() and stream() methods - ClickHouse: Store in billing_plan_type, billing_cost_source, billing_customer_id columns Changes: - Add BillingContext schema to provider_trace.py - Add billing columns to llm_traces ClickHouse table DDL - Update getCustomerSubscription to fetch stripeCustomerId from organization_billing_details - Propagate billing_context through agent step flow, adapters, and streaming service - Update ProviderTrace and LLMTrace to include billing metadata - Regenerate SDK with autogen Production Deployment: Requires env vars: LETTA_PROVIDER_TRACE_BACKEND=clickhouse, LETTA_STORE_LLM_TRACES=true, CLICKHOUSE_* 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: add billing_context parameter to agent step methods - Add billing_context to BaseAgent and BaseAgentV2 abstract methods - Update LettaAgent, LettaAgentV2, LettaAgentV3 step methods - Update multi-agent groups: SleeptimeMultiAgentV2, V3, V4 - Fix test_utils.py to include billing header parameters - Import BillingContext in all affected files * fix: add billing_context to stream methods - Add billing_context parameter to BaseAgentV2.stream() - Add billing_context parameter to LettaAgentV2.stream() - LettaAgentV3.stream() already has it from previous commit * fix: exclude billing headers from OpenAPI spec Mark billing headers as internal (include_in_schema=False) so they don't appear in the public API. These are internal headers between cloud-api and core, not part of the public SDK. Regenerated SDK with stage-api - removes 10,650 lines of bloat that was causing OOM during Next.js build. * refactor: return billing context from handleUnifiedRateLimiting instead of mutating req Instead of passing req into handleUnifiedRateLimiting and mutating headers inside it: - Return billing context fields (billingPlanType, billingCostSource, billingCustomerId) from handleUnifiedRateLimiting - Set headers in handleMessageRateLimiting (middleware layer) after getting the result - This fixes step-orchestrator compatibility since it doesn't have a real Express req object * chore: remove extra gencode * p --------- Co-authored-by: Letta <noreply@letta.com>	2026-03-03 18:34:13 -08:00
Christina Tong	c8ae02a1fb	feat(core): sort agents by updated_at [LET-7771] (#9730 ) feat(core): sort agents by last_updated_at	2026-03-03 18:34:02 -08:00
Shubham Naik	c247496027	Clean up server (#9728 ) * feat: clean up production * feat: clean up production * feat: clean up production * feat: clean up production * feat: clean up production * feat: clean up production * feat: clean up production * feat: clean up production	2026-03-03 18:34:02 -08:00
cthomas	aa66e81a71	feat: add debug logs in telem endpoint (#9723 ) * feat: add debug logs in telem endpoint * api sync * fix: add debug_log_tail to FeedbackProperty type Add debug_log_tail field to FeedbackProperty interface in service-analytics to fix type error when sending debug log data in feedback and telemetry. Also add e2e tests for feedback and error telemetry with debug_log_tail. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-03-03 18:34:02 -08:00
cthomas	ab784e702d	feat: support default convo in list and cancel endpoints (#9707 ) * feat: support default convo in list and cancel endpoints * also support in compact endpoint * api sync	2026-03-03 18:34:02 -08:00
cthomas	39a537a9a5	feat: add default convo support to conversations endpoint (#9706 ) * feat: add default convo support to conversations endpoint * api sync	2026-03-03 18:34:02 -08:00
Shubham Naik	fd4a8e73a5	More stream fixes (#9702 ) * chore: more stream fixes * chore: more stream fixes * chore: more stream fixes * chore: next * chore: next * chore: next	2026-03-03 18:34:02 -08:00
Shubham Naik	357a3ad15b	Shub/let 7721 make env permanent [LET-7721] (#9683 ) * chore: env permanent * chore: env permanent * feat: add persistent environments with hybrid DB + Redis storage [LET-7721] Implements persistent storage for letta-code listener connections (environments) with hybrid PostgreSQL + Redis architecture: Database Layer: - Add `environments` table with device tracking, connection metadata, soft deletes - Store userId/apiKeyOwner, connection history (firstSeenAt, lastSeenAt) - Unique constraint on (organizationId, deviceId) - one environment per device per org - Auto-undelete previously deleted environments on reconnect API Layer: - Update environmentsContract with new fields (id, firstSeenAt, lastSeenAt, metadata) - Add deleteEnvironment endpoint (soft delete, closes WebSocket if online) - Add onlineOnly filter to listConnections for efficient online-only queries - Export ListConnectionsResponse type for proper client typing Router Implementation: - register(): Create/update DB environment, generate ephemeral connectionId - listConnections(): Hybrid query strategy (DB-first for all, Redis-first for onlineOnly) - deleteEnvironment(): Soft delete with Redis Pub/Sub for graceful WebSocket close - Filter by connectionId in DB using inArray() for onlineOnly performance WebSocket Handler: - Moved from apps/cloud-api to libs/utils-server for reusability - Update DB on connect/disconnect only (not heartbeat) - minimal write load - Store currentPodId and userId/apiKeyOwner on connect - Clear currentConnectionId/currentPodId on disconnect/error Shared Types: - Add EnvironmentMetadata interface in libs/types for cross-layer consistency - Update Redis schema to include currentMode field UI Components: - Add DeleteDeviceModal with offline-only restriction - Update DeviceSelector with delete button on hover for offline devices - Proper cache updates using ListConnectionsResponse type - Add translations for delete modal 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * docs: update letta remote setup instructions [LET-7721] Update local setup guide with clearer instructions: - Remove hardcoded ngrok URL requirement (ngrok generates URL automatically) - Update env var to use CLOUD_API_ENDPOINT_OVERRIDE - Add proper API key and base URL format - Include alternative setup using letta-code repo with bun dev 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: fix env * fix: lint errors and make migration idempotent [LET-7721] - Remove unused imports (HiddenOnMobile, VisibleOnMobile, MiddleTruncate) - Fix type imports (use `import type` for type-only imports) - Remove non-null assertions in environmentsRouter (use safe null checks + filter) - Make migration idempotent with IF NOT EXISTS for table, indexes, and constraints - Use DO $$ block for foreign key constraint (handles duplicate_object exception) 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: fix env --------- Co-authored-by: Letta <noreply@letta.com>	2026-03-03 18:34:01 -08:00
Shubham Naik	5d55d4ccd4	chore: rebuild docs (#9674 )	2026-03-03 18:34:01 -08:00
jnjpng	bd6f2e792c	feat: accept recent_chunks in error telemetry schema (#9662 ) * feat: accept recent_chunks in error telemetry schema Add recent_chunks field to ErrorDataSchema (Zod) and LettaCodeErrorProperty (analytics type) so the server can receive and forward chunk diagnostics attached to error telemetry events. * chore: regenerate openapi with recent_chunks field	2026-03-03 18:34:01 -08:00
Kevin Lin	895acb9f4e	feat(core): add gpt-5.3-codex model support (#9628 ) * feat(core): add gpt-5.3-codex model support Add OpenAI gpt-5.3-codex model: context window overrides, model pricing and capabilities, none-reasoning-effort support, and test config. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * just stage-api && just publish-api --------- Co-authored-by: Letta <noreply@letta.com>	2026-03-03 18:34:01 -08:00
Caren Thomas	ce54fb1a00	bump version	2026-02-24 10:58:16 -08:00
amysguan	47b0c87ebe	Add modes `self` and `self_sliding_window` for prompt caching (#9372 ) * add self compaction method with proper caching (pass in tools, don't refresh sys prompt beforehand) + sliding fallback * updated prompts for self compaction * add tests for self, self_sliding_window modes and w/o refresh messages before compaction * add cache logging to summarization * better handling to prevent agent from continuing convo on self modes * if mode changes via summarize endpoint, will use default prompt for the new mode --------- Co-authored-by: Amy Guan <amy@letta.com>	2026-02-24 10:55:26 -08:00
Ari Webb	c325b2b002	feat: add memfs file list and read endpoints to cloud-api [LET-7437] (#9520 ) * feat: add memfs file list and read endpoints to cloud-api * fix ci * add env var * tests and refactor memoryFilesRouter * rename env var * fix path parameter error * fix test * stage publish api * memfs in helm	2026-02-24 10:55:26 -08:00
Ari Webb	62967bcca0	feat: parallel tool calling minimax provider [LET-7647] (#9613 ) * feat: parallel tool calling minimax provider * stage publish api	2026-02-24 10:55:26 -08:00
Shubham Naik	f082fd5061	feat: add order_by and order params to /v1/conversations list endpoin… (#9599 ) * feat: add order_by and order params to /v1/conversations list endpoint [LET-7628] Added sorting support to the conversations list endpoint, matching the pattern from /v1/agents. API Changes: - Added `order` query param: "asc" or "desc" (default: "desc") - Added `order_by` query param: "created_at" or "last_run_completion" (default: "created_at") Implementation: created_at ordering: - Simple ORDER BY on ConversationModel.created_at - No join required, fast query - Nulls not applicable (created_at always set) last_run_completion ordering: - LEFT JOIN with runs table using subquery - Subquery: MAX(completed_at) grouped by conversation_id - Uses OUTER JOIN so conversations with no runs are included - Nulls last ordering (conversations with no runs go to end) - Index on runs.conversation_id ensures performant join Pagination: - Cursor-based pagination with `after` parameter - Handles null values correctly for last_run_completion - For created_at: simple timestamp comparison - For last_run_completion: complex null-aware cursor logic Performance: - Existing index: `ix_runs_conversation_id` on runs table - Subquery with GROUP BY is efficient for this use case - OUTER JOIN ensures conversations without runs are included Follows agents pattern: - Same parameter names (order, order_by) - Same Literal types and defaults - Converts "asc"/"desc" to ascending boolean internally 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: order --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:55:26 -08:00
Kevin Lin	8fc77af685	fix(memory): standardize tool parameter names (#9552 ) fix(memory): standardize tool parameter names Use old_string/new_string across memory edit tools, docs, tests, and starter kits to avoid mismatched parameter names. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> EOF )	2026-02-24 10:55:24 -08:00
jnjpng	ae24bd1034	feat: add recent_chunks to feedback endpoint for diagnostics (#9594 ) * feat: add recent_chunks field to feedback endpoint Accept an optional recent_chunks string in the /v1/metadata/feedback POST body and forward it to PostHog. Contains the last 100 truncated streaming chunks as JSON for debugging issues reported via /feedback. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: regenerate openapi specs with recent_chunks field --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:55:12 -08:00
github-actions[bot]	ba67621e1b	feat: add conversation deletion endpoint (soft delete) [LET-7286] (#9230 ) * feat: add conversation deletion endpoint (soft delete) [LET-7286] - Add DELETE /conversations/{conversation_id} endpoint - Filter soft-deleted conversations from list operations - Add check_is_deleted=True to update/delete operations Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com> 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: add tests, update SDK and stainless for delete conversation - Add 5 integration tests for DELETE conversation endpoint - Run stage-api to regenerate OpenAPI spec and SDK - Add delete method to conversations in stainless.yml Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com> 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * test: add manager-level tests for conversation soft delete [LET-7286] - test_delete_conversation_removes_from_list - test_delete_conversation_double_delete_raises - test_update_deleted_conversation_raises - test_delete_conversation_excluded_from_summary_search 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>	2026-02-24 10:55:12 -08:00
Shubham Naik	73c824f5d2	feat: make agent_id optional in conversations list endpoint [LET-7612] (#9585 ) * feat: make agent_id optional in conversations list endpoint [LET-7612] Allow listing all conversations without filtering by agent_id. Router changes (conversations.py): - Changed agent_id from required (`...`) to optional (`None`) - Updated description to clarify behavior - Updated docstring to reflect optional filtering Manager changes (conversation_manager.py): - Updated list_conversations signature: agent_id: str → Optional[str] - Updated docstring to clarify optional behavior - Summary search query: conditionally adds agent_id filter only if provided - Default list logic: passes agent_id (can be None) to list_async How it works: - Without agent_id: returns all conversations for the user's organization - With agent_id: returns conversations filtered by that agent - list_async handles None gracefully via kwargs pattern Use case:** - Cloud UI can list all user conversations across agents - Still supports filtering by agent_id when needed 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: update logs * chore: update logs --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:55:12 -08:00
Shubham Naik	34bab3cf9a	Shub/let listener mode control (#9584 ) * feat: add two-way mode control for listener connections Enable bidirectional permission mode control between letta-cloud UI and letta-code instances. Backend: - Added ModeChangeMessage and ModeChangedMessage to WebSocket protocol - Added sendModeChange endpoint (/v1/listeners/:connectionId/mode) - listenersRouter publishes mode_change via Redis Pub/Sub - listenersHandler handles mode_changed acknowledgments from letta-code - Stores current mode in Redis for UI state sync Contract: - Added sendModeChange contract with PermissionModeSchema - 4 modes: default, acceptEdits, plan, bypassPermissions Frontend: - Extended PermissionMode type to 4 modes (was 2: ask/never) - PermissionModeSelector now shows all 4 modes with descriptions - Added disabled prop (grayed out when Cloud orchestrator selected) - PermissionModeContext.sendModeChangeToDevice() calls API - AgentMessenger sends mode changes to device on mode/device change - Updated auto-approval logic (only in Cloud mode, only for bypassPermissions) - Updated inputMode logic (device handles approvals, not cloud) Translations: - Updated en.json with 4 mode labels and descriptions - Removed legacy "askAlways" and "neverAsk" keys Mode Behavior: - default: Ask permission for each tool - acceptEdits: Auto-approve file edits only - plan: Read-only exploration (denies writes) - bypassPermissions: Auto-approve everything Lint Fixes: - Removed unused imports and functions from trackingMiddleware.ts 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: store mode in connectionData and show approvals for all modes Backend: - Fixed Redis WRONGTYPE error - store currentMode inside connectionData object - Changed const connectionData to let connectionData (needs mutation) - Updated mode_changed handler to reassign entire connectionData object - Updated ping handler for consistency (also reassigns connectionData) - Added currentMode field to ListenerConnectionSchema (optional) Frontend: - Simplified inputMode logic - always show approval UI when toolCallsToApprove.length > 0 - Removed mode-specific approval filtering (show approvals even in bypass/acceptEdits for visibility) - Users can see what tools are being auto-approved during execution Why: - Redis key is a JSON string (via setRedisData), not a hash - Cannot use hset on string keys - causes WRONGTYPE error - Must update entire object via setRedisData like ping handler does - Approval visibility helpful for debugging/understanding agent behavior 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: use useMutation hook for sendModeChange instead of direct call cloudAPI is initialized via initTsrReactQuery, so sendModeChange is a mutation hook object, not a callable function. Use .useMutation() at the component level and mutateAsync in the callback. Co-authored-by: Shubham Naik <4shub@users.noreply.github.com> * chore: update logs --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Shubham Naik <4shub@users.noreply.github.com>	2026-02-24 10:55:12 -08:00
jnjpng	b29d063ba7	feat: forward letta-code feedback context fields to PostHog (#9567 ) * feat: add flattened feedback context fields for PostHog filterability Add system info (local_time, device_type, cwd), session stats (token counts, timing), agent info (agent_name, agent_description, model), and account info (billing_tier) as flat top-level fields to the /v1/metadata/feedback endpoint. Flat fields allow direct PostHog filtering/breakdown without HogQL. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: regenerate API specs --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:55:12 -08:00
amysguan	33969d7190	Default to lightweight compaction model instead of agent's model (#9488 ) --------- Co-authored-by: Amy Guan <amy@letta.com>	2026-02-24 10:55:12 -08:00
cthomas	8ffc515674	fix: flip parallel_tool_calls setting default (#9541 )	2026-02-24 10:55:11 -08:00
Shubham Naik	3247fa7065	chore: fix favoriting bugs (#9505 ) * chore: fix favoriting bugs * chore: fix favoriting bugs * chore: fix favoriting bugs	2026-02-24 10:55:11 -08:00
Shubham Naik	4a829123cd	Listener mode (#9486 ) * feat: listener mdoe * feat: listener mdoe * feat: listener mdoe * feat: listener mdoe * feat: listener mdoe * feat: listener mdoe * chore: merge * feat: listen mode * feat: add some keycontrols * feat: add some keycontrols * chore: hotwire fix for core * chore: restore connection	2026-02-24 10:55:11 -08:00
Shubham Naik	2f76f2c629	Agent page update (#9475 ) * feat: agent page updates * feat: agent page updates * feat: agent page updates * feat: agent page updates * feat: agent page updates * feat: agent page updates * chore: fix code * chore: fix code	2026-02-24 10:52:07 -08:00
Ari Webb	0a8a8fda54	feat: add credit verification before agent message endpoints [LET-XXXX] (#9433 ) * feat: add credit verification before agent message endpoints Add credit verification checks to message endpoints to prevent execution when organizations have insufficient credits. - Add InsufficientCreditsError exception type - Add CreditVerificationService that calls step-orchestrator API - Add credit checks to /agents/{id}/messages endpoints - Add credit checks to /conversations/{id}/messages endpoint 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * surface error in ade * do per step instead * parallel check * parallel to step * small fixes * stage publish api * fixes * revert unnecessary frontend changes * insufficient credits stop reason --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
amysguan	9bec8c64f5	New prompts/defaults for `sliding_window` and `all` compaction (#9444 ) * new prompts for sliding window and all compaction + defaults to corresponding prompt * regenerate api spec --------- Co-authored-by: Amy Guan <amy@letta.com>	2026-02-24 10:52:07 -08:00
Shubham Naik	6579f9b906	feat: add saveTemplateVersionNoProject endpoint (#9465 ) * feat: add saveTemplateVersionNoProject endpoint Added NoProject version of saveTemplateVersion endpoint: - Backend supports both URL patterns: - `/v1/templates/:project_id/:template_name` (with project in path) - `/v1/templates/:template_name` (NoProject, uses X-Project header) - Stainless surfaces only the cleaner NoProject version as `templates.save()` - NoProject route exported first for correct route matching Changes: - Added saveTemplateVersionNoProject contract and handler - Updated stainless.yml to map `save` method to NoProject endpoint - Follows same pattern as other NoProject endpoints (create, delete, rollback) 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * bump * bump --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Shubham Naik	565fd3c143	feat: add template rollback endpoint [LET-7423] (#9455 ) * feat: add template rollback endpoint [LET-7423] Adds POST /v1/templates/:template_name/rollback endpoint to restore templates to previous versions. Key features: - Rollback to any numbered version (1, 2, 3, etc.) or "latest" - Auto-saves unsaved changes before rollback to prevent data loss - Validates input (rejects "current"/"dev" as target versions) - Preserves entity IDs and relationships across rollback - Uses project context from X-Project header (no project_id in path) Implementation includes: - API contract in templatesContract.ts - Handler in templatesRouter.ts with comprehensive error handling - 9 E2E tests covering functionality and edge cases - Updated stainless.yml for SDK generation 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore test * fix: add X-Project header to rollback endpoint tests The rollback endpoint uses project context from X-Project header instead of URL path. Updated all rollback test calls to include the X-Project header with testProject value. This follows the no-project-in-path pattern for template endpoints. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: support both URL patterns for rollback endpoint Added dual URL pattern support for rollback endpoint: - `/v1/templates/:project_id/:template_name/rollback` (with project in path) - `/v1/templates/:template_name/rollback` (NoProject, uses X-Project header) Backend supports both patterns, but Stainless only exposes the cleaner NoProject version for SDKs. Key changes: - Fixed "rollback to latest" bug by resolving target version BEFORE auto-saving - NoProject route is exported first to ensure correct route matching order - Updated tests to use project_id in path for better compatibility - All 8 rollback tests passing 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * bump * bump * bump --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
jnjpng	39b25a0e3c	fix: update ContextWindowCalculator to parse new system message sections (#9398 ) * fix: update ContextWindowCalculator to parse new system message sections The context window calculator was using outdated position-based parsing that only handled 3 sections (base_instructions, memory_blocks, memory_metadata). The actual system message now includes additional sections that were not being tracked: - <memory_filesystem> (git-enabled agents) - <tool_usage_rules> (when tool rules configured) - <directories> (when sources attached) Changes: - Add _extract_tag_content() helper for proper XML tag extraction - Rewrite extract_system_components() to return a Dict with all 6 sections - Update calculate_context_window() to count tokens for new sections - Add new fields to ContextWindowOverview schema with backward-compatible defaults - Add unit tests for the extraction logic * update * generate * fix: check attached file in directories section instead of core_memory Files are rendered inside <directories> tags, not <memory_blocks>. Update validate_context_window_overview assertions accordingly. * fix: address review feedback for context window parser - Fix git-enabled agents regression: capture bare file blocks (e.g. <system/human.md>) rendered after </memory_filesystem> as core_memory via new _extract_git_core_memory() method - Make _extract_top_level_tag robust: scan all occurrences to find tag outside container, handling nested-first + top-level-later case - Document system_prompt tag inconsistency in docstring - Add TODO to base_agent.py extract_dynamic_section linking to ContextWindowCalculator to flag parallel parser tech debt - Add tests: git-enabled agent parsing, dual-occurrence tag extraction, pure text system prompt, git-enabled integration test	2026-02-24 10:52:07 -08:00
Shubham Naik	ca32311b9a	feat: allow users to specify via query to stip messages [LET-7392] (#9411 ) * feat: allow users to specify via query to stip messages * chore: regenerate API SDK and OpenAPI spec [LET-7392] 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Ari Webb <AriWebb@users.noreply.github.com> Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Ari Webb <AriWebb@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Sarah Wooders	0dde155e9a	feat: Prefix cache optimization system prompt (#9381 )	2026-02-24 10:52:07 -08:00
Kevin Lin	23c94ec6d3	feat: add log probabilities from OpenAI-compatible servers and SGLang native endpoint (#9240 ) * Add log probabilities support for RL training This enables Letta server to request and return log probabilities from OpenAI-compatible providers (including SGLang) for use in RL training. Changes: - LLMConfig: Add return_logprobs and top_logprobs fields - OpenAIClient: Set logprobs in ChatCompletionRequest when enabled - LettaLLMAdapter: Add logprobs field and extract from response - LettaResponse: Add logprobs field to return log probs to client - LettaRequest: Add return_logprobs/top_logprobs for per-request override - LettaAgentV3: Store and pass logprobs through to response - agents.py: Handle request-level logprobs override Usage: response = client.agents.messages.create( agent_id=agent_id, messages=[...], return_logprobs=True, top_logprobs=5, ) print(response.logprobs) # Per-token log probabilities 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * Add multi-turn token tracking for RL training via SGLang native endpoint - Add TurnTokenData schema to track token IDs and logprobs per turn - Add return_token_ids flag to LettaRequest and LLMConfig - Create SGLangNativeClient for /generate endpoint (returns output_ids) - Create SGLangNativeAdapter that uses native endpoint - Modify LettaAgentV3 to accumulate turns across LLM calls - Include turns in LettaResponse when return_token_ids=True * Fix: Add SGLang native adapter to step() method, not just stream() * Fix: Handle Pydantic Message objects in SGLang native adapter * Fix: Remove api_key reference from LLMConfig (not present) * Fix: Add missing 'created' field to ChatCompletionResponse * Add full tool support to SGLang native adapter - Format tools into prompt in Qwen-style format - Parse tool calls from <tool_call> tags in response - Format tool results as <tool_response> in user messages - Set finish_reason to 'tool_calls' when tools are called * Use tokenizer.apply_chat_template for proper tool formatting - Add tokenizer caching in SGLang native adapter - Use apply_chat_template when tokenizer available - Fall back to manual formatting if not - Convert Letta messages to OpenAI format for tokenizer * Fix: Use func_response instead of tool_return for ToolReturn content * Fix: Get output_token_logprobs from meta_info in SGLang response * Fix: Allow None in output_token_logprobs (SGLang format includes null) * chore: remove unrelated files from logprobs branch 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: add missing call_type param to adapter constructors in letta_agent_v3 The SGLang refactor dropped call_type=LLMCallType.agent_step when extracting adapter creation into conditional blocks. Restores it for all 3 spots (SGLang in step, SimpleLLM in step, SGLang in stream). 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * just stage-api && just publish-api * fix: update expected LLMConfig fields in schema test for logprobs support 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: remove rllm provider references 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * just stage-api && just publish-api 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-65-206.ec2.internal> Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Ari Webb	c08b67a26a	feat: add ToolReturnCreate to MessageCreateParams [LET-7366] (#9385 ) * fix: add ToolReturnCreate to sdk types * ci	2026-02-24 10:52:07 -08:00
Sarah Wooders	526da4c49b	Revert "perf: optimize prefix caching by skipping system prompt rebuild on every step" (#9380 ) Revert "perf: optimize prefix caching by skipping system prompt rebuild on ev…" This reverts commit eafa4144c2577a45b7007a177b701863b98d1dfa.	2026-02-24 10:52:07 -08:00
Sarah Wooders	9dbe28e8f1	perf: optimize prefix caching by skipping system prompt rebuild on every step (#9080 )	2026-02-24 10:52:07 -08:00
Sarah Wooders	21e880907f	feat(core): structure memory directory and block labels [LET-7336] (#9309 )	2026-02-24 10:52:06 -08:00
jnjpng	0bdedb3c0f	feat: agent generate endpoint (#9304 ) * base * update * clean up * update	2026-02-24 10:52:06 -08:00
Kevin Lin	34159ffa21	feat: add Anthropic Opus 4.6 model support (#9123 )	2026-02-24 10:52:06 -08:00
jnjpng	ff69c6a32e	feat: add /agents/{agent_id}/generate endpoint for direct LLM requests (#9272 ) * feat: add /agents/{agent_id}/generate endpoint for direct LLM requests Add new endpoint that makes direct LLM provider requests without agent context, memory, tools, or state modification. This enables: - Quick LLM queries without agent overhead - Testing model configurations - Simple chat completions using agent's credentials - Comparing responses across different models Features: - Uses agent's LLM config by default - Supports model override with full provider config resolution - Non-streaming, stateless operation - Proper error handling and validation - Request/response schemas with Pydantic validation Implementation: - Add GenerateRequest and GenerateResponse schemas - Implement generate_completion endpoint handler - Add necessary imports (LLMError, LLMClient, HandleNotFoundError) - Include logging and comprehensive error handling * fix: improve error handling and fix Message construction - Fix critical bug: use content=[TextContent(text=...)] instead of text=... - Add explicit error handling for NoResultFound and HandleNotFoundError - Add error handling for convert_response_to_chat_completion - Add structured logging for debugging - Remove unnecessary .get() calls since Pydantic validates messages * refactor: extract generate logic to AgentCompletionService Move the generate endpoint business logic out of the endpoint handler into a dedicated AgentCompletionService class for better code organization and separation of concerns. Changes: - Create new AgentCompletionService in services/agent_completion_service.py - Service handles all business logic: agent validation, LLM config resolution, message conversion, LLM client creation, and request/response processing - Integrate service with SyncServer initialization - Refactor generate_completion endpoint to use the service - Endpoint now only handles HTTP concerns (auth, error mapping) Benefits: - Cleaner endpoint code (reduced from ~140 lines to ~25 lines) - Better separation of concerns (HTTP vs business logic) - Service logic can be reused or tested independently - Follows established patterns in the codebase (AgentManager, etc.) * feat: simplify generate API to accept just prompt text Simplify the client interface by accepting a simple prompt string instead of requiring clients to format messages. Changes: - Update GenerateRequest schema: - Replace 'messages' array with simple 'prompt' string - Add optional 'system_prompt' for context/instructions - Keep 'override_model' for model selection - Update AgentCompletionService to format messages automatically: - Accepts prompt and optional system_prompt - Constructs message array internally (system + user messages) - Simpler API surface for clients - Update endpoint documentation with new simplified examples - Regenerate OpenAPI spec and TypeScript SDK Benefits: - Much simpler client experience - just send text - No need to understand message formatting - Still supports system prompts for context - Cleaner API that matches common use cases Example (before): { "messages": [{"role": "user", "content": "What is 2+2?"}] } Example (after): { "prompt": "What is 2+2?" } * test: add comprehensive integration tests for generate endpoint Add 9 integration tests covering various scenarios: Happy path tests: - test_agent_generate_basic: Basic prompt -> response flow - test_agent_generate_with_system_prompt: System prompt + user prompt - test_agent_generate_with_model_override: Override model selection - test_agent_generate_long_prompt: Handle longer prompts - test_agent_generate_no_persistence: Verify no messages saved to agent Error handling tests: - test_agent_generate_empty_prompt_error: Empty prompt validation (422) - test_agent_generate_invalid_agent_id: Invalid agent ID (404) - test_agent_generate_invalid_model_override: Invalid model handle (404) All tests verify: - Response structure (content, model, usage) - Proper status codes for errors - Usage statistics (tokens, counts) - No side effects on agent state Tests follow existing test patterns in test_client.py and use the letta_client SDK (assuming generate_completion method is auto-generated from the OpenAPI spec). * openapi * refactor: rename AgentCompletionService to AgentGenerateCompletionManager Rename for better clarity and consistency with codebase naming conventions: - Rename file: agent_completion_service.py → agent_generate_completion_manager.py - Rename class: AgentCompletionService → AgentGenerateCompletionManager - Rename attribute: server.agent_completion_service → server.agent_generate_completion_manager - Update docstrings: 'Service' → 'Manager' Changes: - apps/core/letta/services/agent_generate_completion_manager.py (renamed + updated class) - apps/core/letta/server/server.py (import + initialization) - apps/core/letta/server/rest_api/routers/v1/agents.py (usage in endpoint) No functional changes, purely a naming refactor. * fix: remove invalid Message parameters in generate manager Remove agent_id=None and user_id=None from Message construction. The Message model doesn't accept these as None values - only pass required parameters (role, content). Fixes validation error: 'Extra inputs are not permitted [type=extra_forbidden, input_value=None]' This aligns with other Message construction patterns in the codebase (see tools.py, memory.py examples). * feat: improve generate endpoint validation and tests - Add field validator for whitespace-only prompts - Always include system message (required by Anthropic) - Use default "You are a helpful assistant." when no system_prompt provided - Update tests to use direct HTTP calls via httpx - Fix test issues: - Use valid agent ID format (agent-{uuid}) - Use available model (openai/gpt-4o-mini) - Add whitespace validation test - All 9 integration tests passing	2026-02-24 10:52:06 -08:00
cthomas	530d33c254	feat: add skills support to agentfile (#9287 )	2026-02-24 10:52:06 -08:00
jnjpng	c801866d89	feat: add context token estimates to llm usage (#9295 ) * base * generate * update	2026-02-24 10:52:06 -08:00
Sarah Wooders	e0a23f7039	feat: add usage columns to steps table (#9270 ) * feat: add usage columns to steps table Adds denormalized usage fields to the steps table for easier querying: - model_handle: The model handle (e.g., "openai/gpt-4o-mini") - cached_input_tokens: Tokens served from cache - cache_write_tokens: Tokens written to cache (Anthropic) - reasoning_tokens: Reasoning/thinking tokens These fields mirror LettaUsageStatistics and are extracted from the existing prompt_tokens_details and completion_tokens_details JSON columns. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: regenerate OpenAPI specs and SDK for usage columns 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com> --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com>	2026-02-24 10:52:06 -08:00
Kian Jones	a206f7f345	feat: add ID format validation to agent and user schemas (#9151 ) * feat: add ID format validation to agent and user schemas Reuse existing validator types (ToolId, SourceId, BlockId, MessageId, IdentityId, UserId) from letta.validators to enforce ID format validation at the schema level. This ensures malformed IDs are rejected with a 422 validation error instead of causing 500 database errors. Changes: - CreateAgent: validate tool_ids, source_ids, folder_ids, block_ids, identity_ids - UpdateAgent: validate tool_ids, source_ids, folder_ids, block_ids, message_ids, identity_ids - UserUpdate: validate id 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: regenerate API spec and SDK * fix: override ID validation in AgentSchema for agent file portability AgentSchema extends CreateAgent but needs to allow arbitrary short IDs (e.g., tool-0, block-0) for portable agent files. Override the validated ID fields to use plain List[str] instead of the validated types. Also fix test_agent.af to use proper UUID-format IDs. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: regenerate API spec and SDK 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: revert test_agent.af - short IDs are valid for agent files 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix openapi schema --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:06 -08:00
Kian Jones	025eeaa363	fix: override validation for group for agentfile import (#9248 ) * override validation for group for agentfile import * fix the rest of groupcreate * add api changes	2026-02-24 10:52:06 -08:00
Sarah Wooders	3fdf2b6c79	chore: deprecate old agent messaging (#9120 )	2026-02-24 10:52:06 -08:00
Ari Webb	0bbb9c9bc0	feat: add reasoning zai openrouter (#9189 ) * feat: add reasoning zai openrouter * add openrouter reasoning * stage + publish api * openrouter reasoning always on * revert * fix * remove reference * do	2026-02-24 10:52:06 -08:00
jnjpng	3f23a23227	feat: add compaction stats (#9219 ) * base * update * last * generate * fix test	2026-02-24 10:52:06 -08:00

1 2 3 4 5 ...

392 Commits