letta-server

Author	SHA1	Message	Date
cthomas	416ffc7cd7	Add billing context to LLM telemetry traces (#9745 ) * feat: add billing context to LLM telemetry traces Add billing metadata (plan type, cost source, customer ID) to LLM traces in ClickHouse for cost analytics and attribution. Data Flow: - Cloud-API: Extract billing info from subscription in rate limiting, set x-billing-* headers - Core: Parse headers into BillingContext object via dependencies - Adapters: Flow billing_context through all LLM adapters (blocking & streaming) - Agent: Pass billing_context to step() and stream() methods - ClickHouse: Store in billing_plan_type, billing_cost_source, billing_customer_id columns Changes: - Add BillingContext schema to provider_trace.py - Add billing columns to llm_traces ClickHouse table DDL - Update getCustomerSubscription to fetch stripeCustomerId from organization_billing_details - Propagate billing_context through agent step flow, adapters, and streaming service - Update ProviderTrace and LLMTrace to include billing metadata - Regenerate SDK with autogen Production Deployment: Requires env vars: LETTA_PROVIDER_TRACE_BACKEND=clickhouse, LETTA_STORE_LLM_TRACES=true, CLICKHOUSE_* 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: add billing_context parameter to agent step methods - Add billing_context to BaseAgent and BaseAgentV2 abstract methods - Update LettaAgent, LettaAgentV2, LettaAgentV3 step methods - Update multi-agent groups: SleeptimeMultiAgentV2, V3, V4 - Fix test_utils.py to include billing header parameters - Import BillingContext in all affected files * fix: add billing_context to stream methods - Add billing_context parameter to BaseAgentV2.stream() - Add billing_context parameter to LettaAgentV2.stream() - LettaAgentV3.stream() already has it from previous commit * fix: exclude billing headers from OpenAPI spec Mark billing headers as internal (include_in_schema=False) so they don't appear in the public API. These are internal headers between cloud-api and core, not part of the public SDK. Regenerated SDK with stage-api - removes 10,650 lines of bloat that was causing OOM during Next.js build. * refactor: return billing context from handleUnifiedRateLimiting instead of mutating req Instead of passing req into handleUnifiedRateLimiting and mutating headers inside it: - Return billing context fields (billingPlanType, billingCostSource, billingCustomerId) from handleUnifiedRateLimiting - Set headers in handleMessageRateLimiting (middleware layer) after getting the result - This fixes step-orchestrator compatibility since it doesn't have a real Express req object * chore: remove extra gencode * p --------- Co-authored-by: Letta <noreply@letta.com>	2026-03-03 18:34:13 -08:00
Kian Jones	f5c4ab50f4	chore: add ty + pre-commit hook and repeal even more ruff rules (#9504 ) * auto fixes * auto fix pt2 and transitive deps and undefined var checking locals() * manual fixes (ignored or letta-code fixed) * fix circular import * remove all ignores, add FastAPI rules and Ruff rules * add ty and precommit * ruff stuff * ty check fixes * ty check fixes pt 2 * error on invalid	2026-02-24 10:55:11 -08:00
Sarah Wooders	d7793a4474	fix(core): stabilize system prompt refresh and expand git-memory coverage (#9438 ) * fix(core): stabilize system prompt refresh and expand git-memory coverage Only rebuild system prompts on explicit refresh paths so normal turns preserve prefix-cache stability, including git/custom prompt layouts. Add integration coverage for memory filesystem tree structure and recompile/reset system-message updates via message-id retrieval. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): recompile system prompt around compaction and stabilize source tests Force system prompt refresh before/after compaction in LettaAgentV3 so repaired system+memory state is used and persisted across subsequent turns. Update source-system prompt tests to explicitly recompile before raw preview assertions instead of assuming automatic rebuild timing. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
jnjpng	39b25a0e3c	fix: update ContextWindowCalculator to parse new system message sections (#9398 ) * fix: update ContextWindowCalculator to parse new system message sections The context window calculator was using outdated position-based parsing that only handled 3 sections (base_instructions, memory_blocks, memory_metadata). The actual system message now includes additional sections that were not being tracked: - <memory_filesystem> (git-enabled agents) - <tool_usage_rules> (when tool rules configured) - <directories> (when sources attached) Changes: - Add _extract_tag_content() helper for proper XML tag extraction - Rewrite extract_system_components() to return a Dict with all 6 sections - Update calculate_context_window() to count tokens for new sections - Add new fields to ContextWindowOverview schema with backward-compatible defaults - Add unit tests for the extraction logic * update * generate * fix: check attached file in directories section instead of core_memory Files are rendered inside <directories> tags, not <memory_blocks>. Update validate_context_window_overview assertions accordingly. * fix: address review feedback for context window parser - Fix git-enabled agents regression: capture bare file blocks (e.g. <system/human.md>) rendered after </memory_filesystem> as core_memory via new _extract_git_core_memory() method - Make _extract_top_level_tag robust: scan all occurrences to find tag outside container, handling nested-first + top-level-later case - Document system_prompt tag inconsistency in docstring - Add TODO to base_agent.py extract_dynamic_section linking to ContextWindowCalculator to flag parallel parser tech debt - Add tests: git-enabled agent parsing, dual-occurrence tag extraction, pure text system prompt, git-enabled integration test	2026-02-24 10:52:07 -08:00
Kevin Lin	35b5383724	feat: match line number rendering to Anthropic / OAI defaults (#5492 )	2025-10-24 15:13:15 -07:00
Matthew Zhou	3c2911e9ae	feat: Remove jinja2 (#2971 ) Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>	2025-09-17 10:00:49 -07:00
Matthew Zhou	2e3cabc080	feat: Add search messages endpoint [LET-4144] (#4434 ) * Add search messages endpoint * Run fern autogen and fix tests	2025-09-05 14:28:27 -07:00
Matthew Zhou	fcfc40eaf2	feat: Add project id to message schema [LET-4166] (#4433 ) * Add project id * Propogate through update message by id async * Add project id testing	2025-09-04 16:50:41 -07:00
Matthew Zhou	23b2769dc4	feat: Allow agent archival tools to insert/search with tags [LET-4072] (#4300 ) * Finish modifying archival memory tools * Add tags * Add disabled test	2025-08-29 11:55:06 -07:00
cthomas	395faf3ed8	feat: add prompt generator for on the fly system prompt generation (#4060 )	2025-08-20 15:49:40 -07:00
cthomas	13c916afaa	feat: offload jinja to threadpool LET-3615 (#3787 )	2025-08-06 16:34:28 -07:00
cthomas	21f0373b45	feat: remove redundant memory compilation in agent step (#3785 )	2025-08-06 15:46:18 -07:00
cthomas	8919f06b67	feat: convert compile system prompt to async (#3685 )	2025-07-31 15:49:59 -07:00
cthomas	f77a259d07	feat: asyncify jinja templates (#3580 )	2025-07-26 23:17:24 -07:00
Kevin Lin	d4a63a1826	feat: add max files and file open window to system prompt (#3515 )	2025-07-23 17:05:53 -07:00
Matthew Zhou	2845c86f5f	fix: Fix context window compilation issues for files (#3272 )	2025-07-10 11:29:36 -07:00
Matthew Zhou	2263ffd07c	feat: Various file fixes and improvements (#3125 )	2025-07-01 15:21:52 -07:00
Andy Li	3654fa8c26	feat: expand summarizer providers	2025-07-01 15:07:38 -07:00
Kevin Lin	29ce769549	feat: replace file descriptions with per data source description (#3067 ) Co-authored-by: Matt Zhou <mattzh1314@gmail.com>	2025-07-01 11:51:52 -07:00
Sarah Wooders	2e06feafbf	fix: add more timezone fixes (#3025 )	2025-06-25 14:51:37 -07:00
cthomas	a332b34830	feat: make create_async route consistent with other message routes (#2877 )	2025-06-19 13:51:51 -07:00
cthomas	97986b0f03	feat: send stop reason in letta APIs (#2789 )	2025-06-13 16:04:48 -07:00
Shangyin Tan	22b640a5dd	fix: adjust memory summary's recall message count (#2778 )	2025-06-12 17:11:36 -07:00
cthomas	ccb203b264	feat: consolidate logic for finish tokens (#2779 )	2025-06-12 15:24:06 -07:00
Matthew Zhou	ba3d59bba5	feat: Add prompting to guide tool rule usage (#2742 )	2025-06-10 16:21:27 -07:00
cthomas	883050e761	feat: set request heartbeat for max steps (#2739 )	2025-06-10 15:26:07 -07:00
Matthew Zhou	318a7c769b	feat: Search files returns citations of the filenames that were searched (#2689 )	2025-06-06 15:34:03 -07:00
Matthew Zhou	285acd0324	feat: Add summarization for more scenarios (#2499 )	2025-05-29 11:10:13 -07:00
cthomas	6f72ac0c3a	feat(asyncify): agent batch sync db calls (#2348 )	2025-05-22 15:33:27 -07:00
cthomas	35d4914b4f	feat(asyncify): remove non async memory methods (#2245 )	2025-05-20 17:56:54 -07:00
Andy Li	02c2e4adec	fix: missing await in base async fetch (#2156 )	2025-05-13 13:57:21 -07:00
Andy Li	ca895f1987	fix: rebuilding memory async (#2149 )	2025-05-13 12:14:51 -07:00
Andy Li	8e2417aa2f	feat: async db client (#2076 )	2025-05-12 17:15:14 -07:00
Matthew Zhou	beed896d60	feat: Support offline memory in voice agent (#1347 )	2025-04-28 15:03:02 -07:00
cthomas	8f5a43b886	feat: add otid to new agent loop (#1635 )	2025-04-09 16:50:41 -07:00
Matthew Zhou	3ba79db859	feat: Enable Anthropic streaming on new agent loop (#1550 )	2025-04-03 19:40:48 -07:00
Matthew Zhou	acd0a525aa	feat: Async agent loop (#1387 )	2025-03-27 15:24:50 -07:00
Matthew Zhou	353af9aefe	feat: Finish async memory rewriting agent for voice (#1161 )	2025-03-03 13:58:06 -08:00

38 Commits