Commit Graph

38 Commits

Author SHA1 Message Date
cthomas
416ffc7cd7 Add billing context to LLM telemetry traces (#9745)
* feat: add billing context to LLM telemetry traces

Add billing metadata (plan type, cost source, customer ID) to LLM traces in ClickHouse for cost analytics and attribution.

**Data Flow:**
- Cloud-API: Extract billing info from subscription in rate limiting, set x-billing-* headers
- Core: Parse headers into BillingContext object via dependencies
- Adapters: Flow billing_context through all LLM adapters (blocking & streaming)
- Agent: Pass billing_context to step() and stream() methods
- ClickHouse: Store in billing_plan_type, billing_cost_source, billing_customer_id columns

**Changes:**
- Add BillingContext schema to provider_trace.py
- Add billing columns to llm_traces ClickHouse table DDL
- Update getCustomerSubscription to fetch stripeCustomerId from organization_billing_details
- Propagate billing_context through agent step flow, adapters, and streaming service
- Update ProviderTrace and LLMTrace to include billing metadata
- Regenerate SDK with autogen

**Production Deployment:**
Requires env vars: LETTA_PROVIDER_TRACE_BACKEND=clickhouse, LETTA_STORE_LLM_TRACES=true, CLICKHOUSE_*

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add billing_context parameter to agent step methods

- Add billing_context to BaseAgent and BaseAgentV2 abstract methods
- Update LettaAgent, LettaAgentV2, LettaAgentV3 step methods
- Update multi-agent groups: SleeptimeMultiAgentV2, V3, V4
- Fix test_utils.py to include billing header parameters
- Import BillingContext in all affected files

* fix: add billing_context to stream methods

- Add billing_context parameter to BaseAgentV2.stream()
- Add billing_context parameter to LettaAgentV2.stream()
- LettaAgentV3.stream() already has it from previous commit

* fix: exclude billing headers from OpenAPI spec

Mark billing headers as internal (include_in_schema=False) so they don't appear in the public API.
These are internal headers between cloud-api and core, not part of the public SDK.

Regenerated SDK with stage-api - removes 10,650 lines of bloat that was causing OOM during Next.js build.

* refactor: return billing context from handleUnifiedRateLimiting instead of mutating req

Instead of passing req into handleUnifiedRateLimiting and mutating headers inside it:
- Return billing context fields (billingPlanType, billingCostSource, billingCustomerId) from handleUnifiedRateLimiting
- Set headers in handleMessageRateLimiting (middleware layer) after getting the result
- This fixes step-orchestrator compatibility since it doesn't have a real Express req object

* chore: remove extra gencode

* p

---------

Co-authored-by: Letta <noreply@letta.com>
2026-03-03 18:34:13 -08:00
Kian Jones
f5c4ab50f4 chore: add ty + pre-commit hook and repeal even more ruff rules (#9504)
* auto fixes

* auto fix pt2 and transitive deps and undefined var checking locals()

* manual fixes (ignored or letta-code fixed)

* fix circular import

* remove all ignores, add FastAPI rules and Ruff rules

* add ty and precommit

* ruff stuff

* ty check fixes

* ty check fixes pt 2

* error on invalid
2026-02-24 10:55:11 -08:00
Sarah Wooders
d7793a4474 fix(core): stabilize system prompt refresh and expand git-memory coverage (#9438)
* fix(core): stabilize system prompt refresh and expand git-memory coverage

Only rebuild system prompts on explicit refresh paths so normal turns preserve prefix-cache stability, including git/custom prompt layouts. Add integration coverage for memory filesystem tree structure and recompile/reset system-message updates via message-id retrieval.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): recompile system prompt around compaction and stabilize source tests

Force system prompt refresh before/after compaction in LettaAgentV3 so repaired system+memory state is used and persisted across subsequent turns. Update source-system prompt tests to explicitly recompile before raw preview assertions instead of assuming automatic rebuild timing.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
jnjpng
39b25a0e3c fix: update ContextWindowCalculator to parse new system message sections (#9398)
* fix: update ContextWindowCalculator to parse new system message sections

The context window calculator was using outdated position-based parsing
that only handled 3 sections (base_instructions, memory_blocks, memory_metadata).
The actual system message now includes additional sections that were not
being tracked:

- <memory_filesystem> (git-enabled agents)
- <tool_usage_rules> (when tool rules configured)
- <directories> (when sources attached)

Changes:
- Add _extract_tag_content() helper for proper XML tag extraction
- Rewrite extract_system_components() to return a Dict with all 6 sections
- Update calculate_context_window() to count tokens for new sections
- Add new fields to ContextWindowOverview schema with backward-compatible defaults
- Add unit tests for the extraction logic

* update

* generate

* fix: check attached file in directories section instead of core_memory

Files are rendered inside <directories> tags, not <memory_blocks>.
Update validate_context_window_overview assertions accordingly.

* fix: address review feedback for context window parser

- Fix git-enabled agents regression: capture bare file blocks
  (e.g. <system/human.md>) rendered after </memory_filesystem> as
  core_memory via new _extract_git_core_memory() method
- Make _extract_top_level_tag robust: scan all occurrences to find
  tag outside container, handling nested-first + top-level-later case
- Document system_prompt tag inconsistency in docstring
- Add TODO to base_agent.py extract_dynamic_section linking to
  ContextWindowCalculator to flag parallel parser tech debt
- Add tests: git-enabled agent parsing, dual-occurrence tag
  extraction, pure text system prompt, git-enabled integration test
2026-02-24 10:52:07 -08:00
Kevin Lin
35b5383724 feat: match line number rendering to Anthropic / OAI defaults (#5492) 2025-10-24 15:13:15 -07:00
Matthew Zhou
3c2911e9ae feat: Remove jinja2 (#2971)
Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
2025-09-17 10:00:49 -07:00
Matthew Zhou
2e3cabc080 feat: Add search messages endpoint [LET-4144] (#4434)
* Add search messages endpoint

* Run fern autogen and fix tests
2025-09-05 14:28:27 -07:00
Matthew Zhou
fcfc40eaf2 feat: Add project id to message schema [LET-4166] (#4433)
* Add project id

* Propogate through update message by id async

* Add project id testing
2025-09-04 16:50:41 -07:00
Matthew Zhou
23b2769dc4 feat: Allow agent archival tools to insert/search with tags [LET-4072] (#4300)
* Finish modifying archival memory tools

* Add tags

* Add disabled test
2025-08-29 11:55:06 -07:00
cthomas
395faf3ed8 feat: add prompt generator for on the fly system prompt generation (#4060) 2025-08-20 15:49:40 -07:00
cthomas
13c916afaa feat: offload jinja to threadpool LET-3615 (#3787) 2025-08-06 16:34:28 -07:00
cthomas
21f0373b45 feat: remove redundant memory compilation in agent step (#3785) 2025-08-06 15:46:18 -07:00
cthomas
8919f06b67 feat: convert compile system prompt to async (#3685) 2025-07-31 15:49:59 -07:00
cthomas
f77a259d07 feat: asyncify jinja templates (#3580) 2025-07-26 23:17:24 -07:00
Kevin Lin
d4a63a1826 feat: add max files and file open window to system prompt (#3515) 2025-07-23 17:05:53 -07:00
Matthew Zhou
2845c86f5f fix: Fix context window compilation issues for files (#3272) 2025-07-10 11:29:36 -07:00
Matthew Zhou
2263ffd07c feat: Various file fixes and improvements (#3125) 2025-07-01 15:21:52 -07:00
Andy Li
3654fa8c26 feat: expand summarizer providers 2025-07-01 15:07:38 -07:00
Kevin Lin
29ce769549 feat: replace file descriptions with per data source description (#3067)
Co-authored-by: Matt Zhou <mattzh1314@gmail.com>
2025-07-01 11:51:52 -07:00
Sarah Wooders
2e06feafbf fix: add more timezone fixes (#3025) 2025-06-25 14:51:37 -07:00
cthomas
a332b34830 feat: make create_async route consistent with other message routes (#2877) 2025-06-19 13:51:51 -07:00
cthomas
97986b0f03 feat: send stop reason in letta APIs (#2789) 2025-06-13 16:04:48 -07:00
Shangyin Tan
22b640a5dd fix: adjust memory summary's recall message count (#2778) 2025-06-12 17:11:36 -07:00
cthomas
ccb203b264 feat: consolidate logic for finish tokens (#2779) 2025-06-12 15:24:06 -07:00
Matthew Zhou
ba3d59bba5 feat: Add prompting to guide tool rule usage (#2742) 2025-06-10 16:21:27 -07:00
cthomas
883050e761 feat: set request heartbeat for max steps (#2739) 2025-06-10 15:26:07 -07:00
Matthew Zhou
318a7c769b feat: Search files returns citations of the filenames that were searched (#2689) 2025-06-06 15:34:03 -07:00
Matthew Zhou
285acd0324 feat: Add summarization for more scenarios (#2499) 2025-05-29 11:10:13 -07:00
cthomas
6f72ac0c3a feat(asyncify): agent batch sync db calls (#2348) 2025-05-22 15:33:27 -07:00
cthomas
35d4914b4f feat(asyncify): remove non async memory methods (#2245) 2025-05-20 17:56:54 -07:00
Andy Li
02c2e4adec fix: missing await in base async fetch (#2156) 2025-05-13 13:57:21 -07:00
Andy Li
ca895f1987 fix: rebuilding memory async (#2149) 2025-05-13 12:14:51 -07:00
Andy Li
8e2417aa2f feat: async db client (#2076) 2025-05-12 17:15:14 -07:00
Matthew Zhou
beed896d60 feat: Support offline memory in voice agent (#1347) 2025-04-28 15:03:02 -07:00
cthomas
8f5a43b886 feat: add otid to new agent loop (#1635) 2025-04-09 16:50:41 -07:00
Matthew Zhou
3ba79db859 feat: Enable Anthropic streaming on new agent loop (#1550) 2025-04-03 19:40:48 -07:00
Matthew Zhou
acd0a525aa feat: Async agent loop (#1387) 2025-03-27 15:24:50 -07:00
Matthew Zhou
353af9aefe feat: Finish async memory rewriting agent for voice (#1161) 2025-03-03 13:58:06 -08:00