Files
letta-server/letta/schemas/provider_trace.py
cthomas 416ffc7cd7 Add billing context to LLM telemetry traces (#9745)
* feat: add billing context to LLM telemetry traces

Add billing metadata (plan type, cost source, customer ID) to LLM traces in ClickHouse for cost analytics and attribution.

**Data Flow:**
- Cloud-API: Extract billing info from subscription in rate limiting, set x-billing-* headers
- Core: Parse headers into BillingContext object via dependencies
- Adapters: Flow billing_context through all LLM adapters (blocking & streaming)
- Agent: Pass billing_context to step() and stream() methods
- ClickHouse: Store in billing_plan_type, billing_cost_source, billing_customer_id columns

**Changes:**
- Add BillingContext schema to provider_trace.py
- Add billing columns to llm_traces ClickHouse table DDL
- Update getCustomerSubscription to fetch stripeCustomerId from organization_billing_details
- Propagate billing_context through agent step flow, adapters, and streaming service
- Update ProviderTrace and LLMTrace to include billing metadata
- Regenerate SDK with autogen

**Production Deployment:**
Requires env vars: LETTA_PROVIDER_TRACE_BACKEND=clickhouse, LETTA_STORE_LLM_TRACES=true, CLICKHOUSE_*

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add billing_context parameter to agent step methods

- Add billing_context to BaseAgent and BaseAgentV2 abstract methods
- Update LettaAgent, LettaAgentV2, LettaAgentV3 step methods
- Update multi-agent groups: SleeptimeMultiAgentV2, V3, V4
- Fix test_utils.py to include billing header parameters
- Import BillingContext in all affected files

* fix: add billing_context to stream methods

- Add billing_context parameter to BaseAgentV2.stream()
- Add billing_context parameter to LettaAgentV2.stream()
- LettaAgentV3.stream() already has it from previous commit

* fix: exclude billing headers from OpenAPI spec

Mark billing headers as internal (include_in_schema=False) so they don't appear in the public API.
These are internal headers between cloud-api and core, not part of the public SDK.

Regenerated SDK with stage-api - removes 10,650 lines of bloat that was causing OOM during Next.js build.

* refactor: return billing context from handleUnifiedRateLimiting instead of mutating req

Instead of passing req into handleUnifiedRateLimiting and mutating headers inside it:
- Return billing context fields (billingPlanType, billingCostSource, billingCustomerId) from handleUnifiedRateLimiting
- Set headers in handleMessageRateLimiting (middleware layer) after getting the result
- This fixes step-orchestrator compatibility since it doesn't have a real Express req object

* chore: remove extra gencode

* p

---------

Co-authored-by: Letta <noreply@letta.com>
2026-03-03 18:34:13 -08:00

87 lines
4.7 KiB
Python

from __future__ import annotations
from datetime import datetime
from typing import Any, Dict, Optional
from pydantic import BaseModel, Field
from letta.helpers.datetime_helpers import get_utc_time
from letta.schemas.enums import PrimitiveType
from letta.schemas.letta_base import OrmMetadataBase
class BillingContext(BaseModel):
"""Billing context for LLM request cost tracking."""
plan_type: Optional[str] = Field(None, description="Subscription tier")
cost_source: Optional[str] = Field(None, description="Cost source: 'quota' or 'credits'")
customer_id: Optional[str] = Field(None, description="Customer ID for billing records")
class BaseProviderTrace(OrmMetadataBase):
__id_prefix__ = PrimitiveType.PROVIDER_TRACE.value
class ProviderTrace(BaseProviderTrace):
"""
Letta's internal representation of a provider trace.
Attributes:
id (str): The unique identifier of the provider trace.
request_json (Dict[str, Any]): JSON content of the provider request.
response_json (Dict[str, Any]): JSON content of the provider response.
step_id (str): ID of the step that this trace is associated with.
agent_id (str): ID of the agent that generated this trace.
agent_tags (list[str]): Tags associated with the agent for filtering.
call_type (str): Type of call (agent_step, summarization, etc.).
run_id (str): ID of the run this trace is associated with.
source (str): Source service that generated this trace (memgpt-server, lettuce-py).
organization_id (str): The unique identifier of the organization.
user_id (str): The unique identifier of the user who initiated the request.
compaction_settings (Dict[str, Any]): Compaction/summarization settings (only for summarization calls).
llm_config (Dict[str, Any]): LLM configuration used for this call (only for non-summarization calls).
created_at (datetime): The timestamp when the object was created.
"""
id: str = BaseProviderTrace.generate_id_field()
request_json: Dict[str, Any] = Field(..., description="JSON content of the provider request")
response_json: Dict[str, Any] = Field(..., description="JSON content of the provider response")
step_id: Optional[str] = Field(None, description="ID of the step that this trace is associated with")
# Telemetry context fields
agent_id: Optional[str] = Field(None, description="ID of the agent that generated this trace")
agent_tags: Optional[list[str]] = Field(None, description="Tags associated with the agent for filtering")
call_type: Optional[str] = Field(None, description="Type of call (agent_step, summarization, etc.)")
run_id: Optional[str] = Field(None, description="ID of the run this trace is associated with")
source: Optional[str] = Field(None, description="Source service that generated this trace (memgpt-server, lettuce-py)")
# v2 protocol fields
org_id: Optional[str] = Field(None, description="ID of the organization")
user_id: Optional[str] = Field(None, description="ID of the user who initiated the request")
compaction_settings: Optional[Dict[str, Any]] = Field(None, description="Compaction/summarization settings (summarization calls only)")
llm_config: Optional[Dict[str, Any]] = Field(None, description="LLM configuration used for this call (non-summarization calls only)")
billing_context: Optional[BillingContext] = Field(None, description="Billing context from request headers")
created_at: datetime = Field(default_factory=get_utc_time, description="The timestamp when the object was created.")
class ProviderTraceMetadata(BaseProviderTrace):
"""Metadata-only representation of a provider trace (no request/response JSON)."""
id: str = BaseProviderTrace.generate_id_field()
step_id: Optional[str] = Field(None, description="ID of the step that this trace is associated with")
# Telemetry context fields
agent_id: Optional[str] = Field(None, description="ID of the agent that generated this trace")
agent_tags: Optional[list[str]] = Field(None, description="Tags associated with the agent for filtering")
call_type: Optional[str] = Field(None, description="Type of call (agent_step, summarization, etc.)")
run_id: Optional[str] = Field(None, description="ID of the run this trace is associated with")
source: Optional[str] = Field(None, description="Source service that generated this trace (memgpt-server, lettuce-py)")
# v2 protocol fields
org_id: Optional[str] = Field(None, description="ID of the organization")
user_id: Optional[str] = Field(None, description="ID of the user who initiated the request")
created_at: datetime = Field(default_factory=get_utc_time, description="The timestamp when the object was created.")