Files
letta-server/alembic/versions/a1b2c3d4e5f8_create_provider_trace_metadata_table.py
Kian Jones c1a02fa180 feat: add metadata-only provider trace storage option (#9155)
* feat: add metadata-only provider trace storage option

Add support for writing provider traces to a lightweight metadata-only
table (~1.5GB) instead of the full table (~725GB) since request/response
JSON is now stored in GCS.

- Add `LETTA_TELEMETRY_PROVIDER_TRACE_PG_METADATA_ONLY` setting
- Create `provider_trace_metadata` table via alembic migration
- Conditionally write to new table when flag is enabled
- Include backfill script for migrating existing data

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate API spec and SDK

* fix: use composite PK (created_at, id) for provider_trace_metadata

Aligns with GCS partitioning structure (raw/date=YYYY-MM-DD/{id}.json.gz)
and enables efficient date-range queries via the B-tree index.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* ammendments

* fix: add bulk data copy to migration

Copy existing provider_traces metadata in-migration instead of separate
backfill script. Creates indexes after bulk insert for better performance.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: remove data copy from migration, create empty table only

Old data stays in provider_traces, new writes go to provider_trace_metadata
when flag is enabled. Full traces are in GCS anyway.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: address PR comments

- Remove GCS mention from ProviderTraceMetadata docstring
- Move metadata object creation outside session context

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: reads always use full provider_traces table

The metadata_only flag should only control writes. Reads always go to
the full table to avoid returning ProviderTraceMetadata where
ProviderTrace is expected.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: enable metadata-only provider trace writes in prod

Add LETTA_TELEMETRY_PROVIDER_TRACE_PG_METADATA_ONLY=true to all
Helm values (memgpt-server and lettuce-py, prod and dev).

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00

60 lines
2.2 KiB
Python

"""create provider_trace_metadata table
Revision ID: a1b2c3d4e5f8
Revises: 9275f62ad282
Create Date: 2026-01-28
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
from letta.settings import settings
revision: str = "a1b2c3d4e5f8"
down_revision: Union[str, None] = "9275f62ad282"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
if not settings.letta_pg_uri_no_default:
return
op.create_table(
"provider_trace_metadata",
sa.Column("id", sa.String(), nullable=False),
sa.Column("step_id", sa.String(), nullable=True),
sa.Column("agent_id", sa.String(), nullable=True),
sa.Column("agent_tags", sa.JSON(), nullable=True),
sa.Column("call_type", sa.String(), nullable=True),
sa.Column("run_id", sa.String(), nullable=True),
sa.Column("source", sa.String(), nullable=True),
sa.Column("org_id", sa.String(), nullable=True),
sa.Column("user_id", sa.String(), nullable=True),
sa.Column("created_at", sa.DateTime(timezone=True), server_default=sa.text("now()"), nullable=False),
sa.Column("updated_at", sa.DateTime(timezone=True), server_default=sa.text("now()"), nullable=True),
sa.Column("is_deleted", sa.Boolean(), server_default=sa.text("FALSE"), nullable=False),
sa.Column("_created_by_id", sa.String(), nullable=True),
sa.Column("_last_updated_by_id", sa.String(), nullable=True),
sa.Column("organization_id", sa.String(), nullable=False),
sa.ForeignKeyConstraint(
["organization_id"],
["organizations.id"],
),
sa.PrimaryKeyConstraint("created_at", "id"),
)
op.create_index("ix_provider_trace_metadata_step_id", "provider_trace_metadata", ["step_id"], unique=False)
op.create_index("ix_provider_trace_metadata_id", "provider_trace_metadata", ["id"], unique=True)
def downgrade() -> None:
if not settings.letta_pg_uri_no_default:
return
op.drop_index("ix_provider_trace_metadata_id", table_name="provider_trace_metadata")
op.drop_index("ix_provider_trace_metadata_step_id", table_name="provider_trace_metadata")
op.drop_table("provider_trace_metadata")