Commit Graph

67 Commits

Author SHA1 Message Date
Kian Jones
382e216cbb fix(core): differentiate BYOK vs base provider in all LLM error details (#9425)
Add is_byok flag to every LLMError's details dict returned from
handle_llm_error across all providers (OpenAI, Anthropic, Google,
ChatGPT OAuth). This enables observability into whether errors
originate from Letta's production keys or user-provided BYOK keys.

The rate limit handler in app.py now returns a more helpful message
for BYOK users ("check your provider's rate limits and billing")
versus the generic message for base provider rate limits.

Datadog issues:
- https://us5.datadoghq.com/error-tracking/issue/b711c824-f490-11f0-96e4-da7ad0900000
- https://us5.datadoghq.com/error-tracking/issue/76623036-f4de-11f0-8697-da7ad0900000
- https://us5.datadoghq.com/error-tracking/issue/43e9888a-dfcf-11f0-a645-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
203b6ead7c fix: remove duplicate provider trace logging and dead code (#9278)
Provider traces were being created twice per step:
1. Via `request_async_with_telemetry` / `log_provider_trace_async` in LLMClient
2. Via direct `create_provider_trace_async` calls in LettaAgent

This caused duplicate records in provider_trace_metadata (Postgres) and
llm_traces (ClickHouse) for every agent step.

Changes:
- Remove redundant direct `create_provider_trace_async` calls from letta_agent.py
- Remove no-op `stream_async_with_telemetry` method (was just a pass-through to `stream_async`)
- Update callers to use `stream_async` directly

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
amysguan
16c96cc3c0 Fix sliding window cutoff logic (#9261)
* fix sliding window cutoff calculations to use agent instead of summarizer config

* allow approval messages with tool_calls as valid cutoffs, prevent approval pairs from being split

* update tests with updated sliding window parameters

---------

Co-authored-by: Amy Guan <amy@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
eaf64fb510 fix: add LLMCallType enum and ensure call_type is set on all provider traces (#9258)
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
jnjpng
f48b60634f refactor: extract compact logic to shared function for temporal (#9249)
* refactor: extract compact logic to shared function

Extract the compaction logic from LettaAgentV3.compact() into a
standalone compact_messages() function that can be shared between
the agent and temporal workflows.

Changes:
- Create apps/core/letta/services/summarizer/compact.py with:
  - compact_messages(): Core compaction logic
  - build_summarizer_llm_config(): LLM config builder for summarization
  - CompactResult: Dataclass for compaction results
- Update LettaAgentV3.compact() to use compact_messages()
- Update temporal summarize_conversation_history activity to use
  compact_messages() instead of the old Summarizer class
- Add use_summary_role parameter to SummarizeParams

This ensures consistent summarization behavior across different
execution paths and prevents drift as we improve the implementation.

* chore: clean up verbose comments

* fix: correct CompactionSettings import path

* fix: correct count_tokens import from summarizer_sliding_window

* fix: update test patch path for count_tokens_with_tools

After extracting compact logic to compact.py, the test was patching
the old location. Update the patch path to the new module location.

* fix: update test to use build_summarizer_llm_config from compact.py

The function was moved from LettaAgentV3._build_summarizer_llm_config
to compact.py as a standalone function.

* fix: add early check for system prompt size in compact_messages

Check if the system prompt alone exceeds the context window before
attempting summarization. The system prompt cannot be compacted,
so fail fast with SystemPromptTokenExceededError.

* fix: properly propagate SystemPromptTokenExceededError from compact

The exception handler in _step() was not setting the correct stop_reason
for SystemPromptTokenExceededError, which caused the finally block to
return early and swallow the exception.

Add special handling to set stop_reason to context_window_overflow_in_system_prompt
when SystemPromptTokenExceededError is caught.

* revert: remove redundant SystemPromptTokenExceededError handling

The special handling in the outer exception handler is redundant because
stop_reason is already set in the inner handler at line 943. The actual
fix for the test was the early check in compact_messages(), not this
redundant handling.

* fix: correctly re-raise SystemPromptTokenExceededError

The inner exception handler was using 'raise e' which re-raised the outer
ContextWindowExceededError instead of the current SystemPromptTokenExceededError.

Changed to 'raise' to correctly re-raise the current exception. This bug
was pre-existing but masked because _check_for_system_prompt_overflow was
only called as a fallback. The new early check in compact_messages() exposed it.

* revert: remove early check and restore raise e to match main behavior

* fix: set should_continue=False and correctly re-raise exception

- Add should_continue=False in SystemPromptTokenExceededError handler (matching main's _check_for_system_prompt_overflow behavior)
- Fix raise e -> raise to correctly propagate SystemPromptTokenExceededError

Note: test_large_system_prompt_summarization still fails locally but passes on main.
Need to investigate why exception isn't propagating correctly on refactored branch.

* fix: add SystemPromptTokenExceededError handler for post-step compaction

The post-step compaction (line 1066) was missing a SystemPromptTokenExceededError
exception handler. When compact_messages() raised this error, it would be caught
by the outer exception handler which would:
1. Set stop_reason to "error" instead of "context_window_overflow_in_system_prompt"
2. Not set should_continue = False
3. Get swallowed by the finally block (line 1126) which returns early

This caused test_large_system_prompt_summarization to fail because the exception
never propagated to the test.

The fix adds the same exception handler pattern used in the retry compaction flow
(line 941-946), ensuring proper state is set before re-raising.

This issue only affected the refactored code because on main, _check_for_system_prompt_overflow()
was an instance method that set should_continue/stop_reason BEFORE raising. In the refactor,
compact_messages() is a standalone function that cannot set instance state, so the caller
must handle the exception and set the state.
2026-02-24 10:52:06 -08:00
Sarah Wooders
4096b30cd7 feat: log LLM traces to clickhouse (#9111)
* feat: add non-streaming option for conversation messages

- Add ConversationMessageRequest with stream=True default (backwards compatible)
- stream=true (default): SSE streaming via StreamingService
- stream=false: JSON response via AgentLoop.load().step()

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate API schema for ConversationMessageRequest

* feat: add direct ClickHouse storage for raw LLM traces

    Adds ability to store raw LLM request/response payloads directly in ClickHouse,
    bypassing OTEL span attribute size limits. This enables debugging and analytics
    on large LLM payloads (>10MB system prompts, large tool schemas, etc.).

    New files:
    - letta/schemas/llm_raw_trace.py: Pydantic schema with ClickHouse row helper
    - letta/services/llm_raw_trace_writer.py: Async batching writer (fire-and-forget)
    - letta/services/llm_raw_trace_reader.py: Reader with query methods
    - scripts/sql/clickhouse/llm_raw_traces.ddl: Production table DDL
    - scripts/sql/clickhouse/llm_raw_traces_local.ddl: Local dev DDL
    - apps/core/clickhouse-init.sql: Local dev initialization

    Modified:
    - letta/settings.py: Added 4 settings (store_llm_raw_traces, ttl, batch_size, flush_interval)
    - letta/llm_api/llm_client_base.py: Integration into request_async_with_telemetry
    - compose.yaml: Added ClickHouse service for local dev
    - justfile: Added clickhouse, clickhouse-cli, clickhouse-traces commands

    Feature disabled by default (LETTA_STORE_LLM_RAW_TRACES=false).
    Uses ZSTD(3) compression for 10-30x reduction on JSON payloads.

    🤖 Generated with [Letta Code](https://letta.com)

    Co-Authored-By: Letta <noreply@letta.com>

* fix: address code review feedback for LLM raw traces

Fixes based on code review feedback:

1. Fix ClickHouse endpoint parsing - default to secure=False for raw host:port
   inputs (was defaulting to HTTPS which breaks local dev)

2. Make raw trace writes truly fire-and-forget - use asyncio.create_task()
   instead of awaiting, so JSON serialization doesn't block request path

3. Add bounded queue (maxsize=10000) - prevents unbounded memory growth
   under load. Drops traces with warning if queue is full.

4. Fix deprecated asyncio usage - get_running_loop() instead of get_event_loop()

5. Add org_id fallback - use _telemetry_org_id if actor doesn't have it

6. Remove unused imports - json import in reader

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add missing asyncio import and simplify JSON serialization

- Add missing 'import asyncio' that was causing 'name asyncio is not defined' error
- Remove unnecessary clean_double_escapes() function - the JSON is stored correctly,
  the clickhouse-client CLI was just adding extra escaping when displaying
- Update just clickhouse-trace to use Python client for correct JSON output

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: add clickhouse raw trace integration test

* test: simplify clickhouse trace assertions

* refactor: centralize usage parsing and stream error traces

Use per-client usage helpers for raw trace extraction and ensure streaming errors log requests with error metadata.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: exercise provider usage parsing live

Make live OpenAI/Anthropic/Gemini requests with credential gating and validate Anthropic cache usage mapping when present.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: fix usage parsing tests to pass

- Use GoogleAIClient with GEMINI_API_KEY instead of GoogleVertexClient
- Update model to gemini-2.0-flash (1.5-flash deprecated in v1beta)
- Add tools=[] for Gemini/Anthropic build_request_data

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: extract_usage_statistics returns LettaUsageStatistics

Standardize on LettaUsageStatistics as the canonical usage format returned by client helpers. Inline UsageStatistics construction for ChatCompletionResponse where needed.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: add is_byok and llm_config_json columns to ClickHouse traces

Extend llm_raw_traces table with:
- is_byok (UInt8): Track BYOK vs base provider usage for billing analytics
- llm_config_json (String, ZSTD): Store full LLM config for debugging and analysis

This enables queries like:
- BYOK usage breakdown by provider/model
- Config parameter analysis (temperature, max_tokens, etc.)
- Debugging specific request configurations

* feat: add tests for error traces, llm_config_json, and cache tokens

- Update llm_raw_trace_reader.py to query new columns (is_byok,
  cached_input_tokens, cache_write_tokens, reasoning_tokens, llm_config_json)
- Add test_error_trace_stored_in_clickhouse to verify error fields
- Add test_cache_tokens_stored_for_anthropic to verify cache token storage
- Update existing tests to verify llm_config_json is stored correctly
- Make llm_config required in log_provider_trace_async()
- Simplify provider extraction to use provider_name directly

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* ci: add ClickHouse integration tests to CI pipeline

- Add use-clickhouse option to reusable-test-workflow.yml
- Add ClickHouse service container with otel database
- Add schema initialization step using clickhouse-init.sql
- Add ClickHouse env vars (CLICKHOUSE_ENDPOINT, etc.)
- Add separate clickhouse-integration-tests job running
  integration_test_clickhouse_llm_raw_traces.py

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: simplify provider and org_id extraction in raw trace writer

- Use model_endpoint_type.value for provider (not provider_name)
- Simplify org_id to just self.actor.organization_id (actor is always pydantic)

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: simplify LLMRawTraceWriter with _enabled flag

- Check ClickHouse env vars once at init, set _enabled flag
- Early return in write_async/flush_async if not enabled
- Remove ValueError raises (never used)
- Simplify _get_client (no validation needed since already checked)

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add LLMRawTraceWriter shutdown to FastAPI lifespan

Properly flush pending traces on graceful shutdown via lifespan
instead of relying only on atexit handler.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: add agent_tags column to ClickHouse traces

Store agent tags as Array(String) for filtering/analytics by tag.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* cleanup

* fix(ci): fix ClickHouse schema initialization in CI

- Create database separately before loading SQL file
- Remove CREATE DATABASE from SQL file (handled in CI step)
- Add verification step to confirm table was created
- Use -sf flag for curl to fail on HTTP errors

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: simplify LLM trace writer with ClickHouse async_insert

- Use ClickHouse async_insert for server-side batching instead of manual queue/flush loop
- Sync cloud DDL schema with clickhouse-init.sql (add missing columns)
- Remove redundant llm_raw_traces_local.ddl
- Remove unused batch_size/flush_interval settings
- Update tests for simplified writer

Key changes:
- async_insert=1, wait_for_async_insert=1 for reliable server-side batching
- Simple per-trace retry with exponential backoff (max 3 retries)
- ~150 lines removed from writer

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: consolidate ClickHouse direct writes into TelemetryManager backend

- Add clickhouse_direct backend to provider_trace_backends
- Remove duplicate ClickHouse write logic from llm_client_base.py
- Configure via LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND=postgres,clickhouse_direct

The clickhouse_direct backend:
- Converts ProviderTrace to LLMRawTrace
- Extracts usage stats from response JSON
- Writes via LLMRawTraceWriter with async_insert

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: address PR review comments and fix llm_config bug

Review comment fixes:
- Rename clickhouse_direct -> clickhouse_analytics (clearer purpose)
- Remove ClickHouse from OSS compose.yaml, create separate compose.clickhouse.yaml
- Delete redundant scripts/test_llm_raw_traces.py (use pytest tests)
- Remove unused llm_raw_traces_ttl_days setting (TTL handled in DDL)
- Fix socket description leak in telemetry_manager docstring
- Add cloud-only comment to clickhouse-init.sql
- Update justfile to use separate compose file

Bug fix:
- Fix llm_config not being passed to ProviderTrace in telemetry
- Now correctly populates provider, model, is_byok for all LLM calls
- Affects both request_async_with_telemetry and log_provider_trace_async

DDL optimizations:
- Add secondary indexes (bloom_filter for agent_id, model, step_id)
- Add minmax indexes for is_byok, is_error
- Change model and error_type to LowCardinality for faster GROUP BY

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: rename llm_raw_traces -> llm_traces

Address review feedback that "raw" is misleading since we denormalize fields.

Renames:
- Table: llm_raw_traces -> llm_traces
- Schema: LLMRawTrace -> LLMTrace
- Files: llm_raw_trace_{reader,writer}.py -> llm_trace_{reader,writer}.py
- Setting: store_llm_raw_traces -> store_llm_traces

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: update workflow references to llm_traces

Missed renaming table name in CI workflow files.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: update clickhouse_direct -> clickhouse_analytics in docstring

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: remove inaccurate OTEL size limit comments

The 4MB limit is our own truncation logic, not an OTEL protocol limit.
The real benefit is denormalized columns for analytics queries.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: remove local ClickHouse dev setup (cloud-only feature)

- Delete clickhouse-init.sql and compose.clickhouse.yaml
- Remove local clickhouse just commands
- Update CI to use cloud DDL with MergeTree for testing

clickhouse_analytics is a cloud-only feature. For local dev, use postgres backend.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: restore compose.yaml to match main

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: merge clickhouse_analytics into clickhouse backend

Per review feedback - having two separate backends was confusing.

Now the clickhouse backend:
- Writes to llm_traces table (denormalized for cost analytics)
- Reads from OTEL traces table (will cut over to llm_traces later)

Config: LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND=postgres,clickhouse

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: correct path to DDL file in CI workflow

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: add provider index to DDL for faster filtering

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: configure telemetry backend in clickhouse tests

Tests need to set telemetry_settings.provider_trace_backends to include
'clickhouse', otherwise traces are routed to default postgres backend.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: set provider_trace_backend field, not property

provider_trace_backends is a computed property, need to set the
underlying provider_trace_backend string field instead.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: error trace test and error_type extraction

- Add TelemetryManager to error trace test so traces get written
- Fix error_type extraction to check top-level before nested error dict

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: use provider_trace.id for trace correlation across backends

- Pass provider_trace.id to LLMTrace instead of auto-generating
- Log warning if ID is missing (shouldn't happen, helps debug)
- Fallback to new UUID only if not set

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: trace ID correlation and concurrency issues

- Strip "provider_trace-" prefix from ID for UUID storage in ClickHouse
- Add asyncio.Lock to serialize writes (clickhouse_connect not thread-safe)
- Fix Anthropic prompt_tokens to include cached tokens for cost analytics
- Log warning if provider_trace.id is missing

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Caren Thomas <carenthomas@gmail.com>
2026-02-24 10:52:06 -08:00
jnjpng
24ea7dbaed feat: include tools as part of token estimate in compact (#9242)
* base

* fix
2026-02-24 10:52:06 -08:00
Kian Jones
e3fb00f970 feat(crouton): add orgId, userId, Compaction_Settings and LLM_Config (#9022)
* LC one shot?

* api changes

* fix summarizer nameerror
2026-01-29 12:44:04 -08:00
Kian Jones
4d256b3399 feat: add agent_id, run_id, step_id to summarization provider traces (#8996)
* feat: add agent_id, run_id, step_id to summarization provider traces

Summarization LLM calls were missing telemetry context (agent_id,
agent_tags, run_id, step_id), making it impossible to attribute
summarization costs to specific agents or trace them back to the
step that triggered compaction.

Changes:
- Add step_id param to simple_summary() and set_telemetry_context()
- Add agent_id, agent_tags, run_id, step_id to summarize_all() and
  summarize_via_sliding_window()
- Update Summarizer class to accept and pass telemetry context
- Update LettaAgentV3.compact() to pass full telemetry context
- Update LettaAgentV2.summarize_conversation_history() with run_id/step_id
- Update LettaAgent (v1) streaming methods with run_id param
- Add run_id/step_id to SummarizeParams for Temporal activities

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: update test mock to accept new summarization params

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:43:53 -08:00
cthomas
b3a213a56f revert: fix: include assistant response in sliding window summarizati… (#8929)
revert: fix: include assistant response in sliding window summarization slice
2026-01-19 15:54:44 -08:00
cthomas
4be366470b fix: summarizer cutoff bug (#8928)
* fix: summarizer cutoff bug

* update comment with context
2026-01-19 15:54:44 -08:00
Kian Jones
81b5d71889 feat: add agents and log error properly (#8914)
* add agents and log error properly

* fix llm stream adapter
2026-01-19 15:54:43 -08:00
Kian Jones
b0dfdd2725 fix commas in justfile helm secret setting and bug with missing metadata (#8874) 2026-01-19 15:54:43 -08:00
Kian Jones
a92e868ee6 feat: centralize telemetry logging at LLM client level (#8815)
* feat: centralize telemetry logging at LLM client level

Moves telemetry logging from individual adapters to LLMClientBase:
- Add TelemetryStreamWrapper for streaming telemetry on stream close
- Add request_async_with_telemetry() for non-streaming requests
- Add stream_async_with_telemetry() for streaming requests
- Add set_telemetry_context() to configure agent_id, run_id, step_id

Updates adapters and agents to use new pattern:
- LettaLLMAdapter now accepts agent_id/run_id in constructor
- Adapters call set_telemetry_context() before LLM requests
- Removes duplicate telemetry logging from adapters
- Enriches traces with agent_id, run_id, call_type metadata

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: accumulate streaming response content for telemetry

TelemetryStreamWrapper now extracts actual response data from chunks:
- Content text (concatenated from deltas)
- Tool calls (id, name, arguments)
- Model name, finish reason, usage stats

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: move streaming telemetry to caller (option 3)

- Remove TelemetryStreamWrapper class
- Add log_provider_trace_async() helper to LLMClientBase
- stream_async_with_telemetry() now just returns raw stream
- Callers log telemetry after processing with rich interface data

Updated callers:
- summarizer.py: logs content + usage after stream processing
- letta_agent.py: logs tool_call, reasoning, model, usage

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: pass agent_id and run_id to parent adapter class

LettaLLMStreamAdapter was not passing agent_id/run_id to parent,
causing "unexpected keyword argument" errors.

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-19 15:54:43 -08:00
Sarah Wooders
ef0f8e20d9 fix: improve summarization prompts (#8764) 2026-01-19 15:54:41 -08:00
cthomas
c3276dda17 fix: summarization drops reasoning content mid turn [LET-7039] (#8769)
fix: summarization drops reasoning content mid turn
2026-01-19 15:54:41 -08:00
Sarah Wooders
e2c6550363 fix: include assistant response in sliding window summarization slice (#8749) 2026-01-19 15:54:41 -08:00
Sarah Wooders
c675c289ef feat: add more summary logging (#7405) 2026-01-12 10:57:19 -08:00
Sarah Wooders
f1bd246e9b feat: use token streaming for anthropic summarization (#7105) 2025-12-17 17:31:02 -08:00
Ari Webb
4878b49fa1 chore: bounds check for assistant message index (#7070) 2025-12-17 17:31:02 -08:00
Sarah Wooders
bd9f3aca9b fix: fix prompt_acknowledgement usage and update summarization prompts (#7012) 2025-12-15 12:03:09 -08:00
Sarah Wooders
a731e01e88 fix: use model instead of model_settings (#6834) 2025-12-15 12:03:09 -08:00
Sarah Wooders
7ea297231a feat: add compaction_settings to agents (#6625)
* initial commit

* Add database migration for compaction_settings field

This migration adds the compaction_settings column to the agents table
to support customized summarization configuration for each agent.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix

* rename

* update apis

* fix tests

* update web test

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Kian Jones <kian@letta.com>
2025-12-15 12:02:34 -08:00
Sarah Wooders
70c57c5072 fix: various patches to summarizer (#6597) 2025-12-15 12:02:34 -08:00
Sarah Wooders
bbd52e291c feat: refactor summarization and message persistence code [LET-6464] (#6561) 2025-12-15 12:02:34 -08:00
cthomas
a7c0bad857 fix: unbound var in summarization [LET-6484] (#6568)
* fix: unbound var in summarization

* fix indentation
2025-12-15 12:02:34 -08:00
cthomas
6fd73d4dcf chore: add more summarize tracing (#6532) 2025-12-15 12:02:33 -08:00
Sarah Wooders
90e2030f41 fix: fix manual summarization (#6466) 2025-12-15 12:02:19 -08:00
Sarah Wooders
bd97b23025 feat: fallback to all mode for summarizer if error (#6465) 2025-12-15 12:02:19 -08:00
Sarah Wooders
91e3dd8b3e feat: fix new summarizer code and add more tests (#6461) 2025-12-15 12:02:19 -08:00
Sarah Wooders
c0b422c4c6 fix: patch summarizer and add tests (#6457) 2025-12-15 12:02:19 -08:00
Sarah Wooders
1939a9d185 feat: patch summarizer without changes to AgentState (#6450) 2025-12-15 12:02:19 -08:00
cthomas
4bb116f17c fix: sync api call in message path (#6291)
* fix: sync api call in message path

* remove unused function

* add new error type
2025-11-24 19:10:26 -08:00
Kian Jones
aedddcb1d1 chore: log payload on all summarizer failures for debugging (#5952)
* log the request data on fallback too

* log at every layer lol
2025-11-13 15:36:50 -08:00
Kian Jones
d33fd219db chore: debug to info (#5941)
debug to info
2025-11-13 15:36:50 -08:00
Charles Packer
aa7093c585 fix: patch hole in the fallback summarizer where we weren't actually truncating (#5919)
* fix: patch hole in the fallback summarizer where we weren't actually truncating

* fix: remove no-op

* chore: comment

* fix: simplify the new fallback

* fix: properly handle images in summarizer payload
2025-11-13 15:36:50 -08:00
Kian Jones
c81fded806 chore: dump payload on error (#5918)
dump payload on error
2025-11-13 15:36:39 -08:00
Sarah Wooders
57bb051ea4 feat: add tool return truncation to summarization as a fallback [LET-5970] (#5859) 2025-11-13 15:36:30 -08:00
Kian Jones
c2e474e03a feat: refactor logs to parse as a single log line each and filter out 404s from sentry (#5242)
* add multiline log auto detect

* implement logger.exception()

* filter out 404

* remove potentially problematic changes
2025-10-24 15:11:31 -07:00
Sarah Wooders
ef07e03ee3 feat: add run_id to input messages and step_id to messages (#5099) 2025-10-07 17:50:48 -07:00
Charles Packer
1c7448eb9d fix: patch remaining failing send_message CI tests (#4974)
* fix: patch failing summarizer tests for anthropic claude 3.5

* fix: carveout for gemini-2.5-flash because it doesn't do the send_message tool call

* fix: deprecate old gemini test now that model is unavailable

* fix: deprecate old gemini test now that model is unavailable

* fix: deprecate old gemini test now that model is unavailable

* fix: patch flash flakiness

* fix: relax the gemini 2.5 flash test

* fix: relax the gemini 2.5 flash test

* fix: relax again

* fix: another flash fix

* fix: relax gpt-4o-mini

* fix: swap 4o-mini for 4.1

* fix: drop 4o-mini
2025-10-07 17:50:46 -07:00
Charles Packer
a4041879a4 feat: add new agent loop (squash rebase of OSS PR) (#4815)
* feat: squash rebase of OSS PR

* fix: revert changes that weren't on manual rebase

* fix: caught another one

* fix: disable force

* chore: drop print

* fix: just stage-api && just publish-api

* fix: make agent_type consistently an arg in the client

* fix: patch multi-modal support

* chore: put in todo stub

* fix: disable hardcoding for tests

* fix: patch validate agent sync (#4882)

patch validate agent sync

* fix: strip bad merge diff

* fix: revert unrelated diff

* fix: react_v2 naming -> letta_v1 naming

* fix: strip bad merge

---------

Co-authored-by: Kevin Lin <klin5061@gmail.com>
2025-10-07 17:50:45 -07:00
Kian Jones
b8e9a80d93 merge this (#4759)
* wait I forgot to comit locally

* cp the entire core directory and then rm the .git subdir
2025-09-17 15:47:40 -07:00
Kian Jones
22f70ca07c chore: officially migrate to submodule (#4502)
* remove apps/core and apps/fern

* fix precommit

* add submodule updates in workflows

* submodule

* remove core tests

* update core revision

* Add submodules: true to all GitHub workflows

- Ensure all workflows can access git submodules
- Add submodules support to deployment, test, and CI workflows
- Fix YAML syntax issues in workflow files

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* remove core-lint

* upgrade core with latest main of oss

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-09 12:45:53 -07:00
Matthew Zhou
acaf820009 feat: Add project_id to message filtering/inserting to Turbopuffer [LET-4252] (#4466)
* Add project_id

* Fern autogen
2025-09-08 14:35:15 -07:00
Matthew Zhou
2ef47d8002 feat: Add search messages endpoint [LET-4144] (#4434)
* Add search messages endpoint

* Run fern autogen and fix tests
2025-09-05 14:28:27 -07:00
Matthew Zhou
cbf2e09e13 feat: Add project id to message schema [LET-4166] (#4433)
* Add project id

* Propogate through update message by id async

* Add project id testing
2025-09-04 16:50:41 -07:00
cthomas
1edcc13778 feat: support filtering out messages when converting to openai dict (#4337)
* feat: support filtering out messages when converting to openai dict

* fix imports
2025-09-01 12:48:45 -07:00
Kian Jones
fecf6decfb chore: migrate to ruff (#4305)
* base requirements

* autofix

* Configure ruff for Python linting and formatting

- Set up minimal ruff configuration with basic checks (E, W, F, I)
- Add temporary ignores for common issues during migration
- Configure pre-commit hooks to use ruff with pass_filenames
- This enables gradual migration from black to ruff

* Delete sdj

* autofixed only

* migrate lint action

* more autofixed

* more fixes

* change precommit

* try changing the hook

* try this stuff
2025-08-29 11:11:19 -07:00
Sarah Wooders
f10f27a9ce fix: avoid using summarization agent when not using voice (#4097) 2025-08-25 13:23:24 -07:00