Commit Graph

335 Commits

Author SHA1 Message Date
Kian Jones
a92e868ee6 feat: centralize telemetry logging at LLM client level (#8815)
* feat: centralize telemetry logging at LLM client level

Moves telemetry logging from individual adapters to LLMClientBase:
- Add TelemetryStreamWrapper for streaming telemetry on stream close
- Add request_async_with_telemetry() for non-streaming requests
- Add stream_async_with_telemetry() for streaming requests
- Add set_telemetry_context() to configure agent_id, run_id, step_id

Updates adapters and agents to use new pattern:
- LettaLLMAdapter now accepts agent_id/run_id in constructor
- Adapters call set_telemetry_context() before LLM requests
- Removes duplicate telemetry logging from adapters
- Enriches traces with agent_id, run_id, call_type metadata

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: accumulate streaming response content for telemetry

TelemetryStreamWrapper now extracts actual response data from chunks:
- Content text (concatenated from deltas)
- Tool calls (id, name, arguments)
- Model name, finish reason, usage stats

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: move streaming telemetry to caller (option 3)

- Remove TelemetryStreamWrapper class
- Add log_provider_trace_async() helper to LLMClientBase
- stream_async_with_telemetry() now just returns raw stream
- Callers log telemetry after processing with rich interface data

Updated callers:
- summarizer.py: logs content + usage after stream processing
- letta_agent.py: logs tool_call, reasoning, model, usage

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: pass agent_id and run_id to parent adapter class

LettaLLMStreamAdapter was not passing agent_id/run_id to parent,
causing "unexpected keyword argument" errors.

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-19 15:54:43 -08:00
Kian Jones
9418ab9815 feat: add provider trace backend abstraction for multi-backend telemetry (#8814)
* feat: add provider trace backend abstraction for multi-backend telemetry

Introduces a pluggable backend system for provider traces:
- Base class with async/sync create and read interfaces
- PostgreSQL backend (existing behavior)
- ClickHouse backend (via OTEL instrumentation)
- Socket backend (writes to Unix socket for crouton sidecar)
- Factory for instantiating backends from config

Refactors TelemetryManager to use backends with support for:
- Multi-backend writes (concurrent via asyncio.gather)
- Primary backend for reads (first in config list)
- Graceful error handling per backend

Config: LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND (comma-separated)
Example: "postgres,socket" for dual-write to Postgres and crouton

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: add protocol version to socket backend records

Adds PROTOCOL_VERSION constant to socket backend:
- Included in every telemetry record sent to crouton
- Must match ProtocolVersion in apps/crouton/main.go
- Enables crouton to detect and reject incompatible messages

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: remove organization_id from ProviderTraceCreate calls

The organization_id is now handled via the actor parameter in the
telemetry manager, not through ProviderTraceCreate schema. This fixes
validation errors after changing ProviderTraceCreate to inherit from
BaseProviderTrace which forbids extra fields.

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* consolidate provider trace

* add clickhouse-connect to fix bug on main lmao

* auto generated sdk changes, and deployment details, and clikchouse prefix bug and added fields to runs trace return api

* auto generated sdk changes, and deployment details, and clikchouse prefix bug and added fields to runs trace return api

* consolidate provider trace

* consolidate provider trace bug fix

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-19 15:54:43 -08:00
jnjpng
e2a8a95371 fix: chatgpt oauth model endpoint (#8855)
base
2026-01-19 15:54:43 -08:00
jnjpng
a98bc31bf3 fix: refactor enable strict mode for structured output (#8840)
* base

* test
2026-01-19 15:54:42 -08:00
jnjpng
21c70323df fix: respect strict mode for temporal with fallback on client (#8839)
* base

* update
2026-01-19 15:54:42 -08:00
jnjpng
85c242077e feat: strict tool calling setting (#8810)
base
2026-01-19 15:54:42 -08:00
jnjpng
637e320b11 chore: fix up minor nits and typing for chatgpt client (#8807)
base
2026-01-19 15:54:42 -08:00
jnjpng
5017cb1d12 feat: add chatgpt oauth client for codex routing (#8774)
* base

* refresh

* use default model fallback

* patch

* streaming

* generate
2026-01-19 15:54:42 -08:00
Sarah Wooders
97cdfb4225 Revert "feat: add strict tool calling setting [LET-6902]" (#8720)
Revert "feat: add strict tool calling setting [LET-6902] (#8577)"

This reverts commit 697c9d0dee6af73ec4d5d98780e2ca7632a69173.
2026-01-19 15:54:39 -08:00
github-actions[bot]
5fbf8f93e2 fix: add explicit timeouts to httpx clients to prevent ReadTimeout errors (#8538)
This commit addresses the httpx.ReadTimeout error detected in production
by adding explicit timeout configurations to several httpx client usages:

1. MCP SSE client: Pass mcp_connect_to_server_timeout (30s) to sse_client()
2. MCP StreamableHTTP client: Pass mcp_connect_to_server_timeout (30s) to streamablehttp_client()
3. OpenAI model list API: Add 30s timeout with 10s connect timeout
4. Google AI model list/details API: Add 30s timeout with 10s connect timeout

Previously, these httpx clients were created without explicit timeouts,
which could cause ReadTimeout errors when remote servers are slow to respond.

Fixes #8073

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
2026-01-19 15:54:38 -08:00
github-actions[bot]
85c40c8154 fix: add streaming fallback for long-running Anthropic requests (#8564)
When the Anthropic SDK detects a request may exceed 10 minutes, it
raises a ValueError requiring streaming mode. This fix catches that
specific error in request_async and automatically falls back to
streaming mode, accumulating the response into the same format as
non-streaming.

This resolves the production error:
"ValueError: Streaming is required for operations that may take
longer than 10 minutes"

Fixes #8516

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
2026-01-19 15:54:38 -08:00
Sarah Wooders
bdede5f90c feat: add strict tool calling setting [LET-6902] (#8577) 2026-01-19 15:54:38 -08:00
cthomas
9b5067bed9 fix: remove unused sync code (#8613)
* chore: remove unused sync code

* chore: remove deprecated sync Google AI functions

Removes unused sync functions that used httpx.Client (blocking):
- google_ai_get_model_details()
- google_ai_get_model_context_window()
- GoogleGeminiProvider.get_model_context_window()

All code now uses async versions with httpx.AsyncClient.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-19 15:54:37 -08:00
cthomas
57cb2d7566 fix: async functions must call async methods (#8612)
Critical fixes:
- llm_client_base.send_llm_request() now calls await self.request_async() instead of self.request()
- Remove unused sync get_openai_embedding() that used sync OpenAI client
- Remove deprecated compile_in_thread_async() from Memory

These were blocking the event loop during LLM requests and embeddings.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-01-19 15:54:37 -08:00
Sarah Wooders
0cbdf452fa fix: temporarily disable structured outputs for anthropic (#8491) 2026-01-12 10:57:49 -08:00
cthomas
938bb78afe fix: handle anthropic incorrect tool id bug (#8447) 2026-01-12 10:57:49 -08:00
github-actions[bot]
f2171447a8 fix: handle httpx.ReadError, WriteError, and ConnectError in LLM streaming clients (#8243)
Adds explicit handling for httpx network errors (ReadError, WriteError,
ConnectError) in AnthropicClient, OpenAIClient, and GoogleVertexClient.
These errors can occur during streaming when the connection is unexpectedly
closed while reading/writing data.

Maps these errors to LLMConnectionError for consistent error handling.

Fixes #8221 (and duplicate #8156)

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
2026-01-12 10:57:49 -08:00
github-actions[bot]
05ec02e384 fix: handle Anthropic 413 request_too_large as ContextWindowExceededError (#8424)
The Anthropic API returns a 413 status code with error type `request_too_large`
when the request payload exceeds the maximum allowed size. This error should
be converted to `ContextWindowExceededError` so the system can handle it
appropriately (e.g., by summarizing the conversation to reduce context size).

Changes:
- Added `request_too_large` and `request exceeds the maximum size` to the
  early string-based error detection in `handle_llm_error`
- Added specific handling for HTTP 413 status code in the `APIStatusError`
  handler
- Added tests to verify the new error handling behavior

Fixes: #8422

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: datadog-official[bot] <datadog-official[bot]@users.noreply.github.com>
Co-authored-by: Kian Jones <11655409+kianjones9@users.noreply.github.com>
2026-01-12 10:57:48 -08:00
Ari Webb
02f3e3f3b9 fix: fix providers and models persistence (#8302) 2026-01-12 10:57:48 -08:00
Ari Webb
cc825b4f5c Revert "Revert "feat: enable provider models persistence" (#6590)" (#6595) 2026-01-12 10:57:48 -08:00
github-actions[bot]
76008c61f4 fix: handle httpx.RemoteProtocolError during LLM streaming (#8206) 2026-01-12 10:57:48 -08:00
Sarah Wooders
256dfb31cf fix: attempt to fix /context again (#7946) 2026-01-12 10:57:20 -08:00
Sarah Wooders
f512d13bc9 feat: test token counting (#7943) 2026-01-12 10:57:20 -08:00
Kevin Lin
a1dfedfb0b feat: Add structured outputs for Anthropic (#7495) 2026-01-12 10:57:19 -08:00
jnjpng
a1fb295e69 feat: enable claude pro max plan for letta code (#7663)
* base

* more

* rev
2026-01-12 10:57:19 -08:00
Ari Webb
cd45212acb feat: add zai provider support (#7626)
* feat: add zai provider support

* add zai_api_key secret to deploy-core

* add to justfile

* add testing, provider integration skill

* enable zai key

* fix zai test

* clean up skill a little

* small changes
2026-01-12 10:57:19 -08:00
Kevin Lin
33afb930fc fix: Google AI client logging as Vertex (#7337)
fix
2025-12-17 17:32:27 -08:00
Ari Webb
e47eb23522 fix: max output tokens for gemini 3 models (#7322) 2025-12-17 17:31:03 -08:00
Sarah Wooders
8729a037b9 fix: handle new openai overflow error format (#7110) 2025-12-17 17:31:02 -08:00
Charles Packer
33d39f4643 fix(core): patch usage data tracking for anthropic when context caching is on (#6997) 2025-12-15 12:03:09 -08:00
Kevin Lin
4b9485a484 feat: Add max tokens exceeded to stop reasons [LET-6480] (#6576) 2025-12-15 12:03:09 -08:00
Ari Webb
4d90f37f50 feat: add gpt-5.2 support (#6698) 2025-12-15 12:02:34 -08:00
Sarah Wooders
c8fa77a01f feat: cleanup cancellation code and add more logging (#6588) 2025-12-15 12:02:34 -08:00
Sarah Wooders
8440e319e2 Revert "feat: enable provider models persistence" (#6590)
Revert "feat: enable provider models persistence (#6193)"

This reverts commit 9682aff32640a6ee8cf71a6f18c9fa7cda25c40e.
2025-12-15 12:02:34 -08:00
Ari Webb
848a73125c feat: enable provider models persistence (#6193)
* Revert "fix test"

This reverts commit 5126815f23cefb4edad3e3bf9e7083209dcc7bf1.

* fix server and better test

* test fix, get api key for base and byok?

* set letta default endpoint

* try to fix timeout for test

* fix for letta api key

* Delete apps/core/tests/sdk_v1/conftest.py

* Update utils.py

* clean up a few issues

* fix filterning on list_llm_models

* soft delete models with provider

* add one more test

* fix ci

* add timeout

* band aid for letta embedding provider

* info instead of error logs when creating models
2025-12-15 12:02:34 -08:00
Devansh Jain
d1536df6f6 chore: Update deepseek client for v3.2 models (#6556)
* support for v3.2 models

* streaming + context window fix

* fix for no assitant text from deepseek
2025-12-15 12:02:34 -08:00
Kian Jones
d6292b6eb6 fix: bug which causes unrecoverable state if previous message was an image (#6486)
* trying tout gpt-5.1-codex

* add unit test for message content

* try to support multimodal
2025-12-15 12:02:33 -08:00
Sarah Wooders
91e3dd8b3e feat: fix new summarizer code and add more tests (#6461) 2025-12-15 12:02:19 -08:00
Charles Packer
e67c98eedb feat: add tests for prompt caching + fix anthropic prompt caching [LET-6373] (#6454)
* feat: add tests for prompt caching

* fix: add cache control breakpoints for anthropic + fix tests

* fix: silence logging

* fix: patch token counting error

* fix: same patch on non-streaming path
2025-12-15 12:02:19 -08:00
Charles Packer
88a3743cc8 fix(core): distinguish between null and 0 for prompt caching (#6451)
* fix(core): distinguish between null and 0 for prompt caching

* fix: runtime errors

* fix: just publish just sgate
2025-12-15 12:02:19 -08:00
Charles Packer
131891e05f feat: add tracking of advanced usage data (eg caching) [LET-6372] (#6449)
* feat: init refactor

* feat: add helper code

* fix: missing file + test

* fix: just state/publish api
2025-12-15 12:02:19 -08:00
Sarah Wooders
807c5c18d9 feat: add gemini token counting [LET-6371] (#6444) 2025-12-15 12:02:19 -08:00
jnjpng
c6df306ccf fix: upgrade google-genai sdk version and fix gemini 3 streaming (#6437)
* base

* base

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:02:18 -08:00
Ari Webb
d4e7428c98 feat: structured outputs for anthropic [LET-6232] (#6410)
feat: structured outputs for anthropic

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-26 14:39:40 -08:00
Ari Webb
89c7ab5f14 feat: structured outputs for openai [LET-6233] (#6363)
* first hack with test

* remove changes integration test

* Delete apps/core/tests/sdk_v1/integration/integration_test_send_message_v2.py

* add test

* remove comment

* stage and publish api

* deprecate base level response_schema

* add param to llm_config test

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-26 14:39:39 -08:00
Charles Packer
c4699b3d17 feat: add support for opus 4.5 (#6256)
* feat: add support for new model

* fix: just stage-api && just publish-api (anthropic model settings changed)

* fix: just stage-api && just publish-api (anthropic model settings changed)

* fix: make kevlar have default reasoning on

* fix: bump anthropic sdk version

* fix: patch name

* pin newer version anthropic

---------

Co-authored-by: Ari Webb <ari@letta.com>
2025-11-24 19:10:27 -08:00
jnjpng
9adeb122cd feat: set thinking level high for gemini 3 (#6330)
* base

* include thoughts

* update

* clean

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-11-24 19:10:27 -08:00
cthomas
d903056a43 fix: list indices must be int not str error [LET-6254] (#6302)
fix: list indices must be int not str error
2025-11-24 19:10:26 -08:00
cthomas
1c358c863e fix: google api sync request (#6292)
* fix: google api sync request

* remove unused function
2025-11-24 19:10:26 -08:00
cthomas
2ebf863998 fix: type error in agent step (#6287) 2025-11-24 19:10:26 -08:00