feat: centralize telemetry logging at LLM client level (#8815)

* feat: centralize telemetry logging at LLM client level

Moves telemetry logging from individual adapters to LLMClientBase:
- Add TelemetryStreamWrapper for streaming telemetry on stream close
- Add request_async_with_telemetry() for non-streaming requests
- Add stream_async_with_telemetry() for streaming requests
- Add set_telemetry_context() to configure agent_id, run_id, step_id

Updates adapters and agents to use new pattern:
- LettaLLMAdapter now accepts agent_id/run_id in constructor
- Adapters call set_telemetry_context() before LLM requests
- Removes duplicate telemetry logging from adapters
- Enriches traces with agent_id, run_id, call_type metadata

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: accumulate streaming response content for telemetry

TelemetryStreamWrapper now extracts actual response data from chunks:
- Content text (concatenated from deltas)
- Tool calls (id, name, arguments)
- Model name, finish reason, usage stats

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: move streaming telemetry to caller (option 3)

- Remove TelemetryStreamWrapper class
- Add log_provider_trace_async() helper to LLMClientBase
- stream_async_with_telemetry() now just returns raw stream
- Callers log telemetry after processing with rich interface data

Updated callers:
- summarizer.py: logs content + usage after stream processing
- letta_agent.py: logs tool_call, reasoning, model, usage

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: pass agent_id and run_id to parent adapter class

LettaLLMStreamAdapter was not passing agent_id/run_id to parent,
causing "unexpected keyword argument" errors.

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
This commit is contained in:
Kian Jones
2026-01-16 22:23:48 -08:00
committed by Sarah Wooders
parent 9418ab9815
commit a92e868ee6
10 changed files with 216 additions and 19 deletions

View File

@@ -20,9 +20,17 @@ class LettaLLMAdapter(ABC):
through a consistent API.
"""
def __init__(self, llm_client: LLMClientBase, llm_config: LLMConfig) -> None:
def __init__(
self,
llm_client: LLMClientBase,
llm_config: LLMConfig,
agent_id: str | None = None,
run_id: str | None = None,
) -> None:
self.llm_client: LLMClientBase = llm_client
self.llm_config: LLMConfig = llm_config
self.agent_id: str | None = agent_id
self.run_id: str | None = run_id
self.message_id: str | None = None
self.request_data: dict | None = None
self.response_data: dict | None = None

View File

@@ -26,9 +26,8 @@ class LettaLLMStreamAdapter(LettaLLMAdapter):
specific streaming formats.
"""
def __init__(self, llm_client: LLMClientBase, llm_config: LLMConfig, run_id: str | None = None) -> None:
super().__init__(llm_client, llm_config)
self.run_id = run_id
def __init__(self, llm_client: LLMClientBase, llm_config: LLMConfig, agent_id: str | None = None, run_id: str | None = None) -> None:
super().__init__(llm_client, llm_config, agent_id=agent_id, run_id=run_id)
self.interface: OpenAIStreamingInterface | AnthropicStreamingInterface | None = None
async def invoke_llm(

View File

@@ -38,9 +38,16 @@ class SimpleLLMRequestAdapter(LettaLLMRequestAdapter):
# Store request data
self.request_data = request_data
# Make the blocking LLM request
# Set telemetry context and make the blocking LLM request
self.llm_client.set_telemetry_context(
telemetry_manager=self.telemetry_manager,
step_id=step_id,
agent_id=self.agent_id,
run_id=self.run_id,
call_type="agent_step",
)
try:
self.response_data = await self.llm_client.request_async(request_data, self.llm_config)
self.response_data = await self.llm_client.request_async_with_telemetry(request_data, self.llm_config)
except Exception as e:
raise self.llm_client.handle_llm_error(e)