feat: centralize telemetry logging at LLM client level (#8815)

* feat: centralize telemetry logging at LLM client level

Moves telemetry logging from individual adapters to LLMClientBase:
- Add TelemetryStreamWrapper for streaming telemetry on stream close
- Add request_async_with_telemetry() for non-streaming requests
- Add stream_async_with_telemetry() for streaming requests
- Add set_telemetry_context() to configure agent_id, run_id, step_id

Updates adapters and agents to use new pattern:
- LettaLLMAdapter now accepts agent_id/run_id in constructor
- Adapters call set_telemetry_context() before LLM requests
- Removes duplicate telemetry logging from adapters
- Enriches traces with agent_id, run_id, call_type metadata

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: accumulate streaming response content for telemetry

TelemetryStreamWrapper now extracts actual response data from chunks:
- Content text (concatenated from deltas)
- Tool calls (id, name, arguments)
- Model name, finish reason, usage stats

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: move streaming telemetry to caller (option 3)

- Remove TelemetryStreamWrapper class
- Add log_provider_trace_async() helper to LLMClientBase
- stream_async_with_telemetry() now just returns raw stream
- Callers log telemetry after processing with rich interface data

Updated callers:
- summarizer.py: logs content + usage after stream processing
- letta_agent.py: logs tool_call, reasoning, model, usage

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: pass agent_id and run_id to parent adapter class

LettaLLMStreamAdapter was not passing agent_id/run_id to parent,
causing "unexpected keyword argument" errors.

🐙 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
This commit is contained in:
Kian Jones
2026-01-16 22:23:48 -08:00
committed by Sarah Wooders
parent 9418ab9815
commit a92e868ee6
10 changed files with 216 additions and 19 deletions

View File

@@ -37,6 +37,103 @@ class LLMClientBase:
self.actor = actor
self.put_inner_thoughts_first = put_inner_thoughts_first
self.use_tool_naming = use_tool_naming
self._telemetry_manager: Optional["TelemetryManager"] = None
self._telemetry_agent_id: Optional[str] = None
self._telemetry_run_id: Optional[str] = None
self._telemetry_step_id: Optional[str] = None
self._telemetry_call_type: Optional[str] = None
def set_telemetry_context(
self,
telemetry_manager: Optional["TelemetryManager"] = None,
agent_id: Optional[str] = None,
run_id: Optional[str] = None,
step_id: Optional[str] = None,
call_type: Optional[str] = None,
) -> None:
"""Set telemetry context for provider trace logging."""
self._telemetry_manager = telemetry_manager
self._telemetry_agent_id = agent_id
self._telemetry_run_id = run_id
self._telemetry_step_id = step_id
self._telemetry_call_type = call_type
async def request_async_with_telemetry(self, request_data: dict, llm_config: LLMConfig) -> dict:
"""Wrapper around request_async that logs telemetry for all requests including errors.
Call set_telemetry_context() first to set agent_id, run_id, etc.
"""
from letta.log import get_logger
logger = get_logger(__name__)
response_data = None
error_msg = None
try:
response_data = await self.request_async(request_data, llm_config)
return response_data
except Exception as e:
error_msg = str(e)
raise
finally:
if self._telemetry_manager and settings.track_provider_trace:
if self.actor is None:
logger.warning(f"Skipping telemetry: actor is None (call_type={self._telemetry_call_type})")
else:
try:
pydantic_actor = self.actor.to_pydantic() if hasattr(self.actor, "to_pydantic") else self.actor
await self._telemetry_manager.create_provider_trace_async(
actor=pydantic_actor,
provider_trace=ProviderTrace(
request_json=request_data,
response_json=response_data if response_data else {"error": error_msg},
step_id=self._telemetry_step_id,
agent_id=self._telemetry_agent_id,
run_id=self._telemetry_run_id,
call_type=self._telemetry_call_type,
),
)
except Exception as e:
logger.warning(f"Failed to log telemetry: {e}")
async def stream_async_with_telemetry(self, request_data: dict, llm_config: LLMConfig):
"""Returns raw stream. Caller should log telemetry after processing via log_provider_trace_async().
Call set_telemetry_context() first to set agent_id, run_id, etc.
After consuming the stream, call log_provider_trace_async() with the response data.
"""
return await self.stream_async(request_data, llm_config)
async def log_provider_trace_async(self, request_data: dict, response_json: dict) -> None:
"""Log provider trace telemetry. Call after processing LLM response.
Uses telemetry context set via set_telemetry_context().
"""
from letta.log import get_logger
logger = get_logger(__name__)
if not self._telemetry_manager or not settings.track_provider_trace:
return
if self.actor is None:
logger.warning(f"Skipping telemetry: actor is None (call_type={self._telemetry_call_type})")
return
try:
pydantic_actor = self.actor.to_pydantic() if hasattr(self.actor, "to_pydantic") else self.actor
await self._telemetry_manager.create_provider_trace_async(
actor=pydantic_actor,
provider_trace=ProviderTrace(
request_json=request_data,
response_json=response_json,
step_id=self._telemetry_step_id,
agent_id=self._telemetry_agent_id,
run_id=self._telemetry_run_id,
call_type=self._telemetry_call_type,
),
)
except Exception as e:
logger.warning(f"Failed to log telemetry: {e}")
@trace_method
async def send_llm_request(