feat: centralize telemetry logging at LLM client level (#8815)

* feat: centralize telemetry logging at LLM client level Moves telemetry logging from individual adapters to LLMClientBase: - Add TelemetryStreamWrapper for streaming telemetry on stream close - Add request_async_with_telemetry() for non-streaming requests - Add stream_async_with_telemetry() for streaming requests - Add set_telemetry_context() to configure agent_id, run_id, step_id Updates adapters and agents to use new pattern: - LettaLLMAdapter now accepts agent_id/run_id in constructor - Adapters call set_telemetry_context() before LLM requests - Removes duplicate telemetry logging from adapters - Enriches traces with agent_id, run_id, call_type metadata 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: accumulate streaming response content for telemetry TelemetryStreamWrapper now extracts actual response data from chunks: - Content text (concatenated from deltas) - Tool calls (id, name, arguments) - Model name, finish reason, usage stats 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * refactor: move streaming telemetry to caller (option 3) - Remove TelemetryStreamWrapper class - Add log_provider_trace_async() helper to LLMClientBase - stream_async_with_telemetry() now just returns raw stream - Callers log telemetry after processing with rich interface data Updated callers: - summarizer.py: logs content + usage after stream processing - letta_agent.py: logs tool_call, reasoning, model, usage 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: pass agent_id and run_id to parent adapter class LettaLLMStreamAdapter was not passing agent_id/run_id to parent, causing "unexpected keyword argument" errors. 🐙 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>
2026-01-16 22:23:48 -08:00
parent 9418ab9815
commit a92e868ee6
10 changed files with 216 additions and 19 deletions
--- a/letta/adapters/letta_llm_adapter.py
+++ b/letta/adapters/letta_llm_adapter.py
@@ -20,9 +20,17 @@ class LettaLLMAdapter(ABC):
    through a consistent API.
    """

-    def __init__(self, llm_client: LLMClientBase, llm_config: LLMConfig) -> None:
+    def __init__(
+        self,
+        llm_client: LLMClientBase,
+        llm_config: LLMConfig,
+        agent_id: str | None = None,
+        run_id: str | None = None,
+    ) -> None:
        self.llm_client: LLMClientBase = llm_client
        self.llm_config: LLMConfig = llm_config
+        self.agent_id: str | None = agent_id
+        self.run_id: str | None = run_id
        self.message_id: str | None = None
        self.request_data: dict | None = None
        self.response_data: dict | None = None
--- a/letta/adapters/letta_llm_stream_adapter.py
+++ b/letta/adapters/letta_llm_stream_adapter.py
@@ -26,9 +26,8 @@ class LettaLLMStreamAdapter(LettaLLMAdapter):
    specific streaming formats.
    """

-    def __init__(self, llm_client: LLMClientBase, llm_config: LLMConfig, run_id: str | None = None) -> None:
-        super().__init__(llm_client, llm_config)
-        self.run_id = run_id
+    def __init__(self, llm_client: LLMClientBase, llm_config: LLMConfig, agent_id: str | None = None, run_id: str | None = None) -> None:
+        super().__init__(llm_client, llm_config, agent_id=agent_id, run_id=run_id)
        self.interface: OpenAIStreamingInterface | AnthropicStreamingInterface | None = None

    async def invoke_llm(
--- a/letta/adapters/simple_llm_request_adapter.py
+++ b/letta/adapters/simple_llm_request_adapter.py
@@ -38,9 +38,16 @@ class SimpleLLMRequestAdapter(LettaLLMRequestAdapter):
        # Store request data
        self.request_data = request_data

-        # Make the blocking LLM request
+        # Set telemetry context and make the blocking LLM request
+        self.llm_client.set_telemetry_context(
+            telemetry_manager=self.telemetry_manager,
+            step_id=step_id,
+            agent_id=self.agent_id,
+            run_id=self.run_id,
+            call_type="agent_step",
+        )
        try:
-            self.response_data = await self.llm_client.request_async(request_data, self.llm_config)
+            self.response_data = await self.llm_client.request_async_with_telemetry(request_data, self.llm_config)
        except Exception as e:
            raise self.llm_client.handle_llm_error(e)