refactor: extract compact logic to shared function for temporal (#9249)
* refactor: extract compact logic to shared function Extract the compaction logic from LettaAgentV3.compact() into a standalone compact_messages() function that can be shared between the agent and temporal workflows. Changes: - Create apps/core/letta/services/summarizer/compact.py with: - compact_messages(): Core compaction logic - build_summarizer_llm_config(): LLM config builder for summarization - CompactResult: Dataclass for compaction results - Update LettaAgentV3.compact() to use compact_messages() - Update temporal summarize_conversation_history activity to use compact_messages() instead of the old Summarizer class - Add use_summary_role parameter to SummarizeParams This ensures consistent summarization behavior across different execution paths and prevents drift as we improve the implementation. * chore: clean up verbose comments * fix: correct CompactionSettings import path * fix: correct count_tokens import from summarizer_sliding_window * fix: update test patch path for count_tokens_with_tools After extracting compact logic to compact.py, the test was patching the old location. Update the patch path to the new module location. * fix: update test to use build_summarizer_llm_config from compact.py The function was moved from LettaAgentV3._build_summarizer_llm_config to compact.py as a standalone function. * fix: add early check for system prompt size in compact_messages Check if the system prompt alone exceeds the context window before attempting summarization. The system prompt cannot be compacted, so fail fast with SystemPromptTokenExceededError. * fix: properly propagate SystemPromptTokenExceededError from compact The exception handler in _step() was not setting the correct stop_reason for SystemPromptTokenExceededError, which caused the finally block to return early and swallow the exception. Add special handling to set stop_reason to context_window_overflow_in_system_prompt when SystemPromptTokenExceededError is caught. * revert: remove redundant SystemPromptTokenExceededError handling The special handling in the outer exception handler is redundant because stop_reason is already set in the inner handler at line 943. The actual fix for the test was the early check in compact_messages(), not this redundant handling. * fix: correctly re-raise SystemPromptTokenExceededError The inner exception handler was using 'raise e' which re-raised the outer ContextWindowExceededError instead of the current SystemPromptTokenExceededError. Changed to 'raise' to correctly re-raise the current exception. This bug was pre-existing but masked because _check_for_system_prompt_overflow was only called as a fallback. The new early check in compact_messages() exposed it. * revert: remove early check and restore raise e to match main behavior * fix: set should_continue=False and correctly re-raise exception - Add should_continue=False in SystemPromptTokenExceededError handler (matching main's _check_for_system_prompt_overflow behavior) - Fix raise e -> raise to correctly propagate SystemPromptTokenExceededError Note: test_large_system_prompt_summarization still fails locally but passes on main. Need to investigate why exception isn't propagating correctly on refactored branch. * fix: add SystemPromptTokenExceededError handler for post-step compaction The post-step compaction (line 1066) was missing a SystemPromptTokenExceededError exception handler. When compact_messages() raised this error, it would be caught by the outer exception handler which would: 1. Set stop_reason to "error" instead of "context_window_overflow_in_system_prompt" 2. Not set should_continue = False 3. Get swallowed by the finally block (line 1126) which returns early This caused test_large_system_prompt_summarization to fail because the exception never propagated to the test. The fix adds the same exception handler pattern used in the retry compaction flow (line 941-946), ensuring proper state is set before re-raising. This issue only affected the refactored code because on main, _check_for_system_prompt_overflow() was an instance method that set should_continue/stop_reason BEFORE raising. In the refactor, compact_messages() is a standalone function that cannot set instance state, so the caller must handle the exception and set the state.
This commit is contained in:
@@ -1046,10 +1046,10 @@ async def test_v3_summarize_hard_eviction_when_still_over_threshold(
|
||||
# summarize_conversation_history to run and then hit the branch where the
|
||||
# *post*-summarization token count is still above the proactive
|
||||
# summarization threshold. We simulate that by patching the
|
||||
# letta_agent_v3-level count_tokens_with_tools helper to report an extremely large
|
||||
# count_tokens_with_tools helper to report an extremely large
|
||||
# token count for the first call (post-summary) and a small count for the
|
||||
# second call (after hard eviction).
|
||||
with patch("letta.agents.letta_agent_v3.count_tokens_with_tools") as mock_count_tokens:
|
||||
with patch("letta.services.summarizer.compact.count_tokens_with_tools") as mock_count_tokens:
|
||||
# First call: pretend the summarized context is still huge relative to
|
||||
# this model's context window so that we always trigger the
|
||||
# hard-eviction path. Second call: minimal context (system only) is
|
||||
|
||||
@@ -314,12 +314,12 @@ async def test_compaction_settings_model_uses_separate_llm_config_for_summarizat
|
||||
the LLMConfig used for the summarizer request.
|
||||
"""
|
||||
|
||||
from letta.agents.letta_agent_v3 import LettaAgentV3
|
||||
from letta.schemas.agent import AgentState as PydanticAgentState
|
||||
from letta.schemas.enums import AgentType, MessageRole
|
||||
from letta.schemas.memory import Memory
|
||||
from letta.schemas.message import Message as PydanticMessage
|
||||
from letta.schemas.model import OpenAIModelSettings, OpenAIReasoning
|
||||
from letta.services.summarizer.compact import build_summarizer_llm_config
|
||||
|
||||
# Base agent LLM config
|
||||
base_llm_config = LLMConfig.default_config("gpt-4o-mini")
|
||||
@@ -406,16 +406,11 @@ async def test_compaction_settings_model_uses_separate_llm_config_for_summarizat
|
||||
tool_rules=None,
|
||||
)
|
||||
|
||||
# Create a mock agent instance to call the instance method
|
||||
mock_agent = Mock(spec=LettaAgentV3)
|
||||
mock_agent.actor = default_user
|
||||
mock_agent.logger = Mock()
|
||||
|
||||
# Use the instance method to derive summarizer llm_config
|
||||
summarizer_llm_config = await LettaAgentV3._build_summarizer_llm_config(
|
||||
mock_agent,
|
||||
# Use the shared function to derive summarizer llm_config
|
||||
summarizer_llm_config = await build_summarizer_llm_config(
|
||||
agent_llm_config=agent_state.llm_config,
|
||||
summarizer_config=agent_state.compaction_settings,
|
||||
actor=default_user,
|
||||
)
|
||||
|
||||
# Agent model remains the base model
|
||||
|
||||
Reference in New Issue
Block a user