OpenAI Chat Completions requires tool message content to be a string,
so images in tool returns were silently replaced with [Image omitted].
Now: text stays in the tool return, images get injected as a user
message right after. The model actually sees what the tool saw.
to_openai_dict also cleaned up — image handling lives in
to_openai_dicts_from_list where it can inject the extra message.
* fix: add "max" to AnthropicModelSettings effort type
The effort field on AnthropicModelSettings only accepted
"low" | "medium" | "high", but the LLMConfig.effort field and the
Anthropic API both support "max" for Opus 4.6. This type mismatch
caused Pydantic validation to reject conversation updates that set
effort to "max" (mapped from xhigh in letta-code).
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta Code <noreply@letta.com>
* generate
---------
Co-authored-by: Letta Code <noreply@letta.com>
Add OpenAI's GPT-5.3 Chat model (128K context, 16K output) with pricing
specs, and remove the "chat" keyword filter so chat variants are listed.
🐾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta Code <noreply@letta.com>
* feat: add billing context to LLM telemetry traces
Add billing metadata (plan type, cost source, customer ID) to LLM traces in ClickHouse for cost analytics and attribution.
**Data Flow:**
- Cloud-API: Extract billing info from subscription in rate limiting, set x-billing-* headers
- Core: Parse headers into BillingContext object via dependencies
- Adapters: Flow billing_context through all LLM adapters (blocking & streaming)
- Agent: Pass billing_context to step() and stream() methods
- ClickHouse: Store in billing_plan_type, billing_cost_source, billing_customer_id columns
**Changes:**
- Add BillingContext schema to provider_trace.py
- Add billing columns to llm_traces ClickHouse table DDL
- Update getCustomerSubscription to fetch stripeCustomerId from organization_billing_details
- Propagate billing_context through agent step flow, adapters, and streaming service
- Update ProviderTrace and LLMTrace to include billing metadata
- Regenerate SDK with autogen
**Production Deployment:**
Requires env vars: LETTA_PROVIDER_TRACE_BACKEND=clickhouse, LETTA_STORE_LLM_TRACES=true, CLICKHOUSE_*
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: add billing_context parameter to agent step methods
- Add billing_context to BaseAgent and BaseAgentV2 abstract methods
- Update LettaAgent, LettaAgentV2, LettaAgentV3 step methods
- Update multi-agent groups: SleeptimeMultiAgentV2, V3, V4
- Fix test_utils.py to include billing header parameters
- Import BillingContext in all affected files
* fix: add billing_context to stream methods
- Add billing_context parameter to BaseAgentV2.stream()
- Add billing_context parameter to LettaAgentV2.stream()
- LettaAgentV3.stream() already has it from previous commit
* fix: exclude billing headers from OpenAPI spec
Mark billing headers as internal (include_in_schema=False) so they don't appear in the public API.
These are internal headers between cloud-api and core, not part of the public SDK.
Regenerated SDK with stage-api - removes 10,650 lines of bloat that was causing OOM during Next.js build.
* refactor: return billing context from handleUnifiedRateLimiting instead of mutating req
Instead of passing req into handleUnifiedRateLimiting and mutating headers inside it:
- Return billing context fields (billingPlanType, billingCostSource, billingCustomerId) from handleUnifiedRateLimiting
- Set headers in handleMessageRateLimiting (middleware layer) after getting the result
- This fixes step-orchestrator compatibility since it doesn't have a real Express req object
* chore: remove extra gencode
* p
---------
Co-authored-by: Letta <noreply@letta.com>
fix zai context window (functionally [advertised context window] - [max output tokens]) and properly pass in max tokens so Z.ai doesn't default to 65k for GLM-5
* fix: preserve agent max_tokens when caller doesn't explicitly set it
When updating an agent with convenience fields (model, model_settings)
but without an explicit max_tokens, the server was constructing a fresh
LLMConfig via get_llm_config_from_handle_async. The Pydantic validator
on LLMConfig hardcodes max_tokens=16384 for gpt-5* models, silently
overriding the agent's existing value (e.g. 128000).
This was triggered by reasoning tab-switch in the CLI, which sends
model + model_settings (with reasoning_effort) but no max_tokens.
Now, when request.max_tokens is None we carry forward the agent's
current max_tokens instead of accepting the provider default.
* fix: use correct 128k max_output_tokens defaults for gpt-5.2/5.3
- Update OpenAI provider fallback to return 128000 for gpt-5.2*/5.3*
models (except -chat variants which are 16k)
- Update LLMConfig Pydantic validator to match
- Update gpt-5.2 default_config factory to use 128000
- Move server-side max_tokens preservation guard into the
model_settings branch where llm_config is already available
- Remove `limit` from YAML frontmatter in `serialize_block()` and
`merge_frontmatter_with_body()` (deprecated for git-base memory)
- Remove `limit` from `_render_memory_blocks_git()` in-context rendering
- Existing frontmatter with `limit` is automatically cleaned up on next write
- Parsing still accepts `limit` from frontmatter for backward compatibility
- Increase `CORE_MEMORY_BLOCK_CHAR_LIMIT` from 20,000 to 100,000
- Update integration tests to assert `limit` is not in frontmatter
Fixes#9536
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Sarah Wooders <sarahwooders@users.noreply.github.com>
Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
fix: align ChatGPT OAuth GPT-5 max output token defaults
Update ChatGPT OAuth provider defaults so GPT-5 family models report 128k max output tokens based on current OpenAI model docs, avoiding incorrect 16k values in /v1/models responses.
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* fix: migrate Anthropic Haiku test model off retired release
Update Anthropic Haiku references in integration and usage parsing tests to a supported model id so test requests stop failing with 404 model not found errors.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: use canonical Anthropic Haiku handle in tests
Replace dated Anthropic Haiku handle references with the canonical provider handle so handle-based model resolution does not fail in batch and client tests.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
The Pydantic validator `set_model_specific_defaults` was checking
`values.get("max_tokens") is None`, which matched both "field not
provided" and "field explicitly set to null". This meant users could
not disable the max output tokens limit for GPT-5/GPT-4.1 models -
the validator would always override null with a default value during
request deserialization.
Changed to `"max_tokens" not in values` so that an explicit
`max_tokens: null` is preserved while still applying defaults when
the field is omitted entirely.
LlmConfig._to_model_settings() for Anthropic built an AnthropicModelSettings
object without passing effort=self.effort, so GET /agents/{id} never returned
the effort field in model_settings even when it was stored on the agent.
The Letta Code CLI derives the reasoning tier displayed in the status bar
from model_settings.effort (canonical source), so the footer always showed
e.g. "Sonnet 4.6" instead of "Sonnet 4.6 (high)" after a model switch.
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* fix(core): add OpenAI prompt cache key and model-gated 24h retention (#9492)
* fix(core): apply OpenAI prompt cache settings to request payloads
Set prompt_cache_key using agent and conversation context on both Responses and Chat Completions request builders, and enable 24h retention only for supported OpenAI models while excluding OpenRouter paths.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): prefix prompt cache key with letta tag
Add a `letta:` prefix to generated OpenAI prompt_cache_key values so cache-related entries are easier to identify in provider-side logs and diagnostics.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* add integration test
* skip test
---------
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Ari Webb <ari@letta.com>
* fix(core): only set prompt_cache_retention, drop prompt_cache_key
Two issues with the original prompt_cache_key approach:
1. Key exceeded 64-char max (agent-<uuid>:conv-<uuid> = 90 chars)
2. Setting an explicit key disrupted OpenAI's default prefix-hash
routing, dropping cache hit rates from 40-45% to 10-13%
OpenAI's default routing (hash of first ~256 tokens) already provides
good cache affinity since each agent has a unique system prompt.
We only need prompt_cache_retention="24h" for extended retention.
Also fixes:
- Operator precedence bug in _supports_extended_prompt_cache_retention
- Removes incorrect gpt-5.2-codex exclusion (it IS supported per docs)
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Charles Packer <packercharles@gmail.com>
Co-authored-by: Letta <noreply@letta.com>
* auto fixes
* auto fix pt2 and transitive deps and undefined var checking locals()
* manual fixes (ignored or letta-code fixed)
* fix circular import
* remove all ignores, add FastAPI rules and Ruff rules
* add ty and precommit
* ruff stuff
* ty check fixes
* ty check fixes pt 2
* error on invalid
* fix(core): apply OpenAI prompt cache settings to request payloads
Set prompt_cache_key using agent and conversation context on both Responses and Chat Completions request builders, and enable 24h retention only for supported OpenAI models while excluding OpenRouter paths.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): prefix prompt cache key with letta tag
Add a `letta:` prefix to generated OpenAI prompt_cache_key values so cache-related entries are easier to identify in provider-side logs and diagnostics.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* add integration test
* skip test
---------
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Ari Webb <ari@letta.com>
Avoid failing message-list endpoints when historical send_message tool calls are missing the expected message argument by logging and skipping malformed entries during conversion.
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* fix(core): stabilize system prompt refresh and expand git-memory coverage
Only rebuild system prompts on explicit refresh paths so normal turns preserve prefix-cache stability, including git/custom prompt layouts. Add integration coverage for memory filesystem tree structure and recompile/reset system-message updates via message-id retrieval.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): recompile system prompt around compaction and stabilize source tests
Force system prompt refresh before/after compaction in LettaAgentV3 so repaired system+memory state is used and persisted across subsequent turns. Update source-system prompt tests to explicitly recompile before raw preview assertions instead of assuming automatic rebuild timing.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
* fix: handle system messages with mixed TextContent + ImageContent
System messages injected by external tools (e.g. packify.ai MCP) can
contain both TextContent and ImageContent. The assertions in
to_openai_responses_dicts and to_anthropic_dict expected exactly one
TextContent, causing AssertionError in production.
Extract all text parts and join them, matching how to_openai_dict
already handles this case.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: replace asserts with logger.warning + graceful skip
Asserts are the wrong tool for production input validation — if a
system message has only non-text content, we should warn and skip
rather than crash the request.
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
* fix: update ContextWindowCalculator to parse new system message sections
The context window calculator was using outdated position-based parsing
that only handled 3 sections (base_instructions, memory_blocks, memory_metadata).
The actual system message now includes additional sections that were not
being tracked:
- <memory_filesystem> (git-enabled agents)
- <tool_usage_rules> (when tool rules configured)
- <directories> (when sources attached)
Changes:
- Add _extract_tag_content() helper for proper XML tag extraction
- Rewrite extract_system_components() to return a Dict with all 6 sections
- Update calculate_context_window() to count tokens for new sections
- Add new fields to ContextWindowOverview schema with backward-compatible defaults
- Add unit tests for the extraction logic
* update
* generate
* fix: check attached file in directories section instead of core_memory
Files are rendered inside <directories> tags, not <memory_blocks>.
Update validate_context_window_overview assertions accordingly.
* fix: address review feedback for context window parser
- Fix git-enabled agents regression: capture bare file blocks
(e.g. <system/human.md>) rendered after </memory_filesystem> as
core_memory via new _extract_git_core_memory() method
- Make _extract_top_level_tag robust: scan all occurrences to find
tag outside container, handling nested-first + top-level-later case
- Document system_prompt tag inconsistency in docstring
- Add TODO to base_agent.py extract_dynamic_section linking to
ContextWindowCalculator to flag parallel parser tech debt
- Add tests: git-enabled agent parsing, dual-occurrence tag
extraction, pure text system prompt, git-enabled integration test
* fix(core): handle git memory label prefix collisions in filesystem view
Prevent context window preview crashes when a block label is both a leaf and a prefix (e.g. system/human and system/human/context) by rendering a node as both file and directory. Add regression test.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): parse git-backed core memory in context window preview
ContextWindowCalculator.extract_system_components now detects git-backed memory rendering (<memory_filesystem> and <system/...> tags) when <memory_blocks> wrapper is absent, so core_memory is populated in the context preview. Add regression tests.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
* Add log probabilities support for RL training
This enables Letta server to request and return log probabilities from
OpenAI-compatible providers (including SGLang) for use in RL training.
Changes:
- LLMConfig: Add return_logprobs and top_logprobs fields
- OpenAIClient: Set logprobs in ChatCompletionRequest when enabled
- LettaLLMAdapter: Add logprobs field and extract from response
- LettaResponse: Add logprobs field to return log probs to client
- LettaRequest: Add return_logprobs/top_logprobs for per-request override
- LettaAgentV3: Store and pass logprobs through to response
- agents.py: Handle request-level logprobs override
Usage:
response = client.agents.messages.create(
agent_id=agent_id,
messages=[...],
return_logprobs=True,
top_logprobs=5,
)
print(response.logprobs) # Per-token log probabilities
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* Add multi-turn token tracking for RL training via SGLang native endpoint
- Add TurnTokenData schema to track token IDs and logprobs per turn
- Add return_token_ids flag to LettaRequest and LLMConfig
- Create SGLangNativeClient for /generate endpoint (returns output_ids)
- Create SGLangNativeAdapter that uses native endpoint
- Modify LettaAgentV3 to accumulate turns across LLM calls
- Include turns in LettaResponse when return_token_ids=True
* Fix: Add SGLang native adapter to step() method, not just stream()
* Fix: Handle Pydantic Message objects in SGLang native adapter
* Fix: Remove api_key reference from LLMConfig (not present)
* Fix: Add missing 'created' field to ChatCompletionResponse
* Add full tool support to SGLang native adapter
- Format tools into prompt in Qwen-style format
- Parse tool calls from <tool_call> tags in response
- Format tool results as <tool_response> in user messages
- Set finish_reason to 'tool_calls' when tools are called
* Use tokenizer.apply_chat_template for proper tool formatting
- Add tokenizer caching in SGLang native adapter
- Use apply_chat_template when tokenizer available
- Fall back to manual formatting if not
- Convert Letta messages to OpenAI format for tokenizer
* Fix: Use func_response instead of tool_return for ToolReturn content
* Fix: Get output_token_logprobs from meta_info in SGLang response
* Fix: Allow None in output_token_logprobs (SGLang format includes null)
* chore: remove unrelated files from logprobs branch
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: add missing call_type param to adapter constructors in letta_agent_v3
The SGLang refactor dropped call_type=LLMCallType.agent_step when extracting
adapter creation into conditional blocks. Restores it for all 3 spots (SGLang
in step, SimpleLLM in step, SGLang in stream).
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* just stage-api && just publish-api
* fix: update expected LLMConfig fields in schema test for logprobs support
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: remove rllm provider references
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* just stage-api && just publish-api
🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Ubuntu <ubuntu@ip-172-31-65-206.ec2.internal>
Co-authored-by: Letta <noreply@letta.com>
When an OpenAI/Anthropic-compatible endpoint returns a non-JSON response
(e.g. HTML error page), the SDK's paginated response parser falls back
to returning a raw string. The post-parser then calls
_set_private_attributes() on that string, causing an AttributeError.
Add explicit AttributeError handling around SDK models.list() calls in
provider check_api_key/list_llm_models_async methods, and add type
guards in convert_response_to_chat_completion to reject raw strings
before Pydantic model construction.
Datadog: https://us5.datadoghq.com/error-tracking/issue/59a7a206-00b8-11f1-be73-da7ad0900000🤖 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>
* fix: strip whitespace from API keys in LLM client headers
Fixes httpx.LocalProtocolError when API keys contain leading/trailing whitespace.
Strips whitespace from API keys before using them in HTTP headers across:
- OpenAI client (openai.py)
- Mistral client (mistral.py)
- Anthropic client (anthropic_client.py)
- Anthropic schema provider (schemas/providers/anthropic.py)
- Google AI client (google_ai_client.py)
- Proxy helpers (proxy_helpers.py)
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: handle McpError gracefully in MCP client execute_tool
Return error as failed result instead of re-raising to avoid Datadog alerts for expected user-facing errors like missing tool arguments.
* fix: strip whitespace from API keys before passing to httpx client
Fixes httpx.LocalProtocolError by stripping leading/trailing whitespace
from API keys before passing them to OpenAI/AsyncOpenAI clients. The
OpenAI client library constructs Authorization headers internally, and
invalid header values (like keys with leading spaces) cause protocol
errors.
Applied fix to:
- azure_client.py (AzureOpenAI/AsyncAzureOpenAI)
- deepseek_client.py (OpenAI/AsyncOpenAI)
- openai_client.py (OpenAI/AsyncOpenAI via kwargs)
- xai_client.py (OpenAI/AsyncOpenAI)
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: handle JSONDecodeError in OpenAI client requests
Catches json.JSONDecodeError from OpenAI SDK when API returns invalid
JSON (typically HTML error pages from 500-series errors) and converts
to LLMServerError with helpful details.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): strip API key whitespace at schema level on write/create
Add field_validator to ProviderCreate, ProviderUpdate, and ProviderCheck
schemas to strip whitespace from api_key and access_key fields before
persistence. This ensures keys are clean at the point of entry, preventing
whitespace from being encrypted and stored in the database.
Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>
* refactor: remove api_key.strip() calls across all LLM clients
Remove redundant .strip() calls on api_key parameters since pydantic models
now handle whitespace trimming at the validation layer. This centralizes
the validation logic and follows DRY principles.
- Updated 13 files across multiple LLM client implementations
- Removed 34 occurrences of api_key.strip()
- Includes: OpenAI, Anthropic, Azure, Google AI, Groq, XAI, DeepSeek, ZAI, Together, Mistral
- Also updated proxy helpers and provider schemas
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* refactor: remove redundant ternary operators from api_key parameters
Remove `if api_key else None` ternaries since pydantic validation ensures
api_key is either a valid string or None. The ternary was defensive programming
that's now unnecessary with proper model-level validation.
- Simplified 23 occurrences across 7 files
- Cleaner, more concise client initialization code
- No behavioral change since pydantic already handles this
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>
* fix(core): handle PermissionDeniedError in provider API key validation
Fixed OpenAI PermissionDeniedError being raised as unknown error when
validating provider API keys. The check_api_key methods in OpenAI-based
providers (OpenAI, OpenRouter, Azure, Together) now properly catch and
re-raise PermissionDeniedError as LLMPermissionDeniedError.
🐛 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): handle Unicode surrogates in OpenAI requests
Sanitize invalid UTF-16 surrogates before sending requests to OpenAI API.
Fixes UnicodeEncodeError when message content contains unpaired surrogates
from corrupted emoji data or malformed Unicode sequences.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): handle MCP tool schema validation errors gracefully
Catch fastmcp.exceptions.ToolError in execute_mcp_tool endpoint and
convert to LettaInvalidArgumentError (400) instead of letting it
propagate as 500 error. This is an expected user error when tool
arguments don't match the MCP tool's schema.
Fixes Datadog issue 8f2d874a-f8e5-11f0-9b25-da7ad0900000
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): handle ExceptionGroup-wrapped ToolError in MCP executor
When MCP tools fail with validation errors (e.g., missing required parameters),
fastmcp raises ToolError exceptions that may be wrapped in ExceptionGroup by
Python's async TaskGroup. The exception handler now unwraps single-exception
groups before checking if the error should be handled gracefully.
Fixes Calendly API "organization parameter missing" errors being logged to
Datadog instead of returning friendly error messages to users.
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix: handle missing agent in create_conversation to prevent foreign key violation
* Update .gitignore
---------
Co-authored-by: Letta <noreply@letta.com>
* fix: handle Anthropic overloaded_error in streaming interfaces
* fix: handle Unicode surrogates in OpenAI requests
Sanitize Unicode surrogate pairs before sending requests to OpenAI API.
Surrogate pairs (U+D800-U+DFFF) are UTF-16 encoding artifacts that cause
UnicodeEncodeError when encoding to UTF-8.
Fixes Datadog error: 'utf-8' codec can't encode character '\ud83c' in
position 326605: surrogates not allowed
* fix: handle UnicodeEncodeError from lone Unicode surrogates in OpenAI requests
Improved sanitize_unicode_surrogates() to explicitly filter out lone
surrogate characters (U+D800 to U+DFFF) which are invalid in UTF-8.
Previous implementation used errors='ignore' which could still fail in
edge cases. New approach directly checks Unicode code points and removes
any surrogates before data reaches httpx encoding.
Also added sanitization to stream_async_responses() method which was
missing it.
Fixes: 'utf-8' codec can't encode character '\ud83c' in position X:
surrogates not allowed
When exporting an agent with a conversation_id, the export function was
setting agent_state.message_ids from the conversation, but from_agent_state
was ignoring this and fetching messages generically via list_messages.
Now from_agent_state checks if message_ids is set and fetches those specific
messages instead.
👾 Generated with [Letta Code](https://letta.com)
Co-authored-by: Letta <noreply@letta.com>