Commit Graph

2040 Commits

Author SHA1 Message Date
Kian Jones
7c65fd77f1 fix(core): return 400 for Google GenAI ClientError bad requests (#9357)
Google genai.errors.ClientError with code 400 was being caught and
wrapped as LLMBadRequestError but returned to clients as 502 because
no dedicated FastAPI exception handler existed for LLMBadRequestError.

- Add LLMBadRequestError exception handler in app.py returning HTTP 400
- Fix ErrorCode on Google 400 bad requests from INTERNAL_SERVER_ERROR
  to INVALID_ARGUMENT
- Route Google API errors through handle_llm_error in stream_async path

Datadog: https://us5.datadoghq.com/error-tracking/issue/4eb3ff3c-d937-11f0-8177-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
3634464251 fix(core): handle anyio.BrokenResourceError for client disconnects (#9358)
Catch BrokenResourceError alongside ClosedResourceError in streaming
response, logging middleware, and app exception handlers so client
disconnects are logged at info level instead of surfacing as 500s.

Datadog: https://us5.datadoghq.com/error-tracking/issue/4f57af0c-d558-11f0-a65d-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
0d42afa151 fix(core): catch LockNotAvailableError and return 409 instead of 500 (#9359)
Re-apply changes on top of latest main to resolve merge conflicts.

- Add DatabaseLockNotAvailableError custom exception in orm/errors.py
- Catch asyncpg LockNotAvailableError and pgcode 55P03 in _handle_dbapi_error
- Register FastAPI exception handler returning 409 with Retry-After header



🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
382e216cbb fix(core): differentiate BYOK vs base provider in all LLM error details (#9425)
Add is_byok flag to every LLMError's details dict returned from
handle_llm_error across all providers (OpenAI, Anthropic, Google,
ChatGPT OAuth). This enables observability into whether errors
originate from Letta's production keys or user-provided BYOK keys.

The rate limit handler in app.py now returns a more helpful message
for BYOK users ("check your provider's rate limits and billing")
versus the generic message for base provider rate limits.

Datadog issues:
- https://us5.datadoghq.com/error-tracking/issue/b711c824-f490-11f0-96e4-da7ad0900000
- https://us5.datadoghq.com/error-tracking/issue/76623036-f4de-11f0-8697-da7ad0900000
- https://us5.datadoghq.com/error-tracking/issue/43e9888a-dfcf-11f0-a645-da7ad0900000

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
a00270d199 fix(core): handle UTF-8 surrogate characters in API responses (#9422)
* fix(core): handle UTF-8 surrogate characters in API responses

LLM responses or user input can contain surrogate characters (U+D800-U+DFFF)
which are valid Python strings but illegal in UTF-8. ORJSONResponse rejects
them with "str is not valid UTF-8: surrogates not allowed". Add
SafeORJSONResponse that catches the TypeError and strips surrogates before
retrying serialization.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: reuse sanitize_unicode_surrogates from json_helpers

Replace the inline _sanitize_surrogates function with the existing
sanitize_unicode_surrogates helper from letta.helpers.json_helpers,
which is already used across all LLM clients.

Co-authored-by: Kian Jones <kianjones9@users.noreply.github.com>

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
2026-02-24 10:52:07 -08:00
Shubham Naik
ca32311b9a feat: allow users to specify via query to stip messages [LET-7392] (#9411)
* feat: allow users to specify via query to stip messages

* chore: regenerate API SDK and OpenAPI spec [LET-7392]

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Ari Webb <AriWebb@users.noreply.github.com>
Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Ari Webb <AriWebb@users.noreply.github.com>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
0dde155e9a feat: Prefix cache optimization system prompt (#9381) 2026-02-24 10:52:07 -08:00
Kevin Lin
23c94ec6d3 feat: add log probabilities from OpenAI-compatible servers and SGLang native endpoint (#9240)
* Add log probabilities support for RL training

This enables Letta server to request and return log probabilities from
OpenAI-compatible providers (including SGLang) for use in RL training.

Changes:
- LLMConfig: Add return_logprobs and top_logprobs fields
- OpenAIClient: Set logprobs in ChatCompletionRequest when enabled
- LettaLLMAdapter: Add logprobs field and extract from response
- LettaResponse: Add logprobs field to return log probs to client
- LettaRequest: Add return_logprobs/top_logprobs for per-request override
- LettaAgentV3: Store and pass logprobs through to response
- agents.py: Handle request-level logprobs override

Usage:
  response = client.agents.messages.create(
      agent_id=agent_id,
      messages=[...],
      return_logprobs=True,
      top_logprobs=5,
  )
  print(response.logprobs)  # Per-token log probabilities

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* Add multi-turn token tracking for RL training via SGLang native endpoint

- Add TurnTokenData schema to track token IDs and logprobs per turn
- Add return_token_ids flag to LettaRequest and LLMConfig
- Create SGLangNativeClient for /generate endpoint (returns output_ids)
- Create SGLangNativeAdapter that uses native endpoint
- Modify LettaAgentV3 to accumulate turns across LLM calls
- Include turns in LettaResponse when return_token_ids=True

* Fix: Add SGLang native adapter to step() method, not just stream()

* Fix: Handle Pydantic Message objects in SGLang native adapter

* Fix: Remove api_key reference from LLMConfig (not present)

* Fix: Add missing 'created' field to ChatCompletionResponse

* Add full tool support to SGLang native adapter

- Format tools into prompt in Qwen-style format
- Parse tool calls from <tool_call> tags in response
- Format tool results as <tool_response> in user messages
- Set finish_reason to 'tool_calls' when tools are called

* Use tokenizer.apply_chat_template for proper tool formatting

- Add tokenizer caching in SGLang native adapter
- Use apply_chat_template when tokenizer available
- Fall back to manual formatting if not
- Convert Letta messages to OpenAI format for tokenizer

* Fix: Use func_response instead of tool_return for ToolReturn content

* Fix: Get output_token_logprobs from meta_info in SGLang response

* Fix: Allow None in output_token_logprobs (SGLang format includes null)

* chore: remove unrelated files from logprobs branch

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add missing call_type param to adapter constructors in letta_agent_v3

The SGLang refactor dropped call_type=LLMCallType.agent_step when extracting
adapter creation into conditional blocks. Restores it for all 3 spots (SGLang
in step, SimpleLLM in step, SGLang in stream).

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* just stage-api && just publish-api

* fix: update expected LLMConfig fields in schema test for logprobs support

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: remove rllm provider references

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* just stage-api && just publish-api

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Ubuntu <ubuntu@ip-172-31-65-206.ec2.internal>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
bbc648909b refactor: drop memory/ prefix from git memory repo file paths and update core memory rendering [LET-7356] (#9395) 2026-02-24 10:52:07 -08:00
Charles Packer
b0e16ae50f fix: surface GPT-5.3 Codex for ChatGPT OAuth providers (#9379) 2026-02-24 10:52:07 -08:00
Sarah Wooders
526da4c49b Revert "perf: optimize prefix caching by skipping system prompt rebuild on every step" (#9380)
Revert "perf: optimize prefix caching by skipping system prompt rebuild on ev…"

This reverts commit eafa4144c2577a45b7007a177b701863b98d1dfa.
2026-02-24 10:52:07 -08:00
Sarah Wooders
9dbe28e8f1 perf: optimize prefix caching by skipping system prompt rebuild on every step (#9080) 2026-02-24 10:52:07 -08:00
Kian Jones
32d87b70d7 fix(core): return 400 for ContextWindowExceededError instead of 500 (#9348)
SystemPromptTokenExceededError (a subclass of ContextWindowExceededError)
was hitting the generic 500 handler in Datadog. This is a client-side
configuration issue (system prompt too large for context window), so it
should return 400 Bad Request.

Datadog: https://us5.datadoghq.com/error-tracking/issue/3360e252-d5f1-11f0-8467-da7ad0900000
Also covers: https://us5.datadoghq.com/error-tracking/issue/6e50c57c

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
47aedfa1a7 fix(core): convert MCP ConnectionError to LettaMCPConnectionError for proper HTTP 502 responses (#9364)
MCP server connection failures were raising Python's builtin ConnectionError,
which bypassed the LettaMCPConnectionError FastAPI exception handler and hit
Datadog as unhandled 500 errors. Now all MCP client classes convert
ConnectionError to LettaMCPConnectionError at the source, which the existing
exception handler returns as a user-friendly 502.

Datadog: https://us5.datadoghq.com/error-tracking/issue/93db4a82-fe5a-11f0-85f0-da7ad0900000

🐛 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
cbbb6d776d fix(core): unwrap ExceptionGroup in OAuth stream generator (#9346)
Fixes Datadog issue 5efbb1d4-eec5-11f0-8f8e-da7ad0900000

Add ExceptionGroup unwrapping in OAuth stream exception handler.

The bug was caused by ExceptionGroup not being caught by the general
`except Exception` handler, since ExceptionGroup is a subclass of
BaseException, not Exception. This caused TaskGroup errors to escape
as unhandled ExceptionGroups in Datadog.

The fix adds an explicit ExceptionGroup handler before the general
Exception handler, following the same unwrapping pattern used in
other parts of the codebase (mcp_tool_executor.py, base_client.py).

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
d592ec3135 fix: handle DBAPIError wrapping asyncpg DeadlockDetectedError (#9355)
SQLAlchemy wraps asyncpg's DeadlockDetectedError in a DBAPIError,
which was falling through to the generic 500 handler. Now detected
at both the ORM level (_handle_dbapi_error) and FastAPI handler level,
returning 409 with Retry-After header.

Datadog: https://us5.datadoghq.com/error-tracking/issue/2f1dc54c-dab6-11f0-a828-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
369cdf72c7 feat(core): store block metadata as YAML frontmatter in .md files (#9365)
* feat(core): store block metadata as YAML frontmatter in .md files

Block .md files in git repos now embed metadata (description, limit,
read_only, metadata dict) as YAML frontmatter instead of a separate
metadata/blocks.json file. Only non-default values are rendered.

Format:
  ---
  description: "Who I am"
  limit: 5000
  ---
  Block value content here...

Changes:
- New block_markdown.py utility (serialize_block / parse_block_markdown)
- Updated all three write/read paths: manager.py, memfs_client.py,
  memfs_client_base.py
- block_manager_git.py now passes description/limit/read_only/metadata
  through to git commits
- Post-push sync (git_http.py) parses frontmatter and syncs metadata
  fields to Postgres
- Removed metadata/blocks.json reads/writes entirely
- Backward compat: files without frontmatter treated as raw value
- Integration test verifies frontmatter in cloned files and metadata
  sync via git push

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: derive frontmatter defaults from BaseBlock schema, not hardcoded dict

Remove _DEFAULTS dict from block_markdown.py. The core version now
imports BaseBlock and reads field defaults via model_fields. This
fixes the limit default (was 5000, should be CORE_MEMORY_BLOCK_CHAR_LIMIT=20000).

Also:
- memfs-py copy simplified to parse-only (no serialize, no letta imports)
- All hardcoded limit=5000 fallbacks replaced with CORE_MEMORY_BLOCK_CHAR_LIMIT
- Test updated: blocks with all-default metadata correctly have no frontmatter;
  frontmatter verified after setting non-default description via API

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: always include description and limit in frontmatter

description and limit are always rendered in the YAML frontmatter,
even when at their default values. Only read_only and metadata are
conditional (omitted when at defaults).

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: resolve read_only from block_update before git commit

read_only was using the old Postgres value instead of the update value
when committing to git. Also adds integration test coverage for
read_only: true appearing in frontmatter after API PATCH, and
verifying it's omitted when false (default).

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: add API→git round-trip coverage for description and limit

Verifies that PATCH description/limit via API is reflected in
frontmatter after git pull. Combined with the existing push→API
test (step 6), this gives full bidirectional coverage:
- API edit description/limit → pull → frontmatter updated
- Push frontmatter with description/limit → API reflects changes

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Sarah Wooders
93e453ef8f fix(core): transform nested block labels on git memory enable (#9339)
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:07 -08:00
Kian Jones
e7039470e9 fix(core): handle ExceptionGroup-wrapped ToolError and McpError in MCP tool execution (#9328)
* fix(core): handle ExceptionGroup-wrapped ToolError and McpError in MCP tool execution

Fixes 3 related Datadog bugs (all fastmcp.exceptions.ToolError):
- 75d43daa-ff04-11f0-81b2-da7ad0900000
- 7af6373e-0080-11f1-9855-da7ad0900000
- a322edc8-fffa-11f0-b26c-da7ad0900000

These errors were caused by ToolError and McpError exceptions bubbling up
unhandled from the MCP REST endpoint. This fix combines the approaches from
PRs #9320 and #9321:

1. Handle ExceptionGroup wrapping (Python 3.11+ async TaskGroup)
2. Check for ToolError by class name to handle module variations
3. Convert ToolError to LettaInvalidArgumentError for proper client response
4. Catch McpError and return HTTP 500 with proper error message

Issue-IDs: 75d43daa-ff04-11f0-81b2-da7ad0900000, 7af6373e-0080-11f1-9855-da7ad0900000, a322edc8-fffa-11f0-b26c-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: return 422 instead of 500 for McpError (user config issue)

* fix: use LettaMCPConnectionError instead of HTTPException for McpError

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
cthomas
70d749e859 fix(core): add retry with backoff for post-push sync race condition (#9335)
When git push completes, the webhook fires immediately but GCS upload
may still be in progress. This causes KeyError when trying to read
commit objects that haven't been uploaded yet.

Add retry with exponential backoff (1s, 2s, 4s) to handle this race.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Kian Jones
69fc934135 fix(core): handle transient database connection errors with retry logic (#9324)
Add retry mechanism for ConnectionError during asyncpg SSL handshake failures.
Implements exponential backoff (3 attempts) and returns 503 on exhaustion.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

Issue-ID: 8caf1136-0200-11f1-8f4d-da7ad0900000
2026-02-24 10:52:06 -08:00
cthomas
05b77a5fed fix(core): update memfs client to use memory/ directory (#9322)
PR #9309 changed the block storage from blocks/ to memory/ directory.
Update memfs_client.py and memfs_client_base.py to match.

🐾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
cthomas
0bdd555f33 feat: add memfs-py service (#9315)
* feat: add memfs-py service

* add tf for bucket access and secrets v2 access

* feat(memfs): add helm charts, deploy workflow, and bug fixes

- Add dev helm chart (helm/dev/memfs-py/) with CSI secrets pattern
- Update prod helm chart with CSI secrets and correct service account
- Add GitHub Actions deploy workflow
- Change port from 8284 to 8285 to avoid conflict with core's dulwich sidecar
- Fix chunked transfer encoding issue (strip HTTP_TRANSFER_ENCODING header)
- Fix timestamp parsing to handle both ISO and HTTP date formats
- Fix get_head_sha to raise FileNotFoundError on 404

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Kian Jones <kian@letta.com>
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
21e880907f feat(core): structure memory directory and block labels [LET-7336] (#9309) 2026-02-24 10:52:06 -08:00
Kian Jones
be60697a62 fix(core): handle protocol errors and foreign key violations (#9308)
* fix(core): handle PermissionDeniedError in provider API key validation

Fixed OpenAI PermissionDeniedError being raised as unknown error when
validating provider API keys. The check_api_key methods in OpenAI-based
providers (OpenAI, OpenRouter, Azure, Together) now properly catch and
re-raise PermissionDeniedError as LLMPermissionDeniedError.

🐛 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): handle Unicode surrogates in OpenAI requests

Sanitize invalid UTF-16 surrogates before sending requests to OpenAI API.
Fixes UnicodeEncodeError when message content contains unpaired surrogates
from corrupted emoji data or malformed Unicode sequences.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): handle MCP tool schema validation errors gracefully

Catch fastmcp.exceptions.ToolError in execute_mcp_tool endpoint and
convert to LettaInvalidArgumentError (400) instead of letting it
propagate as 500 error. This is an expected user error when tool
arguments don't match the MCP tool's schema.

Fixes Datadog issue 8f2d874a-f8e5-11f0-9b25-da7ad0900000

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): handle ExceptionGroup-wrapped ToolError in MCP executor

When MCP tools fail with validation errors (e.g., missing required parameters),
fastmcp raises ToolError exceptions that may be wrapped in ExceptionGroup by
Python's async TaskGroup. The exception handler now unwraps single-exception
groups before checking if the error should be handled gracefully.

Fixes Calendly API "organization parameter missing" errors being logged to
Datadog instead of returning friendly error messages to users.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: handle missing agent in create_conversation to prevent foreign key violation

* Update .gitignore

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
93249b96f5 test(core): git-backed memory repo integration (real object store) (#9298)
---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
jnjpng
0bdedb3c0f feat: agent generate endpoint (#9304)
* base

* update

* clean up

* update
2026-02-24 10:52:06 -08:00
Ari Webb
5c6ca705f1 Revert "feat: bring back use message packing for timezone [LET-6846]" (#9302)
Revert "feat: bring back use message packing for timezone [LET-6846] (#9256)"

This reverts commit c5017cccdef95b84fc585b26a0ddc5b7e44eb7c9.
2026-02-24 10:52:06 -08:00
jnjpng
ff69c6a32e feat: add /agents/{agent_id}/generate endpoint for direct LLM requests (#9272)
* feat: add /agents/{agent_id}/generate endpoint for direct LLM requests

Add new endpoint that makes direct LLM provider requests without agent
context, memory, tools, or state modification. This enables:
- Quick LLM queries without agent overhead
- Testing model configurations
- Simple chat completions using agent's credentials
- Comparing responses across different models

Features:
- Uses agent's LLM config by default
- Supports model override with full provider config resolution
- Non-streaming, stateless operation
- Proper error handling and validation
- Request/response schemas with Pydantic validation

Implementation:
- Add GenerateRequest and GenerateResponse schemas
- Implement generate_completion endpoint handler
- Add necessary imports (LLMError, LLMClient, HandleNotFoundError)
- Include logging and comprehensive error handling

* fix: improve error handling and fix Message construction

- Fix critical bug: use content=[TextContent(text=...)] instead of text=...
- Add explicit error handling for NoResultFound and HandleNotFoundError
- Add error handling for convert_response_to_chat_completion
- Add structured logging for debugging
- Remove unnecessary .get() calls since Pydantic validates messages

* refactor: extract generate logic to AgentCompletionService

Move the generate endpoint business logic out of the endpoint handler
into a dedicated AgentCompletionService class for better code organization
and separation of concerns.

Changes:
- Create new AgentCompletionService in services/agent_completion_service.py
- Service handles all business logic: agent validation, LLM config resolution,
  message conversion, LLM client creation, and request/response processing
- Integrate service with SyncServer initialization
- Refactor generate_completion endpoint to use the service
- Endpoint now only handles HTTP concerns (auth, error mapping)

Benefits:
- Cleaner endpoint code (reduced from ~140 lines to ~25 lines)
- Better separation of concerns (HTTP vs business logic)
- Service logic can be reused or tested independently
- Follows established patterns in the codebase (AgentManager, etc.)

* feat: simplify generate API to accept just prompt text

Simplify the client interface by accepting a simple prompt string instead
of requiring clients to format messages.

Changes:
- Update GenerateRequest schema:
  - Replace 'messages' array with simple 'prompt' string
  - Add optional 'system_prompt' for context/instructions
  - Keep 'override_model' for model selection
- Update AgentCompletionService to format messages automatically:
  - Accepts prompt and optional system_prompt
  - Constructs message array internally (system + user messages)
  - Simpler API surface for clients
- Update endpoint documentation with new simplified examples
- Regenerate OpenAPI spec and TypeScript SDK

Benefits:
- Much simpler client experience - just send text
- No need to understand message formatting
- Still supports system prompts for context
- Cleaner API that matches common use cases

Example (before):
{
  "messages": [{"role": "user", "content": "What is 2+2?"}]
}

Example (after):
{
  "prompt": "What is 2+2?"
}

* test: add comprehensive integration tests for generate endpoint

Add 9 integration tests covering various scenarios:

Happy path tests:
- test_agent_generate_basic: Basic prompt -> response flow
- test_agent_generate_with_system_prompt: System prompt + user prompt
- test_agent_generate_with_model_override: Override model selection
- test_agent_generate_long_prompt: Handle longer prompts
- test_agent_generate_no_persistence: Verify no messages saved to agent

Error handling tests:
- test_agent_generate_empty_prompt_error: Empty prompt validation (422)
- test_agent_generate_invalid_agent_id: Invalid agent ID (404)
- test_agent_generate_invalid_model_override: Invalid model handle (404)

All tests verify:
- Response structure (content, model, usage)
- Proper status codes for errors
- Usage statistics (tokens, counts)
- No side effects on agent state

Tests follow existing test patterns in test_client.py and use the
letta_client SDK (assuming generate_completion method is auto-generated
from the OpenAPI spec).

* openapi

* refactor: rename AgentCompletionService to AgentGenerateCompletionManager

Rename for better clarity and consistency with codebase naming conventions:
- Rename file: agent_completion_service.py → agent_generate_completion_manager.py
- Rename class: AgentCompletionService → AgentGenerateCompletionManager
- Rename attribute: server.agent_completion_service → server.agent_generate_completion_manager
- Update docstrings: 'Service' → 'Manager'

Changes:
- apps/core/letta/services/agent_generate_completion_manager.py (renamed + updated class)
- apps/core/letta/server/server.py (import + initialization)
- apps/core/letta/server/rest_api/routers/v1/agents.py (usage in endpoint)

No functional changes, purely a naming refactor.

* fix: remove invalid Message parameters in generate manager

Remove agent_id=None and user_id=None from Message construction.
The Message model doesn't accept these as None values - only pass
required parameters (role, content).

Fixes validation error:
  'Extra inputs are not permitted [type=extra_forbidden, input_value=None]'

This aligns with other Message construction patterns in the codebase
(see tools.py, memory.py examples).

* feat: improve generate endpoint validation and tests

- Add field validator for whitespace-only prompts
- Always include system message (required by Anthropic)
- Use default "You are a helpful assistant." when no system_prompt provided
- Update tests to use direct HTTP calls via httpx
- Fix test issues:
  - Use valid agent ID format (agent-{uuid})
  - Use available model (openai/gpt-4o-mini)
  - Add whitespace validation test
- All 9 integration tests passing
2026-02-24 10:52:06 -08:00
Sarah Wooders
597c7f3e3e fix(core): git HTTP org fallback + post-push sync (#9297) 2026-02-24 10:52:06 -08:00
cthomas
530d33c254 feat: add skills support to agentfile (#9287) 2026-02-24 10:52:06 -08:00
Sarah Wooders
fdc32f6054 fix(core): derive dulwich org context from user_id fallback (#9296)
* fix(core): derive dulwich org context from user_id fallback

Make git smart HTTP more robust in prod by:
- normalizing/injecting a single X-Organization-Id header in the FastAPI proxy
- keeping dulwich org contextvar set through WSGI iteration
- falling back to resolving org_id from user_id when X-Organization-Id is missing
- adding opt-in debug logs (env LETTA_GIT_HTTP_DEBUG_LOGS or letta_debug query)

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor(core): drop user->org cache in dulwich org fallback

Keep the dulwich org_id fallback simple by resolving org_id from user_id via
UserManager lookup when X-Organization-Id is missing, without maintaining an
in-process cache.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore(core): make git HTTP debug logging always-on

Remove opt-in toggles for git HTTP debug logs and log proxy + dulwich request
context for every git smart-HTTP request.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
df85ee970b fix: fix org id again (#9292)
* fix(core): pass org_id to dulwich via header for git HTTP

* fix(core): use actor org id for git HTTP org header

Git smart HTTP proxies were reading `organization_id` from AgentState, which
is not present and caused 500s during clone/push. Use the authenticated
actor's org id while still performing an authorization check on the agent.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
49d354bac1 fix(core): pass org_id to dulwich via header for git HTTP (#9291) 2026-02-24 10:52:06 -08:00
Ari Webb
426f6a8ca4 feat: bring back use message packing for timezone [LET-6846] (#9256)
* feat: bring back use message packing for timezone

* add tests
2026-02-24 10:52:06 -08:00
Sarah Wooders
50a60c1393 feat: git smart HTTP for agent memory repos (#9257)
* feat(core): add git-backed memory repos and block manager

Introduce a GCS-backed git repository per agent as the source of truth for core
memory blocks. Add a GitEnabledBlockManager that writes block updates to git and
syncs values back into Postgres as a cache.

Default newly-created memory repos to the `main` branch.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat(core): serve memory repos over git smart HTTP

Run dulwich's WSGI HTTPGitApplication on a local sidecar port and proxy
/v1/git/* through FastAPI to support git clone/fetch/push directly against
GCS-backed memory repos.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): create memory repos on demand and stabilize git HTTP

- Ensure MemoryRepoManager creates the git repo on first write (instead of 500ing)
  and avoids rewriting history by only auto-creating on FileNotFoundError.
- Simplify dulwich-thread async execution and auto-create empty repos on first
  git clone.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): make dulwich optional for CI installs

Guard dulwich imports in the git smart HTTP router so the core server can boot
(and CI tests can run) without installing the memory-repo extra.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): guard git HTTP WSGI init when dulwich missing

Avoid instantiating dulwich's HTTPGitApplication at import time when dulwich
isn't installed (common in CI installs).

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): avoid masking send_message errors in finally

Initialize `result` before the agent loop so error paths (e.g. approval
validation) don't raise UnboundLocalError in the run-tracking finally block.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): stop event loop watchdog on FastAPI shutdown

Ensure the EventLoopWatchdog thread is stopped during FastAPI lifespan
shutdown to avoid daemon threads logging during interpreter teardown (seen in CI
unit tests).

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore(core): remove send_*_message_to_agent from SyncServer

Drop send_message_to_agent and send_group_message_to_agent from SyncServer and
route internal fire-and-forget messaging through send_messages helpers instead.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): backfill git memory repo when tag added

When an agent is updated to include the git-memory-enabled tag, ensure the
git-backed memory repo is created and initialized from the agent's current
blocks. Also support configuring the memory repo object store via
LETTA_OBJECT_STORE_URI.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): preserve block tags on git-enabled updates

When updating a block for a git-memory-enabled agent, keep block tags in sync
with PostgreSQL (tags are not currently stored in the git repo).

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore(core): remove git-state legacy shims

- Rename optional dependency extra from memory-repo to git-state
- Drop legacy object-store env aliases and unused region config
- Simplify memory repo metadata to a single canonical format
- Remove unused repo-cache invalidation helper

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix(core): keep PR scope for git-backed blocks

- Revert unrelated change in fire-and-forget multi-agent send helper
- Route agent block updates-by-label through injected block manager only when needed

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
Sarah Wooders
eaf64fb510 fix: add LLMCallType enum and ensure call_type is set on all provider traces (#9258)
Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
cthomas
96c4b7175e fix: initialize result and run_status before try block (#9264)
Fixes UnboundLocalError when client disconnects (EndOfStream) during
request processing - the finally block tried to access `result` which
was never assigned.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:06 -08:00
jnjpng
9bf10b4761 fix: initialize result and run_status before try block in send_message (#9265)
Initialize `result` and `run_status` variables before the try block
to prevent UnboundLocalError in the finally block when exceptions
occur early in execution (e.g., during AgentLoop.load()).

Previously, if an exception was raised before `result` was assigned
inside the try block, the finally block would fail when trying to
access `result`, masking the original error with an UnboundLocalError.

This fix ensures the finally block can safely check and use these
variables regardless of when/where an exception occurs.
2026-02-24 10:52:06 -08:00
Sarah Wooders
3fdf2b6c79 chore: deprecate old agent messaging (#9120) 2026-02-24 10:52:06 -08:00
Sarah Wooders
4096b30cd7 feat: log LLM traces to clickhouse (#9111)
* feat: add non-streaming option for conversation messages

- Add ConversationMessageRequest with stream=True default (backwards compatible)
- stream=true (default): SSE streaming via StreamingService
- stream=false: JSON response via AgentLoop.load().step()

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: regenerate API schema for ConversationMessageRequest

* feat: add direct ClickHouse storage for raw LLM traces

    Adds ability to store raw LLM request/response payloads directly in ClickHouse,
    bypassing OTEL span attribute size limits. This enables debugging and analytics
    on large LLM payloads (>10MB system prompts, large tool schemas, etc.).

    New files:
    - letta/schemas/llm_raw_trace.py: Pydantic schema with ClickHouse row helper
    - letta/services/llm_raw_trace_writer.py: Async batching writer (fire-and-forget)
    - letta/services/llm_raw_trace_reader.py: Reader with query methods
    - scripts/sql/clickhouse/llm_raw_traces.ddl: Production table DDL
    - scripts/sql/clickhouse/llm_raw_traces_local.ddl: Local dev DDL
    - apps/core/clickhouse-init.sql: Local dev initialization

    Modified:
    - letta/settings.py: Added 4 settings (store_llm_raw_traces, ttl, batch_size, flush_interval)
    - letta/llm_api/llm_client_base.py: Integration into request_async_with_telemetry
    - compose.yaml: Added ClickHouse service for local dev
    - justfile: Added clickhouse, clickhouse-cli, clickhouse-traces commands

    Feature disabled by default (LETTA_STORE_LLM_RAW_TRACES=false).
    Uses ZSTD(3) compression for 10-30x reduction on JSON payloads.

    🤖 Generated with [Letta Code](https://letta.com)

    Co-Authored-By: Letta <noreply@letta.com>

* fix: address code review feedback for LLM raw traces

Fixes based on code review feedback:

1. Fix ClickHouse endpoint parsing - default to secure=False for raw host:port
   inputs (was defaulting to HTTPS which breaks local dev)

2. Make raw trace writes truly fire-and-forget - use asyncio.create_task()
   instead of awaiting, so JSON serialization doesn't block request path

3. Add bounded queue (maxsize=10000) - prevents unbounded memory growth
   under load. Drops traces with warning if queue is full.

4. Fix deprecated asyncio usage - get_running_loop() instead of get_event_loop()

5. Add org_id fallback - use _telemetry_org_id if actor doesn't have it

6. Remove unused imports - json import in reader

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add missing asyncio import and simplify JSON serialization

- Add missing 'import asyncio' that was causing 'name asyncio is not defined' error
- Remove unnecessary clean_double_escapes() function - the JSON is stored correctly,
  the clickhouse-client CLI was just adding extra escaping when displaying
- Update just clickhouse-trace to use Python client for correct JSON output

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: add clickhouse raw trace integration test

* test: simplify clickhouse trace assertions

* refactor: centralize usage parsing and stream error traces

Use per-client usage helpers for raw trace extraction and ensure streaming errors log requests with error metadata.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: exercise provider usage parsing live

Make live OpenAI/Anthropic/Gemini requests with credential gating and validate Anthropic cache usage mapping when present.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* test: fix usage parsing tests to pass

- Use GoogleAIClient with GEMINI_API_KEY instead of GoogleVertexClient
- Update model to gemini-2.0-flash (1.5-flash deprecated in v1beta)
- Add tools=[] for Gemini/Anthropic build_request_data

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: extract_usage_statistics returns LettaUsageStatistics

Standardize on LettaUsageStatistics as the canonical usage format returned by client helpers. Inline UsageStatistics construction for ChatCompletionResponse where needed.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: add is_byok and llm_config_json columns to ClickHouse traces

Extend llm_raw_traces table with:
- is_byok (UInt8): Track BYOK vs base provider usage for billing analytics
- llm_config_json (String, ZSTD): Store full LLM config for debugging and analysis

This enables queries like:
- BYOK usage breakdown by provider/model
- Config parameter analysis (temperature, max_tokens, etc.)
- Debugging specific request configurations

* feat: add tests for error traces, llm_config_json, and cache tokens

- Update llm_raw_trace_reader.py to query new columns (is_byok,
  cached_input_tokens, cache_write_tokens, reasoning_tokens, llm_config_json)
- Add test_error_trace_stored_in_clickhouse to verify error fields
- Add test_cache_tokens_stored_for_anthropic to verify cache token storage
- Update existing tests to verify llm_config_json is stored correctly
- Make llm_config required in log_provider_trace_async()
- Simplify provider extraction to use provider_name directly

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* ci: add ClickHouse integration tests to CI pipeline

- Add use-clickhouse option to reusable-test-workflow.yml
- Add ClickHouse service container with otel database
- Add schema initialization step using clickhouse-init.sql
- Add ClickHouse env vars (CLICKHOUSE_ENDPOINT, etc.)
- Add separate clickhouse-integration-tests job running
  integration_test_clickhouse_llm_raw_traces.py

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: simplify provider and org_id extraction in raw trace writer

- Use model_endpoint_type.value for provider (not provider_name)
- Simplify org_id to just self.actor.organization_id (actor is always pydantic)

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: simplify LLMRawTraceWriter with _enabled flag

- Check ClickHouse env vars once at init, set _enabled flag
- Early return in write_async/flush_async if not enabled
- Remove ValueError raises (never used)
- Simplify _get_client (no validation needed since already checked)

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: add LLMRawTraceWriter shutdown to FastAPI lifespan

Properly flush pending traces on graceful shutdown via lifespan
instead of relying only on atexit handler.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: add agent_tags column to ClickHouse traces

Store agent tags as Array(String) for filtering/analytics by tag.

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* cleanup

* fix(ci): fix ClickHouse schema initialization in CI

- Create database separately before loading SQL file
- Remove CREATE DATABASE from SQL file (handled in CI step)
- Add verification step to confirm table was created
- Use -sf flag for curl to fail on HTTP errors

🐾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: simplify LLM trace writer with ClickHouse async_insert

- Use ClickHouse async_insert for server-side batching instead of manual queue/flush loop
- Sync cloud DDL schema with clickhouse-init.sql (add missing columns)
- Remove redundant llm_raw_traces_local.ddl
- Remove unused batch_size/flush_interval settings
- Update tests for simplified writer

Key changes:
- async_insert=1, wait_for_async_insert=1 for reliable server-side batching
- Simple per-trace retry with exponential backoff (max 3 retries)
- ~150 lines removed from writer

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: consolidate ClickHouse direct writes into TelemetryManager backend

- Add clickhouse_direct backend to provider_trace_backends
- Remove duplicate ClickHouse write logic from llm_client_base.py
- Configure via LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND=postgres,clickhouse_direct

The clickhouse_direct backend:
- Converts ProviderTrace to LLMRawTrace
- Extracts usage stats from response JSON
- Writes via LLMRawTraceWriter with async_insert

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: address PR review comments and fix llm_config bug

Review comment fixes:
- Rename clickhouse_direct -> clickhouse_analytics (clearer purpose)
- Remove ClickHouse from OSS compose.yaml, create separate compose.clickhouse.yaml
- Delete redundant scripts/test_llm_raw_traces.py (use pytest tests)
- Remove unused llm_raw_traces_ttl_days setting (TTL handled in DDL)
- Fix socket description leak in telemetry_manager docstring
- Add cloud-only comment to clickhouse-init.sql
- Update justfile to use separate compose file

Bug fix:
- Fix llm_config not being passed to ProviderTrace in telemetry
- Now correctly populates provider, model, is_byok for all LLM calls
- Affects both request_async_with_telemetry and log_provider_trace_async

DDL optimizations:
- Add secondary indexes (bloom_filter for agent_id, model, step_id)
- Add minmax indexes for is_byok, is_error
- Change model and error_type to LowCardinality for faster GROUP BY

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: rename llm_raw_traces -> llm_traces

Address review feedback that "raw" is misleading since we denormalize fields.

Renames:
- Table: llm_raw_traces -> llm_traces
- Schema: LLMRawTrace -> LLMTrace
- Files: llm_raw_trace_{reader,writer}.py -> llm_trace_{reader,writer}.py
- Setting: store_llm_raw_traces -> store_llm_traces

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: update workflow references to llm_traces

Missed renaming table name in CI workflow files.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: update clickhouse_direct -> clickhouse_analytics in docstring

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: remove inaccurate OTEL size limit comments

The 4MB limit is our own truncation logic, not an OTEL protocol limit.
The real benefit is denormalized columns for analytics queries.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: remove local ClickHouse dev setup (cloud-only feature)

- Delete clickhouse-init.sql and compose.clickhouse.yaml
- Remove local clickhouse just commands
- Update CI to use cloud DDL with MergeTree for testing

clickhouse_analytics is a cloud-only feature. For local dev, use postgres backend.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: restore compose.yaml to match main

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* refactor: merge clickhouse_analytics into clickhouse backend

Per review feedback - having two separate backends was confusing.

Now the clickhouse backend:
- Writes to llm_traces table (denormalized for cost analytics)
- Reads from OTEL traces table (will cut over to llm_traces later)

Config: LETTA_TELEMETRY_PROVIDER_TRACE_BACKEND=postgres,clickhouse

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: correct path to DDL file in CI workflow

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* chore: add provider index to DDL for faster filtering

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: configure telemetry backend in clickhouse tests

Tests need to set telemetry_settings.provider_trace_backends to include
'clickhouse', otherwise traces are routed to default postgres backend.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: set provider_trace_backend field, not property

provider_trace_backends is a computed property, need to set the
underlying provider_trace_backend string field instead.

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: error trace test and error_type extraction

- Add TelemetryManager to error trace test so traces get written
- Fix error_type extraction to check top-level before nested error dict

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: use provider_trace.id for trace correlation across backends

- Pass provider_trace.id to LLMTrace instead of auto-generating
- Log warning if ID is missing (shouldn't happen, helps debug)
- Fallback to new UUID only if not set

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* fix: trace ID correlation and concurrency issues

- Strip "provider_trace-" prefix from ID for UUID storage in ClickHouse
- Add asyncio.Lock to serialize writes (clickhouse_connect not thread-safe)
- Fix Anthropic prompt_tokens to include cached tokens for cost analytics
- Log warning if provider_trace.id is missing

🤖 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

---------

Co-authored-by: Letta <noreply@letta.com>
Co-authored-by: Caren Thomas <carenthomas@gmail.com>
2026-02-24 10:52:06 -08:00
Kian Jones
68eb076135 Revert "fix: force statement_timeout=0 on all database connections" (#9226)
Revert "fix: force statement_timeout=0 on all database connections (#9184)"

This reverts commit 0d3d9ea76bae586520ae8f50badb203ffd441675.
2026-02-24 10:52:06 -08:00
jnjpng
e25a0c9cdf feat: update compact endpoint to store summary message (#9215)
* base

* add tests
2026-02-24 10:52:06 -08:00
jnjpng
d28ccc0be6 feat: add summary message and event on compaction (#9144)
* base

* update

* update

* revert formatting

* routes

* legacy

* fix

* review

* update
2026-02-24 10:52:05 -08:00
Kian Jones
3b5251fbd6 fix: force statement_timeout=0 on all database connections (#9184)
Cloud SQL's connection pooler can return connections with stale
session settings from previous clients. If any client (Datadog,
monitoring tools, manual psql) sets statement_timeout, that setting
persists on the pooled connection.

This explicitly sets statement_timeout=0 via asyncpg's server_settings
on every connection, ensuring consistent behavior regardless of
connection pool state.

🤖 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-02-24 10:52:05 -08:00
Ari Webb
7b0b1f2531 fix: warning (#9179)
* fix: warning

* just stage publish api

* note

* api
2026-02-24 10:52:05 -08:00
Kian Jones
00045afc3c chore: add statement timeout log (#9177)
* fix: remove duplicate 'Tag and push latest' step from crouton workflow

* startuplog

* Update apps/core/letta/server/rest_api/app.py
2026-01-29 12:44:04 -08:00
cthomas
d992aa0df4 fix: non-streaming conversation messages endpoint (#9159)
* fix: non-streaming conversation messages endpoint

**Problems:**
1. `AssertionError: run_id is required when enforce_run_id_set is True`
   - Non-streaming path didn't create a run before calling `step()`

2. `ResponseValidationError: Unable to extract tag using discriminator 'message_type'`
   - `response_model=LettaStreamingResponse` but non-streaming returns `LettaResponse`

**Fixes:**
1. Add run creation before calling `step()` (mirrors agents endpoint)
2. Set run_id in Redis for cancellation support
3. Pass `run_id` to `step()`
4. Change `response_model` from `LettaStreamingResponse` to `LettaResponse`
   (streaming returns `StreamingResponse` which bypasses response_model validation)

**Test:**
Added `test_conversation_non_streaming_raw_http` to verify the fix.

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* api sync

---------

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
cthomas
dd7e28fae6 fix: return 400 instead of 500 for image fetch errors (#9157)
**Problem:**
When a user sends a message with an image URL that times out or fails to
fetch, the server returns a 500 Internal Server Error with a generic message.
This is confusing because the user doesn't know what went wrong.

**Root Cause:**
`LettaImageFetchError` was not registered in the exception handlers, so it
bubbled up as an unhandled exception.

**Fix:**
Register `LettaImageFetchError` with the 400 Bad Request handler. Now users
get a clear error message like:
```
Failed to fetch image from https://...: Timeout after 2 attempts
```

This tells users exactly what went wrong so they can retry with a different
image or verify the URL is accessible.

👾 Generated with [Letta Code](https://letta.com)

Co-authored-by: Letta <noreply@letta.com>
2026-01-29 12:44:04 -08:00
Shubham Naik
bb2145c24c connections (#9113)
* chore: release code

* chore: release code

* chore: release code

* chore: release code

* chore: release code

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: change paths

* chore: remote

* chore: support  multi project chat
2026-01-29 12:44:04 -08:00