* feat: add memfs-py service
* add tf for bucket access and secrets v2 access
* feat(memfs): add helm charts, deploy workflow, and bug fixes
- Add dev helm chart (helm/dev/memfs-py/) with CSI secrets pattern
- Update prod helm chart with CSI secrets and correct service account
- Add GitHub Actions deploy workflow
- Change port from 8284 to 8285 to avoid conflict with core's dulwich sidecar
- Fix chunked transfer encoding issue (strip HTTP_TRANSFER_ENCODING header)
- Fix timestamp parsing to handle both ISO and HTTP date formats
- Fix get_head_sha to raise FileNotFoundError on 404
🐾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Kian Jones <kian@letta.com>
Co-authored-by: Letta <noreply@letta.com>
* feat: add /agents/{agent_id}/generate endpoint for direct LLM requests
Add new endpoint that makes direct LLM provider requests without agent
context, memory, tools, or state modification. This enables:
- Quick LLM queries without agent overhead
- Testing model configurations
- Simple chat completions using agent's credentials
- Comparing responses across different models
Features:
- Uses agent's LLM config by default
- Supports model override with full provider config resolution
- Non-streaming, stateless operation
- Proper error handling and validation
- Request/response schemas with Pydantic validation
Implementation:
- Add GenerateRequest and GenerateResponse schemas
- Implement generate_completion endpoint handler
- Add necessary imports (LLMError, LLMClient, HandleNotFoundError)
- Include logging and comprehensive error handling
* fix: improve error handling and fix Message construction
- Fix critical bug: use content=[TextContent(text=...)] instead of text=...
- Add explicit error handling for NoResultFound and HandleNotFoundError
- Add error handling for convert_response_to_chat_completion
- Add structured logging for debugging
- Remove unnecessary .get() calls since Pydantic validates messages
* refactor: extract generate logic to AgentCompletionService
Move the generate endpoint business logic out of the endpoint handler
into a dedicated AgentCompletionService class for better code organization
and separation of concerns.
Changes:
- Create new AgentCompletionService in services/agent_completion_service.py
- Service handles all business logic: agent validation, LLM config resolution,
message conversion, LLM client creation, and request/response processing
- Integrate service with SyncServer initialization
- Refactor generate_completion endpoint to use the service
- Endpoint now only handles HTTP concerns (auth, error mapping)
Benefits:
- Cleaner endpoint code (reduced from ~140 lines to ~25 lines)
- Better separation of concerns (HTTP vs business logic)
- Service logic can be reused or tested independently
- Follows established patterns in the codebase (AgentManager, etc.)
* feat: simplify generate API to accept just prompt text
Simplify the client interface by accepting a simple prompt string instead
of requiring clients to format messages.
Changes:
- Update GenerateRequest schema:
- Replace 'messages' array with simple 'prompt' string
- Add optional 'system_prompt' for context/instructions
- Keep 'override_model' for model selection
- Update AgentCompletionService to format messages automatically:
- Accepts prompt and optional system_prompt
- Constructs message array internally (system + user messages)
- Simpler API surface for clients
- Update endpoint documentation with new simplified examples
- Regenerate OpenAPI spec and TypeScript SDK
Benefits:
- Much simpler client experience - just send text
- No need to understand message formatting
- Still supports system prompts for context
- Cleaner API that matches common use cases
Example (before):
{
"messages": [{"role": "user", "content": "What is 2+2?"}]
}
Example (after):
{
"prompt": "What is 2+2?"
}
* test: add comprehensive integration tests for generate endpoint
Add 9 integration tests covering various scenarios:
Happy path tests:
- test_agent_generate_basic: Basic prompt -> response flow
- test_agent_generate_with_system_prompt: System prompt + user prompt
- test_agent_generate_with_model_override: Override model selection
- test_agent_generate_long_prompt: Handle longer prompts
- test_agent_generate_no_persistence: Verify no messages saved to agent
Error handling tests:
- test_agent_generate_empty_prompt_error: Empty prompt validation (422)
- test_agent_generate_invalid_agent_id: Invalid agent ID (404)
- test_agent_generate_invalid_model_override: Invalid model handle (404)
All tests verify:
- Response structure (content, model, usage)
- Proper status codes for errors
- Usage statistics (tokens, counts)
- No side effects on agent state
Tests follow existing test patterns in test_client.py and use the
letta_client SDK (assuming generate_completion method is auto-generated
from the OpenAPI spec).
* openapi
* refactor: rename AgentCompletionService to AgentGenerateCompletionManager
Rename for better clarity and consistency with codebase naming conventions:
- Rename file: agent_completion_service.py → agent_generate_completion_manager.py
- Rename class: AgentCompletionService → AgentGenerateCompletionManager
- Rename attribute: server.agent_completion_service → server.agent_generate_completion_manager
- Update docstrings: 'Service' → 'Manager'
Changes:
- apps/core/letta/services/agent_generate_completion_manager.py (renamed + updated class)
- apps/core/letta/server/server.py (import + initialization)
- apps/core/letta/server/rest_api/routers/v1/agents.py (usage in endpoint)
No functional changes, purely a naming refactor.
* fix: remove invalid Message parameters in generate manager
Remove agent_id=None and user_id=None from Message construction.
The Message model doesn't accept these as None values - only pass
required parameters (role, content).
Fixes validation error:
'Extra inputs are not permitted [type=extra_forbidden, input_value=None]'
This aligns with other Message construction patterns in the codebase
(see tools.py, memory.py examples).
* feat: improve generate endpoint validation and tests
- Add field validator for whitespace-only prompts
- Always include system message (required by Anthropic)
- Use default "You are a helpful assistant." when no system_prompt provided
- Update tests to use direct HTTP calls via httpx
- Fix test issues:
- Use valid agent ID format (agent-{uuid})
- Use available model (openai/gpt-4o-mini)
- Add whitespace validation test
- All 9 integration tests passing
* feat(core): add git-backed memory repos and block manager
Introduce a GCS-backed git repository per agent as the source of truth for core
memory blocks. Add a GitEnabledBlockManager that writes block updates to git and
syncs values back into Postgres as a cache.
Default newly-created memory repos to the `main` branch.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* feat(core): serve memory repos over git smart HTTP
Run dulwich's WSGI HTTPGitApplication on a local sidecar port and proxy
/v1/git/* through FastAPI to support git clone/fetch/push directly against
GCS-backed memory repos.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): create memory repos on demand and stabilize git HTTP
- Ensure MemoryRepoManager creates the git repo on first write (instead of 500ing)
and avoids rewriting history by only auto-creating on FileNotFoundError.
- Simplify dulwich-thread async execution and auto-create empty repos on first
git clone.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): make dulwich optional for CI installs
Guard dulwich imports in the git smart HTTP router so the core server can boot
(and CI tests can run) without installing the memory-repo extra.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): guard git HTTP WSGI init when dulwich missing
Avoid instantiating dulwich's HTTPGitApplication at import time when dulwich
isn't installed (common in CI installs).
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): avoid masking send_message errors in finally
Initialize `result` before the agent loop so error paths (e.g. approval
validation) don't raise UnboundLocalError in the run-tracking finally block.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): stop event loop watchdog on FastAPI shutdown
Ensure the EventLoopWatchdog thread is stopped during FastAPI lifespan
shutdown to avoid daemon threads logging during interpreter teardown (seen in CI
unit tests).
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore(core): remove send_*_message_to_agent from SyncServer
Drop send_message_to_agent and send_group_message_to_agent from SyncServer and
route internal fire-and-forget messaging through send_messages helpers instead.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): backfill git memory repo when tag added
When an agent is updated to include the git-memory-enabled tag, ensure the
git-backed memory repo is created and initialized from the agent's current
blocks. Also support configuring the memory repo object store via
LETTA_OBJECT_STORE_URI.
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): preserve block tags on git-enabled updates
When updating a block for a git-memory-enabled agent, keep block tags in sync
with PostgreSQL (tags are not currently stored in the git repo).
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore(core): remove git-state legacy shims
- Rename optional dependency extra from memory-repo to git-state
- Drop legacy object-store env aliases and unused region config
- Simplify memory repo metadata to a single canonical format
- Remove unused repo-cache invalidation helper
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* fix(core): keep PR scope for git-backed blocks
- Revert unrelated change in fire-and-forget multi-agent send helper
- Route agent block updates-by-label through injected block manager only when needed
👾 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
---------
Co-authored-by: Letta <noreply@letta.com>
* fix: don't need embedding model for self hosted
* stage publish api
* passes tests
* add test
* remove unnecessary upgrades
* update revision order db migrations
* add timeout for ci
* feat: byok provider models in db also
* make tests and sync api
* fix inconsistent state with recreating provider of same name
* fix sync on byok creation
* update revision
* move stripe code for testing purposes
* revert
* add refresh byok models endpoint
* just stage publish api
* add tests
* reorder revision
* add test for name clashes
* feat: enable bedrock for anthropic models
* parallel tool calls in ade
* attempt add to ci
* update tests
* add env vars
* hardcode region
* get it working
* debugging
* add bedrock extra
* default env var [skip ci]
* run ci
* reasoner model update
* secrets
* clean up log
* clean up
* feat: Add conversation_id filtering to message list and search endpoints
Add optional conversation_id parameter to filter messages by conversation:
- client.agents.messages.list
- client.messages.list
- client.messages.search
Changes:
- Added conversation_id field to MessageSearchRequest and SearchAllMessagesRequest schemas
- Added conversation_id filtering to list_messages in message_manager.py
- Updated get_agent_recall_async and get_all_messages_recall_async in server.py
- Added conversation_id query parameter to router endpoints
- Updated Turbopuffer client to support conversation_id filtering in searches
Fixes#8320🤖 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Charles Packer <cpacker@users.noreply.github.com>
* add conversation_id to message and tpuf
* default messages filter for backward compatibility
* add test and auto gen
* fix integration test
* fix test
* update test
---------
Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com>
Co-authored-by: Charles Packer <cpacker@users.noreply.github.com>
Co-authored-by: christinatong01 <christina@letta.com>
* feat: add message_types filter to list messages endpoint
Add the ability to filter messages by type when listing message history
via GET /v1/agents/{agent_id}/messages. This brings parity with the
create message endpoint which already supports include_return_message_types.
Changes:
- Add message_types query parameter to list_messages endpoint in agents.py
- Add message_types parameter to get_agent_recall_async in server.py
- Filter messages by message_type after LettaMessage conversion
- Add test for message_types filtering
Closes#8277
Written by Cameron ◯ Letta Code
> "Simplicity is the ultimate sophistication." - Leonardo da Vinci
🐙 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
* chore: regenerate OpenAPI spec and SDK for message_types filter
🐧 Generated with [Letta Code](https://letta.com)
Co-Authored-By: Letta <noreply@letta.com>
Written by Cameron ◯ Letta Code
"The only way to do great work is to love what you do." - Steve Jobs
---------
Co-authored-by: Letta <noreply@letta.com>
* feat: add zai provider support
* add zai_api_key secret to deploy-core
* add to justfile
* add testing, provider integration skill
* enable zai key
* fix zai test
* clean up skill a little
* small changes
* Revert "fix test"
This reverts commit 5126815f23cefb4edad3e3bf9e7083209dcc7bf1.
* fix server and better test
* test fix, get api key for base and byok?
* set letta default endpoint
* try to fix timeout for test
* fix for letta api key
* Delete apps/core/tests/sdk_v1/conftest.py
* Update utils.py
* clean up a few issues
* fix filterning on list_llm_models
* soft delete models with provider
* add one more test
* fix ci
* add timeout
* band aid for letta embedding provider
* info instead of error logs when creating models
* claude code first pass
* rename routes
* search_messages and list_messages
* revert agents messagesearch
* generate api
* fix backend for list all messages
* request for message search
* return list of letta message
* add tests
* error in archive endpoint
* archive delete return type wrong
* optional params for archive creation
* add passage to tpuf on create
* fix archive manager
* support global passage search
* search by agent
* just do basic org wide search for now
* change message test to be about fresh data, cleanup after
---------
Co-authored-by: Ari Webb <ari@letta.com>
* parallel tool calling in model settings
* configs for send message sdk v1
* change models for all tests
---------
Co-authored-by: Ari Webb <ari@letta.com>
* add memory tracking to core
* move to asyncio from threading.Thread
* remove threading.thread all the way
* delay decorator monitoring initialization until after event loop is registered
* context manager to decorator
* add psutil