* feat: add /agents/{agent_id}/generate endpoint for direct LLM requests
Add new endpoint that makes direct LLM provider requests without agent
context, memory, tools, or state modification. This enables:
- Quick LLM queries without agent overhead
- Testing model configurations
- Simple chat completions using agent's credentials
- Comparing responses across different models
Features:
- Uses agent's LLM config by default
- Supports model override with full provider config resolution
- Non-streaming, stateless operation
- Proper error handling and validation
- Request/response schemas with Pydantic validation
Implementation:
- Add GenerateRequest and GenerateResponse schemas
- Implement generate_completion endpoint handler
- Add necessary imports (LLMError, LLMClient, HandleNotFoundError)
- Include logging and comprehensive error handling
* fix: improve error handling and fix Message construction
- Fix critical bug: use content=[TextContent(text=...)] instead of text=...
- Add explicit error handling for NoResultFound and HandleNotFoundError
- Add error handling for convert_response_to_chat_completion
- Add structured logging for debugging
- Remove unnecessary .get() calls since Pydantic validates messages
* refactor: extract generate logic to AgentCompletionService
Move the generate endpoint business logic out of the endpoint handler
into a dedicated AgentCompletionService class for better code organization
and separation of concerns.
Changes:
- Create new AgentCompletionService in services/agent_completion_service.py
- Service handles all business logic: agent validation, LLM config resolution,
message conversion, LLM client creation, and request/response processing
- Integrate service with SyncServer initialization
- Refactor generate_completion endpoint to use the service
- Endpoint now only handles HTTP concerns (auth, error mapping)
Benefits:
- Cleaner endpoint code (reduced from ~140 lines to ~25 lines)
- Better separation of concerns (HTTP vs business logic)
- Service logic can be reused or tested independently
- Follows established patterns in the codebase (AgentManager, etc.)
* feat: simplify generate API to accept just prompt text
Simplify the client interface by accepting a simple prompt string instead
of requiring clients to format messages.
Changes:
- Update GenerateRequest schema:
- Replace 'messages' array with simple 'prompt' string
- Add optional 'system_prompt' for context/instructions
- Keep 'override_model' for model selection
- Update AgentCompletionService to format messages automatically:
- Accepts prompt and optional system_prompt
- Constructs message array internally (system + user messages)
- Simpler API surface for clients
- Update endpoint documentation with new simplified examples
- Regenerate OpenAPI spec and TypeScript SDK
Benefits:
- Much simpler client experience - just send text
- No need to understand message formatting
- Still supports system prompts for context
- Cleaner API that matches common use cases
Example (before):
{
"messages": [{"role": "user", "content": "What is 2+2?"}]
}
Example (after):
{
"prompt": "What is 2+2?"
}
* test: add comprehensive integration tests for generate endpoint
Add 9 integration tests covering various scenarios:
Happy path tests:
- test_agent_generate_basic: Basic prompt -> response flow
- test_agent_generate_with_system_prompt: System prompt + user prompt
- test_agent_generate_with_model_override: Override model selection
- test_agent_generate_long_prompt: Handle longer prompts
- test_agent_generate_no_persistence: Verify no messages saved to agent
Error handling tests:
- test_agent_generate_empty_prompt_error: Empty prompt validation (422)
- test_agent_generate_invalid_agent_id: Invalid agent ID (404)
- test_agent_generate_invalid_model_override: Invalid model handle (404)
All tests verify:
- Response structure (content, model, usage)
- Proper status codes for errors
- Usage statistics (tokens, counts)
- No side effects on agent state
Tests follow existing test patterns in test_client.py and use the
letta_client SDK (assuming generate_completion method is auto-generated
from the OpenAPI spec).
* openapi
* refactor: rename AgentCompletionService to AgentGenerateCompletionManager
Rename for better clarity and consistency with codebase naming conventions:
- Rename file: agent_completion_service.py → agent_generate_completion_manager.py
- Rename class: AgentCompletionService → AgentGenerateCompletionManager
- Rename attribute: server.agent_completion_service → server.agent_generate_completion_manager
- Update docstrings: 'Service' → 'Manager'
Changes:
- apps/core/letta/services/agent_generate_completion_manager.py (renamed + updated class)
- apps/core/letta/server/server.py (import + initialization)
- apps/core/letta/server/rest_api/routers/v1/agents.py (usage in endpoint)
No functional changes, purely a naming refactor.
* fix: remove invalid Message parameters in generate manager
Remove agent_id=None and user_id=None from Message construction.
The Message model doesn't accept these as None values - only pass
required parameters (role, content).
Fixes validation error:
'Extra inputs are not permitted [type=extra_forbidden, input_value=None]'
This aligns with other Message construction patterns in the codebase
(see tools.py, memory.py examples).
* feat: improve generate endpoint validation and tests
- Add field validator for whitespace-only prompts
- Always include system message (required by Anthropic)
- Use default "You are a helpful assistant." when no system_prompt provided
- Update tests to use direct HTTP calls via httpx
- Fix test issues:
- Use valid agent ID format (agent-{uuid})
- Use available model (openai/gpt-4o-mini)
- Add whitespace validation test
- All 9 integration tests passing
Letta (formerly MemGPT)
Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.
- Letta Code: run agents locally in your terminal
- Letta API: build agents into your applications
Get started in the CLI
Requires Node.js 18+
- Install the Letta Code CLI tool:
npm install -g @letta-ai/letta-code - Run
lettain your terminal to launch an agent with memory running on your local computer
When running the CLI tool, your agent help you code and do any task you can do on your computer.
Letta Code supports skills and subagents, and bundles pre-built skills/subagents for advanced memory and continual learning. Letta is fully model-agnostic, though we recommend Opus 4.5 and GPT-5.2 for best performance (see our model leaderboard for our rankings).
Get started with the Letta API
Use the Letta API to integrate stateful agents into your own applications. Letta has a full-featured agents API, and a Python and Typescript SDK (view our API reference).
Installation
TypeScript / Node.js:
npm install @letta-ai/letta-client
Python:
pip install letta-client
Hello World example
Below is a quick example of creating a stateful agent and sending it a message (requires a Letta API key). See the full quickstart guide for complete documentation.
TypeScript:
import Letta from "@letta-ai/letta-client";
const client = new Letta({ apiKey: process.env.LETTA_API_KEY });
// Create your agent
const agentState = await client.agents.create({
model: "openai/gpt-5.2",
memory_blocks: [
{
label: "human",
value:
"Name: Timber. Status: dog. Occupation: building Letta, infrastructure to democratize self-improving superintelligence",
},
{
label: "persona",
value:
"I am a self-improving superintelligence. Timber is my best friend and collaborator.",
},
],
tools: ["web_search", "fetch_webpage"],
});
console.log("Agent created with ID:", agentState.id);
// Send your agent a message
const response = await client.agents.messages.create(agentState.id, {
input: "What do you know about me?",
});
for (const message of response.messages) {
console.log(message);
}
Python:
from letta_client import Letta
import os
client = Letta(api_key=os.getenv("LETTA_API_KEY"))
# Create your agent
agent_state = client.agents.create(
model="openai/gpt-5.2",
memory_blocks=[
{
"label": "human",
"value": "Name: Timber. Status: dog. Occupation: building Letta, infrastructure to democratize self-improving superintelligence"
},
{
"label": "persona",
"value": "I am a self-improving superintelligence. Timber is my best friend and collaborator."
}
],
tools=["web_search", "fetch_webpage"]
)
print(f"Agent created with ID: {agent_state.id}")
# Send your agent a message
response = client.agents.messages.create(
agent_id=agent_state.id,
input="What do you know about me?"
)
for message in response.messages:
print(message)
Contributing
Letta is an open source project built by over a hundred contributors from around the world. There are many ways to get involved in the Letta OSS project!
- Join the Discord: Chat with the Letta devs and other AI developers.
- Chat on our forum: If you're not into Discord, check out our developer forum.
- Follow our socials: Twitter/X, LinkedIn, YouTube
Legal notices: By using Letta and related Letta services (such as the Letta endpoint or hosted service), you are agreeing to our privacy policy and terms of service.
