docs: clarify agent tools vs SDK endpoints in archival memory guides (#5555)

Separate agent tool calls (archival_memory_insert, archival_memory_search)
from SDK endpoints (client.agents.passages.*) throughout the archival memory
documentation to reduce confusion between what agents do autonomously vs what
developers do via the SDK.

Changes:
- Remove language tabs from agent tool examples (not language-specific)
- Add Info callout explaining the two interaction levels
- Add SDK search example for symmetry with agent search example
- Reorganize best practices into "Agent" and "Developer (SDK)" sections
- Remove comparison table
- Add troubleshooting section with common issues
- Update pagination references to use cursor-based approach
This commit is contained in:
Cameron Pfiffer
2025-10-20 09:20:12 -07:00
committed by Caren Thomas
parent e61d276f5d
commit b0423b4912
3 changed files with 828 additions and 0 deletions

View File

@@ -0,0 +1,411 @@
---
title: Best Practices
subtitle: Patterns, pitfalls, and advanced usage
slug: guides/agents/archival-best-practices
---
## Agent best practices
These patterns help agents use archival memory effectively during conversations.
### 1. Avoid over-insertion
The most common pitfall is inserting too many memories, creating clutter. Trust the agent to decide what's worth storing long-term.
### 2. Use tags consistently
Establish a tag taxonomy and stick to it. Good language models typically handle tagging well.
### 3. Add context to insertions
❌ Don't: "Likes replicants"
✅ Do: "Deckard shows unusual empathy toward replicants, particularly Rachael, suggesting possible replicant identity"
### 4. Let agents experiment
Agents can test different query styles to understand what works:
```python
# What the agent does (agent tool call)
archival_memory_search(query="How does the Voight-Kampff test work?")
archival_memory_search(query="Voight-Kampff procedure")
archival_memory_search(query="replicant detection method")
```
**Important:** Have the agent persist learnings from experimentation in a memory block (like `archival_tracking` or `archival_policies`), not in archival itself (avoid meta-clutter).
## Developer best practices (SDK)
These patterns help developers configure and manage archival memory via the SDK.
### Backfilling archives
Developers can pre-load archival memory with existing knowledge via the SDK:
<CodeGroup>
```typescript TypeScript
// Load company policies
const policies = [
"All replicants must undergo Voight-Kampff testing upon arrival",
"Blade Runner units are authorized to retire rogue replicants",
"Tyrell Corporation employees must report suspected replicants immediately"
];
for (const policy of policies) {
await client.agents.passages.insert(agent.id, {
content: policy,
tags: ["policy", "company", "protocol"]
});
}
// Load technical documentation
const docs = [
{
content: "Nexus-6 replicants: Superior strength, agility, and intelligence. Four-year lifespan prevents emotional development.",
tags: ["technical", "nexus-6", "specifications"]
},
{
content: "Voight-Kampff test: Measures capillary dilation, blush response, and pupil dilation to detect replicants.",
tags: ["technical", "testing", "voight-kampff"]
}
];
for (const doc of docs) {
await client.agents.passages.insert(agent.id, {
content: doc.content,
tags: doc.tags
});
}
```
```python Python
# Load company policies
policies = [
"All replicants must undergo Voight-Kampff testing upon arrival",
"Blade Runner units are authorized to retire rogue replicants",
"Tyrell Corporation employees must report suspected replicants immediately"
]
for policy in policies:
client.agents.passages.insert(
agent_id=agent.id,
content=policy,
tags=["policy", "company", "protocol"]
)
# Load technical documentation
docs = [
{
"content": "Nexus-6 replicants: Superior strength, agility, and intelligence. Four-year lifespan prevents emotional development.",
"tags": ["technical", "nexus-6", "specifications"]
},
{
"content": "Voight-Kampff test: Measures capillary dilation, blush response, and pupil dilation to detect replicants.",
"tags": ["technical", "testing", "voight-kampff"]
}
]
for doc in docs:
client.agents.passages.insert(
agent_id=agent.id,
content=doc["content"],
tags=doc["tags"]
)
```
</CodeGroup>
**Use cases for backfilling:**
- Migrating knowledge bases to Letta
- Seeding specialized agents with domain knowledge
- Loading historical conversation logs
- Importing research libraries
### Create an archival policies block
Help your agent learn how to use archival memory effectively by creating a dedicated memory block for archival usage policies:
<CodeGroup>
```typescript TypeScript
await client.blocks.create({
label: "archival_policies",
value: `
When to insert into archival:
- User preferences and important facts about the user
- Technical specifications and reference information
- Significant decisions or outcomes from conversations
When NOT to insert:
- Temporary conversational context
- Information already stored
- Trivial details or pleasantries
Search strategies:
- Use natural language questions for best results
- Include tags when filtering by category
- Try semantic variations if first search doesn't find what you need
`
});
```
```python Python
client.blocks.create(
label="archival_policies",
value="""
When to insert into archival:
- User preferences and important facts about the user
- Technical specifications and reference information
- Significant decisions or outcomes from conversations
When NOT to insert:
- Temporary conversational context
- Information already stored
- Trivial details or pleasantries
Search strategies:
- Use natural language questions for best results
- Include tags when filtering by category
- Try semantic variations if first search doesn't find what you need
"""
)
```
</CodeGroup>
You can improve this block through conversation with your agent:
> **You:** "I noticed you didn't store the fact that I prefer TypeScript for backend development. Update your archival policies block to ensure you capture language preferences in the future."
> **Agent:** Updates the archival_policies block to include "Programming language preferences" under "When to insert into archival"
This collaborative approach helps agents learn from mistakes and improve their archival memory usage over time.
### Track query effectiveness
Build self-improving agents by having them track archival search effectiveness in a memory block:
<CodeGroup>
```typescript TypeScript
// Create a memory block for tracking
await client.blocks.create({
label: "archival_tracking",
value: `
Query patterns: Natural language questions work best
Recent searches: "test procedures" (3 results), "replicant specs" (5 results)
Success rate: ~85% of searches return relevant results
Frequently searched topics: [technical specifications, protocols, case histories]
Common patterns: Queries about technical specs work better than vague questions
Improvements needed: Add more tags for better filtering
`
});
```
```python Python
# Create a memory block for tracking
client.blocks.create(
label="archival_tracking",
value="""
Query patterns: Natural language questions work best
Recent searches: "test procedures" (3 results), "replicant specs" (5 results)
Success rate: ~85% of searches return relevant results
Frequently searched topics: [technical specifications, protocols, case histories]
Common patterns: Queries about technical specs work better than vague questions
Improvements needed: Add more tags for better filtering
"""
)
```
</CodeGroup>
The agent can update this block based on search results and continuously refine its archival strategy.
### Enforcing archival usage with tool rules
If your agent forgets to use archival memory, you should first try prompting the agent to use it more consistently. If prompting alone doesn't work, you can enforce archival usage with [tool rules](/guides/agents/tool-rules).
**Force archival search at turn start:**
<CodeGroup>
```typescript TypeScript
await client.agents.update(agent.id, {
toolRules: [
{ type: "init", toolName: "archival_memory_search" }
]
});
```
```python Python
from letta_client.types import InitToolRule
client.agents.update(
agent_id=agent.id,
tool_rules=[
InitToolRule(tool_name="archival_memory_search")
]
)
```
</CodeGroup>
**Require archival insertion before exit:**
<CodeGroup>
```typescript TypeScript
await client.agents.update(agent.id, {
toolRules: [
{
type: "child",
toolName: "send_message",
children: ["archival_memory_insert"]
}
]
});
```
```python Python
from letta_client.types import ChildToolRule
client.agents.update(
agent_id=agent.id,
tool_rules=[
ChildToolRule(
tool_name="send_message",
children=["archival_memory_insert"]
)
]
)
```
</CodeGroup>
<Info>
**Using the ADE:** Tool rules can also be configured in the Agent Development Environment's Tool Manager interface.
</Info>
<Warning>
**Note:** Anthropic models don't support strict structured output, so tool rules may not be enforced. Use OpenAI or Gemini models for guaranteed tool rule compliance.
</Warning>
**When to use tool rules:**
- Knowledge management agents that should always search context
- Agents that need to learn from every interaction
- Librarian/archivist agents focused on information storage
**Latency considerations:** Forcing archival search adds a tool call at the start of every turn. For latency-sensitive applications (like customer support), consider making archival search optional.
[Learn more about tool rules →](/guides/agents/tool-rules)
### Modifying archival memories
While agents cannot modify archival memories, developers can update or delete them via the SDK:
<CodeGroup>
```typescript TypeScript
// Update a memory
await client.agents.passages.update(agent.id, passage.id, {
content: "Updated content",
tags: ["new", "tags"]
});
// Delete a memory
await client.agents.passages.delete(agent.id, passage.id);
```
```python Python
# Update a memory
client.agents.passages.update(
agent_id=agent.id,
passage_id=passage.id,
content="Updated content",
tags=["new", "tags"]
)
# Delete a memory
client.agents.passages.delete(
agent_id=agent.id,
passage_id=passage.id
)
```
</CodeGroup>
This allows you to:
- Fix incorrect information
- Update outdated facts
- Remove sensitive or irrelevant data
- Reorganize tag structures
## Troubleshooting
### Why can't my agent delete or modify archival memories?
Archival memory is designed to be **agent-immutable** by default. Agents can only insert and search, not modify or delete. This is intentional to prevent agents from "forgetting" important information.
**Solution:** If you need to modify or delete archival memories, use the SDK via `client.agents.passages.update()` or `client.agents.passages.delete()`.
### When should I use the SDK vs letting the agent handle archival?
**Let the agent handle it when:**
- The agent needs to decide what's worth remembering during conversations
- You want the agent to curate its own knowledge base
- Information emerges naturally from user interactions
**Use the SDK when:**
- Pre-loading knowledge before the agent starts (backfilling)
- Cleaning up incorrect or outdated information
- Bulk operations (importing documentation, migrating data)
- Managing memories outside of agent conversations
### My agent isn't using archival memory
**Common causes:**
1. **Agent doesn't know to use it** - Add guidance to the agent's system prompt or create an `archival_policies` memory block
2. **Agent doesn't need it yet** - With small amounts of information, agents may rely on conversation history instead
3. **Model limitations** - Some models are better at tool use than others
**Solutions:**
- Add explicit instructions in the agent's prompt about when to use archival
- Use tool rules to enforce archival usage (see "Enforcing archival usage with tool rules" above)
- Try a different model (OpenAI and Gemini models handle tool use well)
### Search returns no results or wrong results
**Common causes:**
1. **Empty archive** - Agent or developer hasn't inserted any memories yet
2. **Query mismatch** - Query doesn't semantically match stored content
3. **Tag filters too restrictive** - Filtering by tags that don't exist or are too narrow
**Solutions:**
- Verify memories exist using `client.agents.passages.list()` (uses cursor-based pagination with `after`, `before`, and `limit` parameters)
- Try broader or rephrased queries
- Check tags by listing passages to see what's actually stored
- Remove tag filters temporarily to see if that's the issue
### Agent inserting too many memories
**Common causes:**
1. **No guidance** - Agent doesn't know when to insert vs when not to
2. **Tool rules forcing insertion** - Tool rules may require archival use
3. **Agent being overly cautious** - Some models default to storing everything
**Solutions:**
- Create an `archival_policies` block with clear guidelines (see "Create an archival policies block" above)
- Review and adjust tool rules if you're using them
- Add explicit examples of what NOT to store in the agent's prompt
## Next steps
<CardGroup cols={2}>
<Card
title="Searching & Querying"
href="/guides/agents/archival-search"
>
Learn how to search archival memory effectively
</Card>
<Card
title="Archival Memory Overview"
href="/guides/agents/archival-memory"
>
Back to archival memory overview
</Card>
<Card
title="Memory Blocks"
href="/guides/agents/memory-blocks"
>
Learn about always-visible memory
</Card>
<Card
title="Tool Rules"
href="/guides/agents/tool-rules"
>
Advanced tool execution constraints
</Card>
</CardGroup>

View File

@@ -0,0 +1,196 @@
---
title: Archival Memory
subtitle: Long-term semantic storage for agent knowledge
slug: guides/agents/archival-memory
---
## What is archival memory?
Archival memory is a semantically searchable database where agents store facts, knowledge, and information for long-term retrieval. Unlike memory blocks that are always visible, archival memory is queried on-demand when relevant.
**Key characteristics:**
- **Agent-immutable** - Agents cannot easily modify or delete archival memories (though developers can via SDK)
- **Unlimited storage** - No practical size limits
- **Semantic search** - Find information by meaning, not exact keywords
- **Tagged organization** - Agents can categorize memories with tags
**Best for:** Event descriptions, reports, articles, historical records, and reference material that doesn't change frequently.
## When to use archival memory
**Use archival memory for:**
- Document repositories (API docs, technical guides, research papers)
- Conversation logs beyond the context window
- Customer interaction history and support tickets
- Reports, articles, and written content
- Code examples and technical references
- Training materials and educational content
- User research data and feedback
- Historical records and event logs
**Don't use archival memory for:**
- Information that should always be visible → Use memory blocks
- Frequently changing state → Use memory blocks
- Current working memory → Use scratchpad blocks
- Information that needs frequent modification → Use memory blocks
## How agents interact with archival memory
<Info>
**Two ways to interact with archival memory:**
**Agent tools** - What agents do autonomously during conversations:
- `archival_memory_insert` - Store new information
- `archival_memory_search` - Query for relevant memories
**SDK endpoints** - What developers do via `client.agents.passages.*`:
- Insert, search, list, update, and delete memories programmatically
- Manage archival content outside of agent conversations
</Info>
Agents have two primary tools for archival memory: `archival_memory_insert` and `archival_memory_search`.
### Inserting information
**Agents** can insert memories during conversations using the `archival_memory_insert` tool:
```python
# What the agent does (agent tool call)
archival_memory_insert(
content="Deckard retired six replicants in the off-world colonies before returning to Los Angeles",
tags=["replicant", "history", "retirement"]
)
```
**Developers** can also insert programmatically via the SDK:
<CodeGroup>
```typescript TypeScript
await client.agents.passages.insert(agent.id, {
content: "The Tyrell Corporation's motto: 'More human than human'",
tags: ["company", "motto", "tyrell"]
});
```
```python Python
client.agents.passages.insert(
agent_id=agent.id,
content="The Tyrell Corporation's motto: 'More human than human'",
tags=["company", "motto", "tyrell"]
)
```
</CodeGroup>
### Searching for information
**Agents** can search semantically using the `archival_memory_search` tool:
```python
# What the agent does (agent tool call)
results = archival_memory_search(
query="replicant lifespan",
tags=["technical"], # Optional: filter by tags
page=0
)
```
**Developers** can also search programmatically via the SDK:
<CodeGroup>
```typescript TypeScript
const results = await client.agents.passages.search(agent.id, {
query: "replicant lifespan",
tags: ["technical"],
page: 0
});
```
```python Python
results = client.agents.passages.search(
agent_id=agent.id,
query="replicant lifespan",
tags=["technical"],
page=0
)
```
</CodeGroup>
Results return **semantically relevant** information - meaning the search understands concepts and meaning, not just exact keywords. For example, searching for "artificial memories" will find "implanted memories" even though the exact words don't match.
[Learn more about search and querying →](/guides/agents/archival-search)
## Real-world examples
### Example 1: Personal knowledge manager
An agent with 30k+ archival memories tracking:
- Personal preferences and history
- Technical learnings and insights
- Article summaries and research notes
- Conversation highlights
### Example 2: Social media agent
An agent with 32k+ memories tracking interactions:
- User preferences and conversation history
- Common topics and interests
- Interaction patterns and communication styles
- Tags by user, topic, and interaction type
### Example 3: Customer support agent
- Stores ticket resolutions and common issues
- Tags by product, issue type, priority
- Searches archival for similar past issues
- Learns from successful resolutions over time
### Example 4: Research assistant
- Stores paper summaries with key findings
- Tags by topic, methodology, author
- Cross-references related research
- Builds a semantic knowledge graph
## Archival memory vs conversation search
<Tip>
**Archival memory** is for **intentional** storage:
- Agents decide what's worth remembering long-term
- Used for facts, knowledge, and reference material
- Curated by the agent through active insertion
**Conversation search** is for **historical** retrieval:
- Searches through actual past messages
- Used to recall what was said in previous conversations
- Automatic - no agent curation needed
**Example:**
- User says: "I prefer Python for data science projects"
- **Archival:** Agent inserts "User prefers Python for data science" as a fact
- **Conversation search:** Agent can search for the original message later
Use archival for structured knowledge, conversation search for historical context.
</Tip>
## Next steps
<CardGroup cols={2}>
<Card
title="Searching & Querying"
href="/guides/agents/archival-search"
>
Learn how to write effective queries and filter results
</Card>
<Card
title="Best Practices"
href="/guides/agents/archival-best-practices"
>
Patterns, pitfalls, and advanced usage
</Card>
<Card
title="Memory Blocks"
href="/guides/agents/memory-blocks"
>
Learn about always-visible memory
</Card>
<Card
title="Agent Memory Overview"
href="/guides/agents/memory"
>
Understand Letta's memory system
</Card>
</CardGroup>

View File

@@ -0,0 +1,221 @@
---
title: Searching & Querying
subtitle: How to search archival memory effectively
slug: guides/agents/archival-search
---
## Search result format
<Info>
**What agents receive:** Each result contains:
- `content` - The stored text
- `tags` - Associated tags
- `timestamp` - When the memory was created
- `relevance` - Scoring with `rrf_score`, `vector_rank`, `fts_rank`
Letta uses **hybrid search** combining semantic (vector) and keyword (full-text) search, ranked using Reciprocal Rank Fusion (RRF). Higher `rrf_score` means more relevant.
</Info>
## Writing effective queries
Letta uses OpenAI's `text-embedding-3-small` model, which handles natural language questions well. Agents can use various query styles:
**Natural language questions work best:**
```python
# What the agent does (agent tool call)
archival_memory_search(query="How does the test work?")
# Returns: "The Voight-Kampff test measures involuntary emotional responses..."
```
**Keywords also work:**
```python
# What the agent does (agent tool call)
archival_memory_search(query="replicant lifespan")
# Returns memories containing both keywords and semantically related concepts
```
**Concept-based queries leverage semantic understanding:**
```python
# What the agent does (agent tool call)
archival_memory_search(query="artificial memories")
# Returns: "...experimental replicant with implanted memories..."
# (semantic match despite different terminology)
```
<Tip>
**Pagination:** Agents receive multiple results per search. If an agent doesn't paginate correctly, you can instruct it to adjust the `page` parameter or remind it to iterate through results.
</Tip>
## Filtering by time
Agents can search by date ranges:
```python
# What the agent does (agent tool call)
# Recent memories
archival_memory_search(
query="test results",
start_datetime="2025-09-29T00:00:00"
)
# Specific time window
archival_memory_search(
query="replicant cases",
start_datetime="2025-09-29T00:00:00",
end_datetime="2025-09-30T23:59:59"
)
```
<Info>
**Agent datetime awareness:**
- Agents know the current day but not the current time
- Agents can see timestamps of messages they've received
- Agents cannot control insertion timestamps (automatic)
- Developers can backdate memories via SDK with `created_at`
- Time filtering enables queries like "what did we discuss last week?"
</Info>
## Tags and organization
Tags help agents organize and filter archival memories. **Agents always know what tags exist in their archive** since tag lists are compiled into the context window.
**Common tag patterns:**
- `user_info`, `professional`, `personal_history`
- `documentation`, `technical`, `reference`
- `conversation`, `milestone`, `event`
- `company_policy`, `procedure`, `guideline`
**Tag search modes:**
- Match any tag
- Match all tags
- Filter by date ranges
Example of organized tagging:
```python
# What the agent does (agent tool call)
# Atomic memory with precise tags
archival_memory_insert(
content="Nexus-6 replicants have a four-year lifespan",
tags=["technical", "replicant", "nexus-6"]
)
# Later, easy retrieval
archival_memory_search(
query="how long do replicants live",
tags=["technical"]
)
```
## Performance and scale
<Info>
Archival memory has no practical size limits and remains fast at scale:
**Letta Cloud:** Uses [TurboPuffer](https://turbopuffer.com/) for extremely fast semantic search, even with hundreds of thousands of memories.
**Self-hosted:** Uses pgvector (PostgreSQL) for vector search. Performance scales well with proper indexing.
**Letta Desktop:** Uses SQLite with vector search extensions. Suitable for personal use cases.
No matter the backend, archival memory scales to large archives without performance degradation.
</Info>
## Embedding models and search quality
Archival search quality depends on the agent's embedding model:
**Letta Cloud:** All agents use `text-embedding-3-small`, which is optimized for most use cases. This model cannot be changed.
**Self-hosted:** Embedding model is pinned to the agent at creation. The default `text-embedding-3-small` is sufficient for nearly all use cases.
### Changing embedding models (self-hosted only)
To change an agent's embedding model, you must:
1. List and export all archival memories
2. Delete all archival memories
3. Update the agent's embedding model
4. Re-insert all memories (they'll be re-embedded)
<Warning>
Changing embedding models is a destructive operation. Export your archival memories first.
</Warning>
## Programmatic access (SDK)
Developers can manage archival memory programmatically via the SDK:
<CodeGroup>
```typescript TypeScript
// Insert a memory
await client.agents.passages.insert(agent.id, {
content: "The Voight-Kampff test requires a minimum of 20 cross-referenced questions",
tags: ["technical", "testing", "protocol"]
});
// Search memories
const results = await client.agents.passages.search(agent.id, {
query: "testing procedures",
tags: ["protocol"],
page: 0
});
// List all memories
const passages = await client.agents.passages.list(agent.id, {
limit: 100
});
// Get a specific memory
const passage = await client.agents.passages.get(agent.id, passageId);
```
```python Python
# Insert a memory
client.agents.passages.insert(
agent_id=agent.id,
content="The Voight-Kampff test requires a minimum of 20 cross-referenced questions",
tags=["technical", "testing", "protocol"]
)
# Search memories
results = client.agents.passages.search(
agent_id=agent.id,
query="testing procedures",
tags=["protocol"],
page=0
)
# List all memories
passages = client.agents.passages.list(
agent_id=agent.id,
limit=100
)
# Get a specific memory
passage = client.agents.passages.get(
agent_id=agent.id,
passage_id=passage_id
)
```
</CodeGroup>
## Next steps
<CardGroup cols={2}>
<Card
title="Best Practices"
href="/guides/agents/archival-best-practices"
>
Learn patterns, pitfalls, and advanced usage
</Card>
<Card
title="Archival Memory Overview"
href="/guides/agents/archival-memory"
>
Back to archival memory overview
</Card>
</CardGroup>