From b0423b4912c1677defdbb66c1a581fa45e89fefc Mon Sep 17 00:00:00 2001 From: Cameron Pfiffer Date: Mon, 20 Oct 2025 09:20:12 -0700 Subject: [PATCH] docs: clarify agent tools vs SDK endpoints in archival memory guides (#5555) Separate agent tool calls (archival_memory_insert, archival_memory_search) from SDK endpoints (client.agents.passages.*) throughout the archival memory documentation to reduce confusion between what agents do autonomously vs what developers do via the SDK. Changes: - Remove language tabs from agent tool examples (not language-specific) - Add Info callout explaining the two interaction levels - Add SDK search example for symmetry with agent search example - Reorganize best practices into "Agent" and "Developer (SDK)" sections - Remove comparison table - Add troubleshooting section with common issues - Update pagination references to use cursor-based approach --- fern/pages/agents/archival_best_practices.mdx | 411 ++++++++++++++++++ .../pages/agents/archival_memory_overview.mdx | 196 +++++++++ fern/pages/agents/archival_search.mdx | 221 ++++++++++ 3 files changed, 828 insertions(+) create mode 100644 fern/pages/agents/archival_best_practices.mdx create mode 100644 fern/pages/agents/archival_memory_overview.mdx create mode 100644 fern/pages/agents/archival_search.mdx diff --git a/fern/pages/agents/archival_best_practices.mdx b/fern/pages/agents/archival_best_practices.mdx new file mode 100644 index 00000000..78a2ec9b --- /dev/null +++ b/fern/pages/agents/archival_best_practices.mdx @@ -0,0 +1,411 @@ +--- +title: Best Practices +subtitle: Patterns, pitfalls, and advanced usage +slug: guides/agents/archival-best-practices +--- + +## Agent best practices + +These patterns help agents use archival memory effectively during conversations. + +### 1. Avoid over-insertion + +The most common pitfall is inserting too many memories, creating clutter. Trust the agent to decide what's worth storing long-term. + +### 2. Use tags consistently + +Establish a tag taxonomy and stick to it. Good language models typically handle tagging well. + +### 3. Add context to insertions + +❌ Don't: "Likes replicants" +✅ Do: "Deckard shows unusual empathy toward replicants, particularly Rachael, suggesting possible replicant identity" + +### 4. Let agents experiment + +Agents can test different query styles to understand what works: + +```python +# What the agent does (agent tool call) +archival_memory_search(query="How does the Voight-Kampff test work?") +archival_memory_search(query="Voight-Kampff procedure") +archival_memory_search(query="replicant detection method") +``` + +**Important:** Have the agent persist learnings from experimentation in a memory block (like `archival_tracking` or `archival_policies`), not in archival itself (avoid meta-clutter). + +## Developer best practices (SDK) + +These patterns help developers configure and manage archival memory via the SDK. + +### Backfilling archives + +Developers can pre-load archival memory with existing knowledge via the SDK: + + +```typescript TypeScript +// Load company policies +const policies = [ + "All replicants must undergo Voight-Kampff testing upon arrival", + "Blade Runner units are authorized to retire rogue replicants", + "Tyrell Corporation employees must report suspected replicants immediately" +]; + +for (const policy of policies) { + await client.agents.passages.insert(agent.id, { + content: policy, + tags: ["policy", "company", "protocol"] + }); +} + +// Load technical documentation +const docs = [ + { + content: "Nexus-6 replicants: Superior strength, agility, and intelligence. Four-year lifespan prevents emotional development.", + tags: ["technical", "nexus-6", "specifications"] + }, + { + content: "Voight-Kampff test: Measures capillary dilation, blush response, and pupil dilation to detect replicants.", + tags: ["technical", "testing", "voight-kampff"] + } +]; + +for (const doc of docs) { + await client.agents.passages.insert(agent.id, { + content: doc.content, + tags: doc.tags + }); +} +``` +```python Python +# Load company policies +policies = [ + "All replicants must undergo Voight-Kampff testing upon arrival", + "Blade Runner units are authorized to retire rogue replicants", + "Tyrell Corporation employees must report suspected replicants immediately" +] + +for policy in policies: + client.agents.passages.insert( + agent_id=agent.id, + content=policy, + tags=["policy", "company", "protocol"] + ) + +# Load technical documentation +docs = [ + { + "content": "Nexus-6 replicants: Superior strength, agility, and intelligence. Four-year lifespan prevents emotional development.", + "tags": ["technical", "nexus-6", "specifications"] + }, + { + "content": "Voight-Kampff test: Measures capillary dilation, blush response, and pupil dilation to detect replicants.", + "tags": ["technical", "testing", "voight-kampff"] + } +] + +for doc in docs: + client.agents.passages.insert( + agent_id=agent.id, + content=doc["content"], + tags=doc["tags"] + ) +``` + + +**Use cases for backfilling:** +- Migrating knowledge bases to Letta +- Seeding specialized agents with domain knowledge +- Loading historical conversation logs +- Importing research libraries + +### Create an archival policies block + +Help your agent learn how to use archival memory effectively by creating a dedicated memory block for archival usage policies: + + +```typescript TypeScript +await client.blocks.create({ + label: "archival_policies", + value: ` + When to insert into archival: + - User preferences and important facts about the user + - Technical specifications and reference information + - Significant decisions or outcomes from conversations + + When NOT to insert: + - Temporary conversational context + - Information already stored + - Trivial details or pleasantries + + Search strategies: + - Use natural language questions for best results + - Include tags when filtering by category + - Try semantic variations if first search doesn't find what you need + ` +}); +``` +```python Python +client.blocks.create( + label="archival_policies", + value=""" + When to insert into archival: + - User preferences and important facts about the user + - Technical specifications and reference information + - Significant decisions or outcomes from conversations + + When NOT to insert: + - Temporary conversational context + - Information already stored + - Trivial details or pleasantries + + Search strategies: + - Use natural language questions for best results + - Include tags when filtering by category + - Try semantic variations if first search doesn't find what you need + """ +) +``` + + +You can improve this block through conversation with your agent: + +> **You:** "I noticed you didn't store the fact that I prefer TypeScript for backend development. Update your archival policies block to ensure you capture language preferences in the future." + +> **Agent:** Updates the archival_policies block to include "Programming language preferences" under "When to insert into archival" + +This collaborative approach helps agents learn from mistakes and improve their archival memory usage over time. + +### Track query effectiveness + +Build self-improving agents by having them track archival search effectiveness in a memory block: + + +```typescript TypeScript +// Create a memory block for tracking +await client.blocks.create({ + label: "archival_tracking", + value: ` + Query patterns: Natural language questions work best + Recent searches: "test procedures" (3 results), "replicant specs" (5 results) + Success rate: ~85% of searches return relevant results + Frequently searched topics: [technical specifications, protocols, case histories] + Common patterns: Queries about technical specs work better than vague questions + Improvements needed: Add more tags for better filtering + ` +}); +``` +```python Python +# Create a memory block for tracking +client.blocks.create( + label="archival_tracking", + value=""" + Query patterns: Natural language questions work best + Recent searches: "test procedures" (3 results), "replicant specs" (5 results) + Success rate: ~85% of searches return relevant results + Frequently searched topics: [technical specifications, protocols, case histories] + Common patterns: Queries about technical specs work better than vague questions + Improvements needed: Add more tags for better filtering + """ +) +``` + + +The agent can update this block based on search results and continuously refine its archival strategy. + +### Enforcing archival usage with tool rules + +If your agent forgets to use archival memory, you should first try prompting the agent to use it more consistently. If prompting alone doesn't work, you can enforce archival usage with [tool rules](/guides/agents/tool-rules). + +**Force archival search at turn start:** + + +```typescript TypeScript +await client.agents.update(agent.id, { + toolRules: [ + { type: "init", toolName: "archival_memory_search" } + ] +}); +``` +```python Python +from letta_client.types import InitToolRule + +client.agents.update( + agent_id=agent.id, + tool_rules=[ + InitToolRule(tool_name="archival_memory_search") + ] +) +``` + + +**Require archival insertion before exit:** + + +```typescript TypeScript +await client.agents.update(agent.id, { + toolRules: [ + { + type: "child", + toolName: "send_message", + children: ["archival_memory_insert"] + } + ] +}); +``` +```python Python +from letta_client.types import ChildToolRule + +client.agents.update( + agent_id=agent.id, + tool_rules=[ + ChildToolRule( + tool_name="send_message", + children=["archival_memory_insert"] + ) + ] +) +``` + + + +**Using the ADE:** Tool rules can also be configured in the Agent Development Environment's Tool Manager interface. + + + +**Note:** Anthropic models don't support strict structured output, so tool rules may not be enforced. Use OpenAI or Gemini models for guaranteed tool rule compliance. + + +**When to use tool rules:** +- Knowledge management agents that should always search context +- Agents that need to learn from every interaction +- Librarian/archivist agents focused on information storage + +**Latency considerations:** Forcing archival search adds a tool call at the start of every turn. For latency-sensitive applications (like customer support), consider making archival search optional. + +[Learn more about tool rules →](/guides/agents/tool-rules) + +### Modifying archival memories + +While agents cannot modify archival memories, developers can update or delete them via the SDK: + + +```typescript TypeScript +// Update a memory +await client.agents.passages.update(agent.id, passage.id, { + content: "Updated content", + tags: ["new", "tags"] +}); + +// Delete a memory +await client.agents.passages.delete(agent.id, passage.id); +``` +```python Python +# Update a memory +client.agents.passages.update( + agent_id=agent.id, + passage_id=passage.id, + content="Updated content", + tags=["new", "tags"] +) + +# Delete a memory +client.agents.passages.delete( + agent_id=agent.id, + passage_id=passage.id +) +``` + + +This allows you to: +- Fix incorrect information +- Update outdated facts +- Remove sensitive or irrelevant data +- Reorganize tag structures + +## Troubleshooting + +### Why can't my agent delete or modify archival memories? + +Archival memory is designed to be **agent-immutable** by default. Agents can only insert and search, not modify or delete. This is intentional to prevent agents from "forgetting" important information. + +**Solution:** If you need to modify or delete archival memories, use the SDK via `client.agents.passages.update()` or `client.agents.passages.delete()`. + +### When should I use the SDK vs letting the agent handle archival? + +**Let the agent handle it when:** +- The agent needs to decide what's worth remembering during conversations +- You want the agent to curate its own knowledge base +- Information emerges naturally from user interactions + +**Use the SDK when:** +- Pre-loading knowledge before the agent starts (backfilling) +- Cleaning up incorrect or outdated information +- Bulk operations (importing documentation, migrating data) +- Managing memories outside of agent conversations + +### My agent isn't using archival memory + +**Common causes:** +1. **Agent doesn't know to use it** - Add guidance to the agent's system prompt or create an `archival_policies` memory block +2. **Agent doesn't need it yet** - With small amounts of information, agents may rely on conversation history instead +3. **Model limitations** - Some models are better at tool use than others + +**Solutions:** +- Add explicit instructions in the agent's prompt about when to use archival +- Use tool rules to enforce archival usage (see "Enforcing archival usage with tool rules" above) +- Try a different model (OpenAI and Gemini models handle tool use well) + +### Search returns no results or wrong results + +**Common causes:** +1. **Empty archive** - Agent or developer hasn't inserted any memories yet +2. **Query mismatch** - Query doesn't semantically match stored content +3. **Tag filters too restrictive** - Filtering by tags that don't exist or are too narrow + +**Solutions:** +- Verify memories exist using `client.agents.passages.list()` (uses cursor-based pagination with `after`, `before`, and `limit` parameters) +- Try broader or rephrased queries +- Check tags by listing passages to see what's actually stored +- Remove tag filters temporarily to see if that's the issue + +### Agent inserting too many memories + +**Common causes:** +1. **No guidance** - Agent doesn't know when to insert vs when not to +2. **Tool rules forcing insertion** - Tool rules may require archival use +3. **Agent being overly cautious** - Some models default to storing everything + +**Solutions:** +- Create an `archival_policies` block with clear guidelines (see "Create an archival policies block" above) +- Review and adjust tool rules if you're using them +- Add explicit examples of what NOT to store in the agent's prompt + +## Next steps + + + + Learn how to search archival memory effectively + + + Back to archival memory overview + + + Learn about always-visible memory + + + Advanced tool execution constraints + + diff --git a/fern/pages/agents/archival_memory_overview.mdx b/fern/pages/agents/archival_memory_overview.mdx new file mode 100644 index 00000000..643b90e7 --- /dev/null +++ b/fern/pages/agents/archival_memory_overview.mdx @@ -0,0 +1,196 @@ +--- +title: Archival Memory +subtitle: Long-term semantic storage for agent knowledge +slug: guides/agents/archival-memory +--- + +## What is archival memory? + +Archival memory is a semantically searchable database where agents store facts, knowledge, and information for long-term retrieval. Unlike memory blocks that are always visible, archival memory is queried on-demand when relevant. + +**Key characteristics:** +- **Agent-immutable** - Agents cannot easily modify or delete archival memories (though developers can via SDK) +- **Unlimited storage** - No practical size limits +- **Semantic search** - Find information by meaning, not exact keywords +- **Tagged organization** - Agents can categorize memories with tags + +**Best for:** Event descriptions, reports, articles, historical records, and reference material that doesn't change frequently. + +## When to use archival memory + +**Use archival memory for:** +- Document repositories (API docs, technical guides, research papers) +- Conversation logs beyond the context window +- Customer interaction history and support tickets +- Reports, articles, and written content +- Code examples and technical references +- Training materials and educational content +- User research data and feedback +- Historical records and event logs + +**Don't use archival memory for:** +- Information that should always be visible → Use memory blocks +- Frequently changing state → Use memory blocks +- Current working memory → Use scratchpad blocks +- Information that needs frequent modification → Use memory blocks + +## How agents interact with archival memory + + +**Two ways to interact with archival memory:** + +**Agent tools** - What agents do autonomously during conversations: +- `archival_memory_insert` - Store new information +- `archival_memory_search` - Query for relevant memories + +**SDK endpoints** - What developers do via `client.agents.passages.*`: +- Insert, search, list, update, and delete memories programmatically +- Manage archival content outside of agent conversations + + +Agents have two primary tools for archival memory: `archival_memory_insert` and `archival_memory_search`. + +### Inserting information + +**Agents** can insert memories during conversations using the `archival_memory_insert` tool: + +```python +# What the agent does (agent tool call) +archival_memory_insert( + content="Deckard retired six replicants in the off-world colonies before returning to Los Angeles", + tags=["replicant", "history", "retirement"] +) +``` + +**Developers** can also insert programmatically via the SDK: + + +```typescript TypeScript +await client.agents.passages.insert(agent.id, { + content: "The Tyrell Corporation's motto: 'More human than human'", + tags: ["company", "motto", "tyrell"] +}); +``` +```python Python +client.agents.passages.insert( + agent_id=agent.id, + content="The Tyrell Corporation's motto: 'More human than human'", + tags=["company", "motto", "tyrell"] +) +``` + + +### Searching for information + +**Agents** can search semantically using the `archival_memory_search` tool: + +```python +# What the agent does (agent tool call) +results = archival_memory_search( + query="replicant lifespan", + tags=["technical"], # Optional: filter by tags + page=0 +) +``` + +**Developers** can also search programmatically via the SDK: + + +```typescript TypeScript +const results = await client.agents.passages.search(agent.id, { + query: "replicant lifespan", + tags: ["technical"], + page: 0 +}); +``` +```python Python +results = client.agents.passages.search( + agent_id=agent.id, + query="replicant lifespan", + tags=["technical"], + page=0 +) +``` + + +Results return **semantically relevant** information - meaning the search understands concepts and meaning, not just exact keywords. For example, searching for "artificial memories" will find "implanted memories" even though the exact words don't match. + +[Learn more about search and querying →](/guides/agents/archival-search) + +## Real-world examples + +### Example 1: Personal knowledge manager +An agent with 30k+ archival memories tracking: +- Personal preferences and history +- Technical learnings and insights +- Article summaries and research notes +- Conversation highlights + +### Example 2: Social media agent +An agent with 32k+ memories tracking interactions: +- User preferences and conversation history +- Common topics and interests +- Interaction patterns and communication styles +- Tags by user, topic, and interaction type + +### Example 3: Customer support agent +- Stores ticket resolutions and common issues +- Tags by product, issue type, priority +- Searches archival for similar past issues +- Learns from successful resolutions over time + +### Example 4: Research assistant +- Stores paper summaries with key findings +- Tags by topic, methodology, author +- Cross-references related research +- Builds a semantic knowledge graph + +## Archival memory vs conversation search + + +**Archival memory** is for **intentional** storage: +- Agents decide what's worth remembering long-term +- Used for facts, knowledge, and reference material +- Curated by the agent through active insertion + +**Conversation search** is for **historical** retrieval: +- Searches through actual past messages +- Used to recall what was said in previous conversations +- Automatic - no agent curation needed + +**Example:** +- User says: "I prefer Python for data science projects" +- **Archival:** Agent inserts "User prefers Python for data science" as a fact +- **Conversation search:** Agent can search for the original message later + +Use archival for structured knowledge, conversation search for historical context. + + +## Next steps + + + + Learn how to write effective queries and filter results + + + Patterns, pitfalls, and advanced usage + + + Learn about always-visible memory + + + Understand Letta's memory system + + diff --git a/fern/pages/agents/archival_search.mdx b/fern/pages/agents/archival_search.mdx new file mode 100644 index 00000000..e7ad05d7 --- /dev/null +++ b/fern/pages/agents/archival_search.mdx @@ -0,0 +1,221 @@ +--- +title: Searching & Querying +subtitle: How to search archival memory effectively +slug: guides/agents/archival-search +--- + +## Search result format + + +**What agents receive:** Each result contains: +- `content` - The stored text +- `tags` - Associated tags +- `timestamp` - When the memory was created +- `relevance` - Scoring with `rrf_score`, `vector_rank`, `fts_rank` + +Letta uses **hybrid search** combining semantic (vector) and keyword (full-text) search, ranked using Reciprocal Rank Fusion (RRF). Higher `rrf_score` means more relevant. + + +## Writing effective queries + +Letta uses OpenAI's `text-embedding-3-small` model, which handles natural language questions well. Agents can use various query styles: + +**Natural language questions work best:** + +```python +# What the agent does (agent tool call) +archival_memory_search(query="How does the test work?") +# Returns: "The Voight-Kampff test measures involuntary emotional responses..." +``` + +**Keywords also work:** + +```python +# What the agent does (agent tool call) +archival_memory_search(query="replicant lifespan") +# Returns memories containing both keywords and semantically related concepts +``` + +**Concept-based queries leverage semantic understanding:** + +```python +# What the agent does (agent tool call) +archival_memory_search(query="artificial memories") +# Returns: "...experimental replicant with implanted memories..." +# (semantic match despite different terminology) +``` + + +**Pagination:** Agents receive multiple results per search. If an agent doesn't paginate correctly, you can instruct it to adjust the `page` parameter or remind it to iterate through results. + + +## Filtering by time + +Agents can search by date ranges: + +```python +# What the agent does (agent tool call) + +# Recent memories +archival_memory_search( + query="test results", + start_datetime="2025-09-29T00:00:00" +) + +# Specific time window +archival_memory_search( + query="replicant cases", + start_datetime="2025-09-29T00:00:00", + end_datetime="2025-09-30T23:59:59" +) +``` + + +**Agent datetime awareness:** +- Agents know the current day but not the current time +- Agents can see timestamps of messages they've received +- Agents cannot control insertion timestamps (automatic) +- Developers can backdate memories via SDK with `created_at` +- Time filtering enables queries like "what did we discuss last week?" + + +## Tags and organization + +Tags help agents organize and filter archival memories. **Agents always know what tags exist in their archive** since tag lists are compiled into the context window. + +**Common tag patterns:** +- `user_info`, `professional`, `personal_history` +- `documentation`, `technical`, `reference` +- `conversation`, `milestone`, `event` +- `company_policy`, `procedure`, `guideline` + +**Tag search modes:** +- Match any tag +- Match all tags +- Filter by date ranges + +Example of organized tagging: + +```python +# What the agent does (agent tool call) + +# Atomic memory with precise tags +archival_memory_insert( + content="Nexus-6 replicants have a four-year lifespan", + tags=["technical", "replicant", "nexus-6"] +) + +# Later, easy retrieval +archival_memory_search( + query="how long do replicants live", + tags=["technical"] +) +``` + +## Performance and scale + + +Archival memory has no practical size limits and remains fast at scale: + +**Letta Cloud:** Uses [TurboPuffer](https://turbopuffer.com/) for extremely fast semantic search, even with hundreds of thousands of memories. + +**Self-hosted:** Uses pgvector (PostgreSQL) for vector search. Performance scales well with proper indexing. + +**Letta Desktop:** Uses SQLite with vector search extensions. Suitable for personal use cases. + +No matter the backend, archival memory scales to large archives without performance degradation. + + +## Embedding models and search quality + +Archival search quality depends on the agent's embedding model: + +**Letta Cloud:** All agents use `text-embedding-3-small`, which is optimized for most use cases. This model cannot be changed. + +**Self-hosted:** Embedding model is pinned to the agent at creation. The default `text-embedding-3-small` is sufficient for nearly all use cases. + +### Changing embedding models (self-hosted only) + +To change an agent's embedding model, you must: +1. List and export all archival memories +2. Delete all archival memories +3. Update the agent's embedding model +4. Re-insert all memories (they'll be re-embedded) + + +Changing embedding models is a destructive operation. Export your archival memories first. + + +## Programmatic access (SDK) + +Developers can manage archival memory programmatically via the SDK: + + +```typescript TypeScript +// Insert a memory +await client.agents.passages.insert(agent.id, { + content: "The Voight-Kampff test requires a minimum of 20 cross-referenced questions", + tags: ["technical", "testing", "protocol"] +}); + +// Search memories +const results = await client.agents.passages.search(agent.id, { + query: "testing procedures", + tags: ["protocol"], + page: 0 +}); + +// List all memories +const passages = await client.agents.passages.list(agent.id, { + limit: 100 +}); + +// Get a specific memory +const passage = await client.agents.passages.get(agent.id, passageId); +``` +```python Python +# Insert a memory +client.agents.passages.insert( + agent_id=agent.id, + content="The Voight-Kampff test requires a minimum of 20 cross-referenced questions", + tags=["technical", "testing", "protocol"] +) + +# Search memories +results = client.agents.passages.search( + agent_id=agent.id, + query="testing procedures", + tags=["protocol"], + page=0 +) + +# List all memories +passages = client.agents.passages.list( + agent_id=agent.id, + limit=100 +) + +# Get a specific memory +passage = client.agents.passages.get( + agent_id=agent.id, + passage_id=passage_id +) +``` + + +## Next steps + + + + Learn patterns, pitfalls, and advanced usage + + + Back to archival memory overview + +