docs: add docs on rag tutorials (#5729)

2025-10-24 19:07:50 -07:00
parent 787740e3cb
commit b15b04cec0
17 changed files with 3589 additions and 0 deletions
--- a/fern/docs.yml
+++ b/fern/docs.yml
@@ -707,6 +707,14 @@ navigation:
        contents:
          - page: Building customer-specific relationship agents
            path: pages/tutorials/customer-specific-agents.mdx
+      - section: RAG
+        contents:
+          - page: RAG Overview
+            path: pages/cookbooks/rag-overview.mdx
+          - page: Connect a Vector DB to Letta
+            path: pages/cookbooks/rag-simple.mdx
+          - page: Agentic RAG
+            path: pages/cookbooks/rag-agentic.mdx

  - tab: leaderboard
    layout:
--- a/fern/images/chroma-keys.png
+++ b/fern/images/chroma-keys.png
--- a/fern/images/chroma-new-project.png
+++ b/fern/images/chroma-new-project.png
--- a/fern/images/connection-string-mongodb.png
+++ b/fern/images/connection-string-mongodb.png
--- a/fern/images/create-cluster-mongodb.png
+++ b/fern/images/create-cluster-mongodb.png
--- a/fern/images/hf-token.png
+++ b/fern/images/hf-token.png
--- a/fern/images/ip-config-mongodb.png
+++ b/fern/images/ip-config-mongodb.png
--- a/fern/images/letta-api-key-nav.png
+++ b/fern/images/letta-api-key-nav.png
--- a/fern/images/letta-dep-config.png
+++ b/fern/images/letta-dep-config.png
--- a/fern/images/letta-tool-config.png
+++ b/fern/images/letta-tool-config.png
--- a/fern/images/qdrant-connection-details.png
+++ b/fern/images/qdrant-connection-details.png
--- a/fern/images/qdrant-create-cluster.png
+++ b/fern/images/qdrant-create-cluster.png
--- a/fern/images/stateless-agent-ui.png
+++ b/fern/images/stateless-agent-ui.png
--- a/fern/pages/cookbooks/rag-agentic.mdx
+++ b/fern/pages/cookbooks/rag-agentic.mdx
--- a/fern/pages/cookbooks/rag-overview.mdx
+++ b/fern/pages/cookbooks/rag-overview.mdx
@@ -0,0 +1,59 @@
+---
+title: RAG with Letta
+subtitle: Connect your custom RAG pipeline to Letta agents
+slug: guides/rag/overview
+---
+
+If you have an existing Retrieval-Augmented Generation (RAG) pipeline, you can connect it to your Letta agents. While Letta provides built-in features like archival memory, you can integrate your own RAG pipeline just as you would with any LLM API. This gives you full control over your data and retrieval methods.
+
+## What is RAG?
+
+Retrieval-Augmented Generation (RAG) enhances LLM responses by retrieving relevant information from external data sources before generating an answer. Instead of relying on the model's training data, a RAG system:
+
+1. Takes a user query.
+2. Searches a vector database for relevant documents.
+3. Includes those documents in the LLM's context.
+4. Generates an informed response based on the retrieved information.
+
+## Choosing Your RAG Approach
+
+Letta supports two approaches for integrating RAG, depending on how much control you want over the retrieval process.
+
+| Aspect | Simple RAG | Agentic RAG |
+|--------|------------|-------------|
+| **Who Controls Retrieval** | Your application controls when retrieval happens and what the retrieval query is. | The agent decides when to retrieve and what query to use. |
+| **Context Inclusion** | You can always include retrieval results in the context. | Retrieval happens only when the agent determines it's needed. |
+| **Latency** | Lower – typically single-hop, as the agent doesn't need to do a tool call. | Higher – requires tool calls for retrieval. |
+| **Client Code** | More complex, as it handles retrieval logic. | Simpler, as it just sends the user query. |
+| **Customization** | You have full control via your retrieval function. | You have full control via your custom tool definition. |
+
+Both approaches work with any vector database. Our tutorials include examples for **ChromaDB**, **MongoDB Atlas**, and **Qdrant**.
+
+## Next Steps
+
+Ready to integrate RAG with your Letta agents?
+
+<CardGroup cols={2}>
+  <Card
+    title="Simple RAG Tutorial"
+    icon="fa-sharp fa-light fa-magnifying-glass"
+    href="/guides/rag/simple"
+    iconPosition="left"
+  >
+    Learn how to manage retrieval on the client-side and inject context directly into your agent's messages.
+  </Card>
+  <Card
+    title="Agentic RAG Tutorial"
+    icon="fa-sharp fa-light fa-robot"
+    href="/guides/rag/agentic"
+    iconPosition="left"
+  >
+    Learn how to empower your agent with custom search tools for autonomous retrieval.
+  </Card>
+</CardGroup>
+
+## Additional Resources
+
+- [Custom Tools](/guides/agents/custom-tools) - Learn more about creating custom tools for your agents.
+- [Memory Management](/guides/agents/memory) - Understand how Letta's built-in memory works.
+- [Agent Development Environment](/guides/ade) - Configure and test your agents in the web interface.
--- a/fern/pages/cookbooks/rag-simple.mdx
+++ b/fern/pages/cookbooks/rag-simple.mdx
--- a/fern/pages/tutorials/letta-rag.mdx
+++ b/fern/pages/tutorials/letta-rag.mdx
@@ -0,0 +1,471 @@
+---
+title: Connect Your Custom RAG Pipeline to a Letta Agent
+subtitle: A step-by-step guide to integrating external vector databases with Letta Cloud.
+slug: cookbooks/custom-rag-integration
+---
+
+You've built a powerful Retrieval-Augmented Generation (RAG) pipeline with its own vector database, but now you want to connect it to an intelligent agent. This guide is for developers who want to integrate their existing RAG stack with Letta, giving them full control over their data while leveraging Letta's advanced agentic capabilities.
+
+By the end of this tutorial, we'll build a research assistant that uses a ChromaDB Cloud database to answer questions about scientific papers. We will explore two distinct methods for achieving this.
+
+### What You'll Learn
+
+- **Standard RAG:** How to manage retrieval on your client and inject context directly into the agent's prompt. This gives you maximum control over the data the agent sees.
+- **Agentic RAG:** How to empower your agent with a custom tool, allowing it to decide when and what to search in your vector database. This creates a more autonomous and flexible agent.
+
+## Prerequisites
+
+To follow along, you need free accounts for the following platforms:
+
+- **[Letta](https://www.letta.com):** To access the agent development platform
+- **[ChromaDB Cloud](https://www.trychroma.com/):** To host our vector database
+
+You will also need Python 3.8+ and a code editor.
+
+### Getting Your API Keys
+
+We'll need two API keys for this tutorial.
+
+<AccordionGroup>
+<Accordion title="Get your Letta API Key">
+<Steps>
+  <Step title="Create a Letta Account">
+    If you don't have one, sign up for a free account at [letta.com](https://www.letta.com).
+  </Step>
+  <Step title="Navigate to API Keys">
+    Once logged in, click on **API keys** in the sidebar.
+    ![Letta API Key Navigation](/images/letta-api-key-nav.png)
+  </Step>
+  <Step title="Create and Copy Your Key">
+    Click **+ Create API key**, give it a descriptive name, and click **Confirm**. Copy the key and save it somewhere safe.
+  </Step>
+</Steps>
+</Accordion>
+
+<Accordion title="Get your ChromaDB Cloud API Key">
+<Steps>
+  <Step title="Create a ChromaDB Cloud Account">
+    Sign up for a free account on the [ChromaDB Cloud website](https://www.trychroma.com/).
+  </Step>
+  <Step title="Create a New Database">
+    From your dashboard, create a new database.
+    ![ChromaDB New Project](/images/chroma-new-project.png)
+  </Step>
+  <Step title="Get Your API Key and Host">
+    In your project settings, you will find your **API Key** and **Host URL**. We'll need both of these for our scripts.
+    ![ChromaDB Keys](/images/chroma-keys.png)
+  </Step>
+</Steps>
+</Accordion>
+</AccordionGroup>
+
+Once you have these keys, create a `.env` file in your project directory and add them like this:
+
+```
+LETTA_API_KEY="..."
+CHROMA_API_KEY="..."
+CHROMA_TENANT="..."
+CHROMA_DATABASE="..."
+```
+
+## Part 1: Standard RAG — Full Control on the Client-Side
+
+In the standard RAG approach, our application takes the lead. It fetches the relevant information from our ChromaDB database and then passes this context, along with our query, to a simple Letta agent. This method is direct, transparent, and keeps all the retrieval logic in our client application.
+
+### Step 1: Set Up the Cloud Vector Database
+
+First, we need to populate our ChromaDB Cloud database with the content of the research papers. We'll use two papers for this demo: ["Attention Is All You Need"](https://arxiv.org/abs/1706.03762) and ["BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"](https://arxiv.org/abs/1810.04805).
+
+Before we begin, let's create a Python virtual environment to keep our dependencies isolated:
+
+```shell
+python -m venv venv
+source venv/bin/activate  # On Windows, use: venv\Scripts\activate
+```
+
+Download the research papers we'll be using:
+
+```shell
+curl -o 1706.03762.pdf https://arxiv.org/pdf/1706.03762.pdf
+curl -o 1810.04805.pdf https://arxiv.org/pdf/1810.04805.pdf
+```
+
+Now, create a `requirements.txt` file with the necessary Python libraries:
+
+```
+letta-client
+chromadb
+pypdf
+python-dotenv
+```
+
+Install them using pip:
+
+```shell
+pip install -r requirements.txt
+```
+
+Now, create a `setup.py` file. This script will load the PDFs, split them into manageable chunks, and ingest them into a ChromaDB collection named `rag_collection`.
+
+```python
+import os
+import chromadb
+import pypdf
+from dotenv import load_dotenv
+
+load_dotenv()
+
+def main():
+    # Connect to ChromaDB Cloud
+    client = chromadb.CloudClient(
+        tenant=os.getenv("CHROMA_TENANT"),
+        database=os.getenv("CHROMA_DATABASE"),
+        api_key=os.getenv("CHROMA_API_KEY")
+    )
+
+    # Create or get the collection
+    collection = client.get_or_create_collection("rag_collection")
+
+    # Ingest PDFs
+    pdf_files = ["1706.03762.pdf", "1810.04805.pdf"]
+    for pdf_file in pdf_files:
+        print(f"Ingesting {pdf_file}...")
+        reader = pypdf.PdfReader(pdf_file)
+        for i, page in enumerate(reader.pages):
+            collection.add(
+                ids=[f"{pdf_file}-{i}"],
+                documents=[page.extract_text()]
+            )
+
+    print("\nIngestion complete!")
+    print(f"Total documents in collection: {collection.count()}")
+
+if __name__ == "__main__":
+    main()
+
+```
+
+Run the script from your terminal:
+
+```shell
+python setup.py
+```
+
+This script connects to your ChromaDB Cloud instance, creates a collection, and adds the text content of each page from the PDFs as a separate document. Your vector database is now ready.
+
+### Step 2: Create a "Stateless" Letta Agent
+
+For the standard RAG approach, the Letta agent doesn't need any special tools or complex instructions. Its only job is to answer a question based on the context we provide. We can create this agent programmatically using the Letta SDK.
+
+Create a file named `create_agent.py`:
+
+```python
+import os
+from letta_client import Letta
+from dotenv import load_dotenv
+
+load_dotenv()
+
+# Initialize the Letta client
+client = Letta(token=os.getenv("LETTA_API_KEY"))
+
+# Create the agent
+agent = client.agents.create(
+    name="Stateless RAG Agent",
+    description="This agent answers questions based on provided context. It has no tools or special memory.",
+    memory_blocks=[
+        {
+            "label": "persona",
+            "value": "You are a helpful research assistant. Answer the user's question based *only* on the context provided."
+        }
+    ]
+)
+
+print(f"Agent '{agent.name}' created with ID: {agent.id}")
+
+```
+
+Run this script once to create the agent in your Letta project.
+
+```shell
+python create_agent.py
+```
+
+![Stateless Agent in Letta UI](/images/stateless-agent-ui.png)
+
+### Step 3: Query, Format, and Ask
+
+Now we'll write the main script, `standard_rag.py`, that ties everything together. This script will:
+
+1.  Take a user's question.
+2.  Query the `rag-demo` collection in ChromaDB to find the most relevant document chunks.
+3.  Construct a detailed prompt that includes both the user's question and the retrieved context.
+4.  Send this combined prompt to our stateless Letta agent and print the response.
+
+```python
+import os
+import chromadb
+from letta_client import Letta
+from dotenv import load_dotenv
+
+load_dotenv()
+
+# Initialize clients
+letta_client = Letta(token=os.getenv("LETTA_API_KEY"))
+chroma_client = chromadb.CloudClient(
+    tenant=os.getenv("CHROMA_TENANT"),
+    database=os.getenv("CHROMA_DATABASE"),
+    api_key=os.getenv("CHROMA_API_KEY")
+)
+
+AGENT_ID = "your-stateless-agent-id" # Replace with your agent ID
+
+def main():
+    while True:
+        question = input("Ask a question about the research papers: ")
+        if question.lower() in ['exit', 'quit']:
+            break
+
+        # 1. Query ChromaDB
+        collection = chroma_client.get_collection("rag_collection")
+        results = collection.query(query_texts=[question], n_results=3)
+        context = "\n".join(results["documents"][0])
+
+        # 2. Construct the prompt
+        prompt = f'''Context from research paper:
+{context}
+Question: {question}
+Answer:'''
+
+        # 3. Send to Letta Agent
+        response = letta_client.agents.messages.create(
+            agent_id=AGENT_ID,
+            messages=[{"role": "user", "content": prompt}]
+        )
+
+        for message in response.messages:
+            if message.message_type == 'assistant_message':
+                print(f"Agent: {message.content}")
+
+if __name__ == "__main__":
+    main()
+
+```
+
+<Note>
+Replace `your-stateless-agent-id` with the actual ID of the agent you created in the previous step.
+</Note>
+
+When you run this script, your application performs the retrieval, and the Letta agent simply provides the answer based on the context it receives. This gives you full control over the data pipeline.
+
+## Part 2: Agentic RAG — Empowering Your Agent with Tools
+
+In the agentic RAG approach, we delegate the retrieval process to the agent itself. Instead of our application deciding what to search for, we provide the agent with a custom tool that allows it to query our ChromaDB database directly. This makes the agent more autonomous and our client-side code much simpler.
+
+### Step 4: Create a Custom Search Tool
+
+A Letta tool is essentially a Python function that your agent can call. We'll create a function that searches our ChromaDB collection and returns the results. Letta handles the complexities of exposing this function to the agent securely.
+
+Create a new file named `tools.py`:
+
+```python
+import chromadb
+import os
+
+def search_research_papers(query_text: str, n_results: int = 1) -> str:
+    """
+    Searches the research paper collection for a given query.
+    Args:
+        query_text (str): The text to search for.
+        n_results (int): The number of results to return.
+    Returns:
+        str: The most relevant document found.
+    """
+    # ChromaDB Cloud Client
+    # This tool code is executed on the Letta server. It expects the ChromaDB
+    # credentials to be passed as environment variables.
+    api_key = os.getenv("CHROMA_API_KEY")
+    tenant = os.getenv("CHROMA_TENANT")
+    database = os.getenv("CHROMA_DATABASE")
+
+    if not all([api_key, tenant, database]):
+        # If run locally without the env vars, this will fail early.
+        # When run by the agent, these will be provided by the tool execution environment.
+        raise ValueError("CHROMA_API_KEY, CHROMA_TENANT, and CHROMA_DATABASE must be set as environment variables.")
+
+    client = chromadb.CloudClient(
+        tenant=tenant,
+        database=database,
+        api_key=api_key
+    )
+
+    collection = client.get_or_create_collection("rag_collection")
+
+    try:
+        results = collection.query(
+            query_texts=[query_text],
+            n_results=n_results
+        )
+
+        document = results['documents'][0][0]
+        return document
+    except Exception as e:
+        return f"Tool failed with error: {e}"
+
+```
+
+This function, `search_research_papers`, takes a query, connects to our database, retrieves the top three most relevant documents, and returns them as a single string.
+
+### Step 5: Configure a "Smart" Research Agent
+
+Next, we'll create a new, more advanced agent. This agent will have a specific persona that instructs it on how to behave and, most importantly, it will be equipped with our new search tool.
+
+Create a file named `create_agentic_agent.py`:
+
+```python
+import os
+from letta_client import Letta
+from dotenv import load_dotenv
+from tools import search_research_papers
+
+load_dotenv()
+
+# Initialize the Letta client
+client = Letta(token=os.getenv("LETTA_API_KEY"))
+
+# Create a tool from our Python function
+search_tool = client.tools.create_from_function(func=search_research_papers)
+
+# Define the agent's persona
+persona = """You are a world-class research assistant. Your goal is to answer questions accurately by searching through a database of research papers. When a user asks a question, first use the `search_research_papers` tool to find relevant information. Then, answer the user's question based on the information returned by the tool."""
+
+# Create the agent with the tool attached
+agent = client.agents.create(
+    name="Agentic RAG Assistant",
+    description="A smart agent that can search a vector database to answer questions.",
+    memory_blocks=[
+        {
+            "label": "persona",
+            "value": persona
+        }
+    ],
+    tools=[search_tool.name]
+)
+
+print(f"Agent '{agent.name}' created with ID: {agent.id}")
+
+```
+
+Run this script to create the agent:
+
+```shell
+python create_agentic_agent.py
+```
+
+#### Configure Tool Dependencies and Environment Variables
+
+For the tool to work within Letta's environment, we need to configure its dependencies and environment variables through the Letta dashboard.
+
+<Steps>
+  <Step title="Find your agent">
+    Navigate to your Letta dashboard and find the "Agentic RAG Assistant" agent you just created.
+  </Step>
+  <Step title="Access the ADE">
+    Click on your agent to open the Agent Development Environment (ADE).
+  </Step>
+  <Step title="Configure Dependencies">
+    - In the ADE, select **Tools** from the sidebar
+    - Find and click on the `search_research_papers` tool
+    - Click on the **Dependencies** tab
+    - Add `chromadb` as a dependency
+
+    ![Letta Dependencies Configuration](/images/letta-dep-config.png)
+  </Step>
+  <Step title="Configure Environment Variables">
+    - In the same tool configuration, navigate to **Simulator** > **Environment**
+    - Add the following environment variables with their corresponding values from your `.env` file:
+      - `CHROMA_API_KEY`
+      - `CHROMA_TENANT`
+      - `CHROMA_DATABASE`
+
+    ![Letta Tool Configuration](/images/letta-tool-config.png)
+  </Step>
+</Steps>
+
+Now, when the agent calls this tool, Letta's execution environment will know to install `chromadb` and will have access to the necessary credentials to connect to your database.
+
+### Step 6: Let the Agent Lead the Conversation
+
+With the agentic setup, our client-side code becomes incredibly simple. We no longer need to worry about retrieving context; we just send the user's raw question to the agent and let it handle the rest.
+
+Create the `agentic_rag.py` script:
+
+```python
+import os
+from letta_client import Letta
+from dotenv import load_dotenv
+
+load_dotenv()
+
+# Initialize client
+letta_client = Letta(token=os.getenv("LETTA_API_KEY"))
+
+AGENT_ID = "your-agentic-agent-id" # Replace with your new agent ID
+
+def main():
+    while True:
+        user_query = input("Ask a question about the research papers: ")
+        if user_query.lower() in ['exit', 'quit']:
+            break
+
+        response = letta_client.agents.messages.create(
+            agent_id=AGENT_ID,
+            messages=[{"role": "user", "content": user_query}]
+        )
+
+        for message in response.messages:
+            if message.message_type == 'assistant_message':
+                print(f"Agent: {message.content}")
+
+if __name__ == "__main__":
+    main()
+
+```
+
+<Note>
+Replace `your-agentic-agent-id` with the ID of the new agent you just created.
+</Note>
+
+When you run this script, the agent receives the question, understands from its persona that it needs to search for information, calls the `search_research_papers` tool, gets the context, and then formulates an answer. All the RAG logic is handled by the agent, not your application.
+
+## Which Approach Is Right for You?
+
+We've explored two powerful methods for connecting a custom RAG pipeline to a Letta agent. The best choice depends on your specific needs.
+
+-   **Use Standard RAG when...**
+    -   You want to maintain complete, fine-grained control over the retrieval process.
+    -   Your retrieval logic is complex and better handled by your application code.
+    -   You want to keep your agent as simple as possible and minimize its autonomy.
+
+-   **Use Agentic RAG when...**
+    -   You want to build a more autonomous agent that can handle complex, multi-step queries.
+    -   You prefer simpler, cleaner client-side code.
+    -   You want the agent to decide *when* and *what* to search for, leading to more dynamic conversations.
+
+## What's Next?
+
+Now that you've integrated a custom RAG pipeline, you can expand on this foundation. Here are a few ideas:
+
+<AccordionGroup>
+<Accordion title="Integrate Other Vector Databases">
+Swap out ChromaDB for other providers like Weaviate, Pinecone, or a database you already have in production. The core logic remains the same: create a tool that queries your database and equip your agent with it.
+</Accordion>
+
+<Accordion title="Build More Complex Tools">
+Create tools that not only read from your database but also write new information to it. This would allow your agent to learn from its interactions and update its own knowledge base over time.
+</Accordion>
+
+<Accordion title="Add More Data Sources">
+Expand your RAG pipeline to include more documents, web pages, or other sources of information. The more comprehensive your data source, the more capable your agent will become.
+</Accordion>
+</AccordionGroup>