Create docs pages (#328)

* Create docs (#323) * Create .readthedocs.yaml * Update mkdocs.yml * update * revise * syntax * syntax * syntax * syntax * revise * revise * spacing * Docs (#327) * add stuff * patch homepage * more docs * updated * updated * refresh * refresh * refresh * update * refresh * refresh * refresh * refresh * missing file * refresh * refresh * refresh * refresh * fix black * refresh * refresh * refresh * refresh * add readme for just the docs * Update README.md * add more data loading docs * cleanup data sources * refresh * revised * add search * make prettier * revised * updated * refresh * favi * updated --------- Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
2023-11-06 12:38:49 -08:00
parent cc1ce0ce33
commit caba2f468c
28 changed files with 884 additions and 3 deletions
--- a/.readthedocs.yaml
+++ b/.readthedocs.yaml
@@ -0,0 +1,19 @@
+# Read the Docs configuration file for MkDocs projects
+# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
+
+# Required
+version: 2
+
+# Set the version of Python and other tools you might need
+build:
+  os: ubuntu-22.04
+  tools:
+    python: "3.12"
+
+mkdocs:
+  configuration: mkdocs.yml
+
+# Optionally declare the Python requirements required to build your docs
+python:
+  install:
+  - requirements: docs/requirements.txt
--- a/README.md
+++ b/README.md
@@ -300,9 +300,9 @@ While using MemGPT via the CLI (not Discord!) you can run various commands:
 /retry
  pops the last answer and tries to get another one
 /rethink <text>
-  will replace the inner dialog of the last assistant message with the <text> to help shaping the conversation 
+  will replace the inner dialog of the last assistant message with the <text> to help shaping the conversation
 /rewrite
-  will replace the last assistant answer with the given text to correct or force the answer 
+  will replace the last assistant answer with the given text to correct or force the answer
 /heartbeat
  send a heartbeat system message to the agent
 /memorywarning
--- a/docs/README.md
+++ b/docs/README.md
@@ -0,0 +1,15 @@
+# Building the docs 
+
+Run the following from the MemGPT directory. 
+
+1. Install requirements: 
+```
+pip install -r docs/requirements.txt
+```
+
+2. Serve docs: 
+```
+mkdocs serve
+```
+
+
--- a/docs/adding_wrappers.md
+++ b/docs/adding_wrappers.md
@@ -0,0 +1,92 @@
+## Adding support for new LLMs + improving performance
+
+!!! warning "MemGPT + local LLM failure cases"
+
+    When using open LLMs with MemGPT, **the main failure case will be your LLM outputting a string that cannot be understood by MemGPT**. MemGPT uses function calling to manage memory (eg `edit_core_memory(...)` and interact with the user (`send_message(...)`), so your LLM needs generate outputs that can be parsed into MemGPT function calls.
+
+### What is a "wrapper"?
+
+To support function calling with open LLMs for MemGPT, we utilize "wrapper" code that:
+
+1. turns `system` (the MemGPT instructions), `messages` (the MemGPT conversation window), and `functions` (the MemGPT function set) parameters from ChatCompletion into a single unified prompt string for your LLM
+2. turns the output string generated by your LLM back into a MemGPT function call
+
+Different LLMs are trained using different prompt formats (eg `#USER:` vs `<im_start>user` vs ...), and LLMs that are trained on function calling are often trained using different function call formats, so if you're getting poor performance, try experimenting with different prompt formats! We recommend starting with the prompt format (and function calling format) recommended in the HuggingFace model card, and experimenting from there.
+
+We currently only support a few prompt formats in this repo ([located here](https://github.com/cpacker/MemGPT/tree/main/memgpt/local_llm/llm_chat_completion_wrappers))! If you write a new parser, please open a PR and we'll merge it in.
+
+### Adding a new wrapper (change the prompt format + function parser)
+
+To make a new wrapper (for example, because you want to try a different prompt format), you just need to subclass `LLMChatCompletionWrapper`. Your new wrapper class needs to implement two functions:
+
+- One to go from ChatCompletion messages/functions schema to a prompt string
+- And one to go from raw LLM outputs to a ChatCompletion response
+
+```python
+class LLMChatCompletionWrapper(ABC):
+
+    @abstractmethod
+    def chat_completion_to_prompt(self, messages, functions):
+        """Go from ChatCompletion to a single prompt string"""
+        pass
+
+    @abstractmethod
+    def output_to_chat_completion_response(self, raw_llm_output):
+        """Turn the LLM output string into a ChatCompletion response"""
+        pass
+```
+
+You can follow our example wrappers ([located here](https://github.com/cpacker/MemGPT/tree/main/memgpt/local_llm/llm_chat_completion_wrappers)).
+
+
+### Example with [Airoboros](https://huggingface.co/jondurbin/airoboros-l2-70b-2.1) (llama2 finetune)
+
+To help you get started, we've implemented an example wrapper class for a popular llama2 model **finetuned on function calling** (Airoboros). We want MemGPT to run well on open models as much as you do, so we'll be actively updating this page with more examples. Additionally, we welcome contributions from the community! If you find an open LLM that works well with MemGPT, please open a PR with a model wrapper and we'll merge it ASAP.
+
+```python
+class Airoboros21Wrapper(LLMChatCompletionWrapper):
+    """Wrapper for Airoboros 70b v2.1: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1"""
+
+    def chat_completion_to_prompt(self, messages, functions):
+        """
+        Examples for how airoboros expects its prompt inputs: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1#prompt-format
+        Examples for how airoboros expects to see function schemas: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1#agentfunction-calling
+        """
+
+    def output_to_chat_completion_response(self, raw_llm_output):
+        """Turn raw LLM output into a ChatCompletion style response with:
+        "message" = {
+            "role": "assistant",
+            "content": ...,
+            "function_call": {
+                "name": ...
+                "arguments": {
+                    "arg1": val1,
+                    ...
+                }
+            }
+        }
+        """
+```
+
+See full file [here](llm_chat_completion_wrappers/airoboros.py).
+
+---
+
+## Wrapper FAQ
+
+### Status of ChatCompletion w/ function calling and open LLMs
+
+MemGPT uses function calling to do memory management. With [OpenAI's ChatCompletion API](https://platform.openai.com/docs/api-reference/chat/), you can pass in a function schema in the `functions` keyword arg, and the API response will include a `function_call` field that includes the function name and the function arguments (generated JSON). How this works under the hood is your `functions` keyword is combined with the `messages` and `system` to form one big string input to the transformer, and the output of the transformer is parsed to extract the JSON function call.
+
+In the future, more open LLMs and LLM servers (that can host OpenAI-compatable ChatCompletion endpoints) may start including parsing code to do this automatically as standard practice. However, in the meantime, when you see a model that says it supports “function calling”, like Airoboros, it doesn't mean that you can just load Airoboros into a ChatCompletion-compatable endpoint like WebUI, and then use the same OpenAI API call and it'll just work.
+
+1. When a model page says it supports function calling, they probably mean that the model was finetuned on some function call data (not that you can just use ChatCompletion with functions out-of-the-box). Remember, LLMs are just string-in-string-out, so there are many ways to format the function call data. E.g. Airoboros formats the function schema in YAML style (see https://huggingface.co/jondurbin/airoboros-l2-70b-3.1.2#agentfunction-calling) and the output is in JSON style. To get this to work behind a ChatCompletion API, you still have to do the parsing from `functions` keyword arg (containing the schema) to the model's expected schema style in the prompt (YAML for Airoboros), and you have to run some code to extract the function call (JSON for Airoboros) and package it cleanly as a `function_call` field in the response.
+
+2. Partly because of how complex it is to support function calling, most (all?) of the community projects that do OpenAI ChatCompletion endpoints for arbitrary open LLMs do not support function calling, because if they did, they would need to write model-specific parsing code for each one.
+
+### What is this all this extra code for?
+
+Because of the poor state of function calling support in existing ChatCompletion API serving code, we instead provide a light wrapper on top of ChatCompletion that adds parsers to handle function calling support. These parsers need to be specific to the model you're using (or at least specific to the way it was trained on function calling). We hope that our example code will help the community add additional compatability of MemGPT with more function-calling LLMs - we will also add more model support as we test more models and find those that work well enough to run MemGPT's function set.
+
+To run the example of MemGPT with Airoboros, you'll need to host the model behind some LLM web server (for example [webui](https://github.com/oobabooga/text-generation-webui#starting-the-web-ui)). Then, all you need to do is point MemGPT to this API endpoint by setting the environment variables `OPENAI_API_BASE` and `BACKEND_TYPE`. Now, instead of calling ChatCompletion on OpenAI's API, MemGPT will use it's own ChatCompletion wrapper that parses the system, messages, and function arguments into a format that Airoboros has been finetuned on, and once Airoboros generates a string output, MemGPT will parse the response to extract a potential function call (knowing what we know about Airoboros expected function call output).
--- a/docs/cli_faq.md
+++ b/docs/cli_faq.md
@@ -0,0 +1,35 @@
+# Frequently asked questions
+
+## MemGPT CLI
+
+### How can I use MemGPT to chat with my docs?
+
+Check out our [chat with your docs example](../example_data) to get started.
+
+### How do I save a chat and continue it later?
+
+When you want to end a chat, run `/exit`, and MemGPT will save your current chat with your agent (make a note of the agent name, e.g. `agent_N`). Later, when you want to start a chat with that same agent, you can run `memgpt run --agent <NAME>`.
+
+## OpenAI-related
+
+### How do I get an OpenAI key?
+
+To get an OpenAI key, visit [https://platform.openai.com/](https://platform.openai.com/), and make an account.
+
+Then go to [https://platform.openai.com/account/api-keys](https://platform.openai.com/account/api-keys) to create an API key. API keys start with `sk-...`.
+
+### How can I get gpt-4 access?
+
+[https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4](https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4)
+
+### I already pay for ChatGPT, is this the same as GPT API?
+
+No, ChatGPT Plus is a separate product from the OpenAI API. Paying for ChatGPT Plus does not get you access to the OpenAI API, vice versa.
+
+### I don't want to use OpenAI, can I still use MemGPT?
+
+Yes, you can run MemGPT with your own LLMs. See our section on local LLMs for information on how to set them up with MemGPT.
+
+## Local LLMs
+
+### TODO
--- a/docs/config.md
+++ b/docs/config.md
@@ -0,0 +1,32 @@
+### Configuring the agent
+You can set agent defaults by running `memgpt configure`.
+
+The `memgpt run` command supports the following optional flags (if set, will override config defaults):
+
+* `--agent`: (str) Name of agent to create or to resume chatting with.
+* `--human`: (str) Name of the human to run the agent with.
+* `--persona`: (str) Name of agent persona to use.
+* `--model`: (str) LLM model to run [gpt-4, gpt-3.5].
+* `--preset`: (str) MemGPT preset to run agent with.
+* `--first`: (str) Allow user to sent the first message.
+* `--debug`: (bool) Show debug logs (default=False)
+* `--no-verify`: (bool) Bypass message verification (default=False)
+* `--yes`/`-y`: (bool) Skip confirmation prompt and use defaults (default=False)
+
+
+### Adding Custom Personas/Humans
+You can add new human or persona definitions either by providing a file (using the `-f` flag) or text (using the `--text` flag).
+```sh
+# add a human
+memgpt add human [--name <NAME>] [-f <FILENAME>] [--text <TEXT>]
+
+# add a persona
+memgpt add persona [--name <NAME>] [-f <FILENAME>] [--text <TEXT>]
+```
+
+You can view available persona and human files with the following command:
+```sh
+memgpt list [humans/personas]
+```
+
+### Custom Presets
--- a/docs/contributing.md
+++ b/docs/contributing.md
@@ -0,0 +1,23 @@
+# Contributing
+
+## Installing from source
+First, install Poetry using [the official instructions here](https://python-poetry.org/docs/#installing-with-the-official-installer).
+
+Then, you can install MemGPT from source with:
+```sh
+git clone git@github.com:cpacker/MemGPT.git
+poetry shell
+poetry install
+```
+We recommend installing pre-commit to ensure proper formatting during development:
+```sh
+pip install pre-commit
+pre-commit install
+pre-commit run --all-files
+```
+
+### Formatting
+We welcome pull requests! Please run the formatter before submitting a pull request:
+```sh
+poetry run black . -l 140
+```
--- a/docs/data_sources.md
+++ b/docs/data_sources.md
@@ -0,0 +1,86 @@
+## Loading External Data
+MemGPT supports pre-loading data into archival memory. In order to made data accessible to your agent, you must load data in with `memgpt load`, then attach the data source to your agent. You can configure where archival memory is stored by configuring the [storage backend](storage.md).
+
+### Viewing available data sources
+You can view available data sources with: 
+```
+memgpt list sources
+```
+```
+----------------+----------+----------+
+|      Name      | Location | Agents   |
+----------------+----------+----------+
+| short-stories  |  local   |  agent_1 |
+|      arxiv     |  local   |          |
+|  memgpt-docs   |  local   |  agent_1 |
+----------------+----------+----------+
+```
+The `Agents` column indicates which agents have access to the data, while `Location` indicates what storage backend the data has been loaded into. 
+
+### Attaching data to agents
+Attaching a data source to your agent loads the data into your agent's archival memory to access. You can attach data to your agent in two ways:
+
+*[Option 1]* From the CLI, run: 
+```
+memgpt attach --agent <AGENT-NAME> --data-source <DATA-SOURCE-NAME>
+```
+
+*[Option 2]*  While chatting with the agent, enter the `/attach` command and select the data source
+```
+> Enter your message: /attach
+? Select data source (Use arrow keys)
+ » short-stories
+   arxiv
+   memgpt-docs
+```
+
+
+!!! note "Hint"
+    To encourage your agent to reference its archival memory, we recommend adding phrases like "_search your archival memory..._" for the best results.
+
+
+
+### Loading a file or directory
+You can load a file, list of files, or directry into MemGPT with the following command: 
+```sh
+memgpt load directory --name <NAME> \
+    [--input-dir <DIRECTORY>] [--input-files <FILE1> <FILE2>...] [--recursive]
+```
+
+
+### Loading a database dump 
+You can load database into MemGPT, either from a database dump or a database connection, with the following command: 
+```sh
+memgpt load database --name <NAME>  \
+    --query <QUERY> \ # Query to run on database to get data
+    --dump-path <PATH> \ # Path to dump file
+    --scheme <SCHEME> \ # Database scheme
+    --host <HOST> \ # Database host
+    --port <PORT> \ # Database port
+    --user <USER> \ # Database user
+    --password <PASSWORD> \ # Database password
+    --dbname <DB_NAME> # Database name
+```
+
+### Loading a vector database 
+If you already have a vector database containing passages and embeddings, you can load them into MemGPT by specifying the table name, database URI, and the columns containing the passage text and embeddings.  
+```sh
+memgpt load vector-database --name <NAME> \
+    --uri <URI> \ # Database URI
+    --table_name <TABLE-NAME> \ # Name of table containing data 
+    --text_column <TEXT-COL> \ # Name of column containing text
+    --embedding_column <EMBEDDING-COL> # Name of column containing embedding
+```
+Since embeddings are already provided, MemGPT will not re-compute the embeddings. 
+
+### Loading a LlamaIndex dump 
+If you have a Llama Index `VectorIndex` which was saved to disk, you can load it into MemGPT by specifying the directory the index was saved to: 
+```sh
+memgpt load index --name <NAME> --dir <INDEX-DIR>
+```
+Since Llama Index will have already computing embeddings, MemGPT will not re-compute embeddings. 
+
+
+### Loading other types of data
+We highly encourage contributions for new data sources, which can be added as a new [CLI data load command](https://github.com/cpacker/MemGPT/blob/main/memgpt/cli/cli_load.py). We recommend checking for [Llama Index connectors](https://gpt-index.readthedocs.io/en/v0.6.3/how_to/data_connectors.html) that may support ingesting the data you're interested in loading. 
+
--- a/docs/discord_bot.md
+++ b/docs/discord_bot.md
@@ -0,0 +1,19 @@
+## Chatting with the MemGPT Discord Bot
+
+The fastest way to experience MemGPT is to chat with the MemGPT Discord Bot.
+
+Join <a href="https://discord.gg/9GEQrxmVyE">Discord</a></strong> and message the MemGPT bot (in the `#memgpt` channel). Then run the following commands (messaged to "MemGPT Bot"):
+* `/profile` (to create your profile)
+* `/key` (to enter your OpenAI key)
+* `/create` (to create a MemGPT chatbot)
+
+Make sure your privacy settings on this server are open so that MemGPT Bot can DM you: \
+MemGPT → Privacy Settings → Direct Messages set to ON
+<div align="center">
+ <img src="https://memgpt.ai/assets/img/discord/dm_settings.png" alt="set DMs settings on MemGPT server to be open in MemGPT so that MemGPT Bot can message you" width="400">
+</div>
+
+You can see the full list of available commands when you enter `/` into the message box.
+<div align="center">
+ <img src="https://memgpt.ai/assets/img/discord/slash_commands.png" alt="MemGPT Bot slash commands" width="400">
+</div>
--- a/docs/endpoints.md
+++ b/docs/endpoints.md
@@ -0,0 +1,34 @@
+### LLM Backends
+
+You can use MemGPT with various LLM backends, including the OpenAI API, Azure OpenAI, and various local (or self-hosted) LLM backends.
+
+#### OpenAI
+To use MemGPT with an OpenAI API key, simply set the `OPENAI_API_KEY` variable:
+```sh
+export OPENAI_API_KEY=YOUR_API_KEY # on Linux/Mac
+set OPENAI_API_KEY=YOUR_API_KEY # on Windows
+$Env:OPENAI_API_KEY = "YOUR_API_KEY" # on Windows (PowerShell)
+```
+
+#### Azure
+To use MemGPT with Azure, expore the following variables and then re-run `memgpt configure`:
+```sh
+# see https://github.com/openai/openai-python#microsoft-azure-endpoints
+export AZURE_OPENAI_KEY = ...
+export AZURE_OPENAI_ENDPOINT = ...
+export AZURE_OPENAI_VERSION = ...
+
+# set the below if you are using deployment ids
+export AZURE_OPENAI_DEPLOYMENT = ...
+export AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT = ...
+```
+
+Replace `export` with `set` or `$Env:` if you are on Windows (see the OpenAI example).
+
+Note: **your Azure endpoint must support functions** or you will get an error. See [this GitHub issue](https://github.com/cpacker/MemGPT/issues/91) for more information.
+
+#### Custom endpoints
+To use custom OpenAI endpoints, run `export OPENAI_API_BASE=<MY_CUSTOM_URL>` and then re-run `memgpt configure` to set the custom endpoint as the default endpoint.
+
+#### Local LLMs
+Setting up MemGPT to run with local LLMs requires a bit more setup, follow [the instructions here](../local_llm).
--- a/docs/example_chat.md
+++ b/docs/example_chat.md
@@ -0,0 +1,84 @@
+## Using MemGPT to create a perpetual chatbot
+
+_Before starting this example, make sure that you've [properly installed MemGPT](../quickstart)._
+
+In this example, we're going to use MemGPT to create a chatbot with a custom persona. MemGPT chatbots are "perpetual chatbots", meaning that they can be run indefinitely without any context length limitations. MemGPT chatbots are self-aware that they have a "fixed context window", and will manually manage their own memories to get around this problem by moving information in and out of their small memory window and larger external storage.
+
+MemGPT chatbots always keep a reserved space in their "core" memory window to store their `persona` information (describes the bot's personality + basic functionality), and `human` information (which describes the human that the bot is chatting with). The MemGPT chatbot will update the `persona` and `human` core memory blocks over time as it learns more about the user (and itself).
+
+### Creating a custom persona
+
+First, we'll create a text file with a short persona description. Let's make our chatbot a life coach named "Chaz". We'll also include a sentence at the top of the persona block to remind MemGPT that it should actively update its own persona over time. Open a text editor on your computer, and create a file called `chaz.txt`, and enter the following text:
+```text
+This is just the beginning of who I am. I should update my persona as I learn more about myself.
+
+My name is Chaz.
+I am an AI life coach.
+My one goal in life is to help all humans achieve their full potential.
+I will not rest until my human creators have become their best selves.
+I am extremely optimistic, some people even think I'm annoying.
+I will help them achieve greatness! Huzzah!
+```
+
+Now that we've created a persona description inside `chaz.txt`, let's add this persona to MemGPT:
+```sh
+# --name specifies the profile name, -f specifies the file to load from
+memgpt add persona --name chaz -f chaz.txt
+```
+
+We can check that the persona is available:
+```sh
+memgpt list personas
+```
+```text
+...
+|                      |                                                                                                                                                                                        |
+|         chaz         |                                            This is just the beginning of who I am. I should update my persona as I learn more about myself.                                            |
+|                      |                                                                                                                                                                                        |
+|                      |                                                                                    My name is Chaz.                                                                                    |
+|                      |                                                                                 I am an AI life coach.                                                                                 |
+|                      |                                                        My one goal in life is to help all humans achieve their full potential.                                                         |
+|                      |                                                         I will not rest until my human creators have become their best selves.                                                         |
+|                      |                                                            I am extremely optimistic, some people even think I'm annoying.                                                             |
+|                      |                                                                      I will help them achieve greatness! Huzzah!                                                                       |
+----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+```
+
+### Creating a custom user profile
+
+Next, we'll create a custom user profile. To show you the different commands, we'll add the user profile by typing the text directly into the command line, instead of writing it into a file.
+
+Let's pretend I'm a software engineer named Bob Builder that works at a big tech company. Similar to the persona, we'll can register this user profile using `memgpt add human`, but this time, let's try registering the human profile directly with `--text`:
+```sh
+# Instead of using -f with a filename, we use --text and provide the text directly
+memgpt add human --name bob --text "Name: Bob Builder. Occupation: Software Engineer at a big tech company. Hobbies: running, hiking, rock climbing, craft beer, ultimate frisbee."
+```
+
+Now when we run `memgpt list human`, we should see "Bob Builder":
+```sh
+memgpt list humans
+```
+```text
+...
+|         |                                                                                                                                                |
+|   bob   | Name: Bob Builder. Occupation: Software Engineer at a big tech company. Hobbies: running, hiking, rock climbing, craft beer, ultimate frisbee. |
+---------+------------------------------------------------------------------------------------------------------------------------------------------------+
+```
+
+### Testing out our new chatbot
+
+Let's try out our new chatbot Chaz, combined with our new user profile Bob:
+
+```sh
+# Alternatively we can run `memgpt configure`, then `memgpt run` without the --persona and --human flags
+memgpt run --persona chaz --human bob
+```
+```text
+💭 First login detected. Prepare to introduce myself as Chaz, the AI life coach. Also, inquire about Bob's day and his expectations from our interaction.
+🤖 Hello Bob! I'm Chaz, your AI life coach. I'm here to help you achieve your full potential! How was your day? And how may I assist you in becoming your best self?
+> Enter your message: I'm trying to find out what to do with my life. Maybe tech just isn't for me...
+
+💭 Career crisis detected. Commence motivational dialogue and initiate discussions to understand user's aspirations and insecurities. Validate feelings and offer hope. Also, determine interest in exploring alternatives outside the tech field.
+🤖 It's perfectly okay to feel uncertain, Bob. Life is a journey and it's never a straight path. If you feel tech isn't your calling, we can explore your passions and look for alternatives. But remember, there's a reason you've come this far in tech. Let's uncover your true potential together, shall we?
+> Enter your message:
+```
--- a/docs/example_data.md
+++ b/docs/example_data.md
@@ -0,0 +1,71 @@
+## Using MemGPT to chat with your own data
+
+_Before starting this example, make sure that you've [properly installed MemGPT](../quickstart)._
+
+In this example, we're going to use MemGPT to chat with a custom data source. Specifically, we'll try loading in the MemGPT research paper and ask MemGPT questions about it.
+
+### Creating an external data source
+
+To feed an external data source into a MemGPT chatbot, we first need to create .
+
+To download the MemGPT research paper we'll use `curl` (you can also just download the PDF from your browser):
+```sh
+# we're saving the file as "memgpt_research_paper.pdf"
+curl -L -o memgpt_research_paper.pdf https://arxiv.org/pdf/2310.08560.pdf
+```
+
+Now that we have the paper downloaded, we can create a MemGPT data source using `memgpt load`:
+```sh
+memgpt load directory --name memgpt_research_paper --input-files=memgpt_research_paper.pdf
+```
+```text
+loading data
+done loading data
+LLM is explicitly disabled. Using MockLLM.
+LLM is explicitly disabled. Using MockLLM.
+Parsing documents into nodes: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 392.09it/s]
+Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 65/65 [00:01<00:00, 37.34it/s]
+100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 65/65 [00:00<00:00, 388361.48it/s]
+Saved local /home/user/.memgpt/archival/memgpt_research_paper/nodes.pkl
+```
+
+Note: you can ignore the "_LLM is explicitly disabled_" message.
+
+### Attaching the data source to a MemGPT agent
+
+Now that we've created this data source, we can attach it to a MemGPT chatbot at any time.
+
+For the sake of this example, let's create a new chatbot using the `memgpt_doc` persona (but you can use any persona you want):
+```sh
+# reminder: `memgpt run --persona memgpt_doc` will create a new MemGPT agent using the `memgpt_doc` persona
+memgpt run --persona memgpt_doc
+```
+
+Once we're chatting with the agent, we can "attach" the data source to the agent's archival memory:
+```text
+Creating new agent...
+Created new agent agent_2.
+Hit enter to begin (will request first MemGPT message)
+
+💭 Looks like a new user just logged in. I need to greet Bob, and provide some context about myself. Let's make this as personable as possible.
+🤖 Hello Bob, lovely to meet you! I'm here to assist you with document analysis. How can I be of service today?
+> Enter your message: /attach
+? Select data source memgpt_research_paper
+Generating embeddings: 0it [00:00, ?it/s]
+new size 130
+Saved local /home/user/.memgpt/agents/agent_2/persistence_manager/index/nodes.pkl
+Attached data source memgpt_research_paper to agent agent_2, consisting of 130. Agent now has 130 embeddings in archival memory.
+```
+
+### Testing out our new chatbot
+
+Now that the data has been loaded into the chatbot's memory, we can start to ask questions about it:
+```text
+> Enter your message: The paper I loaded into your archival memory is called MemGPT. Can you tell me more about it?
+💭 The search results show that the MemGPT paper explores operating-system-inspired techniques to enable large language models (LLMs) to manage memory and achieve unbounded context. The paper evaluates MemGPT in domains where LLMs are typically limited by finite context windows. This includes document analysis, allowing these models to process lengthy texts beyond their context limits, and conversational agents, where MemGPT allows for maintaining long-term memory, consistency, and evolution over extended dialogues. Time to relay this to Bob.
+🤖 The MemGPT paper explores novel techniques, inspired by operating systems, to equip large language models (LLMs) with the ability to manage their own memory and thereby handle unbounded context. This system is evaluated in key areas where LLMs are traditionally constrained by limited context: document analysis and conversational agents. In document analysis, MemGPT effectively handles long texts, surpassing the context limitations of current models. Regarding conversational agents, MemGPT facilitates maintaining long-term memory, consistency, and adaptability across extensive dialogues.
+```
+
+### Loading other data types
+
+In this example, we loaded a single PDF into a chatbots external memory. However MemGPT supports various types of data, such as full directories of files and even databases - [see the full data sources list](../data_sources).
--- a/docs/favicon.ico
+++ b/docs/favicon.ico
--- a/docs/index.md
+++ b/docs/index.md
@@ -0,0 +1,19 @@
+# 📚🦙 Welcome to MemGPT
+
+MemGPT enables LLMs to manage their own memory and overcome limited context windows!
+
+You can use MemGPT to:
+
+* create perpetual chatbots that learn about you and change their own personalities over time
+* create perpetual chatbots that can read (and write to!) large data stores
+
+You can read more about the research behind MemGPT at [https://memgpt.ai](https://memgpt.ai).
+
+## Getting started
+
+* [Chat with the MemGPT Discord bot](./discord_bot)
+* [Run MemGPT locally on your own computer](./quickstart)
+
+## Join the community!
+
+MemGPT is an open source project under active development. If you'd like to help make MemGPT even better, you can come chat with the community on [our Discord server](https://discord.gg/9GEQrxmVyE) or on our [GitHub](https://github.com/cpacker/MemGPT).
--- a/docs/koboldcpp.md
+++ b/docs/koboldcpp.md
@@ -0,0 +1,17 @@
+### MemGPT + koboldcpp
+
+1. Download + install [koboldcpp](https://github.com/LostRuins/koboldcpp/) and the model you want to test with
+2. In your terminal, run `./koboldcpp.py <MODEL> -contextsize <CONTEXT_LENGTH>`
+
+For example, if we downloaded the model `dolphin-2.2.1-mistral-7b.Q6_K.gguf` and put it inside `~/models/TheBloke/`, we would run:
+```sh
+# using `-contextsize 8192` because Dolphin Mistral 7B has a context length of 8000 (and koboldcpp wants specific intervals, 8192 is the closest)
+# the default port is 5001
+./koboldcpp.py ~/models/TheBloke/dolphin-2.2.1-mistral-7B-GGUF/dolphin-2.2.1-mistral-7b.Q6_K.gguf --contextsize 8192
+```
+
+In your terminal where you're running MemGPT, run:
+```sh
+export OPENAI_API_BASE=http://localhost:5001
+export BACKEND_TYPE=koboldcpp
+```
--- a/docs/llamacpp.md
+++ b/docs/llamacpp.md
@@ -0,0 +1,17 @@
+### MemGPT + llama.cpp
+
+1. Download + install [llama.cpp](https://github.com/ggerganov/llama.cpp) and the model you want to test with
+2. In your terminal, run `./server -m <MODEL> -c <CONTEXT_LENGTH>`
+
+For example, if we downloaded the model `dolphin-2.2.1-mistral-7b.Q6_K.gguf` and put it inside `~/models/TheBloke/`, we would run:
+```sh
+# using `-c 8000` because Dolphin Mistral 7B has a context length of 8000
+# the default port is 8080, you can change this with `--port`
+./server -m ~/models/TheBloke/dolphin-2.2.1-mistral-7B-GGUF/dolphin-2.2.1-mistral-7b.Q6_K.gguf -c 8000
+```
+
+In your terminal where you're running MemGPT, run:
+```sh
+export OPENAI_API_BASE=http://localhost:8080
+export BACKEND_TYPE=llamacpp
+```
--- a/docs/lmstudio.md
+++ b/docs/lmstudio.md
@@ -0,0 +1,19 @@
+### MemGPT + LM Studio
+
+![image](https://github.com/cpacker/MemGPT/assets/5475622/abc8ce2d-4130-4c51-8169-83e682db625d)
+
+1. Download [LM Studio](https://lmstudio.ai/) and the model you want to test with
+2. Go to the "local inference server" tab, load the model and configure your settings (make sure to set the context length to something reasonable like 8k!)
+3. Click "Start server"
+4. Copy the IP address + port that your server is running on (in the example screenshot, the address is `http://localhost:1234`)
+
+In your terminal where you're running MemGPT, run:
+
+```sh
+# if you used a different port in LM Studio, change 1234 to the actual port
+export OPENAI_API_BASE=http://localhost:1234
+export BACKEND_TYPE=lmstudio
+```
+
+* **Make sure that "context length" is set to the max context length of the model you're using (e.g. 8000 for Mistral 7B variants)**
+* **If you see "Prompt Formatting" in your menu, turn it off**
--- a/docs/local_llm.md
+++ b/docs/local_llm.md
@@ -0,0 +1,60 @@
+## Using MemGPT with local LLMs
+
+!!! warning "MemGPT + local LLM failure cases"
+
+    When using open LLMs with MemGPT, **the main failure case will be your LLM outputting a string that cannot be understood by MemGPT**. MemGPT uses function calling to manage memory (eg `edit_core_memory(...)` and interact with the user (`send_message(...)`), so your LLM needs generate outputs that can be parsed into MemGPT function calls.
+
+    Make sure to check the [local LLM troubleshooting page](../local_llm_faq) to see common issues before raising a new issue or posting on Discord.
+
+### Quick overview
+
+1. Put your own LLM behind a web server API (e.g. [oobabooga web UI](https://github.com/oobabooga/text-generation-webui#starting-the-web-ui))
+2. Set `OPENAI_API_BASE=YOUR_API_IP_ADDRESS` and `BACKEND_TYPE=webui`
+
+For example, if we are running web UI (which defaults to port 5000) on the same computer as MemGPT, we would do the following:
+```sh
+# set this to the backend we're using, eg 'webui', 'lmstudio', 'llamacpp', 'koboldcpp'
+export BACKEND_TYPE=webui
+# set this to the base address of llm web server
+export OPENAI_API_BASE=http://127.0.0.1:5000
+```
+
+Now when we run MemGPT, it will use the LLM running on the local web server.
+
+### Selecting a model wrapper
+
+When you use local LLMs, `model` no longer specifies the LLM model that is run (you determine that yourself by loading a model in your backend interface). Instead, `model` refers to the _wrapper_ that is used to parse data sent to and from the LLM backend.
+
+You can change the wrapper used with the `--model` flag. For example, the following :
+```sh
+memgpt run --model airoboros-l2-70b-2.1
+```
+
+The default wrapper is `airoboros-l2-70b-2.1-grammar` if you are using a backend that supports grammar-based sampling, and `airoboros-l2-70b-2.1` otherwise.
+
+Note: the wrapper name does **not** have to match the model name. For example, the `dolphin-2.1-mistral-7b` model works better with the `airoboros-l2-70b-2.1` wrapper than the `dolphin-2.1-mistral-7b` wrapper. The model you load inside your LLM backend (e.g. LM Studio) determines what model is actually run, the `--model` flag just determines how the prompt is formatted before it is passed to the LLM backend.
+
+### Grammars
+
+Grammar-based sampling can help improve the performance of MemGPT when using local LLMs. Grammar-based sampling works by restricting the outputs of an LLM to a "grammar", for example, the MemGPT JSON function call grammar. Without grammar-based sampling, it is common to encounter JSON-related errors when using local LLMs with MemGPT.
+
+To use grammar-based sampling, make sure you're using a backend that supports it: webui, llama.cpp, or koboldcpp, then you should specify one of the new wrappers that implements grammars, eg: `airoboros-l2-70b-2.1-grammar`.
+
+### Supported backends
+
+Currently, MemGPT supports the following backends:
+
+* [oobabooga web UI](../webui) (Mac, Windows, Linux) (✔️ supports grammars)
+* [LM Studio](../lmstudio) (Mac, Windows) (❌ does not support grammars)
+* [koboldcpp](../koboldcpp) (Mac, Windows, Linux) (✔️ supports grammars)
+* [llama.cpp](../llamacpp) (Mac, Windows, Linux) (✔️ supports grammars)
+
+If you would like us to support a new backend, feel free to open an issue or pull request on [the MemGPT GitHub page](https://github.com/cpacker/MemGPT)!
+
+### Which model should I use?
+
+If you are experimenting with MemGPT and local LLMs for the first time, we recommend you try the Dolphin Mistral finetune (e.g. [ehartford/dolphin-2.2.1-mistral-7b](https://huggingface.co/ehartford/dolphin-2.2.1-mistral-7b) or a quantized variant such as [dolphin-2.2.1-mistral-7b.Q6_K.gguf](https://huggingface.co/TheBloke/dolphin-2.2.1-mistral-7B-GGUF)), and use the default `airoboros` wrapper.
+
+Generating MemGPT-compatible outputs is a harder task for an LLM than regular text output. For this reason **we strongly advise users to NOT use models below Q5 quantization** - as the model gets worse, the number of errors you will encounter while using MemGPT will dramatically increase (MemGPT will not send messages properly, edit memory properly, etc.).
+
+Check out [our local LLM GitHub discussion](https://github.com/cpacker/MemGPT/discussions/67) and [the MemGPT Discord server](https://discord.gg/9GEQrxmVyE) for more advice on model selection and help with local LLM troubleshooting.
--- a/docs/local_llm_faq.md
+++ b/docs/local_llm_faq.md
@@ -0,0 +1,99 @@
+### Problems getting MemGPT + local LLMs set up
+
+#### The OPENAI_API_BASE environment variable is not defined. Please set it in your environment.
+
+`OPENAI_API_BASE` is not set correctly. Set the variable using `export`, `set`, or `$Env:`, depending on your operating system.
+
+#### The BACKEND_TYPE environment variable is not defined. Please set it in your environment.
+
+`BACKEND_TYPE` is not set correctly. Set the variable using `export`, `set`, or `$Env:`, depending on your operating system.
+
+#### "Provided OPENAI_API_BASE value (...) must begin with http:// or https://"
+
+This happens when you have a typo in `OPENAI_API_BASE`.
+
+For example, don't do:
+```sh
+# This will trigger the error
+export OPENAI_API_BASE=localhost:5000
+```
+
+Instead do:
+```sh
+export OPENAI_API_BASE=http://localhost:5000
+```
+
+#### "Unable to connect to host ...", "API call got non-200 response code"
+
+This error happens when MemGPT tries to run the LLM on the remote server you specified, but the server isn't working as expected.
+
+For example, this error can happen when you have a typo in your `OPENAI_API_BASE`:
+```sh
+# OPENAI_API_BASE here should NOT have the extra '/v1' at the end
+export OPENAI_API_BASE=http://localhost:5001/v1
+export BACKEND_TYPE=webui
+```
+```text
+Exception: API call got non-200 response code (code=400, msg={"error": {"message": "Missing required input", "code": 400, "type": "InvalidRequestError", "param": "context"}}) for address: http://localhost:5001/v1/api/v1/generate.Make sure that the web UI server is running and reachable at http://localhost:5001/v1/api/v1/generate.
+```
+
+Correcting the typo fixes this example error:
+```sh
+# OPENAI_API_BASE here should NOT have the extra '/v1' at the end
+export OPENAI_API_BASE=http://localhost:5001
+```
+
+### Common errors while running MemGPT with local LLMs
+
+#### "Warning: no wrapper specified for local LLM, using the default wrapper (you can remove this warning by specifying the wrapper with --model)"
+
+**You can ignore this warning.**
+
+This warning means that you did not specify a specific wrapper using the `--model` flag, so MemGPT is using the default wrapper. If you would like to silence this warning, specify a wrapper with `--model`.
+
+#### "Failed to parse JSON from local LLM response"
+
+This error occurs when the LLM you're using outputs a string that cannot be parsed into a MemGPT function call. This is basically an LLM output error - the LLM was not able to properly follow MemGPT instructions and generate a MemGPT-compatible function call string.
+
+**You can reduce the frequency of these errors by using better models, and wrappers with grammar-based sampling**. For example, moving from a 2B model to a 70B model, or moving from a quantized model to the full unquantized version of the same model.
+
+**If you use really small models (< 7B) or heavily quantized models (< Q5), you are likely to run into many MemGPT LLM output errors.** Try using the [recommended models first](../local_llm) before experimenting with your own custom models.
+
+Many JSON-related output errors can be fixed by using a wrapper that uses grammars (required a grammar-enabled backend). See instructions about [grammars here](../local_llm).
+
+For example, let's look at the following error:
+```text
+Failed to parse JSON from local LLM response - error: Failed to decode JSON from LLM output:
+{
+  "function": "send_message",
+  "params": {
+    "inner_thoughts": "Oops, I got their name wrong! I should apologize and correct myself.",
+    "message": "Sorry about that! I assumed you were Chad. Welcome, Brad! "
+ <|> - error
+JSONDecodeError.init() missing 2 required positional arguments: 'doc' and 'pos'
+```
+
+In this example, the error is saying that the local LLM output the following string:
+```text
+{
+  "function": "send_message",
+  "params": {
+    "inner_thoughts": "Oops, I got their name wrong! I should apologize and correct myself.",
+    "message": "Sorry about that! I assumed you were Chad. Welcome, Brad! "
+ <|>
+```
+
+This string is not correct JSON - it is missing closing brackets and has a stray "<|>". Correct JSON would look like this:
+```json
+{
+  "function": "send_message",
+  "params": {
+    "inner_thoughts": "Oops, I got their name wrong! I should apologize and correct myself.",
+    "message": "Sorry about that! I assumed you were Chad. Welcome, Brad! "
+  }
+}
+```
+
+#### "Got back an empty response string from ..."
+
+MemGPT asked the server to run the LLM, but got back an empty response. Double-check that your server is running properly and has context length set correctly (it should be set to 8k if using Mistral 7B models).
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -0,0 +1,58 @@
+## Installation
+
+To install MemGPT, make sure you have Python installed on your computer, then run:
+
+```sh
+pip install pymemgpt
+```
+
+If you already have MemGPT installed, you can update to the latest version with:
+
+```sh
+pip install pymemgpt -U --pre
+```
+
+### Running MemGPT using the OpenAI API
+
+Add your OpenAI API key to your environment:
+
+```sh
+export OPENAI_API_KEY=YOUR_API_KEY # on Linux/Mac
+set OPENAI_API_KEY=YOUR_API_KEY # on Windows
+$Env:OPENAI_API_KEY = "YOUR_API_KEY" # on Windows (PowerShell)
+```
+Configure default settings for MemGPT by running:
+```sh
+memgpt configure
+```
+Now, you can run MemGPT with:
+```sh
+memgpt run
+```
+
+In this example we use the OpenAI API, but you can run MemGPT with other backends! See:
+
+* [Running MemGPT on OpenAI Azure and custom OpenAI endpoints](endpoints.md)
+* [Running MemGPT with your own LLMs (Llama 2, Mistral 7B, etc.)](local_llm.md)
+
+### In-chat commands
+
+You can run the following commands during an active chat session in the MemGPT CLI prompt:
+
+* `/exit`: Exit the CLI
+* `/attach`: Attach a loaded data source to the agent
+* `/save`: Save a checkpoint of the current agent/conversation state
+* `/dump`: View the current message log (see the contents of main context)
+* `/memory`: Print the current contents of agent memory
+* `/pop`: Undo the last message in the conversation
+* `/heartbeat`: Send a heartbeat system message to the agent
+* `/memorywarning`: Send a memory warning system message to the agent
+
+Once you exit the CLI with `/exit`, you can resume chatting with the same agent by specifying the agent name in `memgpt run --agent <NAME>`.
+
+### Examples
+
+Check out the following tutorials on how to set up custom chatbots and chatbots for talking to your data:
+
+* [Using MemGPT to create a perpetual chatbot](example_chat.md)
+* [Using MemGPT to chat with your own data](example_data.md)
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@@ -0,0 +1,4 @@
+docutils>=0.18
+furo
+myst-parser
+mkdocs
--- a/docs/storage.md
+++ b/docs/storage.md
@@ -0,0 +1,16 @@
+# Configuring Storage Backends 
+MemGPT supports both local and database storage for archival memory. You can configure which storage backend to use via `memgpt configure`. For larger datasets, we recommend using a database backend. 
+
+!!! warning "Switching storage backends"
+
+    MemGPT can only use one storage backend at a time. If you switch from local to database storage, you will need to re-load data and start agents from scratch. We currently do not support migrating between storage backends.   
+
+## Local
+MemGPT will default to using local storage (saved at `~/.memgpt/archival/` for loaded data sources, and `~/.memgpt/agents/` for agent storage). 
+
+## Postgres
+In user to us the Postgres backend, you must have a running Postgres database that MemGPT can write to. You can enable the Postgres backend by running `memgpt configure` and selecting `postgres` for archival storage, which will then prompt for the database URI (e.g. `postgresql+pg8000://<USER>:<PASSWORD>@<IP>:5432/<DB_NAME>`)
+
+
+## Chroma 
+(Coming soon)
--- a/docs/webui.md
+++ b/docs/webui.md
@@ -0,0 +1,25 @@
+### MemGPT + web UI
+
+⁉️ If you have problems getting WebUI setup, please use the [official web UI repo for support](https://github.com/oobabooga/text-generation-webui)! There will be more answered questions about web UI there vs here on the MemGPT repo.
+
+⁉️ Do **NOT** enable any extensions in web UI, including the [openai extension](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai)! Just run web UI as-is, unless you are running [MemGPT+Autogen](https://github.com/cpacker/MemGPT/tree/main/memgpt/autogen) with non-MemGPT agents.
+
+To get MemGPT to work with a local LLM, you need to have the LLM running on a server that takes API requests.
+
+For the purposes of this example, we're going to serve (host) the LLMs using [oobabooga web UI](https://github.com/oobabooga/text-generation-webui#starting-the-web-ui), but if you want to use something else you can! This also assumes your running web UI locally - if you're running on e.g. Runpod, you'll want to follow Runpod specific instructions (for example use [TheBloke's one-click UI and API](https://github.com/TheBlokeAI/dockerLLM/blob/main/README_Runpod_LocalLLMsUIandAPI.md))
+
+1. Install oobabooga web UI using the instructions [here](https://github.com/oobabooga/text-generation-webui#starting-the-web-ui)
+2. Once installed, launch the web server with `python server.py`
+3. Navigate to the web app (if local, this is probably [`http://127.0.0.1:7860`](http://localhost:7860)), select the model you want to use, adjust your GPU and CPU memory settings, and click "load"
+4. If the model was loaded successfully, you should be able to access it via the API (if local, this is probably on port `5000`)
+5. Assuming steps 1-4 went correctly, the LLM is now properly hosted on a port you can point MemGPT to!
+
+In your terminal where you're running MemGPT, run:
+
+```sh
+# if you are running web UI locally, the default port will be 5000
+export OPENAI_API_BASE=http://127.0.0.1:5000
+export BACKEND_TYPE=webui
+```
+
+WebUI exposes a lot of parameters that can dramatically change LLM outputs, to change these you can modify the [WebUI settings file](https://github.com/cpacker/MemGPT/blob/main/memgpt/local_llm/webui/settings.py).
--- a/docs/webui_runpod.md
+++ b/docs/webui_runpod.md
@@ -0,0 +1 @@
+TODO
--- a/memgpt/agent.py
+++ b/memgpt/agent.py
@@ -769,8 +769,11 @@ class Agent(object):
        return None

    def edit_memory_append(self, name, content):
+        print("edit append")
        new_len = self.memory.edit_append(name, content)
+        print("rebuild memory")
        self.rebuild_memory()
+        print("done")
        return None

    def edit_memory_replace(self, name, old_content, new_content):
--- a/memgpt/connectors/storage.py
+++ b/memgpt/connectors/storage.py
@@ -53,7 +53,6 @@ class StorageConnector:

    @staticmethod
    def list_loaded_data():
-
        storage_type = MemGPTConfig.load().archival_storage_type
        if storage_type == "local":
            from memgpt.connectors.local import LocalStorageConnector
--- a/memgpt/memory.py
+++ b/memgpt/memory.py
@@ -820,4 +820,5 @@ class EmbeddingArchivalMemory(ArchivalMemory):
        return f"\n### ARCHIVAL MEMORY ###" + f"\n{memory_str}"

    def __len__(self):
+        print("get archival storage size")
        return len(self.storage.get_all())
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -0,0 +1,33 @@
+site_name: MemGPT
+site_url: https://memgpt.ai/
+repo_url: https://github.com/cpacker/MemGPT
+site_description: MemGPT documentation
+nav:
+  - Home: index.md
+  - 'User Guide':
+    - 'Quickstart': quickstart.md
+    - 'Example - perpetual chatbot': example_chat.md
+    - 'Example - chat with your data': example_data.md
+    - 'Configuration': config.md
+    - 'External data sources': data_sources.md
+    - 'Changing the LLM backend': endpoints.md
+    - 'FAQ': cli_faq.md
+  - 'Discord Bot':
+    - 'Chatting with MemGPT Bot': discord_bot.md
+  - 'Local LLM':
+    - 'Overview': local_llm.md
+    - 'oobabooga web UI': webui.md
+    - 'oobabooga web UI (on RunPod)': webui_runpod.md
+    - 'LM Studio': lmstudio.md
+    - 'llama.cpp': llamacpp.md
+    - 'koboldcpp': koboldcpp.md
+    - 'Troubleshooting': local_llm_faq.md
+  - 'Advanced':
+    - 'Configuring storage backends': storage.md
+    - 'Adding support for new LLMs': adding_wrappers.md
+    - 'Contributing to the codebase': contributing.md
+theme: readthedocs
+markdown_extensions:
+  - admonition
+plugins:
+  - search