Add autogen example that lets you chat with docs (#342)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Support attach in memgpt autogen agent

* Add docs example

* Add documentation, cleanup
This commit is contained in:
Vivian Fang
2023-11-06 21:06:13 -08:00
committed by GitHub
parent 4fed48f75c
commit 1942ecee76
4 changed files with 176 additions and 16 deletions

View File

@@ -94,3 +94,57 @@ config_list_memgpt = [
If you're using WebUI and want to run the non-MemGPT agents with a local LLM instead of OpenAI, enable the [openai extension](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai) and point `config_list`'s `api_base` to the appropriate URL (usually port 5001).
Then, for MemGPT agents, export `OPENAI_API_BASE` and `BACKEND_TYPE` as described in [Local LLM support](../local_llm) (usually port 5000).
## Loading documents
[examples/agent_docs.py](https://github.com/cpacker/MemGPT/blob/main/memgpt/autogen/examples/agent_docs.py) contains an example of a groupchat where the MemGPT autogen agent has access to documents.
First, follow the instructions in [Example - chat with your data - Creating an external data source](../example_data/#creating-an-external-data-source):
To download the MemGPT research paper we'll use `curl` (you can also just download the PDF from your browser):
```sh
# we're saving the file as "memgpt_research_paper.pdf"
curl -L -o memgpt_research_paper.pdf https://arxiv.org/pdf/2310.08560.pdf
```
Now that we have the paper downloaded, we can create a MemGPT data source using `memgpt load`:
```sh
memgpt load directory --name memgpt_research_paper --input-files=memgpt_research_paper.pdf
```
```text
loading data
done loading data
LLM is explicitly disabled. Using MockLLM.
LLM is explicitly disabled. Using MockLLM.
Parsing documents into nodes: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 392.09it/s]
Generating embeddings: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 65/65 [00:01<00:00, 37.34it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 65/65 [00:00<00:00, 388361.48it/s]
Saved local /home/user/.memgpt/archival/memgpt_research_paper/nodes.pkl
```
Note: you can ignore the "_LLM is explicitly disabled_" message.
Now, you can run `agent_docs.py`, which asks `MemGPT_coder` what a virtual context is:
```
python3 agent_docs.py
LLM is explicitly disabled. Using MockLLM.
LLM is explicitly disabled. Using MockLLM.
LLM is explicitly disabled. Using MockLLM.
Generating embeddings: 0it [00:00, ?it/s]
new size 60
Saved local /Users/vivian/.memgpt/agents/agent_25/persistence_manager/index/nodes.pkl
Attached data source memgpt_research_paper to agent agent_25, consisting of 60. Agent now has 60 embeddings in archival memory.
LLM is explicitly disabled. Using MockLLM.
User_proxy (to chat_manager):
Tell me what a virtual context in MemGPT is. Search your archival memory.
--------------------------------------------------------------------------------
GroupChat is underpopulated with 2 agents. Direct communication would be more efficient.
MemGPT_coder (to chat_manager):
Virtual context management is a technique used in large language models like MemGPT. It's used to handle context beyond limited context windows, which is crucial for tasks such as extended conversations and document analysis. The technique was inspired by hierarchical memory systems in traditional operating systems that provide the appearance of large memory resources through data movement between fast and slow memory. This system intelligently manages different memory tiers to effectively provide extended context within the model's limited context window.
--------------------------------------------------------------------------------
...
```

View File

@@ -6,7 +6,7 @@ In this example, we're going to use MemGPT to chat with a custom data source. Sp
### Creating an external data source
To feed an external data source into a MemGPT chatbot, we first need to create .
To feed external data into a MemGPT chatbot, we first need to create a data source.
To download the MemGPT research paper we'll use `curl` (you can also just download the PDF from your browser):
```sh

View File

@@ -0,0 +1,88 @@
"""Example of how to add MemGPT into an AutoGen groupchat and chat with docs.
See https://memgpt.readthedocs.io/en/latest/autogen/#loading-documents
Based on the official AutoGen example here: https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat.ipynb
Begin by doing:
pip install "pyautogen[teachable]"
pip install pymemgpt
or
pip install -e . (inside the MemGPT home directory)
"""
import os
import autogen
from memgpt.autogen.memgpt_agent import create_autogen_memgpt_agent, create_memgpt_autogen_agent_from_config
# This config is for autogen agents that are not powered by MemGPT
config_list = [
{
"model": "gpt-4",
"api_key": os.getenv("OPENAI_API_KEY"),
}
]
# This config is for autogen agents that powered by MemGPT
config_list_memgpt = [
{
"model": "gpt-4",
},
]
USE_AUTOGEN_WORKFLOW = True
# Set to True if you want to print MemGPT's inner workings.
DEBUG = False
interface_kwargs = {
"debug": DEBUG,
"show_inner_thoughts": DEBUG,
"show_function_outputs": DEBUG,
}
llm_config = {"config_list": config_list, "seed": 42}
llm_config_memgpt = {"config_list": config_list_memgpt, "seed": 42}
# The user agent
user_proxy = autogen.UserProxyAgent(
name="User_proxy",
system_message="A human admin.",
code_execution_config={"last_n_messages": 2, "work_dir": "groupchat"},
human_input_mode="TERMINATE", # needed?
default_auto_reply="...", # Set a default auto-reply message here (non-empty auto-reply is required for LM Studio)
)
# In our example, we swap this AutoGen agent with a MemGPT agent
# This MemGPT agent will have all the benefits of MemGPT, ie persistent memory, etc.
if not USE_AUTOGEN_WORKFLOW:
coder = create_autogen_memgpt_agent(
"MemGPT_coder",
persona_description="I am a 10x engineer, trained in Python. I was the first engineer at Uber "
"(which I make sure to tell everyone I work with).",
user_description=f"You are participating in a group chat with a user ({user_proxy.name}) " f"and a product manager ({pm.name}).",
model=config_list_memgpt[0]["model"],
interface_kwargs=interface_kwargs,
)
else:
coder = create_memgpt_autogen_agent_from_config(
"MemGPT_coder",
llm_config=llm_config_memgpt,
system_message=f"I am a 10x engineer, trained in Python. I was the first engineer at Uber "
f"(which I make sure to tell everyone I work with).\n"
f"You are participating in a group chat with a user ({user_proxy.name}).",
interface_kwargs=interface_kwargs,
)
coder.attach("memgpt_research_paper") # See https://memgpt.readthedocs.io/en/latest/autogen/#loading-documents
# Initialize the group chat between the user and two LLM agents (PM and coder)
groupchat = autogen.GroupChat(agents=[user_proxy, coder], messages=[], max_round=12)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config)
# Begin the group chat with a message from the user
user_proxy.initiate_chat(
manager,
message="Tell me what a virtual context in MemGPT is. Search your archival memory.",
)

View File

@@ -1,17 +1,18 @@
from autogen.agentchat import Agent, ConversableAgent, UserProxyAgent, GroupChat, GroupChatManager
from ..agent import Agent as _Agent
from memgpt.agent import Agent as _Agent
import asyncio
from typing import Callable, Optional, List, Dict, Union, Any, Tuple
from .interface import AutoGenInterface
from ..persistence_manager import InMemoryStateManager
from .. import system
from .. import constants
from .. import presets
from ..personas import personas
from ..humans import humans
from ..config import AgentConfig
from memgpt.autogen.interface import AutoGenInterface
from memgpt.persistence_manager import LocalStateManager
import memgpt.system as system
import memgpt.constants as constants
import memgpt.presets as presets
from memgpt.personas import personas
from memgpt.humans import humans
from memgpt.config import AgentConfig
from memgpt.cli.cli import attach
from memgpt.connectors.storage import StorageConnector
def create_memgpt_autogen_agent_from_config(
@@ -101,7 +102,7 @@ def create_autogen_memgpt_agent(
interface=None,
interface_kwargs={},
persistence_manager=None,
persistence_manager_kwargs={},
persistence_manager_kwargs=None,
is_termination_msg: Optional[Callable[[Dict], bool]] = None,
):
"""
@@ -117,17 +118,24 @@ def create_autogen_memgpt_agent(
}
```
"""
interface = AutoGenInterface(**interface_kwargs) if interface is None else interface
persistence_manager = InMemoryStateManager(**persistence_manager_kwargs) if persistence_manager is None else persistence_manager
agent_config = AgentConfig(
name=autogen_name,
# name=autogen_name,
# TODO: more gracefully integrate reuse of MemGPT agents. Right now, we are creating a new MemGPT agent for
# every call to this function, because those scripts using create_autogen_memgpt_agent may contain calls
# to non-idempotent agent functions like `attach`.
persona=persona_description,
human=user_description,
model=model,
preset=presets.SYNC_CHAT,
)
interface = AutoGenInterface(**interface_kwargs) if interface is None else interface
if persistence_manager_kwargs is None:
persistence_manager_kwargs = {
"agent_config": agent_config,
}
persistence_manager = LocalStateManager(**persistence_manager_kwargs) if persistence_manager is None else persistence_manager
memgpt_agent = presets.use_preset(
preset,
agent_config,
@@ -164,6 +172,16 @@ class MemGPTAgent(ConversableAgent):
self._is_termination_msg = is_termination_msg if is_termination_msg is not None else (lambda x: x == "TERMINATE")
def attach(self, data_source: str):
# attach new data
attach(self.agent.config.name, data_source)
# update agent config
self.agent.config.attach_data_source(data_source)
# reload agent with new data source
self.agent.persistence_manager.archival_memory.storage = StorageConnector.get_storage_connector(agent_config=self.agent.config)
def format_other_agent_message(self, msg):
if "name" in msg:
user_message = f"{msg['name']}: {msg['content']}"