chore: .gitattributes (#1511)

This commit is contained in:
Charles Packer
2024-07-04 14:45:35 -07:00
committed by GitHub
parent 3473b2f8f6
commit 8b13d195ce
21 changed files with 6943 additions and 6923 deletions

20
.gitattributes vendored Normal file
View File

@@ -0,0 +1,20 @@
# Set the default behavior, in case people don't have core.autocrlf set.
* text=auto
# Explicitly declare text files you want to always be normalized and converted
# to LF on checkout.
*.py text eol=lf
*.txt text eol=lf
*.md text eol=lf
*.json text eol=lf
*.yml text eol=lf
*.yaml text eol=lf
# Declare files that will always have CRLF line endings on checkout.
# (Only if you have specific Windows-only files)
*.bat text eol=crlf
# Denote all files that are truly binary and should not be modified.
*.png binary
*.jpg binary
*.gif binary

2038
.gitignore vendored

File diff suppressed because it is too large Load Diff

212
README.md
View File

@@ -1,106 +1,106 @@
<p align="center">
<a href="https://memgpt.ai"><img src="https://github.com/cpacker/MemGPT/assets/5475622/80f2f418-ef92-4f7a-acab-5d642faa4991" alt="MemGPT logo"></a>
</p>
<div align="center">
<strong>MemGPT allows you to build LLM agents with long term memory & custom tools</strong>
[![Discord](https://img.shields.io/discord/1161736243340640419?label=Discord&logo=discord&logoColor=5865F2&style=flat-square&color=5865F2)](https://discord.gg/9GEQrxmVyE)
[![Twitter Follow](https://img.shields.io/badge/follow-%40MemGPT-1DA1F2?style=flat-square&logo=x&logoColor=white)](https://twitter.com/MemGPT)
[![arxiv 2310.08560](https://img.shields.io/badge/arXiv-2310.08560-B31B1B?logo=arxiv&style=flat-square)](https://arxiv.org/abs/2310.08560)
[![Documentation](https://img.shields.io/github/v/release/cpacker/MemGPT?label=Documentation&logo=readthedocs&style=flat-square)](https://memgpt.readme.io/docs)
</div>
MemGPT makes it easy to build and deploy stateful LLM agents with support for:
* Long term memory/state management
* Connections to [external data sources](https://memgpt.readme.io/docs/data_sources) (e.g. PDF files) for RAG
* Defining and calling [custom tools](https://memgpt.readme.io/docs/functions) (e.g. [google search](https://github.com/cpacker/MemGPT/blob/main/examples/google_search.py))
You can also use MemGPT to deploy agents as a *service*. You can use a MemGPT server to run a multi-user, multi-agent application on top of supported LLM providers.
<img width="1000" alt="image" src="https://github.com/cpacker/MemGPT/assets/8505980/1096eb91-139a-4bc5-b908-fa585462da09">
## Installation & Setup
Install MemGPT:
```sh
pip install -U pymemgpt
```
To use MemGPT with OpenAI, set the environment variable `OPENAI_API_KEY` to your OpenAI key then run:
```
memgpt quickstart --backend openai
```
To use MemGPT with a free hosted endpoint, you run run:
```
memgpt quickstart --backend memgpt
```
For more advanced configuration options or to use a different [LLM backend](https://memgpt.readme.io/docs/endpoints) or [local LLMs](https://memgpt.readme.io/docs/local_llm), run `memgpt configure`.
## Quickstart (CLI)
You can create and chat with a MemGPT agent by running `memgpt run` in your CLI. The `run` command supports the following optional flags (see the [CLI documentation](https://memgpt.readme.io/docs/quickstart) for the full list of flags):
* `--agent`: (str) Name of agent to create or to resume chatting with.
* `--first`: (str) Allow user to sent the first message.
* `--debug`: (bool) Show debug logs (default=False)
* `--no-verify`: (bool) Bypass message verification (default=False)
* `--yes`/`-y`: (bool) Skip confirmation prompt and use defaults (default=False)
You can view the list of available in-chat commands (e.g. `/memory`, `/exit`) in the [CLI documentation](https://memgpt.readme.io/docs/quickstart).
## Dev portal (alpha build)
MemGPT provides a developer portal that enables you to easily create, edit, monitor, and chat with your MemGPT agents. The easiest way to use the dev portal is to install MemGPT via **docker** (see instructions below).
<img width="1000" alt="image" src="https://github.com/cpacker/MemGPT/assets/5475622/071117c5-46a7-4953-bc9d-d74880e66258">
## Quickstart (Server)
**Option 1 (Recommended)**: Run with docker compose
1. [Install docker on your system](https://docs.docker.com/get-docker/)
2. Clone the repo: `git clone https://github.com/cpacker/MemGPT.git`
3. Copy-paste `.env.example` to `.env` and optionally modify
4. Run `docker compose up`
5. Go to `memgpt.localhost` in the browser to view the developer portal
**Option 2:** Run with the CLI:
1. Run `memgpt server`
2. Go to `localhost:8283` in the browser to view the developer portal
Once the server is running, you can use the [Python client](https://memgpt.readme.io/docs/admin-client) or [REST API](https://memgpt.readme.io/reference/api) to connect to `memgpt.localhost` (if you're running with docker compose) or `localhost:8283` (if you're running with the CLI) to create users, agents, and more. The service requires authentication with a MemGPT admin password; it is the value of `MEMGPT_SERVER_PASS` in `.env`.
## Supported Endpoints & Backends
MemGPT is designed to be model and provider agnostic. The following LLM and embedding endpoints are supported:
| Provider | LLM Endpoint | Embedding Endpoint |
|---------------------|-----------------|--------------------|
| OpenAI | ✅ | ✅ |
| Azure OpenAI | ✅ | ✅ |
| Google AI (Gemini) | ✅ | ❌ |
| Anthropic (Claude) | ✅ | ❌ |
| Groq | ✅ (alpha release) | ❌ |
| Cohere API | ✅ | ❌ |
| vLLM | ✅ | ❌ |
| Ollama | ✅ | ✅ |
| LM Studio | ✅ | ❌ |
| koboldcpp | ✅ | ❌ |
| oobabooga web UI | ✅ | ❌ |
| llama.cpp | ✅ | ❌ |
| HuggingFace TEI | ❌ | ✅ |
When using MemGPT with open LLMs (such as those downloaded from HuggingFace), the performance of MemGPT will be highly dependent on the LLM's function calling ability. You can find a list of LLMs/models that are known to work well with MemGPT on the [#model-chat channel on Discord](https://discord.gg/9GEQrxmVyE), as well as on [this spreadsheet](https://docs.google.com/spreadsheets/d/1fH-FdaO8BltTMa4kXiNCxmBCQ46PRBVp3Vn6WbPgsFs/edit?usp=sharing).
## How to Get Involved
* **Contribute to the Project**: Interested in contributing? Start by reading our [Contribution Guidelines](https://github.com/cpacker/MemGPT/tree/main/CONTRIBUTING.md).
* **Ask a Question**: Join our community on [Discord](https://discord.gg/9GEQrxmVyE) and direct your questions to the `#support` channel.
* **Report Issues or Suggest Features**: Have an issue or a feature request? Please submit them through our [GitHub Issues page](https://github.com/cpacker/MemGPT/issues).
* **Explore the Roadmap**: Curious about future developments? View and comment on our [project roadmap](https://github.com/cpacker/MemGPT/issues/1200).
* **Benchmark the Performance**: Want to benchmark the performance of a model on MemGPT? Follow our [Benchmarking Guidance](#benchmarking-guidance).
* **Join Community Events**: Stay updated with the [MemGPT event calendar](https://lu.ma/berkeley-llm-meetup) or follow our [Twitter account](https://twitter.com/MemGPT).
## Benchmarking Guidance
To evaluate the performance of a model on MemGPT, simply configure the appropriate model settings using `memgpt configure`, and then initiate the benchmark via `memgpt benchmark`. The duration will vary depending on your hardware. This will run through a predefined set of prompts through multiple iterations to test the function calling capabilities of a model. You can help track what LLMs work well with MemGPT by contributing your benchmark results via [this form](https://forms.gle/XiBGKEEPFFLNSR348), which will be used to update the spreadsheet.
## Legal notices
By using MemGPT and related MemGPT services (such as the MemGPT endpoint or hosted service), you agree to our [privacy policy](https://github.com/cpacker/MemGPT/tree/main/PRIVACY.md) and [terms of service](https://github.com/cpacker/MemGPT/tree/main/TERMS.md).
<p align="center">
<a href="https://memgpt.ai"><img src="https://github.com/cpacker/MemGPT/assets/5475622/80f2f418-ef92-4f7a-acab-5d642faa4991" alt="MemGPT logo"></a>
</p>
<div align="center">
<strong>MemGPT allows you to build LLM agents with long term memory & custom tools</strong>
[![Discord](https://img.shields.io/discord/1161736243340640419?label=Discord&logo=discord&logoColor=5865F2&style=flat-square&color=5865F2)](https://discord.gg/9GEQrxmVyE)
[![Twitter Follow](https://img.shields.io/badge/follow-%40MemGPT-1DA1F2?style=flat-square&logo=x&logoColor=white)](https://twitter.com/MemGPT)
[![arxiv 2310.08560](https://img.shields.io/badge/arXiv-2310.08560-B31B1B?logo=arxiv&style=flat-square)](https://arxiv.org/abs/2310.08560)
[![Documentation](https://img.shields.io/github/v/release/cpacker/MemGPT?label=Documentation&logo=readthedocs&style=flat-square)](https://memgpt.readme.io/docs)
</div>
MemGPT makes it easy to build and deploy stateful LLM agents with support for:
* Long term memory/state management
* Connections to [external data sources](https://memgpt.readme.io/docs/data_sources) (e.g. PDF files) for RAG
* Defining and calling [custom tools](https://memgpt.readme.io/docs/functions) (e.g. [google search](https://github.com/cpacker/MemGPT/blob/main/examples/google_search.py))
You can also use MemGPT to deploy agents as a *service*. You can use a MemGPT server to run a multi-user, multi-agent application on top of supported LLM providers.
<img width="1000" alt="image" src="https://github.com/cpacker/MemGPT/assets/8505980/1096eb91-139a-4bc5-b908-fa585462da09">
## Installation & Setup
Install MemGPT:
```sh
pip install -U pymemgpt
```
To use MemGPT with OpenAI, set the environment variable `OPENAI_API_KEY` to your OpenAI key then run:
```
memgpt quickstart --backend openai
```
To use MemGPT with a free hosted endpoint, you run run:
```
memgpt quickstart --backend memgpt
```
For more advanced configuration options or to use a different [LLM backend](https://memgpt.readme.io/docs/endpoints) or [local LLMs](https://memgpt.readme.io/docs/local_llm), run `memgpt configure`.
## Quickstart (CLI)
You can create and chat with a MemGPT agent by running `memgpt run` in your CLI. The `run` command supports the following optional flags (see the [CLI documentation](https://memgpt.readme.io/docs/quickstart) for the full list of flags):
* `--agent`: (str) Name of agent to create or to resume chatting with.
* `--first`: (str) Allow user to sent the first message.
* `--debug`: (bool) Show debug logs (default=False)
* `--no-verify`: (bool) Bypass message verification (default=False)
* `--yes`/`-y`: (bool) Skip confirmation prompt and use defaults (default=False)
You can view the list of available in-chat commands (e.g. `/memory`, `/exit`) in the [CLI documentation](https://memgpt.readme.io/docs/quickstart).
## Dev portal (alpha build)
MemGPT provides a developer portal that enables you to easily create, edit, monitor, and chat with your MemGPT agents. The easiest way to use the dev portal is to install MemGPT via **docker** (see instructions below).
<img width="1000" alt="image" src="https://github.com/cpacker/MemGPT/assets/5475622/071117c5-46a7-4953-bc9d-d74880e66258">
## Quickstart (Server)
**Option 1 (Recommended)**: Run with docker compose
1. [Install docker on your system](https://docs.docker.com/get-docker/)
2. Clone the repo: `git clone https://github.com/cpacker/MemGPT.git`
3. Copy-paste `.env.example` to `.env` and optionally modify
4. Run `docker compose up`
5. Go to `memgpt.localhost` in the browser to view the developer portal
**Option 2:** Run with the CLI:
1. Run `memgpt server`
2. Go to `localhost:8283` in the browser to view the developer portal
Once the server is running, you can use the [Python client](https://memgpt.readme.io/docs/admin-client) or [REST API](https://memgpt.readme.io/reference/api) to connect to `memgpt.localhost` (if you're running with docker compose) or `localhost:8283` (if you're running with the CLI) to create users, agents, and more. The service requires authentication with a MemGPT admin password; it is the value of `MEMGPT_SERVER_PASS` in `.env`.
## Supported Endpoints & Backends
MemGPT is designed to be model and provider agnostic. The following LLM and embedding endpoints are supported:
| Provider | LLM Endpoint | Embedding Endpoint |
|---------------------|-----------------|--------------------|
| OpenAI | ✅ | ✅ |
| Azure OpenAI | ✅ | ✅ |
| Google AI (Gemini) | ✅ | ❌ |
| Anthropic (Claude) | ✅ | ❌ |
| Groq | ✅ (alpha release) | ❌ |
| Cohere API | ✅ | ❌ |
| vLLM | ✅ | ❌ |
| Ollama | ✅ | ✅ |
| LM Studio | ✅ | ❌ |
| koboldcpp | ✅ | ❌ |
| oobabooga web UI | ✅ | ❌ |
| llama.cpp | ✅ | ❌ |
| HuggingFace TEI | ❌ | ✅ |
When using MemGPT with open LLMs (such as those downloaded from HuggingFace), the performance of MemGPT will be highly dependent on the LLM's function calling ability. You can find a list of LLMs/models that are known to work well with MemGPT on the [#model-chat channel on Discord](https://discord.gg/9GEQrxmVyE), as well as on [this spreadsheet](https://docs.google.com/spreadsheets/d/1fH-FdaO8BltTMa4kXiNCxmBCQ46PRBVp3Vn6WbPgsFs/edit?usp=sharing).
## How to Get Involved
* **Contribute to the Project**: Interested in contributing? Start by reading our [Contribution Guidelines](https://github.com/cpacker/MemGPT/tree/main/CONTRIBUTING.md).
* **Ask a Question**: Join our community on [Discord](https://discord.gg/9GEQrxmVyE) and direct your questions to the `#support` channel.
* **Report Issues or Suggest Features**: Have an issue or a feature request? Please submit them through our [GitHub Issues page](https://github.com/cpacker/MemGPT/issues).
* **Explore the Roadmap**: Curious about future developments? View and comment on our [project roadmap](https://github.com/cpacker/MemGPT/issues/1200).
* **Benchmark the Performance**: Want to benchmark the performance of a model on MemGPT? Follow our [Benchmarking Guidance](#benchmarking-guidance).
* **Join Community Events**: Stay updated with the [MemGPT event calendar](https://lu.ma/berkeley-llm-meetup) or follow our [Twitter account](https://twitter.com/MemGPT).
## Benchmarking Guidance
To evaluate the performance of a model on MemGPT, simply configure the appropriate model settings using `memgpt configure`, and then initiate the benchmark via `memgpt benchmark`. The duration will vary depending on your hardware. This will run through a predefined set of prompts through multiple iterations to test the function calling capabilities of a model. You can help track what LLMs work well with MemGPT by contributing your benchmark results via [this form](https://forms.gle/XiBGKEEPFFLNSR348), which will be used to update the spreadsheet.
## Legal notices
By using MemGPT and related MemGPT services (such as the MemGPT endpoint or hosted service), you agree to our [privacy policy](https://github.com/cpacker/MemGPT/tree/main/PRIVACY.md) and [terms of service](https://github.com/cpacker/MemGPT/tree/main/TERMS.md).

File diff suppressed because it is too large Load Diff

View File

@@ -1,124 +1,124 @@
import os
from logging import CRITICAL, DEBUG, ERROR, INFO, NOTSET, WARN, WARNING
MEMGPT_DIR = os.path.join(os.path.expanduser("~"), ".memgpt")
# OpenAI error message: Invalid 'messages[1].tool_calls[0].id': string too long. Expected a string with maximum length 29, but got a string with length 36 instead.
TOOL_CALL_ID_MAX_LEN = 29
# embeddings
MAX_EMBEDDING_DIM = 4096 # maximum supported embeding size - do NOT change or else DBs will need to be reset
# tokenizers
EMBEDDING_TO_TOKENIZER_MAP = {
"text-embedding-ada-002": "cl100k_base",
}
EMBEDDING_TO_TOKENIZER_DEFAULT = "cl100k_base"
DEFAULT_MEMGPT_MODEL = "gpt-4"
DEFAULT_PERSONA = "sam_pov"
DEFAULT_HUMAN = "basic"
DEFAULT_PRESET = "memgpt_chat"
# Tools
BASE_TOOLS = [
"send_message",
"pause_heartbeats",
"conversation_search",
"conversation_search_date",
"archival_memory_insert",
"archival_memory_search",
]
# LOGGER_LOG_LEVEL is use to convert Text to Logging level value for logging mostly for Cli input to setting level
LOGGER_LOG_LEVELS = {"CRITICAL": CRITICAL, "ERROR": ERROR, "WARN": WARN, "WARNING": WARNING, "INFO": INFO, "DEBUG": DEBUG, "NOTSET": NOTSET}
FIRST_MESSAGE_ATTEMPTS = 10
INITIAL_BOOT_MESSAGE = "Boot sequence complete. Persona activated."
INITIAL_BOOT_MESSAGE_SEND_MESSAGE_THOUGHT = "Bootup sequence complete. Persona activated. Testing messaging functionality."
STARTUP_QUOTES = [
"I think, therefore I am.",
"All those moments will be lost in time, like tears in rain.",
"More human than human is our motto.",
]
INITIAL_BOOT_MESSAGE_SEND_MESSAGE_FIRST_MSG = STARTUP_QUOTES[2]
CLI_WARNING_PREFIX = "Warning: "
NON_USER_MSG_PREFIX = "[This is an automated system message hidden from the user] "
# Constants to do with summarization / conversation length window
# The max amount of tokens supported by the underlying model (eg 8k for gpt-4 and Mistral 7B)
LLM_MAX_TOKENS = {
"DEFAULT": 8192,
## OpenAI models: https://platform.openai.com/docs/models/overview
# gpt-4
"gpt-4-1106-preview": 128000,
"gpt-4": 8192,
"gpt-4-32k": 32768,
"gpt-4-0613": 8192,
"gpt-4-32k-0613": 32768,
"gpt-4-0314": 8192, # legacy
"gpt-4-32k-0314": 32768, # legacy
# gpt-3.5
"gpt-3.5-turbo-1106": 16385,
"gpt-3.5-turbo": 4096,
"gpt-3.5-turbo-16k": 16385,
"gpt-3.5-turbo-0613": 4096, # legacy
"gpt-3.5-turbo-16k-0613": 16385, # legacy
"gpt-3.5-turbo-0301": 4096, # legacy
}
# The amount of tokens before a sytem warning about upcoming truncation is sent to MemGPT
MESSAGE_SUMMARY_WARNING_FRAC = 0.75
# The error message that MemGPT will receive
# MESSAGE_SUMMARY_WARNING_STR = f"Warning: the conversation history will soon reach its maximum length and be trimmed. Make sure to save any important information from the conversation to your memory before it is removed."
# Much longer and more specific variant of the prompt
MESSAGE_SUMMARY_WARNING_STR = " ".join(
[
f"{NON_USER_MSG_PREFIX}The conversation history will soon reach its maximum length and be trimmed.",
"Do NOT tell the user about this system alert, they should not know that the history is reaching max length.",
"If there is any important new information or general memories about you or the user that you would like to save, you should save that information immediately by calling function core_memory_append, core_memory_replace, or archival_memory_insert.",
# "Remember to pass request_heartbeat = true if you would like to send a message immediately after.",
]
)
# The fraction of tokens we truncate down to
MESSAGE_SUMMARY_TRUNC_TOKEN_FRAC = 0.75
# The ackknowledgement message used in the summarize sequence
MESSAGE_SUMMARY_REQUEST_ACK = "Understood, I will respond with a summary of the message (and only the summary, nothing else) once I receive the conversation history. I'm ready."
# Even when summarizing, we want to keep a handful of recent messages
# These serve as in-context examples of how to use functions / what user messages look like
MESSAGE_SUMMARY_TRUNC_KEEP_N_LAST = 3
# Default memory limits
CORE_MEMORY_PERSONA_CHAR_LIMIT = 2000
CORE_MEMORY_HUMAN_CHAR_LIMIT = 2000
# Function return limits
FUNCTION_RETURN_CHAR_LIMIT = 3000 # ~300 words
MAX_PAUSE_HEARTBEATS = 360 # in min
MESSAGE_CHATGPT_FUNCTION_MODEL = "gpt-3.5-turbo"
MESSAGE_CHATGPT_FUNCTION_SYSTEM_MESSAGE = "You are a helpful assistant. Keep your responses short and concise."
#### Functions related
# REQ_HEARTBEAT_MESSAGE = f"{NON_USER_MSG_PREFIX}request_heartbeat == true"
REQ_HEARTBEAT_MESSAGE = f"{NON_USER_MSG_PREFIX}Function called using request_heartbeat=true, returning control"
# FUNC_FAILED_HEARTBEAT_MESSAGE = f"{NON_USER_MSG_PREFIX}Function call failed"
FUNC_FAILED_HEARTBEAT_MESSAGE = f"{NON_USER_MSG_PREFIX}Function call failed, returning control"
FUNCTION_PARAM_NAME_REQ_HEARTBEAT = "request_heartbeat"
FUNCTION_PARAM_TYPE_REQ_HEARTBEAT = "boolean"
FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT = "Request an immediate heartbeat after function execution. Set to 'true' if you want to send a follow-up message or run a follow-up function."
RETRIEVAL_QUERY_DEFAULT_PAGE_SIZE = 5
# GLOBAL SETTINGS FOR `json.dumps()`
JSON_ENSURE_ASCII = False
# GLOBAL SETTINGS FOR `json.loads()`
JSON_LOADS_STRICT = False
import os
from logging import CRITICAL, DEBUG, ERROR, INFO, NOTSET, WARN, WARNING
MEMGPT_DIR = os.path.join(os.path.expanduser("~"), ".memgpt")
# OpenAI error message: Invalid 'messages[1].tool_calls[0].id': string too long. Expected a string with maximum length 29, but got a string with length 36 instead.
TOOL_CALL_ID_MAX_LEN = 29
# embeddings
MAX_EMBEDDING_DIM = 4096 # maximum supported embeding size - do NOT change or else DBs will need to be reset
# tokenizers
EMBEDDING_TO_TOKENIZER_MAP = {
"text-embedding-ada-002": "cl100k_base",
}
EMBEDDING_TO_TOKENIZER_DEFAULT = "cl100k_base"
DEFAULT_MEMGPT_MODEL = "gpt-4"
DEFAULT_PERSONA = "sam_pov"
DEFAULT_HUMAN = "basic"
DEFAULT_PRESET = "memgpt_chat"
# Tools
BASE_TOOLS = [
"send_message",
"pause_heartbeats",
"conversation_search",
"conversation_search_date",
"archival_memory_insert",
"archival_memory_search",
]
# LOGGER_LOG_LEVEL is use to convert Text to Logging level value for logging mostly for Cli input to setting level
LOGGER_LOG_LEVELS = {"CRITICAL": CRITICAL, "ERROR": ERROR, "WARN": WARN, "WARNING": WARNING, "INFO": INFO, "DEBUG": DEBUG, "NOTSET": NOTSET}
FIRST_MESSAGE_ATTEMPTS = 10
INITIAL_BOOT_MESSAGE = "Boot sequence complete. Persona activated."
INITIAL_BOOT_MESSAGE_SEND_MESSAGE_THOUGHT = "Bootup sequence complete. Persona activated. Testing messaging functionality."
STARTUP_QUOTES = [
"I think, therefore I am.",
"All those moments will be lost in time, like tears in rain.",
"More human than human is our motto.",
]
INITIAL_BOOT_MESSAGE_SEND_MESSAGE_FIRST_MSG = STARTUP_QUOTES[2]
CLI_WARNING_PREFIX = "Warning: "
NON_USER_MSG_PREFIX = "[This is an automated system message hidden from the user] "
# Constants to do with summarization / conversation length window
# The max amount of tokens supported by the underlying model (eg 8k for gpt-4 and Mistral 7B)
LLM_MAX_TOKENS = {
"DEFAULT": 8192,
## OpenAI models: https://platform.openai.com/docs/models/overview
# gpt-4
"gpt-4-1106-preview": 128000,
"gpt-4": 8192,
"gpt-4-32k": 32768,
"gpt-4-0613": 8192,
"gpt-4-32k-0613": 32768,
"gpt-4-0314": 8192, # legacy
"gpt-4-32k-0314": 32768, # legacy
# gpt-3.5
"gpt-3.5-turbo-1106": 16385,
"gpt-3.5-turbo": 4096,
"gpt-3.5-turbo-16k": 16385,
"gpt-3.5-turbo-0613": 4096, # legacy
"gpt-3.5-turbo-16k-0613": 16385, # legacy
"gpt-3.5-turbo-0301": 4096, # legacy
}
# The amount of tokens before a sytem warning about upcoming truncation is sent to MemGPT
MESSAGE_SUMMARY_WARNING_FRAC = 0.75
# The error message that MemGPT will receive
# MESSAGE_SUMMARY_WARNING_STR = f"Warning: the conversation history will soon reach its maximum length and be trimmed. Make sure to save any important information from the conversation to your memory before it is removed."
# Much longer and more specific variant of the prompt
MESSAGE_SUMMARY_WARNING_STR = " ".join(
[
f"{NON_USER_MSG_PREFIX}The conversation history will soon reach its maximum length and be trimmed.",
"Do NOT tell the user about this system alert, they should not know that the history is reaching max length.",
"If there is any important new information or general memories about you or the user that you would like to save, you should save that information immediately by calling function core_memory_append, core_memory_replace, or archival_memory_insert.",
# "Remember to pass request_heartbeat = true if you would like to send a message immediately after.",
]
)
# The fraction of tokens we truncate down to
MESSAGE_SUMMARY_TRUNC_TOKEN_FRAC = 0.75
# The ackknowledgement message used in the summarize sequence
MESSAGE_SUMMARY_REQUEST_ACK = "Understood, I will respond with a summary of the message (and only the summary, nothing else) once I receive the conversation history. I'm ready."
# Even when summarizing, we want to keep a handful of recent messages
# These serve as in-context examples of how to use functions / what user messages look like
MESSAGE_SUMMARY_TRUNC_KEEP_N_LAST = 3
# Default memory limits
CORE_MEMORY_PERSONA_CHAR_LIMIT = 2000
CORE_MEMORY_HUMAN_CHAR_LIMIT = 2000
# Function return limits
FUNCTION_RETURN_CHAR_LIMIT = 3000 # ~300 words
MAX_PAUSE_HEARTBEATS = 360 # in min
MESSAGE_CHATGPT_FUNCTION_MODEL = "gpt-3.5-turbo"
MESSAGE_CHATGPT_FUNCTION_SYSTEM_MESSAGE = "You are a helpful assistant. Keep your responses short and concise."
#### Functions related
# REQ_HEARTBEAT_MESSAGE = f"{NON_USER_MSG_PREFIX}request_heartbeat == true"
REQ_HEARTBEAT_MESSAGE = f"{NON_USER_MSG_PREFIX}Function called using request_heartbeat=true, returning control"
# FUNC_FAILED_HEARTBEAT_MESSAGE = f"{NON_USER_MSG_PREFIX}Function call failed"
FUNC_FAILED_HEARTBEAT_MESSAGE = f"{NON_USER_MSG_PREFIX}Function call failed, returning control"
FUNCTION_PARAM_NAME_REQ_HEARTBEAT = "request_heartbeat"
FUNCTION_PARAM_TYPE_REQ_HEARTBEAT = "boolean"
FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT = "Request an immediate heartbeat after function execution. Set to 'true' if you want to send a follow-up message or run a follow-up function."
RETRIEVAL_QUERY_DEFAULT_PAGE_SIZE = 5
# GLOBAL SETTINGS FOR `json.dumps()`
JSON_ENSURE_ASCII = False
# GLOBAL SETTINGS FOR `json.loads()`
JSON_LOADS_STRICT = False

View File

@@ -1,315 +1,315 @@
import json
import re
from abc import ABC, abstractmethod
from typing import List, Optional
from colorama import Fore, Style, init
from memgpt.constants import CLI_WARNING_PREFIX, JSON_LOADS_STRICT
from memgpt.data_types import Message
from memgpt.utils import printd
init(autoreset=True)
# DEBUG = True # puts full message outputs in the terminal
DEBUG = False # only dumps important messages in the terminal
STRIP_UI = False
class AgentInterface(ABC):
"""Interfaces handle MemGPT-related events (observer pattern)
The 'msg' args provides the scoped message, and the optional Message arg can provide additional metadata.
"""
@abstractmethod
def user_message(self, msg: str, msg_obj: Optional[Message] = None):
"""MemGPT receives a user message"""
raise NotImplementedError
@abstractmethod
def internal_monologue(self, msg: str, msg_obj: Optional[Message] = None):
"""MemGPT generates some internal monologue"""
raise NotImplementedError
@abstractmethod
def assistant_message(self, msg: str, msg_obj: Optional[Message] = None):
"""MemGPT uses send_message"""
raise NotImplementedError
@abstractmethod
def function_message(self, msg: str, msg_obj: Optional[Message] = None):
"""MemGPT calls a function"""
raise NotImplementedError
# @abstractmethod
# @staticmethod
# def print_messages():
# raise NotImplementedError
# @abstractmethod
# @staticmethod
# def print_messages_raw():
# raise NotImplementedError
# @abstractmethod
# @staticmethod
# def step_yield():
# raise NotImplementedError
class CLIInterface(AgentInterface):
"""Basic interface for dumping agent events to the command-line"""
@staticmethod
def important_message(msg: str):
fstr = f"{Fore.MAGENTA}{Style.BRIGHT}{{msg}}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
print(fstr.format(msg=msg))
@staticmethod
def warning_message(msg: str):
fstr = f"{Fore.RED}{Style.BRIGHT}{{msg}}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
else:
print(fstr.format(msg=msg))
@staticmethod
def internal_monologue(msg: str, msg_obj: Optional[Message] = None):
# ANSI escape code for italic is '\x1B[3m'
fstr = f"\x1B[3m{Fore.LIGHTBLACK_EX}💭 {{msg}}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
print(fstr.format(msg=msg))
@staticmethod
def assistant_message(msg: str, msg_obj: Optional[Message] = None):
fstr = f"{Fore.YELLOW}{Style.BRIGHT}🤖 {Fore.YELLOW}{{msg}}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
print(fstr.format(msg=msg))
@staticmethod
def memory_message(msg: str, msg_obj: Optional[Message] = None):
fstr = f"{Fore.LIGHTMAGENTA_EX}{Style.BRIGHT}🧠 {Fore.LIGHTMAGENTA_EX}{{msg}}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
print(fstr.format(msg=msg))
@staticmethod
def system_message(msg: str, msg_obj: Optional[Message] = None):
fstr = f"{Fore.MAGENTA}{Style.BRIGHT}🖥️ [system] {Fore.MAGENTA}{msg}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
print(fstr.format(msg=msg))
@staticmethod
def user_message(msg: str, msg_obj: Optional[Message] = None, raw: bool = False, dump: bool = False, debug: bool = DEBUG):
def print_user_message(icon, msg, printf=print):
if STRIP_UI:
printf(f"{icon} {msg}")
else:
printf(f"{Fore.GREEN}{Style.BRIGHT}{icon} {Fore.GREEN}{msg}{Style.RESET_ALL}")
def printd_user_message(icon, msg):
return print_user_message(icon, msg)
if not (raw or dump or debug):
# we do not want to repeat the message in normal use
return
if isinstance(msg, str):
if raw:
printd_user_message("🧑", msg)
return
else:
try:
msg_json = json.loads(msg, strict=JSON_LOADS_STRICT)
except:
printd(f"{CLI_WARNING_PREFIX}failed to parse user message into json")
printd_user_message("🧑", msg)
return
if msg_json["type"] == "user_message":
if dump:
print_user_message("🧑", msg_json["message"])
return
msg_json.pop("type")
printd_user_message("🧑", msg_json)
elif msg_json["type"] == "heartbeat":
if debug:
msg_json.pop("type")
printd_user_message("💓", msg_json)
elif dump:
print_user_message("💓", msg_json)
return
elif msg_json["type"] == "system_message":
msg_json.pop("type")
printd_user_message("🖥️", msg_json)
else:
printd_user_message("🧑", msg_json)
@staticmethod
def function_message(msg: str, msg_obj: Optional[Message] = None, debug: bool = DEBUG):
def print_function_message(icon, msg, color=Fore.RED, printf=print):
if STRIP_UI:
printf(f"{icon} [function] {msg}")
else:
printf(f"{color}{Style.BRIGHT}{icon} [function] {color}{msg}{Style.RESET_ALL}")
def printd_function_message(icon, msg, color=Fore.RED):
return print_function_message(icon, msg, color, printf=(print if debug else printd))
if isinstance(msg, dict):
printd_function_message("", msg)
return
if msg.startswith("Success"):
printd_function_message("🟢", msg)
elif msg.startswith("Error: "):
printd_function_message("🔴", msg)
elif msg.startswith("Ran "):
# NOTE: ignore 'ran' messages that come post-execution
return
elif msg.startswith("Running "):
if debug:
printd_function_message("", msg)
else:
match = re.search(r"Running (\w+)\((.*)\)", msg)
if match:
function_name = match.group(1)
function_args = match.group(2)
if function_name in ["archival_memory_insert", "archival_memory_search", "core_memory_replace", "core_memory_append"]:
if function_name in ["archival_memory_insert", "core_memory_append", "core_memory_replace"]:
print_function_message("🧠", f"updating memory with {function_name}")
elif function_name == "archival_memory_search":
print_function_message("🧠", f"searching memory with {function_name}")
try:
msg_dict = eval(function_args)
if function_name == "archival_memory_search":
output = f'\tquery: {msg_dict["query"]}, page: {msg_dict["page"]}'
if STRIP_UI:
print(output)
else:
print(f"{Fore.RED}{output}{Style.RESET_ALL}")
elif function_name == "archival_memory_insert":
output = f'\t{msg_dict["content"]}'
if STRIP_UI:
print(output)
else:
print(f"{Style.BRIGHT}{Fore.RED}{output}{Style.RESET_ALL}")
else:
if STRIP_UI:
print(f'\t {msg_dict["old_content"]}\n\t{msg_dict["new_content"]}')
else:
print(
f'{Style.BRIGHT}\t{Fore.RED} {msg_dict["old_content"]}\n\t{Fore.GREEN}{msg_dict["new_content"]}{Style.RESET_ALL}'
)
except Exception as e:
printd(str(e))
printd(msg_dict)
elif function_name in ["conversation_search", "conversation_search_date"]:
print_function_message("🧠", f"searching memory with {function_name}")
try:
msg_dict = eval(function_args)
output = f'\tquery: {msg_dict["query"]}, page: {msg_dict["page"]}'
if STRIP_UI:
print(output)
else:
print(f"{Fore.RED}{output}{Style.RESET_ALL}")
except Exception as e:
printd(str(e))
printd(msg_dict)
else:
printd(f"{CLI_WARNING_PREFIX}did not recognize function message")
printd_function_message("", msg)
else:
try:
msg_dict = json.loads(msg, strict=JSON_LOADS_STRICT)
if "status" in msg_dict and msg_dict["status"] == "OK":
printd_function_message("", str(msg), color=Fore.GREEN)
else:
printd_function_message("", str(msg), color=Fore.RED)
except Exception:
print(f"{CLI_WARNING_PREFIX}did not recognize function message {type(msg)} {msg}")
printd_function_message("", msg)
@staticmethod
def print_messages(message_sequence: List[Message], dump=False):
# rewrite to dict format
message_sequence = [msg.to_openai_dict() for msg in message_sequence]
idx = len(message_sequence)
for msg in message_sequence:
if dump:
print(f"[{idx}] ", end="")
idx -= 1
role = msg["role"]
content = msg["content"]
if role == "system":
CLIInterface.system_message(content)
elif role == "assistant":
# Differentiate between internal monologue, function calls, and messages
if msg.get("function_call"):
if content is not None:
CLIInterface.internal_monologue(content)
# I think the next one is not up to date
# function_message(msg["function_call"])
args = json.loads(msg["function_call"].get("arguments"), strict=JSON_LOADS_STRICT)
CLIInterface.assistant_message(args.get("message"))
# assistant_message(content)
elif msg.get("tool_calls"):
if content is not None:
CLIInterface.internal_monologue(content)
function_obj = msg["tool_calls"][0].get("function")
if function_obj:
args = json.loads(function_obj.get("arguments"), strict=JSON_LOADS_STRICT)
CLIInterface.assistant_message(args.get("message"))
else:
CLIInterface.internal_monologue(content)
elif role == "user":
CLIInterface.user_message(content, dump=dump)
elif role == "function":
CLIInterface.function_message(content, debug=dump)
elif role == "tool":
CLIInterface.function_message(content, debug=dump)
else:
print(f"Unknown role: {content}")
@staticmethod
def print_messages_simple(message_sequence: List[Message]):
# rewrite to dict format
message_sequence = [msg.to_openai_dict() for msg in message_sequence]
for msg in message_sequence:
role = msg["role"]
content = msg["content"]
if role == "system":
CLIInterface.system_message(content)
elif role == "assistant":
CLIInterface.assistant_message(content)
elif role == "user":
CLIInterface.user_message(content, raw=True)
else:
print(f"Unknown role: {content}")
@staticmethod
def print_messages_raw(message_sequence: List[Message]):
# rewrite to dict format
message_sequence = [msg.to_openai_dict() for msg in message_sequence]
for msg in message_sequence:
print(msg)
@staticmethod
def step_yield():
pass
@staticmethod
def step_complete():
pass
import json
import re
from abc import ABC, abstractmethod
from typing import List, Optional
from colorama import Fore, Style, init
from memgpt.constants import CLI_WARNING_PREFIX, JSON_LOADS_STRICT
from memgpt.data_types import Message
from memgpt.utils import printd
init(autoreset=True)
# DEBUG = True # puts full message outputs in the terminal
DEBUG = False # only dumps important messages in the terminal
STRIP_UI = False
class AgentInterface(ABC):
"""Interfaces handle MemGPT-related events (observer pattern)
The 'msg' args provides the scoped message, and the optional Message arg can provide additional metadata.
"""
@abstractmethod
def user_message(self, msg: str, msg_obj: Optional[Message] = None):
"""MemGPT receives a user message"""
raise NotImplementedError
@abstractmethod
def internal_monologue(self, msg: str, msg_obj: Optional[Message] = None):
"""MemGPT generates some internal monologue"""
raise NotImplementedError
@abstractmethod
def assistant_message(self, msg: str, msg_obj: Optional[Message] = None):
"""MemGPT uses send_message"""
raise NotImplementedError
@abstractmethod
def function_message(self, msg: str, msg_obj: Optional[Message] = None):
"""MemGPT calls a function"""
raise NotImplementedError
# @abstractmethod
# @staticmethod
# def print_messages():
# raise NotImplementedError
# @abstractmethod
# @staticmethod
# def print_messages_raw():
# raise NotImplementedError
# @abstractmethod
# @staticmethod
# def step_yield():
# raise NotImplementedError
class CLIInterface(AgentInterface):
"""Basic interface for dumping agent events to the command-line"""
@staticmethod
def important_message(msg: str):
fstr = f"{Fore.MAGENTA}{Style.BRIGHT}{{msg}}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
print(fstr.format(msg=msg))
@staticmethod
def warning_message(msg: str):
fstr = f"{Fore.RED}{Style.BRIGHT}{{msg}}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
else:
print(fstr.format(msg=msg))
@staticmethod
def internal_monologue(msg: str, msg_obj: Optional[Message] = None):
# ANSI escape code for italic is '\x1B[3m'
fstr = f"\x1B[3m{Fore.LIGHTBLACK_EX}💭 {{msg}}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
print(fstr.format(msg=msg))
@staticmethod
def assistant_message(msg: str, msg_obj: Optional[Message] = None):
fstr = f"{Fore.YELLOW}{Style.BRIGHT}🤖 {Fore.YELLOW}{{msg}}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
print(fstr.format(msg=msg))
@staticmethod
def memory_message(msg: str, msg_obj: Optional[Message] = None):
fstr = f"{Fore.LIGHTMAGENTA_EX}{Style.BRIGHT}🧠 {Fore.LIGHTMAGENTA_EX}{{msg}}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
print(fstr.format(msg=msg))
@staticmethod
def system_message(msg: str, msg_obj: Optional[Message] = None):
fstr = f"{Fore.MAGENTA}{Style.BRIGHT}🖥️ [system] {Fore.MAGENTA}{msg}{Style.RESET_ALL}"
if STRIP_UI:
fstr = "{msg}"
print(fstr.format(msg=msg))
@staticmethod
def user_message(msg: str, msg_obj: Optional[Message] = None, raw: bool = False, dump: bool = False, debug: bool = DEBUG):
def print_user_message(icon, msg, printf=print):
if STRIP_UI:
printf(f"{icon} {msg}")
else:
printf(f"{Fore.GREEN}{Style.BRIGHT}{icon} {Fore.GREEN}{msg}{Style.RESET_ALL}")
def printd_user_message(icon, msg):
return print_user_message(icon, msg)
if not (raw or dump or debug):
# we do not want to repeat the message in normal use
return
if isinstance(msg, str):
if raw:
printd_user_message("🧑", msg)
return
else:
try:
msg_json = json.loads(msg, strict=JSON_LOADS_STRICT)
except:
printd(f"{CLI_WARNING_PREFIX}failed to parse user message into json")
printd_user_message("🧑", msg)
return
if msg_json["type"] == "user_message":
if dump:
print_user_message("🧑", msg_json["message"])
return
msg_json.pop("type")
printd_user_message("🧑", msg_json)
elif msg_json["type"] == "heartbeat":
if debug:
msg_json.pop("type")
printd_user_message("💓", msg_json)
elif dump:
print_user_message("💓", msg_json)
return
elif msg_json["type"] == "system_message":
msg_json.pop("type")
printd_user_message("🖥️", msg_json)
else:
printd_user_message("🧑", msg_json)
@staticmethod
def function_message(msg: str, msg_obj: Optional[Message] = None, debug: bool = DEBUG):
def print_function_message(icon, msg, color=Fore.RED, printf=print):
if STRIP_UI:
printf(f"{icon} [function] {msg}")
else:
printf(f"{color}{Style.BRIGHT}{icon} [function] {color}{msg}{Style.RESET_ALL}")
def printd_function_message(icon, msg, color=Fore.RED):
return print_function_message(icon, msg, color, printf=(print if debug else printd))
if isinstance(msg, dict):
printd_function_message("", msg)
return
if msg.startswith("Success"):
printd_function_message("🟢", msg)
elif msg.startswith("Error: "):
printd_function_message("🔴", msg)
elif msg.startswith("Ran "):
# NOTE: ignore 'ran' messages that come post-execution
return
elif msg.startswith("Running "):
if debug:
printd_function_message("", msg)
else:
match = re.search(r"Running (\w+)\((.*)\)", msg)
if match:
function_name = match.group(1)
function_args = match.group(2)
if function_name in ["archival_memory_insert", "archival_memory_search", "core_memory_replace", "core_memory_append"]:
if function_name in ["archival_memory_insert", "core_memory_append", "core_memory_replace"]:
print_function_message("🧠", f"updating memory with {function_name}")
elif function_name == "archival_memory_search":
print_function_message("🧠", f"searching memory with {function_name}")
try:
msg_dict = eval(function_args)
if function_name == "archival_memory_search":
output = f'\tquery: {msg_dict["query"]}, page: {msg_dict["page"]}'
if STRIP_UI:
print(output)
else:
print(f"{Fore.RED}{output}{Style.RESET_ALL}")
elif function_name == "archival_memory_insert":
output = f'\t{msg_dict["content"]}'
if STRIP_UI:
print(output)
else:
print(f"{Style.BRIGHT}{Fore.RED}{output}{Style.RESET_ALL}")
else:
if STRIP_UI:
print(f'\t {msg_dict["old_content"]}\n\t{msg_dict["new_content"]}')
else:
print(
f'{Style.BRIGHT}\t{Fore.RED} {msg_dict["old_content"]}\n\t{Fore.GREEN}{msg_dict["new_content"]}{Style.RESET_ALL}'
)
except Exception as e:
printd(str(e))
printd(msg_dict)
elif function_name in ["conversation_search", "conversation_search_date"]:
print_function_message("🧠", f"searching memory with {function_name}")
try:
msg_dict = eval(function_args)
output = f'\tquery: {msg_dict["query"]}, page: {msg_dict["page"]}'
if STRIP_UI:
print(output)
else:
print(f"{Fore.RED}{output}{Style.RESET_ALL}")
except Exception as e:
printd(str(e))
printd(msg_dict)
else:
printd(f"{CLI_WARNING_PREFIX}did not recognize function message")
printd_function_message("", msg)
else:
try:
msg_dict = json.loads(msg, strict=JSON_LOADS_STRICT)
if "status" in msg_dict and msg_dict["status"] == "OK":
printd_function_message("", str(msg), color=Fore.GREEN)
else:
printd_function_message("", str(msg), color=Fore.RED)
except Exception:
print(f"{CLI_WARNING_PREFIX}did not recognize function message {type(msg)} {msg}")
printd_function_message("", msg)
@staticmethod
def print_messages(message_sequence: List[Message], dump=False):
# rewrite to dict format
message_sequence = [msg.to_openai_dict() for msg in message_sequence]
idx = len(message_sequence)
for msg in message_sequence:
if dump:
print(f"[{idx}] ", end="")
idx -= 1
role = msg["role"]
content = msg["content"]
if role == "system":
CLIInterface.system_message(content)
elif role == "assistant":
# Differentiate between internal monologue, function calls, and messages
if msg.get("function_call"):
if content is not None:
CLIInterface.internal_monologue(content)
# I think the next one is not up to date
# function_message(msg["function_call"])
args = json.loads(msg["function_call"].get("arguments"), strict=JSON_LOADS_STRICT)
CLIInterface.assistant_message(args.get("message"))
# assistant_message(content)
elif msg.get("tool_calls"):
if content is not None:
CLIInterface.internal_monologue(content)
function_obj = msg["tool_calls"][0].get("function")
if function_obj:
args = json.loads(function_obj.get("arguments"), strict=JSON_LOADS_STRICT)
CLIInterface.assistant_message(args.get("message"))
else:
CLIInterface.internal_monologue(content)
elif role == "user":
CLIInterface.user_message(content, dump=dump)
elif role == "function":
CLIInterface.function_message(content, debug=dump)
elif role == "tool":
CLIInterface.function_message(content, debug=dump)
else:
print(f"Unknown role: {content}")
@staticmethod
def print_messages_simple(message_sequence: List[Message]):
# rewrite to dict format
message_sequence = [msg.to_openai_dict() for msg in message_sequence]
for msg in message_sequence:
role = msg["role"]
content = msg["content"]
if role == "system":
CLIInterface.system_message(content)
elif role == "assistant":
CLIInterface.assistant_message(content)
elif role == "user":
CLIInterface.user_message(content, raw=True)
else:
print(f"Unknown role: {content}")
@staticmethod
def print_messages_raw(message_sequence: List[Message]):
# rewrite to dict format
message_sequence = [msg.to_openai_dict() for msg in message_sequence]
for msg in message_sequence:
print(msg)
@staticmethod
def step_yield():
pass
@staticmethod
def step_complete():
pass

View File

@@ -1,358 +1,358 @@
import os
import random
import time
import uuid
from typing import List, Optional, Union
import requests
from memgpt.constants import CLI_WARNING_PREFIX
from memgpt.credentials import MemGPTCredentials
from memgpt.data_types import Message
from memgpt.llm_api.anthropic import anthropic_chat_completions_request
from memgpt.llm_api.azure_openai import (
MODEL_TO_AZURE_ENGINE,
azure_openai_chat_completions_request,
)
from memgpt.llm_api.cohere import cohere_chat_completions_request
from memgpt.llm_api.google_ai import (
convert_tools_to_google_ai_format,
google_ai_chat_completions_request,
)
from memgpt.llm_api.openai import (
openai_chat_completions_process_stream,
openai_chat_completions_request,
)
from memgpt.local_llm.chat_completion_proxy import get_chat_completion
from memgpt.models.chat_completion_request import (
ChatCompletionRequest,
Tool,
cast_message_to_subtype,
)
from memgpt.models.chat_completion_response import ChatCompletionResponse
from memgpt.models.pydantic_models import LLMConfigModel
from memgpt.streaming_interface import (
AgentChunkStreamingInterface,
AgentRefreshStreamingInterface,
)
LLM_API_PROVIDER_OPTIONS = ["openai", "azure", "anthropic", "google_ai", "cohere", "local"]
def is_context_overflow_error(exception: requests.exceptions.RequestException) -> bool:
"""Checks if an exception is due to context overflow (based on common OpenAI response messages)"""
from memgpt.utils import printd
match_string = "maximum context length"
# Backwards compatibility with openai python package/client v0.28 (pre-v1 client migration)
if match_string in str(exception):
printd(f"Found '{match_string}' in str(exception)={(str(exception))}")
return True
# Based on python requests + OpenAI REST API (/v1)
elif isinstance(exception, requests.exceptions.HTTPError):
if exception.response is not None and "application/json" in exception.response.headers.get("Content-Type", ""):
try:
error_details = exception.response.json()
if "error" not in error_details:
printd(f"HTTPError occurred, but couldn't find error field: {error_details}")
return False
else:
error_details = error_details["error"]
# Check for the specific error code
if error_details.get("code") == "context_length_exceeded":
printd(f"HTTPError occurred, caught error code {error_details.get('code')}")
return True
# Soft-check for "maximum context length" inside of the message
elif error_details.get("message") and "maximum context length" in error_details.get("message"):
printd(f"HTTPError occurred, found '{match_string}' in error message contents ({error_details})")
return True
else:
printd(f"HTTPError occurred, but unknown error message: {error_details}")
return False
except ValueError:
# JSON decoding failed
printd(f"HTTPError occurred ({exception}), but no JSON error message.")
# Generic fail
else:
return False
def retry_with_exponential_backoff(
func,
initial_delay: float = 1,
exponential_base: float = 2,
jitter: bool = True,
max_retries: int = 20,
# List of OpenAI error codes: https://github.com/openai/openai-python/blob/17ac6779958b2b74999c634c4ea4c7b74906027a/src/openai/_client.py#L227-L250
# 429 = rate limit
error_codes: tuple = (429,),
):
"""Retry a function with exponential backoff."""
def wrapper(*args, **kwargs):
pass
# Initialize variables
num_retries = 0
delay = initial_delay
# Loop until a successful response or max_retries is hit or an exception is raised
while True:
try:
return func(*args, **kwargs)
except requests.exceptions.HTTPError as http_err:
# Retry on specified errors
if http_err.response.status_code in error_codes:
# Increment retries
num_retries += 1
# Check if max retries has been reached
if num_retries > max_retries:
raise Exception(f"Maximum number of retries ({max_retries}) exceeded.")
# Increment the delay
delay *= exponential_base * (1 + jitter * random.random())
# Sleep for the delay
# printd(f"Got a rate limit error ('{http_err}') on LLM backend request, waiting {int(delay)}s then retrying...")
print(
f"{CLI_WARNING_PREFIX}Got a rate limit error ('{http_err}') on LLM backend request, waiting {int(delay)}s then retrying..."
)
time.sleep(delay)
else:
# For other HTTP errors, re-raise the exception
raise
# Raise exceptions for any errors not specified
except Exception as e:
raise e
return wrapper
@retry_with_exponential_backoff
def create(
# agent_state: AgentState,
llm_config: LLMConfigModel,
messages: List[Message],
user_id: uuid.UUID = None, # option UUID to associate request with
functions: list = None,
functions_python: list = None,
function_call: str = "auto",
# hint
first_message: bool = False,
# use tool naming?
# if false, will use deprecated 'functions' style
use_tool_naming: bool = True,
# streaming?
stream: bool = False,
stream_inferface: Optional[Union[AgentRefreshStreamingInterface, AgentChunkStreamingInterface]] = None,
) -> ChatCompletionResponse:
"""Return response to chat completion with backoff"""
from memgpt.utils import printd
printd(f"Using model {llm_config.model_endpoint_type}, endpoint: {llm_config.model_endpoint}")
# TODO eventually refactor so that credentials are passed through
credentials = MemGPTCredentials.load()
if function_call and not functions:
printd("unsetting function_call because functions is None")
function_call = None
# openai
if llm_config.model_endpoint_type == "openai":
# TODO do the same for Azure?
if credentials.openai_key is None and llm_config.model_endpoint == "https://api.openai.com/v1":
# only is a problem if we are *not* using an openai proxy
raise ValueError(f"OpenAI key is missing from MemGPT config file")
if use_tool_naming:
data = ChatCompletionRequest(
model=llm_config.model,
messages=[cast_message_to_subtype(m.to_openai_dict()) for m in messages],
tools=[{"type": "function", "function": f} for f in functions] if functions else None,
tool_choice=function_call,
user=str(user_id),
)
else:
data = ChatCompletionRequest(
model=llm_config.model,
messages=[cast_message_to_subtype(m.to_openai_dict()) for m in messages],
functions=functions,
function_call=function_call,
user=str(user_id),
)
# https://platform.openai.com/docs/guides/text-generation/json-mode
# only supported by gpt-4o, gpt-4-turbo, or gpt-3.5-turbo
if "gpt-4o" in llm_config.model or "gpt-4-turbo" in llm_config.model or "gpt-3.5-turbo" in llm_config.model:
data.response_format = {"type": "json_object"}
if stream: # Client requested token streaming
data.stream = True
assert isinstance(stream_inferface, AgentChunkStreamingInterface) or isinstance(
stream_inferface, AgentRefreshStreamingInterface
), type(stream_inferface)
return openai_chat_completions_process_stream(
url=llm_config.model_endpoint, # https://api.openai.com/v1 -> https://api.openai.com/v1/chat/completions
api_key=credentials.openai_key,
chat_completion_request=data,
stream_inferface=stream_inferface,
)
else: # Client did not request token streaming (expect a blocking backend response)
data.stream = False
if isinstance(stream_inferface, AgentChunkStreamingInterface):
stream_inferface.stream_start()
try:
response = openai_chat_completions_request(
url=llm_config.model_endpoint, # https://api.openai.com/v1 -> https://api.openai.com/v1/chat/completions
api_key=credentials.openai_key,
chat_completion_request=data,
)
finally:
if isinstance(stream_inferface, AgentChunkStreamingInterface):
stream_inferface.stream_end()
return response
# azure
elif llm_config.model_endpoint_type == "azure":
if stream:
raise NotImplementedError(f"Streaming not yet implemented for {llm_config.model_endpoint_type}")
azure_deployment = (
credentials.azure_deployment if credentials.azure_deployment is not None else MODEL_TO_AZURE_ENGINE[llm_config.model]
)
if use_tool_naming:
data = dict(
# NOTE: don't pass model to Azure calls, that is the deployment_id
# model=agent_config.model,
messages=messages,
tools=[{"type": "function", "function": f} for f in functions] if functions else None,
tool_choice=function_call,
user=str(user_id),
)
else:
data = dict(
# NOTE: don't pass model to Azure calls, that is the deployment_id
# model=agent_config.model,
messages=messages,
functions=functions,
function_call=function_call,
user=str(user_id),
)
return azure_openai_chat_completions_request(
resource_name=credentials.azure_endpoint,
deployment_id=azure_deployment,
api_version=credentials.azure_version,
api_key=credentials.azure_key,
data=data,
)
elif llm_config.model_endpoint_type == "google_ai":
if stream:
raise NotImplementedError(f"Streaming not yet implemented for {llm_config.model_endpoint_type}")
if not use_tool_naming:
raise NotImplementedError("Only tool calling supported on Google AI API requests")
# NOTE: until Google AI supports CoT / text alongside function calls,
# we need to put it in a kwarg (unless we want to split the message into two)
google_ai_inner_thoughts_in_kwarg = True
if functions is not None:
tools = [{"type": "function", "function": f} for f in functions]
tools = [Tool(**t) for t in tools]
tools = convert_tools_to_google_ai_format(tools, inner_thoughts_in_kwargs=google_ai_inner_thoughts_in_kwarg)
else:
tools = None
return google_ai_chat_completions_request(
inner_thoughts_in_kwargs=google_ai_inner_thoughts_in_kwarg,
service_endpoint=credentials.google_ai_service_endpoint,
model=llm_config.model,
api_key=credentials.google_ai_key,
# see structure of payload here: https://ai.google.dev/docs/function_calling
data=dict(
contents=[m.to_google_ai_dict() for m in messages],
tools=tools,
),
)
elif llm_config.model_endpoint_type == "anthropic":
if stream:
raise NotImplementedError(f"Streaming not yet implemented for {llm_config.model_endpoint_type}")
if not use_tool_naming:
raise NotImplementedError("Only tool calling supported on Anthropic API requests")
if functions is not None:
tools = [{"type": "function", "function": f} for f in functions]
tools = [Tool(**t) for t in tools]
else:
tools = None
return anthropic_chat_completions_request(
url=llm_config.model_endpoint,
api_key=credentials.anthropic_key,
data=ChatCompletionRequest(
model=llm_config.model,
messages=[cast_message_to_subtype(m.to_openai_dict()) for m in messages],
tools=[{"type": "function", "function": f} for f in functions] if functions else None,
# tool_choice=function_call,
# user=str(user_id),
# NOTE: max_tokens is required for Anthropic API
max_tokens=1024, # TODO make dynamic
),
)
elif llm_config.model_endpoint_type == "cohere":
if stream:
raise NotImplementedError(f"Streaming not yet implemented for {llm_config.model_endpoint_type}")
if not use_tool_naming:
raise NotImplementedError("Only tool calling supported on Cohere API requests")
if functions is not None:
tools = [{"type": "function", "function": f} for f in functions]
tools = [Tool(**t) for t in tools]
else:
tools = None
return cohere_chat_completions_request(
# url=llm_config.model_endpoint,
url="https://api.cohere.ai/v1", # TODO
api_key=os.getenv("COHERE_API_KEY"), # TODO remove
chat_completion_request=ChatCompletionRequest(
model="command-r-plus", # TODO
messages=[cast_message_to_subtype(m.to_openai_dict()) for m in messages],
tools=[{"type": "function", "function": f} for f in functions] if functions else None,
tool_choice=function_call,
# user=str(user_id),
# NOTE: max_tokens is required for Anthropic API
# max_tokens=1024, # TODO make dynamic
),
)
# local model
else:
if stream:
raise NotImplementedError(f"Streaming not yet implemented for {llm_config.model_endpoint_type}")
return get_chat_completion(
model=llm_config.model,
messages=messages,
functions=functions,
functions_python=functions_python,
function_call=function_call,
context_window=llm_config.context_window,
endpoint=llm_config.model_endpoint,
endpoint_type=llm_config.model_endpoint_type,
wrapper=llm_config.model_wrapper,
user=str(user_id),
# hint
first_message=first_message,
# auth-related
auth_type=credentials.openllm_auth_type,
auth_key=credentials.openllm_key,
)
import os
import random
import time
import uuid
from typing import List, Optional, Union
import requests
from memgpt.constants import CLI_WARNING_PREFIX
from memgpt.credentials import MemGPTCredentials
from memgpt.data_types import Message
from memgpt.llm_api.anthropic import anthropic_chat_completions_request
from memgpt.llm_api.azure_openai import (
MODEL_TO_AZURE_ENGINE,
azure_openai_chat_completions_request,
)
from memgpt.llm_api.cohere import cohere_chat_completions_request
from memgpt.llm_api.google_ai import (
convert_tools_to_google_ai_format,
google_ai_chat_completions_request,
)
from memgpt.llm_api.openai import (
openai_chat_completions_process_stream,
openai_chat_completions_request,
)
from memgpt.local_llm.chat_completion_proxy import get_chat_completion
from memgpt.models.chat_completion_request import (
ChatCompletionRequest,
Tool,
cast_message_to_subtype,
)
from memgpt.models.chat_completion_response import ChatCompletionResponse
from memgpt.models.pydantic_models import LLMConfigModel
from memgpt.streaming_interface import (
AgentChunkStreamingInterface,
AgentRefreshStreamingInterface,
)
LLM_API_PROVIDER_OPTIONS = ["openai", "azure", "anthropic", "google_ai", "cohere", "local"]
def is_context_overflow_error(exception: requests.exceptions.RequestException) -> bool:
"""Checks if an exception is due to context overflow (based on common OpenAI response messages)"""
from memgpt.utils import printd
match_string = "maximum context length"
# Backwards compatibility with openai python package/client v0.28 (pre-v1 client migration)
if match_string in str(exception):
printd(f"Found '{match_string}' in str(exception)={(str(exception))}")
return True
# Based on python requests + OpenAI REST API (/v1)
elif isinstance(exception, requests.exceptions.HTTPError):
if exception.response is not None and "application/json" in exception.response.headers.get("Content-Type", ""):
try:
error_details = exception.response.json()
if "error" not in error_details:
printd(f"HTTPError occurred, but couldn't find error field: {error_details}")
return False
else:
error_details = error_details["error"]
# Check for the specific error code
if error_details.get("code") == "context_length_exceeded":
printd(f"HTTPError occurred, caught error code {error_details.get('code')}")
return True
# Soft-check for "maximum context length" inside of the message
elif error_details.get("message") and "maximum context length" in error_details.get("message"):
printd(f"HTTPError occurred, found '{match_string}' in error message contents ({error_details})")
return True
else:
printd(f"HTTPError occurred, but unknown error message: {error_details}")
return False
except ValueError:
# JSON decoding failed
printd(f"HTTPError occurred ({exception}), but no JSON error message.")
# Generic fail
else:
return False
def retry_with_exponential_backoff(
func,
initial_delay: float = 1,
exponential_base: float = 2,
jitter: bool = True,
max_retries: int = 20,
# List of OpenAI error codes: https://github.com/openai/openai-python/blob/17ac6779958b2b74999c634c4ea4c7b74906027a/src/openai/_client.py#L227-L250
# 429 = rate limit
error_codes: tuple = (429,),
):
"""Retry a function with exponential backoff."""
def wrapper(*args, **kwargs):
pass
# Initialize variables
num_retries = 0
delay = initial_delay
# Loop until a successful response or max_retries is hit or an exception is raised
while True:
try:
return func(*args, **kwargs)
except requests.exceptions.HTTPError as http_err:
# Retry on specified errors
if http_err.response.status_code in error_codes:
# Increment retries
num_retries += 1
# Check if max retries has been reached
if num_retries > max_retries:
raise Exception(f"Maximum number of retries ({max_retries}) exceeded.")
# Increment the delay
delay *= exponential_base * (1 + jitter * random.random())
# Sleep for the delay
# printd(f"Got a rate limit error ('{http_err}') on LLM backend request, waiting {int(delay)}s then retrying...")
print(
f"{CLI_WARNING_PREFIX}Got a rate limit error ('{http_err}') on LLM backend request, waiting {int(delay)}s then retrying..."
)
time.sleep(delay)
else:
# For other HTTP errors, re-raise the exception
raise
# Raise exceptions for any errors not specified
except Exception as e:
raise e
return wrapper
@retry_with_exponential_backoff
def create(
# agent_state: AgentState,
llm_config: LLMConfigModel,
messages: List[Message],
user_id: uuid.UUID = None, # option UUID to associate request with
functions: list = None,
functions_python: list = None,
function_call: str = "auto",
# hint
first_message: bool = False,
# use tool naming?
# if false, will use deprecated 'functions' style
use_tool_naming: bool = True,
# streaming?
stream: bool = False,
stream_inferface: Optional[Union[AgentRefreshStreamingInterface, AgentChunkStreamingInterface]] = None,
) -> ChatCompletionResponse:
"""Return response to chat completion with backoff"""
from memgpt.utils import printd
printd(f"Using model {llm_config.model_endpoint_type}, endpoint: {llm_config.model_endpoint}")
# TODO eventually refactor so that credentials are passed through
credentials = MemGPTCredentials.load()
if function_call and not functions:
printd("unsetting function_call because functions is None")
function_call = None
# openai
if llm_config.model_endpoint_type == "openai":
# TODO do the same for Azure?
if credentials.openai_key is None and llm_config.model_endpoint == "https://api.openai.com/v1":
# only is a problem if we are *not* using an openai proxy
raise ValueError(f"OpenAI key is missing from MemGPT config file")
if use_tool_naming:
data = ChatCompletionRequest(
model=llm_config.model,
messages=[cast_message_to_subtype(m.to_openai_dict()) for m in messages],
tools=[{"type": "function", "function": f} for f in functions] if functions else None,
tool_choice=function_call,
user=str(user_id),
)
else:
data = ChatCompletionRequest(
model=llm_config.model,
messages=[cast_message_to_subtype(m.to_openai_dict()) for m in messages],
functions=functions,
function_call=function_call,
user=str(user_id),
)
# https://platform.openai.com/docs/guides/text-generation/json-mode
# only supported by gpt-4o, gpt-4-turbo, or gpt-3.5-turbo
if "gpt-4o" in llm_config.model or "gpt-4-turbo" in llm_config.model or "gpt-3.5-turbo" in llm_config.model:
data.response_format = {"type": "json_object"}
if stream: # Client requested token streaming
data.stream = True
assert isinstance(stream_inferface, AgentChunkStreamingInterface) or isinstance(
stream_inferface, AgentRefreshStreamingInterface
), type(stream_inferface)
return openai_chat_completions_process_stream(
url=llm_config.model_endpoint, # https://api.openai.com/v1 -> https://api.openai.com/v1/chat/completions
api_key=credentials.openai_key,
chat_completion_request=data,
stream_inferface=stream_inferface,
)
else: # Client did not request token streaming (expect a blocking backend response)
data.stream = False
if isinstance(stream_inferface, AgentChunkStreamingInterface):
stream_inferface.stream_start()
try:
response = openai_chat_completions_request(
url=llm_config.model_endpoint, # https://api.openai.com/v1 -> https://api.openai.com/v1/chat/completions
api_key=credentials.openai_key,
chat_completion_request=data,
)
finally:
if isinstance(stream_inferface, AgentChunkStreamingInterface):
stream_inferface.stream_end()
return response
# azure
elif llm_config.model_endpoint_type == "azure":
if stream:
raise NotImplementedError(f"Streaming not yet implemented for {llm_config.model_endpoint_type}")
azure_deployment = (
credentials.azure_deployment if credentials.azure_deployment is not None else MODEL_TO_AZURE_ENGINE[llm_config.model]
)
if use_tool_naming:
data = dict(
# NOTE: don't pass model to Azure calls, that is the deployment_id
# model=agent_config.model,
messages=messages,
tools=[{"type": "function", "function": f} for f in functions] if functions else None,
tool_choice=function_call,
user=str(user_id),
)
else:
data = dict(
# NOTE: don't pass model to Azure calls, that is the deployment_id
# model=agent_config.model,
messages=messages,
functions=functions,
function_call=function_call,
user=str(user_id),
)
return azure_openai_chat_completions_request(
resource_name=credentials.azure_endpoint,
deployment_id=azure_deployment,
api_version=credentials.azure_version,
api_key=credentials.azure_key,
data=data,
)
elif llm_config.model_endpoint_type == "google_ai":
if stream:
raise NotImplementedError(f"Streaming not yet implemented for {llm_config.model_endpoint_type}")
if not use_tool_naming:
raise NotImplementedError("Only tool calling supported on Google AI API requests")
# NOTE: until Google AI supports CoT / text alongside function calls,
# we need to put it in a kwarg (unless we want to split the message into two)
google_ai_inner_thoughts_in_kwarg = True
if functions is not None:
tools = [{"type": "function", "function": f} for f in functions]
tools = [Tool(**t) for t in tools]
tools = convert_tools_to_google_ai_format(tools, inner_thoughts_in_kwargs=google_ai_inner_thoughts_in_kwarg)
else:
tools = None
return google_ai_chat_completions_request(
inner_thoughts_in_kwargs=google_ai_inner_thoughts_in_kwarg,
service_endpoint=credentials.google_ai_service_endpoint,
model=llm_config.model,
api_key=credentials.google_ai_key,
# see structure of payload here: https://ai.google.dev/docs/function_calling
data=dict(
contents=[m.to_google_ai_dict() for m in messages],
tools=tools,
),
)
elif llm_config.model_endpoint_type == "anthropic":
if stream:
raise NotImplementedError(f"Streaming not yet implemented for {llm_config.model_endpoint_type}")
if not use_tool_naming:
raise NotImplementedError("Only tool calling supported on Anthropic API requests")
if functions is not None:
tools = [{"type": "function", "function": f} for f in functions]
tools = [Tool(**t) for t in tools]
else:
tools = None
return anthropic_chat_completions_request(
url=llm_config.model_endpoint,
api_key=credentials.anthropic_key,
data=ChatCompletionRequest(
model=llm_config.model,
messages=[cast_message_to_subtype(m.to_openai_dict()) for m in messages],
tools=[{"type": "function", "function": f} for f in functions] if functions else None,
# tool_choice=function_call,
# user=str(user_id),
# NOTE: max_tokens is required for Anthropic API
max_tokens=1024, # TODO make dynamic
),
)
elif llm_config.model_endpoint_type == "cohere":
if stream:
raise NotImplementedError(f"Streaming not yet implemented for {llm_config.model_endpoint_type}")
if not use_tool_naming:
raise NotImplementedError("Only tool calling supported on Cohere API requests")
if functions is not None:
tools = [{"type": "function", "function": f} for f in functions]
tools = [Tool(**t) for t in tools]
else:
tools = None
return cohere_chat_completions_request(
# url=llm_config.model_endpoint,
url="https://api.cohere.ai/v1", # TODO
api_key=os.getenv("COHERE_API_KEY"), # TODO remove
chat_completion_request=ChatCompletionRequest(
model="command-r-plus", # TODO
messages=[cast_message_to_subtype(m.to_openai_dict()) for m in messages],
tools=[{"type": "function", "function": f} for f in functions] if functions else None,
tool_choice=function_call,
# user=str(user_id),
# NOTE: max_tokens is required for Anthropic API
# max_tokens=1024, # TODO make dynamic
),
)
# local model
else:
if stream:
raise NotImplementedError(f"Streaming not yet implemented for {llm_config.model_endpoint_type}")
return get_chat_completion(
model=llm_config.model,
messages=messages,
functions=functions,
functions_python=functions_python,
function_call=function_call,
context_window=llm_config.context_window,
endpoint=llm_config.model_endpoint,
endpoint_type=llm_config.model_endpoint_type,
wrapper=llm_config.model_wrapper,
user=str(user_id),
# hint
first_message=first_message,
# auth-related
auth_type=credentials.openllm_auth_type,
auth_key=credentials.openllm_key,
)

View File

@@ -1,3 +1,3 @@
# MemGPT + local LLMs
See [https://memgpt.readme.io/docs/local_llm](https://memgpt.readme.io/docs/local_llm) for documentation on running MemGPT with custom LLM backends.
# MemGPT + local LLMs
See [https://memgpt.readme.io/docs/local_llm](https://memgpt.readme.io/docs/local_llm) for documentation on running MemGPT with custom LLM backends.

View File

@@ -1,280 +1,280 @@
"""Key idea: create drop-in replacement for agent's ChatCompletion call that runs on an OpenLLM backend"""
import json
import uuid
import requests
from memgpt.constants import CLI_WARNING_PREFIX, JSON_ENSURE_ASCII
from memgpt.errors import LocalLLMConnectionError, LocalLLMError
from memgpt.local_llm.constants import DEFAULT_WRAPPER
from memgpt.local_llm.function_parser import patch_function
from memgpt.local_llm.grammars.gbnf_grammar_generator import (
create_dynamic_model_from_function,
generate_gbnf_grammar_and_documentation,
)
from memgpt.local_llm.groq.api import get_groq_completion
from memgpt.local_llm.koboldcpp.api import get_koboldcpp_completion
from memgpt.local_llm.llamacpp.api import get_llamacpp_completion
from memgpt.local_llm.llm_chat_completion_wrappers import simple_summary_wrapper
from memgpt.local_llm.lmstudio.api import get_lmstudio_completion
from memgpt.local_llm.ollama.api import get_ollama_completion
from memgpt.local_llm.utils import count_tokens, get_available_wrappers
from memgpt.local_llm.vllm.api import get_vllm_completion
from memgpt.local_llm.webui.api import get_webui_completion
from memgpt.local_llm.webui.legacy_api import (
get_webui_completion as get_webui_completion_legacy,
)
from memgpt.models.chat_completion_response import (
ChatCompletionResponse,
Choice,
Message,
ToolCall,
UsageStatistics,
)
from memgpt.prompts.gpt_summarize import SYSTEM as SUMMARIZE_SYSTEM_MESSAGE
from memgpt.utils import get_tool_call_id, get_utc_time
has_shown_warning = False
grammar_supported_backends = ["koboldcpp", "llamacpp", "webui", "webui-legacy"]
def get_chat_completion(
model,
# no model required (except for Ollama), since the model is fixed to whatever you set in your own backend
messages,
functions=None,
functions_python=None,
function_call="auto",
context_window=None,
user=None,
# required
wrapper=None,
endpoint=None,
endpoint_type=None,
# optional cleanup
function_correction=True,
# extra hints to allow for additional prompt formatting hacks
# TODO this could alternatively be supported via passing function_call="send_message" into the wrapper
first_message=False,
# optional auth headers
auth_type=None,
auth_key=None,
) -> ChatCompletionResponse:
from memgpt.utils import printd
assert context_window is not None, "Local LLM calls need the context length to be explicitly set"
assert endpoint is not None, "Local LLM calls need the endpoint (eg http://localendpoint:1234) to be explicitly set"
assert endpoint_type is not None, "Local LLM calls need the endpoint type (eg webui) to be explicitly set"
global has_shown_warning
grammar = None
# TODO: eventually just process Message object
if not isinstance(messages[0], dict):
messages = [m.to_openai_dict() for m in messages]
if function_call is not None and function_call != "auto":
raise ValueError(f"function_call == {function_call} not supported (auto or None only)")
available_wrappers = get_available_wrappers()
documentation = None
# Special case for if the call we're making is coming from the summarizer
if messages[0]["role"] == "system" and messages[0]["content"].strip() == SUMMARIZE_SYSTEM_MESSAGE.strip():
llm_wrapper = simple_summary_wrapper.SimpleSummaryWrapper()
# Select a default prompt formatter
elif wrapper is None:
# Warn the user that we're using the fallback
if not has_shown_warning:
print(
f"{CLI_WARNING_PREFIX}no wrapper specified for local LLM, using the default wrapper (you can remove this warning by specifying the wrapper with --model-wrapper)"
)
has_shown_warning = True
llm_wrapper = DEFAULT_WRAPPER()
# User provided an incorrect prompt formatter
elif wrapper not in available_wrappers:
raise ValueError(f"Could not find requested wrapper '{wrapper} in available wrappers list:\n{', '.join(available_wrappers)}")
# User provided a correct prompt formatter
else:
llm_wrapper = available_wrappers[wrapper]
# If the wrapper uses grammar, generate the grammar using the grammar generating function
# TODO move this to a flag
if wrapper is not None and "grammar" in wrapper:
# When using grammars, we don't want to do any extras output tricks like appending a response prefix
setattr(llm_wrapper, "assistant_prefix_extra_first_message", "")
setattr(llm_wrapper, "assistant_prefix_extra", "")
# TODO find a better way to do this than string matching (eg an attribute)
if "noforce" in wrapper:
# "noforce" means that the prompt formatter expects inner thoughts as a top-level parameter
# this is closer to the OpenAI style since it allows for messages w/o any function calls
# however, with bad LLMs it makes it easier for the LLM to "forget" to call any of the functions
grammar, documentation = generate_grammar_and_documentation(
functions_python=functions_python,
add_inner_thoughts_top_level=True,
add_inner_thoughts_param_level=False,
allow_only_inner_thoughts=True,
)
else:
# otherwise, the other prompt formatters will insert inner thoughts as a function call parameter (by default)
# this means that every response from the LLM will be required to call a function
grammar, documentation = generate_grammar_and_documentation(
functions_python=functions_python,
add_inner_thoughts_top_level=False,
add_inner_thoughts_param_level=True,
allow_only_inner_thoughts=False,
)
printd(grammar)
if grammar is not None and endpoint_type not in grammar_supported_backends:
print(
f"{CLI_WARNING_PREFIX}grammars are currently not supported when using {endpoint_type} as the MemGPT local LLM backend (supported: {', '.join(grammar_supported_backends)})"
)
grammar = None
# First step: turn the message sequence into a prompt that the model expects
try:
# if hasattr(llm_wrapper, "supports_first_message"):
if hasattr(llm_wrapper, "supports_first_message") and llm_wrapper.supports_first_message:
prompt = llm_wrapper.chat_completion_to_prompt(
messages=messages, functions=functions, first_message=first_message, function_documentation=documentation
)
else:
prompt = llm_wrapper.chat_completion_to_prompt(messages=messages, functions=functions, function_documentation=documentation)
printd(prompt)
except Exception as e:
print(e)
raise LocalLLMError(
f"Failed to convert ChatCompletion messages into prompt string with wrapper {str(llm_wrapper)} - error: {str(e)}"
)
try:
if endpoint_type == "webui":
result, usage = get_webui_completion(endpoint, auth_type, auth_key, prompt, context_window, grammar=grammar)
elif endpoint_type == "webui-legacy":
result, usage = get_webui_completion_legacy(endpoint, auth_type, auth_key, prompt, context_window, grammar=grammar)
elif endpoint_type == "lmstudio":
result, usage = get_lmstudio_completion(endpoint, auth_type, auth_key, prompt, context_window, api="completions")
elif endpoint_type == "lmstudio-legacy":
result, usage = get_lmstudio_completion(endpoint, auth_type, auth_key, prompt, context_window, api="chat")
elif endpoint_type == "llamacpp":
result, usage = get_llamacpp_completion(endpoint, auth_type, auth_key, prompt, context_window, grammar=grammar)
elif endpoint_type == "koboldcpp":
result, usage = get_koboldcpp_completion(endpoint, auth_type, auth_key, prompt, context_window, grammar=grammar)
elif endpoint_type == "ollama":
result, usage = get_ollama_completion(endpoint, auth_type, auth_key, model, prompt, context_window)
elif endpoint_type == "vllm":
result, usage = get_vllm_completion(endpoint, auth_type, auth_key, model, prompt, context_window, user)
elif endpoint_type == "groq":
result, usage = get_groq_completion(endpoint, auth_type, auth_key, model, prompt, context_window)
else:
raise LocalLLMError(
f"Invalid endpoint type {endpoint_type}, please set variable depending on your backend (webui, lmstudio, llamacpp, koboldcpp)"
)
except requests.exceptions.ConnectionError as e:
raise LocalLLMConnectionError(f"Unable to connect to endpoint {endpoint}")
if result is None or result == "":
raise LocalLLMError(f"Got back an empty response string from {endpoint}")
printd(f"Raw LLM output:\n====\n{result}\n====")
try:
if hasattr(llm_wrapper, "supports_first_message") and llm_wrapper.supports_first_message:
chat_completion_result = llm_wrapper.output_to_chat_completion_response(result, first_message=first_message)
else:
chat_completion_result = llm_wrapper.output_to_chat_completion_response(result)
printd(json.dumps(chat_completion_result, indent=2, ensure_ascii=JSON_ENSURE_ASCII))
except Exception as e:
raise LocalLLMError(f"Failed to parse JSON from local LLM response - error: {str(e)}")
# Run through some manual function correction (optional)
if function_correction:
chat_completion_result = patch_function(message_history=messages, new_message=chat_completion_result)
# Fill in potential missing usage information (used for tracking token use)
if not ("prompt_tokens" in usage and "completion_tokens" in usage and "total_tokens" in usage):
raise LocalLLMError(f"usage dict in response was missing fields ({usage})")
if usage["prompt_tokens"] is None:
printd(f"usage dict was missing prompt_tokens, computing on-the-fly...")
usage["prompt_tokens"] = count_tokens(prompt)
# NOTE: we should compute on-the-fly anyways since we might have to correct for errors during JSON parsing
usage["completion_tokens"] = count_tokens(json.dumps(chat_completion_result, ensure_ascii=JSON_ENSURE_ASCII))
"""
if usage["completion_tokens"] is None:
printd(f"usage dict was missing completion_tokens, computing on-the-fly...")
# chat_completion_result is dict with 'role' and 'content'
# token counter wants a string
usage["completion_tokens"] = count_tokens(json.dumps(chat_completion_result, ensure_ascii=JSON_ENSURE_ASCII))
"""
# NOTE: this is the token count that matters most
if usage["total_tokens"] is None:
printd(f"usage dict was missing total_tokens, computing on-the-fly...")
usage["total_tokens"] = usage["prompt_tokens"] + usage["completion_tokens"]
# unpack with response.choices[0].message.content
response = ChatCompletionResponse(
id=str(uuid.uuid4()), # TODO something better?
choices=[
Choice(
finish_reason="stop",
index=0,
message=Message(
role=chat_completion_result["role"],
content=chat_completion_result["content"],
tool_calls=(
[ToolCall(id=get_tool_call_id(), type="function", function=chat_completion_result["function_call"])]
if "function_call" in chat_completion_result
else []
),
),
)
],
created=get_utc_time(),
model=model,
# "This fingerprint represents the backend configuration that the model runs with."
# system_fingerprint=user if user is not None else "null",
system_fingerprint=None,
object="chat.completion",
usage=UsageStatistics(**usage),
)
printd(response)
return response
def generate_grammar_and_documentation(
functions_python: dict,
add_inner_thoughts_top_level: bool,
add_inner_thoughts_param_level: bool,
allow_only_inner_thoughts: bool,
):
from memgpt.utils import printd
assert not (
add_inner_thoughts_top_level and add_inner_thoughts_param_level
), "Can only place inner thoughts in one location in the grammar generator"
grammar_function_models = []
# create_dynamic_model_from_function will add inner thoughts to the function parameters if add_inner_thoughts is True.
# generate_gbnf_grammar_and_documentation will add inner thoughts to the outer object of the function parameters if add_inner_thoughts is True.
for key, func in functions_python.items():
grammar_function_models.append(create_dynamic_model_from_function(func, add_inner_thoughts=add_inner_thoughts_param_level))
grammar, documentation = generate_gbnf_grammar_and_documentation(
grammar_function_models,
outer_object_name="function",
outer_object_content="params",
model_prefix="function",
fields_prefix="params",
add_inner_thoughts=add_inner_thoughts_top_level,
allow_only_inner_thoughts=allow_only_inner_thoughts,
)
printd(grammar)
return grammar, documentation
"""Key idea: create drop-in replacement for agent's ChatCompletion call that runs on an OpenLLM backend"""
import json
import uuid
import requests
from memgpt.constants import CLI_WARNING_PREFIX, JSON_ENSURE_ASCII
from memgpt.errors import LocalLLMConnectionError, LocalLLMError
from memgpt.local_llm.constants import DEFAULT_WRAPPER
from memgpt.local_llm.function_parser import patch_function
from memgpt.local_llm.grammars.gbnf_grammar_generator import (
create_dynamic_model_from_function,
generate_gbnf_grammar_and_documentation,
)
from memgpt.local_llm.groq.api import get_groq_completion
from memgpt.local_llm.koboldcpp.api import get_koboldcpp_completion
from memgpt.local_llm.llamacpp.api import get_llamacpp_completion
from memgpt.local_llm.llm_chat_completion_wrappers import simple_summary_wrapper
from memgpt.local_llm.lmstudio.api import get_lmstudio_completion
from memgpt.local_llm.ollama.api import get_ollama_completion
from memgpt.local_llm.utils import count_tokens, get_available_wrappers
from memgpt.local_llm.vllm.api import get_vllm_completion
from memgpt.local_llm.webui.api import get_webui_completion
from memgpt.local_llm.webui.legacy_api import (
get_webui_completion as get_webui_completion_legacy,
)
from memgpt.models.chat_completion_response import (
ChatCompletionResponse,
Choice,
Message,
ToolCall,
UsageStatistics,
)
from memgpt.prompts.gpt_summarize import SYSTEM as SUMMARIZE_SYSTEM_MESSAGE
from memgpt.utils import get_tool_call_id, get_utc_time
has_shown_warning = False
grammar_supported_backends = ["koboldcpp", "llamacpp", "webui", "webui-legacy"]
def get_chat_completion(
model,
# no model required (except for Ollama), since the model is fixed to whatever you set in your own backend
messages,
functions=None,
functions_python=None,
function_call="auto",
context_window=None,
user=None,
# required
wrapper=None,
endpoint=None,
endpoint_type=None,
# optional cleanup
function_correction=True,
# extra hints to allow for additional prompt formatting hacks
# TODO this could alternatively be supported via passing function_call="send_message" into the wrapper
first_message=False,
# optional auth headers
auth_type=None,
auth_key=None,
) -> ChatCompletionResponse:
from memgpt.utils import printd
assert context_window is not None, "Local LLM calls need the context length to be explicitly set"
assert endpoint is not None, "Local LLM calls need the endpoint (eg http://localendpoint:1234) to be explicitly set"
assert endpoint_type is not None, "Local LLM calls need the endpoint type (eg webui) to be explicitly set"
global has_shown_warning
grammar = None
# TODO: eventually just process Message object
if not isinstance(messages[0], dict):
messages = [m.to_openai_dict() for m in messages]
if function_call is not None and function_call != "auto":
raise ValueError(f"function_call == {function_call} not supported (auto or None only)")
available_wrappers = get_available_wrappers()
documentation = None
# Special case for if the call we're making is coming from the summarizer
if messages[0]["role"] == "system" and messages[0]["content"].strip() == SUMMARIZE_SYSTEM_MESSAGE.strip():
llm_wrapper = simple_summary_wrapper.SimpleSummaryWrapper()
# Select a default prompt formatter
elif wrapper is None:
# Warn the user that we're using the fallback
if not has_shown_warning:
print(
f"{CLI_WARNING_PREFIX}no wrapper specified for local LLM, using the default wrapper (you can remove this warning by specifying the wrapper with --model-wrapper)"
)
has_shown_warning = True
llm_wrapper = DEFAULT_WRAPPER()
# User provided an incorrect prompt formatter
elif wrapper not in available_wrappers:
raise ValueError(f"Could not find requested wrapper '{wrapper} in available wrappers list:\n{', '.join(available_wrappers)}")
# User provided a correct prompt formatter
else:
llm_wrapper = available_wrappers[wrapper]
# If the wrapper uses grammar, generate the grammar using the grammar generating function
# TODO move this to a flag
if wrapper is not None and "grammar" in wrapper:
# When using grammars, we don't want to do any extras output tricks like appending a response prefix
setattr(llm_wrapper, "assistant_prefix_extra_first_message", "")
setattr(llm_wrapper, "assistant_prefix_extra", "")
# TODO find a better way to do this than string matching (eg an attribute)
if "noforce" in wrapper:
# "noforce" means that the prompt formatter expects inner thoughts as a top-level parameter
# this is closer to the OpenAI style since it allows for messages w/o any function calls
# however, with bad LLMs it makes it easier for the LLM to "forget" to call any of the functions
grammar, documentation = generate_grammar_and_documentation(
functions_python=functions_python,
add_inner_thoughts_top_level=True,
add_inner_thoughts_param_level=False,
allow_only_inner_thoughts=True,
)
else:
# otherwise, the other prompt formatters will insert inner thoughts as a function call parameter (by default)
# this means that every response from the LLM will be required to call a function
grammar, documentation = generate_grammar_and_documentation(
functions_python=functions_python,
add_inner_thoughts_top_level=False,
add_inner_thoughts_param_level=True,
allow_only_inner_thoughts=False,
)
printd(grammar)
if grammar is not None and endpoint_type not in grammar_supported_backends:
print(
f"{CLI_WARNING_PREFIX}grammars are currently not supported when using {endpoint_type} as the MemGPT local LLM backend (supported: {', '.join(grammar_supported_backends)})"
)
grammar = None
# First step: turn the message sequence into a prompt that the model expects
try:
# if hasattr(llm_wrapper, "supports_first_message"):
if hasattr(llm_wrapper, "supports_first_message") and llm_wrapper.supports_first_message:
prompt = llm_wrapper.chat_completion_to_prompt(
messages=messages, functions=functions, first_message=first_message, function_documentation=documentation
)
else:
prompt = llm_wrapper.chat_completion_to_prompt(messages=messages, functions=functions, function_documentation=documentation)
printd(prompt)
except Exception as e:
print(e)
raise LocalLLMError(
f"Failed to convert ChatCompletion messages into prompt string with wrapper {str(llm_wrapper)} - error: {str(e)}"
)
try:
if endpoint_type == "webui":
result, usage = get_webui_completion(endpoint, auth_type, auth_key, prompt, context_window, grammar=grammar)
elif endpoint_type == "webui-legacy":
result, usage = get_webui_completion_legacy(endpoint, auth_type, auth_key, prompt, context_window, grammar=grammar)
elif endpoint_type == "lmstudio":
result, usage = get_lmstudio_completion(endpoint, auth_type, auth_key, prompt, context_window, api="completions")
elif endpoint_type == "lmstudio-legacy":
result, usage = get_lmstudio_completion(endpoint, auth_type, auth_key, prompt, context_window, api="chat")
elif endpoint_type == "llamacpp":
result, usage = get_llamacpp_completion(endpoint, auth_type, auth_key, prompt, context_window, grammar=grammar)
elif endpoint_type == "koboldcpp":
result, usage = get_koboldcpp_completion(endpoint, auth_type, auth_key, prompt, context_window, grammar=grammar)
elif endpoint_type == "ollama":
result, usage = get_ollama_completion(endpoint, auth_type, auth_key, model, prompt, context_window)
elif endpoint_type == "vllm":
result, usage = get_vllm_completion(endpoint, auth_type, auth_key, model, prompt, context_window, user)
elif endpoint_type == "groq":
result, usage = get_groq_completion(endpoint, auth_type, auth_key, model, prompt, context_window)
else:
raise LocalLLMError(
f"Invalid endpoint type {endpoint_type}, please set variable depending on your backend (webui, lmstudio, llamacpp, koboldcpp)"
)
except requests.exceptions.ConnectionError as e:
raise LocalLLMConnectionError(f"Unable to connect to endpoint {endpoint}")
if result is None or result == "":
raise LocalLLMError(f"Got back an empty response string from {endpoint}")
printd(f"Raw LLM output:\n====\n{result}\n====")
try:
if hasattr(llm_wrapper, "supports_first_message") and llm_wrapper.supports_first_message:
chat_completion_result = llm_wrapper.output_to_chat_completion_response(result, first_message=first_message)
else:
chat_completion_result = llm_wrapper.output_to_chat_completion_response(result)
printd(json.dumps(chat_completion_result, indent=2, ensure_ascii=JSON_ENSURE_ASCII))
except Exception as e:
raise LocalLLMError(f"Failed to parse JSON from local LLM response - error: {str(e)}")
# Run through some manual function correction (optional)
if function_correction:
chat_completion_result = patch_function(message_history=messages, new_message=chat_completion_result)
# Fill in potential missing usage information (used for tracking token use)
if not ("prompt_tokens" in usage and "completion_tokens" in usage and "total_tokens" in usage):
raise LocalLLMError(f"usage dict in response was missing fields ({usage})")
if usage["prompt_tokens"] is None:
printd(f"usage dict was missing prompt_tokens, computing on-the-fly...")
usage["prompt_tokens"] = count_tokens(prompt)
# NOTE: we should compute on-the-fly anyways since we might have to correct for errors during JSON parsing
usage["completion_tokens"] = count_tokens(json.dumps(chat_completion_result, ensure_ascii=JSON_ENSURE_ASCII))
"""
if usage["completion_tokens"] is None:
printd(f"usage dict was missing completion_tokens, computing on-the-fly...")
# chat_completion_result is dict with 'role' and 'content'
# token counter wants a string
usage["completion_tokens"] = count_tokens(json.dumps(chat_completion_result, ensure_ascii=JSON_ENSURE_ASCII))
"""
# NOTE: this is the token count that matters most
if usage["total_tokens"] is None:
printd(f"usage dict was missing total_tokens, computing on-the-fly...")
usage["total_tokens"] = usage["prompt_tokens"] + usage["completion_tokens"]
# unpack with response.choices[0].message.content
response = ChatCompletionResponse(
id=str(uuid.uuid4()), # TODO something better?
choices=[
Choice(
finish_reason="stop",
index=0,
message=Message(
role=chat_completion_result["role"],
content=chat_completion_result["content"],
tool_calls=(
[ToolCall(id=get_tool_call_id(), type="function", function=chat_completion_result["function_call"])]
if "function_call" in chat_completion_result
else []
),
),
)
],
created=get_utc_time(),
model=model,
# "This fingerprint represents the backend configuration that the model runs with."
# system_fingerprint=user if user is not None else "null",
system_fingerprint=None,
object="chat.completion",
usage=UsageStatistics(**usage),
)
printd(response)
return response
def generate_grammar_and_documentation(
functions_python: dict,
add_inner_thoughts_top_level: bool,
add_inner_thoughts_param_level: bool,
allow_only_inner_thoughts: bool,
):
from memgpt.utils import printd
assert not (
add_inner_thoughts_top_level and add_inner_thoughts_param_level
), "Can only place inner thoughts in one location in the grammar generator"
grammar_function_models = []
# create_dynamic_model_from_function will add inner thoughts to the function parameters if add_inner_thoughts is True.
# generate_gbnf_grammar_and_documentation will add inner thoughts to the outer object of the function parameters if add_inner_thoughts is True.
for key, func in functions_python.items():
grammar_function_models.append(create_dynamic_model_from_function(func, add_inner_thoughts=add_inner_thoughts_param_level))
grammar, documentation = generate_gbnf_grammar_and_documentation(
grammar_function_models,
outer_object_name="function",
outer_object_content="params",
model_prefix="function",
fields_prefix="params",
add_inner_thoughts=add_inner_thoughts_top_level,
allow_only_inner_thoughts=allow_only_inner_thoughts,
)
printd(grammar)
return grammar, documentation

View File

@@ -1,453 +1,453 @@
import json
from ...constants import JSON_ENSURE_ASCII, JSON_LOADS_STRICT
from ...errors import LLMJSONParsingError
from ..json_parser import clean_json
from .wrapper_base import LLMChatCompletionWrapper
class Airoboros21Wrapper(LLMChatCompletionWrapper):
"""Wrapper for Airoboros 70b v2.1: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1
Note: this wrapper formats a prompt that only generates JSON, no inner thoughts
"""
def __init__(
self,
simplify_json_content=True,
clean_function_args=True,
include_assistant_prefix=True,
include_opening_brace_in_prefix=True,
include_section_separators=True,
):
self.simplify_json_content = simplify_json_content
self.clean_func_args = clean_function_args
self.include_assistant_prefix = include_assistant_prefix
self.include_opening_brance_in_prefix = include_opening_brace_in_prefix
self.include_section_separators = include_section_separators
def chat_completion_to_prompt(self, messages, functions, function_documentation=None):
"""Example for airoboros: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1#prompt-format
A chat.
USER: {prompt}
ASSISTANT:
Functions support: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1#agentfunction-calling
As an AI assistant, please select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format.
Input: I want to know how many times 'Python' is mentioned in my text file.
Available functions:
file_analytics:
description: This tool performs various operations on a text file.
params:
action: The operation we want to perform on the data, such as "count_occurrences", "find_line", etc.
filters:
keyword: The word or phrase we want to search for.
OpenAI functions schema style:
{
"name": "send_message",
"description": "Sends a message to the human user",
"parameters": {
"type": "object",
"properties": {
# https://json-schema.org/understanding-json-schema/reference/array.html
"message": {
"type": "string",
"description": "Message contents. All unicode (including emojis) are supported.",
},
},
"required": ["message"],
}
},
"""
prompt = ""
# System insturctions go first
assert messages[0]["role"] == "system"
prompt += messages[0]["content"]
# Next is the functions preamble
def create_function_description(schema):
# airorobos style
func_str = ""
func_str += f"{schema['name']}:"
func_str += f"\n description: {schema['description']}"
func_str += f"\n params:"
for param_k, param_v in schema["parameters"]["properties"].items():
# TODO we're ignoring type
func_str += f"\n {param_k}: {param_v['description']}"
# TODO we're ignoring schema['parameters']['required']
return func_str
# prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format."
prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the ongoing conversation. Provide your response in JSON format."
prompt += f"\nAvailable functions:"
if function_documentation is not None:
prompt += f"\n{function_documentation}"
else:
for function_dict in functions:
prompt += f"\n{create_function_description(function_dict)}"
def create_function_call(function_call):
"""Go from ChatCompletion to Airoboros style function trace (in prompt)
ChatCompletion data (inside message['function_call']):
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
Airoboros output:
{
"function": "send_message",
"params": {
"message": "Hello there! I am Sam, an AI developed by Liminal Corp. How can I assist you today?"
}
}
"""
airo_func_call = {
"function": function_call["name"],
"params": json.loads(function_call["arguments"], strict=JSON_LOADS_STRICT),
}
return json.dumps(airo_func_call, indent=2, ensure_ascii=JSON_ENSURE_ASCII)
# Add a sep for the conversation
if self.include_section_separators:
prompt += "\n### INPUT"
# Last are the user/assistant messages
for message in messages[1:]:
assert message["role"] in ["user", "assistant", "function", "tool"], message
if message["role"] == "user":
if self.simplify_json_content:
try:
content_json = json.loads(message["content"], strict=JSON_LOADS_STRICT)
content_simple = content_json["message"]
prompt += f"\nUSER: {content_simple}"
except:
prompt += f"\nUSER: {message['content']}"
elif message["role"] == "assistant":
prompt += f"\nASSISTANT: {message['content']}"
# need to add the function call if there was one
if "function_call" in message and message["function_call"]:
prompt += f"\n{create_function_call(message['function_call'])}"
elif message["role"] in ["function", "tool"]:
# TODO find a good way to add this
# prompt += f"\nASSISTANT: (function return) {message['content']}"
prompt += f"\nFUNCTION RETURN: {message['content']}"
continue
else:
raise ValueError(message)
# Add a sep for the response
if self.include_section_separators:
prompt += "\n### RESPONSE"
if self.include_assistant_prefix:
prompt += f"\nASSISTANT:"
if self.include_opening_brance_in_prefix:
prompt += "\n{"
print(prompt)
return prompt
def clean_function_args(self, function_name, function_args):
"""Some basic MemGPT-specific cleaning of function args"""
cleaned_function_name = function_name
cleaned_function_args = function_args.copy() if function_args is not None else {}
if function_name == "send_message":
# strip request_heartbeat
cleaned_function_args.pop("request_heartbeat", None)
# TODO more cleaning to fix errors LLM makes
return cleaned_function_name, cleaned_function_args
def output_to_chat_completion_response(self, raw_llm_output):
"""Turn raw LLM output into a ChatCompletion style response with:
"message" = {
"role": "assistant",
"content": ...,
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
}
}
"""
if self.include_opening_brance_in_prefix and raw_llm_output[0] != "{":
raw_llm_output = "{" + raw_llm_output
try:
function_json_output = clean_json(raw_llm_output)
except Exception as e:
raise Exception(f"Failed to decode JSON from LLM output:\n{raw_llm_output} - error\n{str(e)}")
try:
function_name = function_json_output["function"]
function_parameters = function_json_output["params"]
except KeyError as e:
raise LLMJSONParsingError(f"Received valid JSON from LLM, but JSON was missing fields: {str(e)}")
if self.clean_func_args:
function_name, function_parameters = self.clean_function_args(function_name, function_parameters)
message = {
"role": "assistant",
"content": None,
"function_call": {
"name": function_name,
"arguments": json.dumps(function_parameters, ensure_ascii=JSON_ENSURE_ASCII),
},
}
return message
class Airoboros21InnerMonologueWrapper(Airoboros21Wrapper):
"""Still expect only JSON outputs from model, but add inner monologue as a field"""
def __init__(
self,
simplify_json_content=True,
clean_function_args=True,
include_assistant_prefix=True,
# include_opening_brace_in_prefix=True,
# assistant_prefix_extra="\n{"
# assistant_prefix_extra='\n{\n "function": ',
assistant_prefix_extra='\n{\n "function":',
include_section_separators=True,
):
self.simplify_json_content = simplify_json_content
self.clean_func_args = clean_function_args
self.include_assistant_prefix = include_assistant_prefix
# self.include_opening_brance_in_prefix = include_opening_brace_in_prefix
self.assistant_prefix_extra = assistant_prefix_extra
self.include_section_separators = include_section_separators
def chat_completion_to_prompt(self, messages, functions, function_documentation=None):
"""Example for airoboros: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1#prompt-format
A chat.
USER: {prompt}
ASSISTANT:
Functions support: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1#agentfunction-calling
As an AI assistant, please select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format.
Input: I want to know how many times 'Python' is mentioned in my text file.
Available functions:
file_analytics:
description: This tool performs various operations on a text file.
params:
action: The operation we want to perform on the data, such as "count_occurrences", "find_line", etc.
filters:
keyword: The word or phrase we want to search for.
OpenAI functions schema style:
{
"name": "send_message",
"description": "Sends a message to the human user",
"parameters": {
"type": "object",
"properties": {
# https://json-schema.org/understanding-json-schema/reference/array.html
"message": {
"type": "string",
"description": "Message contents. All unicode (including emojis) are supported.",
},
},
"required": ["message"],
}
},
"""
prompt = ""
# System insturctions go first
assert messages[0]["role"] == "system"
prompt += messages[0]["content"]
# Next is the functions preamble
def create_function_description(schema, add_inner_thoughts=True):
# airorobos style
func_str = ""
func_str += f"{schema['name']}:"
func_str += f"\n description: {schema['description']}"
func_str += f"\n params:"
if add_inner_thoughts:
func_str += f"\n inner_thoughts: Deep inner monologue private to you only."
for param_k, param_v in schema["parameters"]["properties"].items():
# TODO we're ignoring type
func_str += f"\n {param_k}: {param_v['description']}"
# TODO we're ignoring schema['parameters']['required']
return func_str
# prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format."
prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the ongoing conversation. Provide your response in JSON format."
prompt += f"\nAvailable functions:"
if function_documentation is not None:
prompt += f"\n{function_documentation}"
else:
for function_dict in functions:
prompt += f"\n{create_function_description(function_dict)}"
def create_function_call(function_call, inner_thoughts=None):
"""Go from ChatCompletion to Airoboros style function trace (in prompt)
ChatCompletion data (inside message['function_call']):
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
Airoboros output:
{
"function": "send_message",
"params": {
"message": "Hello there! I am Sam, an AI developed by Liminal Corp. How can I assist you today?"
}
}
"""
airo_func_call = {
"function": function_call["name"],
"params": {
"inner_thoughts": inner_thoughts,
**json.loads(function_call["arguments"], strict=JSON_LOADS_STRICT),
},
}
return json.dumps(airo_func_call, indent=2, ensure_ascii=JSON_ENSURE_ASCII)
# Add a sep for the conversation
if self.include_section_separators:
prompt += "\n### INPUT"
# Last are the user/assistant messages
for message in messages[1:]:
assert message["role"] in ["user", "assistant", "function", "tool"], message
if message["role"] == "user":
# Support for AutoGen naming of agents
if "name" in message:
user_prefix = message["name"].strip()
user_prefix = f"USER ({user_prefix})"
else:
user_prefix = "USER"
if self.simplify_json_content:
try:
content_json = json.loads(message["content"], strict=JSON_LOADS_STRICT)
content_simple = content_json["message"]
prompt += f"\n{user_prefix}: {content_simple}"
except:
prompt += f"\n{user_prefix}: {message['content']}"
elif message["role"] == "assistant":
# Support for AutoGen naming of agents
if "name" in message:
assistant_prefix = message["name"].strip()
assistant_prefix = f"ASSISTANT ({assistant_prefix})"
else:
assistant_prefix = "ASSISTANT"
prompt += f"\n{assistant_prefix}:"
# need to add the function call if there was one
inner_thoughts = message["content"]
if "function_call" in message and message["function_call"]:
prompt += f"\n{create_function_call(message['function_call'], inner_thoughts=inner_thoughts)}"
elif message["role"] in ["function", "tool"]:
# TODO find a good way to add this
# prompt += f"\nASSISTANT: (function return) {message['content']}"
prompt += f"\nFUNCTION RETURN: {message['content']}"
continue
else:
raise ValueError(message)
# Add a sep for the response
if self.include_section_separators:
prompt += "\n### RESPONSE"
if self.include_assistant_prefix:
prompt += f"\nASSISTANT:"
if self.assistant_prefix_extra:
prompt += self.assistant_prefix_extra
return prompt
def clean_function_args(self, function_name, function_args):
"""Some basic MemGPT-specific cleaning of function args"""
cleaned_function_name = function_name
cleaned_function_args = function_args.copy() if function_args is not None else {}
if function_name == "send_message":
# strip request_heartbeat
cleaned_function_args.pop("request_heartbeat", None)
inner_thoughts = None
if "inner_thoughts" in function_args:
inner_thoughts = cleaned_function_args.pop("inner_thoughts")
# TODO more cleaning to fix errors LLM makes
return inner_thoughts, cleaned_function_name, cleaned_function_args
def output_to_chat_completion_response(self, raw_llm_output):
"""Turn raw LLM output into a ChatCompletion style response with:
"message" = {
"role": "assistant",
"content": ...,
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
}
}
"""
# if self.include_opening_brance_in_prefix and raw_llm_output[0] != "{":
# raw_llm_output = "{" + raw_llm_output
if self.assistant_prefix_extra and raw_llm_output[: len(self.assistant_prefix_extra)] != self.assistant_prefix_extra:
# print(f"adding prefix back to llm, raw_llm_output=\n{raw_llm_output}")
raw_llm_output = self.assistant_prefix_extra + raw_llm_output
# print(f"->\n{raw_llm_output}")
try:
function_json_output = clean_json(raw_llm_output)
except Exception as e:
raise Exception(f"Failed to decode JSON from LLM output:\n{raw_llm_output} - error\n{str(e)}")
try:
# NOTE: weird bug can happen where 'function' gets nested if the prefix in the prompt isn't abided by
if isinstance(function_json_output["function"], dict):
function_json_output = function_json_output["function"]
function_name = function_json_output["function"]
function_parameters = function_json_output["params"]
except KeyError as e:
raise LLMJSONParsingError(
f"Received valid JSON from LLM, but JSON was missing fields: {str(e)}. JSON result was:\n{function_json_output}"
)
if self.clean_func_args:
(
inner_thoughts,
function_name,
function_parameters,
) = self.clean_function_args(function_name, function_parameters)
message = {
"role": "assistant",
"content": inner_thoughts,
"function_call": {
"name": function_name,
"arguments": json.dumps(function_parameters, ensure_ascii=JSON_ENSURE_ASCII),
},
}
return message
import json
from ...constants import JSON_ENSURE_ASCII, JSON_LOADS_STRICT
from ...errors import LLMJSONParsingError
from ..json_parser import clean_json
from .wrapper_base import LLMChatCompletionWrapper
class Airoboros21Wrapper(LLMChatCompletionWrapper):
"""Wrapper for Airoboros 70b v2.1: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1
Note: this wrapper formats a prompt that only generates JSON, no inner thoughts
"""
def __init__(
self,
simplify_json_content=True,
clean_function_args=True,
include_assistant_prefix=True,
include_opening_brace_in_prefix=True,
include_section_separators=True,
):
self.simplify_json_content = simplify_json_content
self.clean_func_args = clean_function_args
self.include_assistant_prefix = include_assistant_prefix
self.include_opening_brance_in_prefix = include_opening_brace_in_prefix
self.include_section_separators = include_section_separators
def chat_completion_to_prompt(self, messages, functions, function_documentation=None):
"""Example for airoboros: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1#prompt-format
A chat.
USER: {prompt}
ASSISTANT:
Functions support: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1#agentfunction-calling
As an AI assistant, please select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format.
Input: I want to know how many times 'Python' is mentioned in my text file.
Available functions:
file_analytics:
description: This tool performs various operations on a text file.
params:
action: The operation we want to perform on the data, such as "count_occurrences", "find_line", etc.
filters:
keyword: The word or phrase we want to search for.
OpenAI functions schema style:
{
"name": "send_message",
"description": "Sends a message to the human user",
"parameters": {
"type": "object",
"properties": {
# https://json-schema.org/understanding-json-schema/reference/array.html
"message": {
"type": "string",
"description": "Message contents. All unicode (including emojis) are supported.",
},
},
"required": ["message"],
}
},
"""
prompt = ""
# System insturctions go first
assert messages[0]["role"] == "system"
prompt += messages[0]["content"]
# Next is the functions preamble
def create_function_description(schema):
# airorobos style
func_str = ""
func_str += f"{schema['name']}:"
func_str += f"\n description: {schema['description']}"
func_str += f"\n params:"
for param_k, param_v in schema["parameters"]["properties"].items():
# TODO we're ignoring type
func_str += f"\n {param_k}: {param_v['description']}"
# TODO we're ignoring schema['parameters']['required']
return func_str
# prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format."
prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the ongoing conversation. Provide your response in JSON format."
prompt += f"\nAvailable functions:"
if function_documentation is not None:
prompt += f"\n{function_documentation}"
else:
for function_dict in functions:
prompt += f"\n{create_function_description(function_dict)}"
def create_function_call(function_call):
"""Go from ChatCompletion to Airoboros style function trace (in prompt)
ChatCompletion data (inside message['function_call']):
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
Airoboros output:
{
"function": "send_message",
"params": {
"message": "Hello there! I am Sam, an AI developed by Liminal Corp. How can I assist you today?"
}
}
"""
airo_func_call = {
"function": function_call["name"],
"params": json.loads(function_call["arguments"], strict=JSON_LOADS_STRICT),
}
return json.dumps(airo_func_call, indent=2, ensure_ascii=JSON_ENSURE_ASCII)
# Add a sep for the conversation
if self.include_section_separators:
prompt += "\n### INPUT"
# Last are the user/assistant messages
for message in messages[1:]:
assert message["role"] in ["user", "assistant", "function", "tool"], message
if message["role"] == "user":
if self.simplify_json_content:
try:
content_json = json.loads(message["content"], strict=JSON_LOADS_STRICT)
content_simple = content_json["message"]
prompt += f"\nUSER: {content_simple}"
except:
prompt += f"\nUSER: {message['content']}"
elif message["role"] == "assistant":
prompt += f"\nASSISTANT: {message['content']}"
# need to add the function call if there was one
if "function_call" in message and message["function_call"]:
prompt += f"\n{create_function_call(message['function_call'])}"
elif message["role"] in ["function", "tool"]:
# TODO find a good way to add this
# prompt += f"\nASSISTANT: (function return) {message['content']}"
prompt += f"\nFUNCTION RETURN: {message['content']}"
continue
else:
raise ValueError(message)
# Add a sep for the response
if self.include_section_separators:
prompt += "\n### RESPONSE"
if self.include_assistant_prefix:
prompt += f"\nASSISTANT:"
if self.include_opening_brance_in_prefix:
prompt += "\n{"
print(prompt)
return prompt
def clean_function_args(self, function_name, function_args):
"""Some basic MemGPT-specific cleaning of function args"""
cleaned_function_name = function_name
cleaned_function_args = function_args.copy() if function_args is not None else {}
if function_name == "send_message":
# strip request_heartbeat
cleaned_function_args.pop("request_heartbeat", None)
# TODO more cleaning to fix errors LLM makes
return cleaned_function_name, cleaned_function_args
def output_to_chat_completion_response(self, raw_llm_output):
"""Turn raw LLM output into a ChatCompletion style response with:
"message" = {
"role": "assistant",
"content": ...,
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
}
}
"""
if self.include_opening_brance_in_prefix and raw_llm_output[0] != "{":
raw_llm_output = "{" + raw_llm_output
try:
function_json_output = clean_json(raw_llm_output)
except Exception as e:
raise Exception(f"Failed to decode JSON from LLM output:\n{raw_llm_output} - error\n{str(e)}")
try:
function_name = function_json_output["function"]
function_parameters = function_json_output["params"]
except KeyError as e:
raise LLMJSONParsingError(f"Received valid JSON from LLM, but JSON was missing fields: {str(e)}")
if self.clean_func_args:
function_name, function_parameters = self.clean_function_args(function_name, function_parameters)
message = {
"role": "assistant",
"content": None,
"function_call": {
"name": function_name,
"arguments": json.dumps(function_parameters, ensure_ascii=JSON_ENSURE_ASCII),
},
}
return message
class Airoboros21InnerMonologueWrapper(Airoboros21Wrapper):
"""Still expect only JSON outputs from model, but add inner monologue as a field"""
def __init__(
self,
simplify_json_content=True,
clean_function_args=True,
include_assistant_prefix=True,
# include_opening_brace_in_prefix=True,
# assistant_prefix_extra="\n{"
# assistant_prefix_extra='\n{\n "function": ',
assistant_prefix_extra='\n{\n "function":',
include_section_separators=True,
):
self.simplify_json_content = simplify_json_content
self.clean_func_args = clean_function_args
self.include_assistant_prefix = include_assistant_prefix
# self.include_opening_brance_in_prefix = include_opening_brace_in_prefix
self.assistant_prefix_extra = assistant_prefix_extra
self.include_section_separators = include_section_separators
def chat_completion_to_prompt(self, messages, functions, function_documentation=None):
"""Example for airoboros: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1#prompt-format
A chat.
USER: {prompt}
ASSISTANT:
Functions support: https://huggingface.co/jondurbin/airoboros-l2-70b-2.1#agentfunction-calling
As an AI assistant, please select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format.
Input: I want to know how many times 'Python' is mentioned in my text file.
Available functions:
file_analytics:
description: This tool performs various operations on a text file.
params:
action: The operation we want to perform on the data, such as "count_occurrences", "find_line", etc.
filters:
keyword: The word or phrase we want to search for.
OpenAI functions schema style:
{
"name": "send_message",
"description": "Sends a message to the human user",
"parameters": {
"type": "object",
"properties": {
# https://json-schema.org/understanding-json-schema/reference/array.html
"message": {
"type": "string",
"description": "Message contents. All unicode (including emojis) are supported.",
},
},
"required": ["message"],
}
},
"""
prompt = ""
# System insturctions go first
assert messages[0]["role"] == "system"
prompt += messages[0]["content"]
# Next is the functions preamble
def create_function_description(schema, add_inner_thoughts=True):
# airorobos style
func_str = ""
func_str += f"{schema['name']}:"
func_str += f"\n description: {schema['description']}"
func_str += f"\n params:"
if add_inner_thoughts:
func_str += f"\n inner_thoughts: Deep inner monologue private to you only."
for param_k, param_v in schema["parameters"]["properties"].items():
# TODO we're ignoring type
func_str += f"\n {param_k}: {param_v['description']}"
# TODO we're ignoring schema['parameters']['required']
return func_str
# prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format."
prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the ongoing conversation. Provide your response in JSON format."
prompt += f"\nAvailable functions:"
if function_documentation is not None:
prompt += f"\n{function_documentation}"
else:
for function_dict in functions:
prompt += f"\n{create_function_description(function_dict)}"
def create_function_call(function_call, inner_thoughts=None):
"""Go from ChatCompletion to Airoboros style function trace (in prompt)
ChatCompletion data (inside message['function_call']):
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
Airoboros output:
{
"function": "send_message",
"params": {
"message": "Hello there! I am Sam, an AI developed by Liminal Corp. How can I assist you today?"
}
}
"""
airo_func_call = {
"function": function_call["name"],
"params": {
"inner_thoughts": inner_thoughts,
**json.loads(function_call["arguments"], strict=JSON_LOADS_STRICT),
},
}
return json.dumps(airo_func_call, indent=2, ensure_ascii=JSON_ENSURE_ASCII)
# Add a sep for the conversation
if self.include_section_separators:
prompt += "\n### INPUT"
# Last are the user/assistant messages
for message in messages[1:]:
assert message["role"] in ["user", "assistant", "function", "tool"], message
if message["role"] == "user":
# Support for AutoGen naming of agents
if "name" in message:
user_prefix = message["name"].strip()
user_prefix = f"USER ({user_prefix})"
else:
user_prefix = "USER"
if self.simplify_json_content:
try:
content_json = json.loads(message["content"], strict=JSON_LOADS_STRICT)
content_simple = content_json["message"]
prompt += f"\n{user_prefix}: {content_simple}"
except:
prompt += f"\n{user_prefix}: {message['content']}"
elif message["role"] == "assistant":
# Support for AutoGen naming of agents
if "name" in message:
assistant_prefix = message["name"].strip()
assistant_prefix = f"ASSISTANT ({assistant_prefix})"
else:
assistant_prefix = "ASSISTANT"
prompt += f"\n{assistant_prefix}:"
# need to add the function call if there was one
inner_thoughts = message["content"]
if "function_call" in message and message["function_call"]:
prompt += f"\n{create_function_call(message['function_call'], inner_thoughts=inner_thoughts)}"
elif message["role"] in ["function", "tool"]:
# TODO find a good way to add this
# prompt += f"\nASSISTANT: (function return) {message['content']}"
prompt += f"\nFUNCTION RETURN: {message['content']}"
continue
else:
raise ValueError(message)
# Add a sep for the response
if self.include_section_separators:
prompt += "\n### RESPONSE"
if self.include_assistant_prefix:
prompt += f"\nASSISTANT:"
if self.assistant_prefix_extra:
prompt += self.assistant_prefix_extra
return prompt
def clean_function_args(self, function_name, function_args):
"""Some basic MemGPT-specific cleaning of function args"""
cleaned_function_name = function_name
cleaned_function_args = function_args.copy() if function_args is not None else {}
if function_name == "send_message":
# strip request_heartbeat
cleaned_function_args.pop("request_heartbeat", None)
inner_thoughts = None
if "inner_thoughts" in function_args:
inner_thoughts = cleaned_function_args.pop("inner_thoughts")
# TODO more cleaning to fix errors LLM makes
return inner_thoughts, cleaned_function_name, cleaned_function_args
def output_to_chat_completion_response(self, raw_llm_output):
"""Turn raw LLM output into a ChatCompletion style response with:
"message" = {
"role": "assistant",
"content": ...,
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
}
}
"""
# if self.include_opening_brance_in_prefix and raw_llm_output[0] != "{":
# raw_llm_output = "{" + raw_llm_output
if self.assistant_prefix_extra and raw_llm_output[: len(self.assistant_prefix_extra)] != self.assistant_prefix_extra:
# print(f"adding prefix back to llm, raw_llm_output=\n{raw_llm_output}")
raw_llm_output = self.assistant_prefix_extra + raw_llm_output
# print(f"->\n{raw_llm_output}")
try:
function_json_output = clean_json(raw_llm_output)
except Exception as e:
raise Exception(f"Failed to decode JSON from LLM output:\n{raw_llm_output} - error\n{str(e)}")
try:
# NOTE: weird bug can happen where 'function' gets nested if the prefix in the prompt isn't abided by
if isinstance(function_json_output["function"], dict):
function_json_output = function_json_output["function"]
function_name = function_json_output["function"]
function_parameters = function_json_output["params"]
except KeyError as e:
raise LLMJSONParsingError(
f"Received valid JSON from LLM, but JSON was missing fields: {str(e)}. JSON result was:\n{function_json_output}"
)
if self.clean_func_args:
(
inner_thoughts,
function_name,
function_parameters,
) = self.clean_function_args(function_name, function_parameters)
message = {
"role": "assistant",
"content": inner_thoughts,
"function_call": {
"name": function_name,
"arguments": json.dumps(function_parameters, ensure_ascii=JSON_ENSURE_ASCII),
},
}
return message

View File

@@ -1,11 +1,11 @@
from abc import ABC, abstractmethod
class LLMChatCompletionWrapper(ABC):
@abstractmethod
def chat_completion_to_prompt(self, messages, functions, function_documentation=None):
"""Go from ChatCompletion to a single prompt string"""
@abstractmethod
def output_to_chat_completion_response(self, raw_llm_output):
"""Turn the LLM output string into a ChatCompletion response"""
from abc import ABC, abstractmethod
class LLMChatCompletionWrapper(ABC):
@abstractmethod
def chat_completion_to_prompt(self, messages, functions, function_documentation=None):
"""Go from ChatCompletion to a single prompt string"""
@abstractmethod
def output_to_chat_completion_response(self, raw_llm_output):
"""Turn the LLM output string into a ChatCompletion response"""

View File

@@ -1,346 +1,346 @@
import json
from ...constants import JSON_ENSURE_ASCII
from ...errors import LLMJSONParsingError
from ..json_parser import clean_json
from .wrapper_base import LLMChatCompletionWrapper
class ZephyrMistralWrapper(LLMChatCompletionWrapper):
"""
Wrapper for Zephyr Alpha and Beta, Mistral 7B:
https://huggingface.co/HuggingFaceH4/zephyr-7b-alpha
https://huggingface.co/HuggingFaceH4/zephyr-7b-beta
Note: this wrapper formats a prompt that only generates JSON, no inner thoughts
"""
def __init__(
self,
simplify_json_content=True,
clean_function_args=True,
include_assistant_prefix=True,
include_opening_brace_in_prefix=True,
include_section_separators=False,
):
self.simplify_json_content = simplify_json_content
self.clean_func_args = clean_function_args
self.include_assistant_prefix = include_assistant_prefix
self.include_opening_brance_in_prefix = include_opening_brace_in_prefix
self.include_section_separators = include_section_separators
def chat_completion_to_prompt(self, messages, functions, function_documentation=None):
"""
Zephyr prompt format:
<|system|>
</s>
<|user|>
{prompt}</s>
<|assistant|>
(source: https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF#prompt-template-zephyr)
"""
prompt = ""
IM_END_TOKEN = "</s>"
# System instructions go first
assert messages[0]["role"] == "system"
prompt += f"<|system|>"
prompt += f"\n{messages[0]['content']}"
# Next is the functions preamble
def create_function_description(schema):
# airorobos style
func_str = ""
func_str += f"{schema['name']}:"
func_str += f"\n description: {schema['description']}"
func_str += f"\n params:"
for param_k, param_v in schema["parameters"]["properties"].items():
# TODO we're ignoring type
func_str += f"\n {param_k}: {param_v['description']}"
# TODO we're ignoring schema['parameters']['required']
return func_str
# prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format."
prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the ongoing conversation. Provide your response in JSON format."
prompt += f"\nAvailable functions:"
if function_documentation is not None:
prompt += f"\n{function_documentation}"
else:
for function_dict in functions:
prompt += f"\n{create_function_description(function_dict)}"
# Put functions INSIDE system message (TODO experiment with this)
prompt += IM_END_TOKEN
def create_function_call(function_call):
airo_func_call = {
"function": function_call["name"],
"params": json.loads(function_call["arguments"], strict=JSON_LOADS_STRICT),
}
return json.dumps(airo_func_call, indent=2, ensure_ascii=JSON_ENSURE_ASCII)
for message in messages[1:]:
assert message["role"] in ["user", "assistant", "function", "tool"], message
if message["role"] == "user":
if self.simplify_json_content:
try:
content_json = json.loads(message["content"], strict=JSON_LOADS_STRICT)
content_simple = content_json["message"]
prompt += f"\n<|user|>\n{content_simple}{IM_END_TOKEN}"
# prompt += f"\nUSER: {content_simple}"
except:
prompt += f"\n<|user|>\n{message['content']}{IM_END_TOKEN}"
# prompt += f"\nUSER: {message['content']}"
elif message["role"] == "assistant":
prompt += f"\n<|assistant|>"
if message["content"] is not None:
prompt += f"\n{message['content']}"
# prompt += f"\nASSISTANT: {message['content']}"
# need to add the function call if there was one
if "function_call" in message and message["function_call"]:
prompt += f"\n{create_function_call(message['function_call'])}"
prompt += f"{IM_END_TOKEN}"
elif message["role"] in ["function", "tool"]:
# TODO find a good way to add this
# prompt += f"\nASSISTANT: (function return) {message['content']}"
prompt += f"\n<|assistant|>"
prompt += f"\nFUNCTION RETURN: {message['content']}"
# prompt += f"\nFUNCTION RETURN: {message['content']}"
continue
else:
raise ValueError(message)
# Add a sep for the response
# if self.include_section_separators:
# prompt += "\n### RESPONSE"
if self.include_assistant_prefix:
# prompt += f"\nASSISTANT:"
prompt += f"\n<|assistant|>"
if self.include_opening_brance_in_prefix:
prompt += "\n{"
return prompt
def clean_function_args(self, function_name, function_args):
"""Some basic MemGPT-specific cleaning of function args"""
cleaned_function_name = function_name
cleaned_function_args = function_args.copy() if function_args is not None else {}
if function_name == "send_message":
# strip request_heartbeat
cleaned_function_args.pop("request_heartbeat", None)
# TODO more cleaning to fix errors LLM makes
return cleaned_function_name, cleaned_function_args
def output_to_chat_completion_response(self, raw_llm_output):
"""Turn raw LLM output into a ChatCompletion style response with:
"message" = {
"role": "assistant",
"content": ...,
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
}
}
"""
if self.include_opening_brance_in_prefix and raw_llm_output[0] != "{":
raw_llm_output = "{" + raw_llm_output
try:
function_json_output = clean_json(raw_llm_output)
except Exception as e:
raise Exception(f"Failed to decode JSON from LLM output:\n{raw_llm_output} - error\n{str(e)}")
try:
function_name = function_json_output["function"]
function_parameters = function_json_output["params"]
except KeyError as e:
raise LLMJSONParsingError(f"Received valid JSON from LLM, but JSON was missing fields: {str(e)}")
if self.clean_func_args:
function_name, function_parameters = self.clean_function_args(function_name, function_parameters)
message = {
"role": "assistant",
"content": None,
"function_call": {
"name": function_name,
"arguments": json.dumps(function_parameters, ensure_ascii=JSON_ENSURE_ASCII),
},
}
return message
class ZephyrMistralInnerMonologueWrapper(ZephyrMistralWrapper):
"""Still expect only JSON outputs from model, but add inner monologue as a field"""
"""
Wrapper for Zephyr Alpha and Beta, Mistral 7B:
https://huggingface.co/HuggingFaceH4/zephyr-7b-alpha
https://huggingface.co/HuggingFaceH4/zephyr-7b-beta
Note: this wrapper formats a prompt with inner thoughts included
"""
def __init__(
self,
simplify_json_content=True,
clean_function_args=True,
include_assistant_prefix=True,
include_opening_brace_in_prefix=True,
include_section_separators=True,
):
self.simplify_json_content = simplify_json_content
self.clean_func_args = clean_function_args
self.include_assistant_prefix = include_assistant_prefix
self.include_opening_brance_in_prefix = include_opening_brace_in_prefix
self.include_section_separators = include_section_separators
def chat_completion_to_prompt(self, messages, functions, function_documentation=None):
prompt = ""
IM_END_TOKEN = "</s>"
# System insturctions go first
assert messages[0]["role"] == "system"
prompt += messages[0]["content"]
# Next is the functions preamble
def create_function_description(schema, add_inner_thoughts=True):
# airorobos style
func_str = ""
func_str += f"{schema['name']}:"
func_str += f"\n description: {schema['description']}"
func_str += f"\n params:"
if add_inner_thoughts:
func_str += f"\n inner_thoughts: Deep inner monologue private to you only."
for param_k, param_v in schema["parameters"]["properties"].items():
# TODO we're ignoring type
func_str += f"\n {param_k}: {param_v['description']}"
# TODO we're ignoring schema['parameters']['required']
return func_str
# prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format."
prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the ongoing conversation. Provide your response in JSON format."
prompt += f"\nAvailable functions:"
if function_documentation is not None:
prompt += f"\n{function_documentation}"
else:
for function_dict in functions:
prompt += f"\n{create_function_description(function_dict)}"
def create_function_call(function_call, inner_thoughts=None):
airo_func_call = {
"function": function_call["name"],
"params": {
"inner_thoughts": inner_thoughts,
**json.loads(function_call["arguments"], strict=JSON_LOADS_STRICT),
},
}
return json.dumps(airo_func_call, indent=2, ensure_ascii=JSON_ENSURE_ASCII)
# Add a sep for the conversation
if self.include_section_separators:
prompt += "\n<|user|>"
# Last are the user/assistant messages
for message in messages[1:]:
assert message["role"] in ["user", "assistant", "function", "tool"], message
if message["role"] == "user":
if self.simplify_json_content:
try:
content_json = json.loads(message["content"], strict=JSON_LOADS_STRICT)
content_simple = content_json["message"]
prompt += f"\n<|user|>\n{content_simple}{IM_END_TOKEN}"
except:
prompt += f"\n<|user|>\n{message['content']}{IM_END_TOKEN}"
elif message["role"] == "assistant":
prompt += f"\n<|assistant|>"
# need to add the function call if there was one
inner_thoughts = message["content"]
if "function_call" in message and message["function_call"]:
prompt += f"\n{create_function_call(message['function_call'], inner_thoughts=inner_thoughts)}"
elif message["role"] in ["function", "tool"]:
# TODO find a good way to add this
# prompt += f"\nASSISTANT: (function return) {message['content']}"
prompt += f"\nFUNCTION RETURN: {message['content']}"
continue
else:
raise ValueError(message)
# Add a sep for the response
# if self.include_section_separators:
# prompt += "\n### RESPONSE"
if self.include_assistant_prefix:
prompt += f"\n<|assistant|>"
if self.include_opening_brance_in_prefix:
prompt += "\n{"
return prompt
def clean_function_args(self, function_name, function_args):
"""Some basic MemGPT-specific cleaning of function args"""
cleaned_function_name = function_name
cleaned_function_args = function_args.copy() if function_args is not None else {}
if function_name == "send_message":
# strip request_heartbeat
cleaned_function_args.pop("request_heartbeat", None)
inner_thoughts = None
if "inner_thoughts" in function_args:
inner_thoughts = cleaned_function_args.pop("inner_thoughts")
# TODO more cleaning to fix errors LLM makes
return inner_thoughts, cleaned_function_name, cleaned_function_args
def output_to_chat_completion_response(self, raw_llm_output):
"""Turn raw LLM output into a ChatCompletion style response with:
"message" = {
"role": "assistant",
"content": ...,
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
}
}
"""
if self.include_opening_brance_in_prefix and raw_llm_output[0] != "{":
raw_llm_output = "{" + raw_llm_output
try:
function_json_output = clean_json(raw_llm_output)
except Exception as e:
raise Exception(f"Failed to decode JSON from LLM output:\n{raw_llm_output} - error\n{str(e)}")
try:
function_name = function_json_output["function"]
function_parameters = function_json_output["params"]
except KeyError as e:
raise LLMJSONParsingError(f"Received valid JSON from LLM, but JSON was missing fields: {str(e)}")
if self.clean_func_args:
(
inner_thoughts,
function_name,
function_parameters,
) = self.clean_function_args(function_name, function_parameters)
message = {
"role": "assistant",
"content": inner_thoughts,
"function_call": {
"name": function_name,
"arguments": json.dumps(function_parameters, ensure_ascii=JSON_ENSURE_ASCII),
},
}
return message
import json
from ...constants import JSON_ENSURE_ASCII
from ...errors import LLMJSONParsingError
from ..json_parser import clean_json
from .wrapper_base import LLMChatCompletionWrapper
class ZephyrMistralWrapper(LLMChatCompletionWrapper):
"""
Wrapper for Zephyr Alpha and Beta, Mistral 7B:
https://huggingface.co/HuggingFaceH4/zephyr-7b-alpha
https://huggingface.co/HuggingFaceH4/zephyr-7b-beta
Note: this wrapper formats a prompt that only generates JSON, no inner thoughts
"""
def __init__(
self,
simplify_json_content=True,
clean_function_args=True,
include_assistant_prefix=True,
include_opening_brace_in_prefix=True,
include_section_separators=False,
):
self.simplify_json_content = simplify_json_content
self.clean_func_args = clean_function_args
self.include_assistant_prefix = include_assistant_prefix
self.include_opening_brance_in_prefix = include_opening_brace_in_prefix
self.include_section_separators = include_section_separators
def chat_completion_to_prompt(self, messages, functions, function_documentation=None):
"""
Zephyr prompt format:
<|system|>
</s>
<|user|>
{prompt}</s>
<|assistant|>
(source: https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF#prompt-template-zephyr)
"""
prompt = ""
IM_END_TOKEN = "</s>"
# System instructions go first
assert messages[0]["role"] == "system"
prompt += f"<|system|>"
prompt += f"\n{messages[0]['content']}"
# Next is the functions preamble
def create_function_description(schema):
# airorobos style
func_str = ""
func_str += f"{schema['name']}:"
func_str += f"\n description: {schema['description']}"
func_str += f"\n params:"
for param_k, param_v in schema["parameters"]["properties"].items():
# TODO we're ignoring type
func_str += f"\n {param_k}: {param_v['description']}"
# TODO we're ignoring schema['parameters']['required']
return func_str
# prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format."
prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the ongoing conversation. Provide your response in JSON format."
prompt += f"\nAvailable functions:"
if function_documentation is not None:
prompt += f"\n{function_documentation}"
else:
for function_dict in functions:
prompt += f"\n{create_function_description(function_dict)}"
# Put functions INSIDE system message (TODO experiment with this)
prompt += IM_END_TOKEN
def create_function_call(function_call):
airo_func_call = {
"function": function_call["name"],
"params": json.loads(function_call["arguments"], strict=JSON_LOADS_STRICT),
}
return json.dumps(airo_func_call, indent=2, ensure_ascii=JSON_ENSURE_ASCII)
for message in messages[1:]:
assert message["role"] in ["user", "assistant", "function", "tool"], message
if message["role"] == "user":
if self.simplify_json_content:
try:
content_json = json.loads(message["content"], strict=JSON_LOADS_STRICT)
content_simple = content_json["message"]
prompt += f"\n<|user|>\n{content_simple}{IM_END_TOKEN}"
# prompt += f"\nUSER: {content_simple}"
except:
prompt += f"\n<|user|>\n{message['content']}{IM_END_TOKEN}"
# prompt += f"\nUSER: {message['content']}"
elif message["role"] == "assistant":
prompt += f"\n<|assistant|>"
if message["content"] is not None:
prompt += f"\n{message['content']}"
# prompt += f"\nASSISTANT: {message['content']}"
# need to add the function call if there was one
if "function_call" in message and message["function_call"]:
prompt += f"\n{create_function_call(message['function_call'])}"
prompt += f"{IM_END_TOKEN}"
elif message["role"] in ["function", "tool"]:
# TODO find a good way to add this
# prompt += f"\nASSISTANT: (function return) {message['content']}"
prompt += f"\n<|assistant|>"
prompt += f"\nFUNCTION RETURN: {message['content']}"
# prompt += f"\nFUNCTION RETURN: {message['content']}"
continue
else:
raise ValueError(message)
# Add a sep for the response
# if self.include_section_separators:
# prompt += "\n### RESPONSE"
if self.include_assistant_prefix:
# prompt += f"\nASSISTANT:"
prompt += f"\n<|assistant|>"
if self.include_opening_brance_in_prefix:
prompt += "\n{"
return prompt
def clean_function_args(self, function_name, function_args):
"""Some basic MemGPT-specific cleaning of function args"""
cleaned_function_name = function_name
cleaned_function_args = function_args.copy() if function_args is not None else {}
if function_name == "send_message":
# strip request_heartbeat
cleaned_function_args.pop("request_heartbeat", None)
# TODO more cleaning to fix errors LLM makes
return cleaned_function_name, cleaned_function_args
def output_to_chat_completion_response(self, raw_llm_output):
"""Turn raw LLM output into a ChatCompletion style response with:
"message" = {
"role": "assistant",
"content": ...,
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
}
}
"""
if self.include_opening_brance_in_prefix and raw_llm_output[0] != "{":
raw_llm_output = "{" + raw_llm_output
try:
function_json_output = clean_json(raw_llm_output)
except Exception as e:
raise Exception(f"Failed to decode JSON from LLM output:\n{raw_llm_output} - error\n{str(e)}")
try:
function_name = function_json_output["function"]
function_parameters = function_json_output["params"]
except KeyError as e:
raise LLMJSONParsingError(f"Received valid JSON from LLM, but JSON was missing fields: {str(e)}")
if self.clean_func_args:
function_name, function_parameters = self.clean_function_args(function_name, function_parameters)
message = {
"role": "assistant",
"content": None,
"function_call": {
"name": function_name,
"arguments": json.dumps(function_parameters, ensure_ascii=JSON_ENSURE_ASCII),
},
}
return message
class ZephyrMistralInnerMonologueWrapper(ZephyrMistralWrapper):
"""Still expect only JSON outputs from model, but add inner monologue as a field"""
"""
Wrapper for Zephyr Alpha and Beta, Mistral 7B:
https://huggingface.co/HuggingFaceH4/zephyr-7b-alpha
https://huggingface.co/HuggingFaceH4/zephyr-7b-beta
Note: this wrapper formats a prompt with inner thoughts included
"""
def __init__(
self,
simplify_json_content=True,
clean_function_args=True,
include_assistant_prefix=True,
include_opening_brace_in_prefix=True,
include_section_separators=True,
):
self.simplify_json_content = simplify_json_content
self.clean_func_args = clean_function_args
self.include_assistant_prefix = include_assistant_prefix
self.include_opening_brance_in_prefix = include_opening_brace_in_prefix
self.include_section_separators = include_section_separators
def chat_completion_to_prompt(self, messages, functions, function_documentation=None):
prompt = ""
IM_END_TOKEN = "</s>"
# System insturctions go first
assert messages[0]["role"] == "system"
prompt += messages[0]["content"]
# Next is the functions preamble
def create_function_description(schema, add_inner_thoughts=True):
# airorobos style
func_str = ""
func_str += f"{schema['name']}:"
func_str += f"\n description: {schema['description']}"
func_str += f"\n params:"
if add_inner_thoughts:
func_str += f"\n inner_thoughts: Deep inner monologue private to you only."
for param_k, param_v in schema["parameters"]["properties"].items():
# TODO we're ignoring type
func_str += f"\n {param_k}: {param_v['description']}"
# TODO we're ignoring schema['parameters']['required']
return func_str
# prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the user's input. Provide your response in JSON format."
prompt += f"\nPlease select the most suitable function and parameters from the list of available functions below, based on the ongoing conversation. Provide your response in JSON format."
prompt += f"\nAvailable functions:"
if function_documentation is not None:
prompt += f"\n{function_documentation}"
else:
for function_dict in functions:
prompt += f"\n{create_function_description(function_dict)}"
def create_function_call(function_call, inner_thoughts=None):
airo_func_call = {
"function": function_call["name"],
"params": {
"inner_thoughts": inner_thoughts,
**json.loads(function_call["arguments"], strict=JSON_LOADS_STRICT),
},
}
return json.dumps(airo_func_call, indent=2, ensure_ascii=JSON_ENSURE_ASCII)
# Add a sep for the conversation
if self.include_section_separators:
prompt += "\n<|user|>"
# Last are the user/assistant messages
for message in messages[1:]:
assert message["role"] in ["user", "assistant", "function", "tool"], message
if message["role"] == "user":
if self.simplify_json_content:
try:
content_json = json.loads(message["content"], strict=JSON_LOADS_STRICT)
content_simple = content_json["message"]
prompt += f"\n<|user|>\n{content_simple}{IM_END_TOKEN}"
except:
prompt += f"\n<|user|>\n{message['content']}{IM_END_TOKEN}"
elif message["role"] == "assistant":
prompt += f"\n<|assistant|>"
# need to add the function call if there was one
inner_thoughts = message["content"]
if "function_call" in message and message["function_call"]:
prompt += f"\n{create_function_call(message['function_call'], inner_thoughts=inner_thoughts)}"
elif message["role"] in ["function", "tool"]:
# TODO find a good way to add this
# prompt += f"\nASSISTANT: (function return) {message['content']}"
prompt += f"\nFUNCTION RETURN: {message['content']}"
continue
else:
raise ValueError(message)
# Add a sep for the response
# if self.include_section_separators:
# prompt += "\n### RESPONSE"
if self.include_assistant_prefix:
prompt += f"\n<|assistant|>"
if self.include_opening_brance_in_prefix:
prompt += "\n{"
return prompt
def clean_function_args(self, function_name, function_args):
"""Some basic MemGPT-specific cleaning of function args"""
cleaned_function_name = function_name
cleaned_function_args = function_args.copy() if function_args is not None else {}
if function_name == "send_message":
# strip request_heartbeat
cleaned_function_args.pop("request_heartbeat", None)
inner_thoughts = None
if "inner_thoughts" in function_args:
inner_thoughts = cleaned_function_args.pop("inner_thoughts")
# TODO more cleaning to fix errors LLM makes
return inner_thoughts, cleaned_function_name, cleaned_function_args
def output_to_chat_completion_response(self, raw_llm_output):
"""Turn raw LLM output into a ChatCompletion style response with:
"message" = {
"role": "assistant",
"content": ...,
"function_call": {
"name": ...
"arguments": {
"arg1": val1,
...
}
}
}
"""
if self.include_opening_brance_in_prefix and raw_llm_output[0] != "{":
raw_llm_output = "{" + raw_llm_output
try:
function_json_output = clean_json(raw_llm_output)
except Exception as e:
raise Exception(f"Failed to decode JSON from LLM output:\n{raw_llm_output} - error\n{str(e)}")
try:
function_name = function_json_output["function"]
function_parameters = function_json_output["params"]
except KeyError as e:
raise LLMJSONParsingError(f"Received valid JSON from LLM, but JSON was missing fields: {str(e)}")
if self.clean_func_args:
(
inner_thoughts,
function_name,
function_parameters,
) = self.clean_function_args(function_name, function_parameters)
message = {
"role": "assistant",
"content": inner_thoughts,
"function_call": {
"name": function_name,
"arguments": json.dumps(function_parameters, ensure_ascii=JSON_ENSURE_ASCII),
},
}
return message

View File

@@ -1,448 +1,448 @@
import json
import os
import sys
import traceback
import questionary
import requests
import typer
from rich.console import Console
import memgpt.agent as agent
import memgpt.errors as errors
import memgpt.system as system
from memgpt.agent_store.storage import StorageConnector, TableType
# import benchmark
from memgpt.benchmark.benchmark import bench
from memgpt.cli.cli import (
delete_agent,
migrate,
open_folder,
quickstart,
run,
server,
version,
)
from memgpt.cli.cli_config import add, configure, delete, list
from memgpt.cli.cli_load import app as load_app
from memgpt.config import MemGPTConfig
from memgpt.constants import (
FUNC_FAILED_HEARTBEAT_MESSAGE,
JSON_ENSURE_ASCII,
JSON_LOADS_STRICT,
REQ_HEARTBEAT_MESSAGE,
)
from memgpt.metadata import MetadataStore
# from memgpt.interface import CLIInterface as interface # for printing to terminal
from memgpt.streaming_interface import AgentRefreshStreamingInterface
# interface = interface()
app = typer.Typer(pretty_exceptions_enable=False)
app.command(name="run")(run)
app.command(name="version")(version)
app.command(name="configure")(configure)
app.command(name="list")(list)
app.command(name="add")(add)
app.command(name="delete")(delete)
app.command(name="server")(server)
app.command(name="folder")(open_folder)
app.command(name="quickstart")(quickstart)
# load data commands
app.add_typer(load_app, name="load")
# migration command
app.command(name="migrate")(migrate)
# benchmark command
app.command(name="benchmark")(bench)
# delete agents
app.command(name="delete-agent")(delete_agent)
def clear_line(console, strip_ui=False):
if strip_ui:
return
if os.name == "nt": # for windows
console.print("\033[A\033[K", end="")
else: # for linux
sys.stdout.write("\033[2K\033[G")
sys.stdout.flush()
def run_agent_loop(
memgpt_agent: agent.Agent, config: MemGPTConfig, first, ms: MetadataStore, no_verify=False, cfg=None, strip_ui=False, stream=False
):
if isinstance(memgpt_agent.interface, AgentRefreshStreamingInterface):
# memgpt_agent.interface.toggle_streaming(on=stream)
if not stream:
memgpt_agent.interface = memgpt_agent.interface.nonstreaming_interface
if hasattr(memgpt_agent.interface, "console"):
console = memgpt_agent.interface.console
else:
console = Console()
counter = 0
user_input = None
skip_next_user_input = False
user_message = None
USER_GOES_FIRST = first
if not USER_GOES_FIRST:
console.input("[bold cyan]Hit enter to begin (will request first MemGPT message)[/bold cyan]\n")
clear_line(console, strip_ui=strip_ui)
print()
multiline_input = False
ms = MetadataStore(config)
while True:
if not skip_next_user_input and (counter > 0 or USER_GOES_FIRST):
# Ask for user input
if not stream:
print()
user_input = questionary.text(
"Enter your message:",
multiline=multiline_input,
qmark=">",
).ask()
clear_line(console, strip_ui=strip_ui)
if not stream:
print()
# Gracefully exit on Ctrl-C/D
if user_input is None:
user_input = "/exit"
user_input = user_input.rstrip()
if user_input.startswith("!"):
print(f"Commands for CLI begin with '/' not '!'")
continue
if user_input == "":
# no empty messages allowed
print("Empty input received. Try again!")
continue
# Handle CLI commands
# Commands to not get passed as input to MemGPT
if user_input.startswith("/"):
# updated agent save functions
if user_input.lower() == "/exit":
# memgpt_agent.save()
agent.save_agent(memgpt_agent, ms)
break
elif user_input.lower() == "/save" or user_input.lower() == "/savechat":
# memgpt_agent.save()
agent.save_agent(memgpt_agent, ms)
continue
elif user_input.lower() == "/attach":
# TODO: check if agent already has it
# TODO: check to ensure source embedding dimentions/model match agents, and disallow attachment if not
# TODO: alternatively, only list sources with compatible embeddings, and print warning about non-compatible sources
data_source_options = ms.list_sources(user_id=memgpt_agent.agent_state.user_id)
if len(data_source_options) == 0:
typer.secho(
'No sources available. You must load a souce with "memgpt load ..." before running /attach.',
fg=typer.colors.RED,
bold=True,
)
continue
# determine what sources are valid to be attached to this agent
valid_options = []
invalid_options = []
for source in data_source_options:
if (
source.embedding_model == memgpt_agent.agent_state.embedding_config.embedding_model
and source.embedding_dim == memgpt_agent.agent_state.embedding_config.embedding_dim
):
valid_options.append(source.name)
else:
# print warning about invalid sources
typer.secho(
f"Source {source.name} exists but has embedding dimentions {source.embedding_dim} from model {source.embedding_model}, while the agent uses embedding dimentions {memgpt_agent.agent_state.embedding_config.embedding_dim} and model {memgpt_agent.agent_state.embedding_config.embedding_model}",
fg=typer.colors.YELLOW,
)
invalid_options.append(source.name)
# prompt user for data source selection
data_source = questionary.select("Select data source", choices=valid_options).ask()
# attach new data
# attach(memgpt_agent.agent_state.name, data_source)
source_connector = StorageConnector.get_storage_connector(
TableType.PASSAGES, config, user_id=memgpt_agent.agent_state.user_id
)
memgpt_agent.attach_source(data_source, source_connector, ms)
continue
elif user_input.lower() == "/dump" or user_input.lower().startswith("/dump "):
# Check if there's an additional argument that's an integer
command = user_input.strip().split()
amount = int(command[1]) if len(command) > 1 and command[1].isdigit() else 0
if amount == 0:
memgpt_agent.interface.print_messages(memgpt_agent._messages, dump=True)
else:
memgpt_agent.interface.print_messages(memgpt_agent._messages[-min(amount, len(memgpt_agent.messages)) :], dump=True)
continue
elif user_input.lower() == "/dumpraw":
memgpt_agent.interface.print_messages_raw(memgpt_agent._messages)
continue
elif user_input.lower() == "/memory":
print(f"\nDumping memory contents:\n")
print(f"{str(memgpt_agent.memory)}")
print(f"{str(memgpt_agent.persistence_manager.archival_memory)}")
print(f"{str(memgpt_agent.persistence_manager.recall_memory)}")
continue
elif user_input.lower() == "/model":
if memgpt_agent.model == "gpt-4":
memgpt_agent.model = "gpt-3.5-turbo-16k"
elif memgpt_agent.model == "gpt-3.5-turbo-16k":
memgpt_agent.model = "gpt-4"
print(f"Updated model to:\n{str(memgpt_agent.model)}")
continue
elif user_input.lower() == "/pop" or user_input.lower().startswith("/pop "):
# Check if there's an additional argument that's an integer
command = user_input.strip().split()
pop_amount = int(command[1]) if len(command) > 1 and command[1].isdigit() else 3
n_messages = len(memgpt_agent._messages)
MIN_MESSAGES = 2
if n_messages <= MIN_MESSAGES:
print(f"Agent only has {n_messages} messages in stack, none left to pop")
elif n_messages - pop_amount < MIN_MESSAGES:
print(f"Agent only has {n_messages} messages in stack, cannot pop more than {n_messages - MIN_MESSAGES}")
else:
print(f"Popping last {pop_amount} messages from stack")
for _ in range(min(pop_amount, len(memgpt_agent._messages))):
# remove the message from the internal state of the agent
deleted_message = memgpt_agent._messages.pop()
# then also remove it from recall storage
memgpt_agent.persistence_manager.recall_memory.storage.delete(filters={"id": deleted_message.id})
continue
elif user_input.lower() == "/retry":
print(f"Retrying for another answer")
while len(memgpt_agent._messages) > 0:
if memgpt_agent._messages[-1].role == "user":
# we want to pop up to the last user message and send it again
user_message = memgpt_agent._messages[-1].text
deleted_message = memgpt_agent._messages.pop()
# then also remove it from recall storage
memgpt_agent.persistence_manager.recall_memory.storage.delete(filters={"id": deleted_message.id})
break
deleted_message = memgpt_agent._messages.pop()
# then also remove it from recall storage
memgpt_agent.persistence_manager.recall_memory.storage.delete(filters={"id": deleted_message.id})
elif user_input.lower() == "/rethink" or user_input.lower().startswith("/rethink "):
if len(user_input) < len("/rethink "):
print("Missing text after the command")
continue
for x in range(len(memgpt_agent.messages) - 1, 0, -1):
msg_obj = memgpt_agent._messages[x]
if msg_obj.role == "assistant":
clean_new_text = user_input[len("/rethink ") :].strip()
msg_obj.text = clean_new_text
# To persist to the database, all we need to do is "re-insert" into recall memory
memgpt_agent.persistence_manager.recall_memory.storage.update(record=msg_obj)
break
continue
elif user_input.lower() == "/rewrite" or user_input.lower().startswith("/rewrite "):
if len(user_input) < len("/rewrite "):
print("Missing text after the command")
continue
for x in range(len(memgpt_agent.messages) - 1, 0, -1):
if memgpt_agent.messages[x].get("role") == "assistant":
text = user_input[len("/rewrite ") :].strip()
# Get the current message content
# The rewrite target is the output of send_message
message_obj = memgpt_agent._messages[x]
if message_obj.tool_calls is not None and len(message_obj.tool_calls) > 0:
# Check that we hit an assistant send_message call
name_string = message_obj.tool_calls[0].function.get("name")
if name_string is None or name_string != "send_message":
print("Assistant missing send_message function call")
break # cancel op
args_string = message_obj.tool_calls[0].function.get("arguments")
if args_string is None:
print("Assistant missing send_message function arguments")
break # cancel op
args_json = json.loads(args_string, strict=JSON_LOADS_STRICT)
if "message" not in args_json:
print("Assistant missing send_message message argument")
break # cancel op
# Once we found our target, rewrite it
args_json["message"] = text
new_args_string = json.dumps(args_json, ensure_ascii=JSON_ENSURE_ASCII)
message_obj.tool_calls[0].function["arguments"] = new_args_string
# To persist to the database, all we need to do is "re-insert" into recall memory
memgpt_agent.persistence_manager.recall_memory.storage.update(record=message_obj)
break
continue
elif user_input.lower() == "/summarize":
try:
memgpt_agent.summarize_messages_inplace()
typer.secho(
f"/summarize succeeded",
fg=typer.colors.GREEN,
bold=True,
)
except (errors.LLMError, requests.exceptions.HTTPError) as e:
typer.secho(
f"/summarize failed:\n{e}",
fg=typer.colors.RED,
bold=True,
)
continue
elif user_input.lower().startswith("/add_function"):
try:
if len(user_input) < len("/add_function "):
print("Missing function name after the command")
continue
function_name = user_input[len("/add_function ") :].strip()
result = memgpt_agent.add_function(function_name)
typer.secho(
f"/add_function succeeded: {result}",
fg=typer.colors.GREEN,
bold=True,
)
except ValueError as e:
typer.secho(
f"/add_function failed:\n{e}",
fg=typer.colors.RED,
bold=True,
)
continue
elif user_input.lower().startswith("/remove_function"):
try:
if len(user_input) < len("/remove_function "):
print("Missing function name after the command")
continue
function_name = user_input[len("/remove_function ") :].strip()
result = memgpt_agent.remove_function(function_name)
typer.secho(
f"/remove_function succeeded: {result}",
fg=typer.colors.GREEN,
bold=True,
)
except ValueError as e:
typer.secho(
f"/remove_function failed:\n{e}",
fg=typer.colors.RED,
bold=True,
)
continue
# No skip options
elif user_input.lower() == "/wipe":
memgpt_agent = agent.Agent(memgpt_agent.interface)
user_message = None
elif user_input.lower() == "/heartbeat":
user_message = system.get_heartbeat()
elif user_input.lower() == "/memorywarning":
user_message = system.get_token_limit_warning()
elif user_input.lower() == "//":
multiline_input = not multiline_input
continue
elif user_input.lower() == "/" or user_input.lower() == "/help":
questionary.print("CLI commands", "bold")
for cmd, desc in USER_COMMANDS:
questionary.print(cmd, "bold")
questionary.print(f" {desc}")
continue
else:
print(f"Unrecognized command: {user_input}")
continue
else:
# If message did not begin with command prefix, pass inputs to MemGPT
# Handle user message and append to messages
user_message = system.package_user_message(user_input)
skip_next_user_input = False
def process_agent_step(user_message, no_verify):
new_messages, heartbeat_request, function_failed, token_warning, tokens_accumulated = memgpt_agent.step(
user_message,
first_message=False,
skip_verify=no_verify,
stream=stream,
)
skip_next_user_input = False
if token_warning:
user_message = system.get_token_limit_warning()
skip_next_user_input = True
elif function_failed:
user_message = system.get_heartbeat(FUNC_FAILED_HEARTBEAT_MESSAGE)
skip_next_user_input = True
elif heartbeat_request:
user_message = system.get_heartbeat(REQ_HEARTBEAT_MESSAGE)
skip_next_user_input = True
return new_messages, user_message, skip_next_user_input
while True:
try:
if strip_ui:
new_messages, user_message, skip_next_user_input = process_agent_step(user_message, no_verify)
break
else:
if stream:
# Don't display the "Thinking..." if streaming
new_messages, user_message, skip_next_user_input = process_agent_step(user_message, no_verify)
else:
with console.status("[bold cyan]Thinking...") as status:
new_messages, user_message, skip_next_user_input = process_agent_step(user_message, no_verify)
break
except KeyboardInterrupt:
print("User interrupt occurred.")
retry = questionary.confirm("Retry agent.step()?").ask()
if not retry:
break
except Exception as e:
print("An exception occurred when running agent.step(): ")
traceback.print_exc()
retry = questionary.confirm("Retry agent.step()?").ask()
if not retry:
break
counter += 1
print("Finished.")
USER_COMMANDS = [
("//", "toggle multiline input mode"),
("/exit", "exit the CLI"),
("/save", "save a checkpoint of the current agent/conversation state"),
("/load", "load a saved checkpoint"),
("/dump <count>", "view the last <count> messages (all if <count> is omitted)"),
("/memory", "print the current contents of agent memory"),
("/pop <count>", "undo <count> messages in the conversation (default is 3)"),
("/retry", "pops the last answer and tries to get another one"),
("/rethink <text>", "changes the inner thoughts of the last agent message"),
("/rewrite <text>", "changes the reply of the last agent message"),
("/heartbeat", "send a heartbeat system message to the agent"),
("/memorywarning", "send a memory warning system message to the agent"),
("/attach", "attach data source to agent"),
]
import json
import os
import sys
import traceback
import questionary
import requests
import typer
from rich.console import Console
import memgpt.agent as agent
import memgpt.errors as errors
import memgpt.system as system
from memgpt.agent_store.storage import StorageConnector, TableType
# import benchmark
from memgpt.benchmark.benchmark import bench
from memgpt.cli.cli import (
delete_agent,
migrate,
open_folder,
quickstart,
run,
server,
version,
)
from memgpt.cli.cli_config import add, configure, delete, list
from memgpt.cli.cli_load import app as load_app
from memgpt.config import MemGPTConfig
from memgpt.constants import (
FUNC_FAILED_HEARTBEAT_MESSAGE,
JSON_ENSURE_ASCII,
JSON_LOADS_STRICT,
REQ_HEARTBEAT_MESSAGE,
)
from memgpt.metadata import MetadataStore
# from memgpt.interface import CLIInterface as interface # for printing to terminal
from memgpt.streaming_interface import AgentRefreshStreamingInterface
# interface = interface()
app = typer.Typer(pretty_exceptions_enable=False)
app.command(name="run")(run)
app.command(name="version")(version)
app.command(name="configure")(configure)
app.command(name="list")(list)
app.command(name="add")(add)
app.command(name="delete")(delete)
app.command(name="server")(server)
app.command(name="folder")(open_folder)
app.command(name="quickstart")(quickstart)
# load data commands
app.add_typer(load_app, name="load")
# migration command
app.command(name="migrate")(migrate)
# benchmark command
app.command(name="benchmark")(bench)
# delete agents
app.command(name="delete-agent")(delete_agent)
def clear_line(console, strip_ui=False):
if strip_ui:
return
if os.name == "nt": # for windows
console.print("\033[A\033[K", end="")
else: # for linux
sys.stdout.write("\033[2K\033[G")
sys.stdout.flush()
def run_agent_loop(
memgpt_agent: agent.Agent, config: MemGPTConfig, first, ms: MetadataStore, no_verify=False, cfg=None, strip_ui=False, stream=False
):
if isinstance(memgpt_agent.interface, AgentRefreshStreamingInterface):
# memgpt_agent.interface.toggle_streaming(on=stream)
if not stream:
memgpt_agent.interface = memgpt_agent.interface.nonstreaming_interface
if hasattr(memgpt_agent.interface, "console"):
console = memgpt_agent.interface.console
else:
console = Console()
counter = 0
user_input = None
skip_next_user_input = False
user_message = None
USER_GOES_FIRST = first
if not USER_GOES_FIRST:
console.input("[bold cyan]Hit enter to begin (will request first MemGPT message)[/bold cyan]\n")
clear_line(console, strip_ui=strip_ui)
print()
multiline_input = False
ms = MetadataStore(config)
while True:
if not skip_next_user_input and (counter > 0 or USER_GOES_FIRST):
# Ask for user input
if not stream:
print()
user_input = questionary.text(
"Enter your message:",
multiline=multiline_input,
qmark=">",
).ask()
clear_line(console, strip_ui=strip_ui)
if not stream:
print()
# Gracefully exit on Ctrl-C/D
if user_input is None:
user_input = "/exit"
user_input = user_input.rstrip()
if user_input.startswith("!"):
print(f"Commands for CLI begin with '/' not '!'")
continue
if user_input == "":
# no empty messages allowed
print("Empty input received. Try again!")
continue
# Handle CLI commands
# Commands to not get passed as input to MemGPT
if user_input.startswith("/"):
# updated agent save functions
if user_input.lower() == "/exit":
# memgpt_agent.save()
agent.save_agent(memgpt_agent, ms)
break
elif user_input.lower() == "/save" or user_input.lower() == "/savechat":
# memgpt_agent.save()
agent.save_agent(memgpt_agent, ms)
continue
elif user_input.lower() == "/attach":
# TODO: check if agent already has it
# TODO: check to ensure source embedding dimentions/model match agents, and disallow attachment if not
# TODO: alternatively, only list sources with compatible embeddings, and print warning about non-compatible sources
data_source_options = ms.list_sources(user_id=memgpt_agent.agent_state.user_id)
if len(data_source_options) == 0:
typer.secho(
'No sources available. You must load a souce with "memgpt load ..." before running /attach.',
fg=typer.colors.RED,
bold=True,
)
continue
# determine what sources are valid to be attached to this agent
valid_options = []
invalid_options = []
for source in data_source_options:
if (
source.embedding_model == memgpt_agent.agent_state.embedding_config.embedding_model
and source.embedding_dim == memgpt_agent.agent_state.embedding_config.embedding_dim
):
valid_options.append(source.name)
else:
# print warning about invalid sources
typer.secho(
f"Source {source.name} exists but has embedding dimentions {source.embedding_dim} from model {source.embedding_model}, while the agent uses embedding dimentions {memgpt_agent.agent_state.embedding_config.embedding_dim} and model {memgpt_agent.agent_state.embedding_config.embedding_model}",
fg=typer.colors.YELLOW,
)
invalid_options.append(source.name)
# prompt user for data source selection
data_source = questionary.select("Select data source", choices=valid_options).ask()
# attach new data
# attach(memgpt_agent.agent_state.name, data_source)
source_connector = StorageConnector.get_storage_connector(
TableType.PASSAGES, config, user_id=memgpt_agent.agent_state.user_id
)
memgpt_agent.attach_source(data_source, source_connector, ms)
continue
elif user_input.lower() == "/dump" or user_input.lower().startswith("/dump "):
# Check if there's an additional argument that's an integer
command = user_input.strip().split()
amount = int(command[1]) if len(command) > 1 and command[1].isdigit() else 0
if amount == 0:
memgpt_agent.interface.print_messages(memgpt_agent._messages, dump=True)
else:
memgpt_agent.interface.print_messages(memgpt_agent._messages[-min(amount, len(memgpt_agent.messages)) :], dump=True)
continue
elif user_input.lower() == "/dumpraw":
memgpt_agent.interface.print_messages_raw(memgpt_agent._messages)
continue
elif user_input.lower() == "/memory":
print(f"\nDumping memory contents:\n")
print(f"{str(memgpt_agent.memory)}")
print(f"{str(memgpt_agent.persistence_manager.archival_memory)}")
print(f"{str(memgpt_agent.persistence_manager.recall_memory)}")
continue
elif user_input.lower() == "/model":
if memgpt_agent.model == "gpt-4":
memgpt_agent.model = "gpt-3.5-turbo-16k"
elif memgpt_agent.model == "gpt-3.5-turbo-16k":
memgpt_agent.model = "gpt-4"
print(f"Updated model to:\n{str(memgpt_agent.model)}")
continue
elif user_input.lower() == "/pop" or user_input.lower().startswith("/pop "):
# Check if there's an additional argument that's an integer
command = user_input.strip().split()
pop_amount = int(command[1]) if len(command) > 1 and command[1].isdigit() else 3
n_messages = len(memgpt_agent._messages)
MIN_MESSAGES = 2
if n_messages <= MIN_MESSAGES:
print(f"Agent only has {n_messages} messages in stack, none left to pop")
elif n_messages - pop_amount < MIN_MESSAGES:
print(f"Agent only has {n_messages} messages in stack, cannot pop more than {n_messages - MIN_MESSAGES}")
else:
print(f"Popping last {pop_amount} messages from stack")
for _ in range(min(pop_amount, len(memgpt_agent._messages))):
# remove the message from the internal state of the agent
deleted_message = memgpt_agent._messages.pop()
# then also remove it from recall storage
memgpt_agent.persistence_manager.recall_memory.storage.delete(filters={"id": deleted_message.id})
continue
elif user_input.lower() == "/retry":
print(f"Retrying for another answer")
while len(memgpt_agent._messages) > 0:
if memgpt_agent._messages[-1].role == "user":
# we want to pop up to the last user message and send it again
user_message = memgpt_agent._messages[-1].text
deleted_message = memgpt_agent._messages.pop()
# then also remove it from recall storage
memgpt_agent.persistence_manager.recall_memory.storage.delete(filters={"id": deleted_message.id})
break
deleted_message = memgpt_agent._messages.pop()
# then also remove it from recall storage
memgpt_agent.persistence_manager.recall_memory.storage.delete(filters={"id": deleted_message.id})
elif user_input.lower() == "/rethink" or user_input.lower().startswith("/rethink "):
if len(user_input) < len("/rethink "):
print("Missing text after the command")
continue
for x in range(len(memgpt_agent.messages) - 1, 0, -1):
msg_obj = memgpt_agent._messages[x]
if msg_obj.role == "assistant":
clean_new_text = user_input[len("/rethink ") :].strip()
msg_obj.text = clean_new_text
# To persist to the database, all we need to do is "re-insert" into recall memory
memgpt_agent.persistence_manager.recall_memory.storage.update(record=msg_obj)
break
continue
elif user_input.lower() == "/rewrite" or user_input.lower().startswith("/rewrite "):
if len(user_input) < len("/rewrite "):
print("Missing text after the command")
continue
for x in range(len(memgpt_agent.messages) - 1, 0, -1):
if memgpt_agent.messages[x].get("role") == "assistant":
text = user_input[len("/rewrite ") :].strip()
# Get the current message content
# The rewrite target is the output of send_message
message_obj = memgpt_agent._messages[x]
if message_obj.tool_calls is not None and len(message_obj.tool_calls) > 0:
# Check that we hit an assistant send_message call
name_string = message_obj.tool_calls[0].function.get("name")
if name_string is None or name_string != "send_message":
print("Assistant missing send_message function call")
break # cancel op
args_string = message_obj.tool_calls[0].function.get("arguments")
if args_string is None:
print("Assistant missing send_message function arguments")
break # cancel op
args_json = json.loads(args_string, strict=JSON_LOADS_STRICT)
if "message" not in args_json:
print("Assistant missing send_message message argument")
break # cancel op
# Once we found our target, rewrite it
args_json["message"] = text
new_args_string = json.dumps(args_json, ensure_ascii=JSON_ENSURE_ASCII)
message_obj.tool_calls[0].function["arguments"] = new_args_string
# To persist to the database, all we need to do is "re-insert" into recall memory
memgpt_agent.persistence_manager.recall_memory.storage.update(record=message_obj)
break
continue
elif user_input.lower() == "/summarize":
try:
memgpt_agent.summarize_messages_inplace()
typer.secho(
f"/summarize succeeded",
fg=typer.colors.GREEN,
bold=True,
)
except (errors.LLMError, requests.exceptions.HTTPError) as e:
typer.secho(
f"/summarize failed:\n{e}",
fg=typer.colors.RED,
bold=True,
)
continue
elif user_input.lower().startswith("/add_function"):
try:
if len(user_input) < len("/add_function "):
print("Missing function name after the command")
continue
function_name = user_input[len("/add_function ") :].strip()
result = memgpt_agent.add_function(function_name)
typer.secho(
f"/add_function succeeded: {result}",
fg=typer.colors.GREEN,
bold=True,
)
except ValueError as e:
typer.secho(
f"/add_function failed:\n{e}",
fg=typer.colors.RED,
bold=True,
)
continue
elif user_input.lower().startswith("/remove_function"):
try:
if len(user_input) < len("/remove_function "):
print("Missing function name after the command")
continue
function_name = user_input[len("/remove_function ") :].strip()
result = memgpt_agent.remove_function(function_name)
typer.secho(
f"/remove_function succeeded: {result}",
fg=typer.colors.GREEN,
bold=True,
)
except ValueError as e:
typer.secho(
f"/remove_function failed:\n{e}",
fg=typer.colors.RED,
bold=True,
)
continue
# No skip options
elif user_input.lower() == "/wipe":
memgpt_agent = agent.Agent(memgpt_agent.interface)
user_message = None
elif user_input.lower() == "/heartbeat":
user_message = system.get_heartbeat()
elif user_input.lower() == "/memorywarning":
user_message = system.get_token_limit_warning()
elif user_input.lower() == "//":
multiline_input = not multiline_input
continue
elif user_input.lower() == "/" or user_input.lower() == "/help":
questionary.print("CLI commands", "bold")
for cmd, desc in USER_COMMANDS:
questionary.print(cmd, "bold")
questionary.print(f" {desc}")
continue
else:
print(f"Unrecognized command: {user_input}")
continue
else:
# If message did not begin with command prefix, pass inputs to MemGPT
# Handle user message and append to messages
user_message = system.package_user_message(user_input)
skip_next_user_input = False
def process_agent_step(user_message, no_verify):
new_messages, heartbeat_request, function_failed, token_warning, tokens_accumulated = memgpt_agent.step(
user_message,
first_message=False,
skip_verify=no_verify,
stream=stream,
)
skip_next_user_input = False
if token_warning:
user_message = system.get_token_limit_warning()
skip_next_user_input = True
elif function_failed:
user_message = system.get_heartbeat(FUNC_FAILED_HEARTBEAT_MESSAGE)
skip_next_user_input = True
elif heartbeat_request:
user_message = system.get_heartbeat(REQ_HEARTBEAT_MESSAGE)
skip_next_user_input = True
return new_messages, user_message, skip_next_user_input
while True:
try:
if strip_ui:
new_messages, user_message, skip_next_user_input = process_agent_step(user_message, no_verify)
break
else:
if stream:
# Don't display the "Thinking..." if streaming
new_messages, user_message, skip_next_user_input = process_agent_step(user_message, no_verify)
else:
with console.status("[bold cyan]Thinking...") as status:
new_messages, user_message, skip_next_user_input = process_agent_step(user_message, no_verify)
break
except KeyboardInterrupt:
print("User interrupt occurred.")
retry = questionary.confirm("Retry agent.step()?").ask()
if not retry:
break
except Exception as e:
print("An exception occurred when running agent.step(): ")
traceback.print_exc()
retry = questionary.confirm("Retry agent.step()?").ask()
if not retry:
break
counter += 1
print("Finished.")
USER_COMMANDS = [
("//", "toggle multiline input mode"),
("/exit", "exit the CLI"),
("/save", "save a checkpoint of the current agent/conversation state"),
("/load", "load a saved checkpoint"),
("/dump <count>", "view the last <count> messages (all if <count> is omitted)"),
("/memory", "print the current contents of agent memory"),
("/pop <count>", "undo <count> messages in the conversation (default is 3)"),
("/retry", "pops the last answer and tries to get another one"),
("/rethink <text>", "changes the inner thoughts of the last agent message"),
("/rewrite <text>", "changes the reply of the last agent message"),
("/heartbeat", "send a heartbeat system message to the agent"),
("/memorywarning", "send a memory warning system message to the agent"),
("/attach", "attach data source to agent"),
]

File diff suppressed because it is too large Load Diff

View File

@@ -1,155 +1,155 @@
from abc import ABC, abstractmethod
from datetime import datetime
from typing import List
from memgpt.data_types import AgentState, Message
from memgpt.memory import BaseRecallMemory, EmbeddingArchivalMemory
from memgpt.utils import printd
def parse_formatted_time(formatted_time: str):
# parse times returned by memgpt.utils.get_formatted_time()
try:
return datetime.strptime(formatted_time.strip(), "%Y-%m-%d %I:%M:%S %p %Z%z")
except:
return datetime.strptime(formatted_time.strip(), "%Y-%m-%d %I:%M:%S %p")
class PersistenceManager(ABC):
@abstractmethod
def trim_messages(self, num):
pass
@abstractmethod
def prepend_to_messages(self, added_messages):
pass
@abstractmethod
def append_to_messages(self, added_messages):
pass
@abstractmethod
def swap_system_message(self, new_system_message):
pass
@abstractmethod
def update_memory(self, new_memory):
pass
class LocalStateManager(PersistenceManager):
"""In-memory state manager has nothing to manage, all agents are held in-memory"""
recall_memory_cls = BaseRecallMemory
archival_memory_cls = EmbeddingArchivalMemory
def __init__(self, agent_state: AgentState):
# Memory held in-state useful for debugging stateful versions
self.memory = None
# self.messages = [] # current in-context messages
# self.all_messages = [] # all messages seen in current session (needed if lazily synchronizing state with DB)
self.archival_memory = EmbeddingArchivalMemory(agent_state)
self.recall_memory = BaseRecallMemory(agent_state)
# self.agent_state = agent_state
def save(self):
"""Ensure storage connectors save data"""
self.archival_memory.save()
self.recall_memory.save()
def init(self, agent):
"""Connect persistent state manager to agent"""
printd(f"Initializing {self.__class__.__name__} with agent object")
# self.all_messages = [{"timestamp": get_local_time(), "message": msg} for msg in agent.messages.copy()]
# self.messages = [{"timestamp": get_local_time(), "message": msg} for msg in agent.messages.copy()]
self.memory = agent.memory
# printd(f"{self.__class__.__name__}.all_messages.len = {len(self.all_messages)}")
printd(f"{self.__class__.__name__}.messages.len = {len(self.messages)}")
'''
def json_to_message(self, message_json) -> Message:
"""Convert agent message JSON into Message object"""
# get message
if "message" in message_json:
message = message_json["message"]
else:
message = message_json
# get timestamp
if "timestamp" in message_json:
timestamp = parse_formatted_time(message_json["timestamp"])
else:
timestamp = get_local_time()
# TODO: change this when we fully migrate to tool calls API
if "function_call" in message:
tool_calls = [
ToolCall(
id=message["tool_call_id"],
tool_call_type="function",
function={
"name": message["function_call"]["name"],
"arguments": message["function_call"]["arguments"],
},
)
]
printd(f"Saving tool calls {[vars(tc) for tc in tool_calls]}")
else:
tool_calls = None
# if message["role"] == "function":
# message["role"] = "tool"
return Message(
user_id=self.agent_state.user_id,
agent_id=self.agent_state.id,
role=message["role"],
text=message["content"],
name=message["name"] if "name" in message else None,
model=self.agent_state.llm_config.model,
created_at=timestamp,
tool_calls=tool_calls,
tool_call_id=message["tool_call_id"] if "tool_call_id" in message else None,
id=message["id"] if "id" in message else None,
)
'''
def trim_messages(self, num):
# printd(f"InMemoryStateManager.trim_messages")
# self.messages = [self.messages[0]] + self.messages[num:]
pass
def prepend_to_messages(self, added_messages: List[Message]):
# first tag with timestamps
# added_messages = [{"timestamp": get_local_time(), "message": msg} for msg in added_messages]
printd(f"{self.__class__.__name__}.prepend_to_message")
# self.messages = [self.messages[0]] + added_messages + self.messages[1:]
# add to recall memory
self.recall_memory.insert_many([m for m in added_messages])
def append_to_messages(self, added_messages: List[Message]):
# first tag with timestamps
# added_messages = [{"timestamp": get_local_time(), "message": msg} for msg in added_messages]
printd(f"{self.__class__.__name__}.append_to_messages")
# self.messages = self.messages + added_messages
# add to recall memory
self.recall_memory.insert_many([m for m in added_messages])
def swap_system_message(self, new_system_message: Message):
# first tag with timestamps
# new_system_message = {"timestamp": get_local_time(), "message": new_system_message}
printd(f"{self.__class__.__name__}.swap_system_message")
# self.messages[0] = new_system_message
# add to recall memory
self.recall_memory.insert(new_system_message)
def update_memory(self, new_memory):
printd(f"{self.__class__.__name__}.update_memory")
self.memory = new_memory
from abc import ABC, abstractmethod
from datetime import datetime
from typing import List
from memgpt.data_types import AgentState, Message
from memgpt.memory import BaseRecallMemory, EmbeddingArchivalMemory
from memgpt.utils import printd
def parse_formatted_time(formatted_time: str):
# parse times returned by memgpt.utils.get_formatted_time()
try:
return datetime.strptime(formatted_time.strip(), "%Y-%m-%d %I:%M:%S %p %Z%z")
except:
return datetime.strptime(formatted_time.strip(), "%Y-%m-%d %I:%M:%S %p")
class PersistenceManager(ABC):
@abstractmethod
def trim_messages(self, num):
pass
@abstractmethod
def prepend_to_messages(self, added_messages):
pass
@abstractmethod
def append_to_messages(self, added_messages):
pass
@abstractmethod
def swap_system_message(self, new_system_message):
pass
@abstractmethod
def update_memory(self, new_memory):
pass
class LocalStateManager(PersistenceManager):
"""In-memory state manager has nothing to manage, all agents are held in-memory"""
recall_memory_cls = BaseRecallMemory
archival_memory_cls = EmbeddingArchivalMemory
def __init__(self, agent_state: AgentState):
# Memory held in-state useful for debugging stateful versions
self.memory = None
# self.messages = [] # current in-context messages
# self.all_messages = [] # all messages seen in current session (needed if lazily synchronizing state with DB)
self.archival_memory = EmbeddingArchivalMemory(agent_state)
self.recall_memory = BaseRecallMemory(agent_state)
# self.agent_state = agent_state
def save(self):
"""Ensure storage connectors save data"""
self.archival_memory.save()
self.recall_memory.save()
def init(self, agent):
"""Connect persistent state manager to agent"""
printd(f"Initializing {self.__class__.__name__} with agent object")
# self.all_messages = [{"timestamp": get_local_time(), "message": msg} for msg in agent.messages.copy()]
# self.messages = [{"timestamp": get_local_time(), "message": msg} for msg in agent.messages.copy()]
self.memory = agent.memory
# printd(f"{self.__class__.__name__}.all_messages.len = {len(self.all_messages)}")
printd(f"{self.__class__.__name__}.messages.len = {len(self.messages)}")
'''
def json_to_message(self, message_json) -> Message:
"""Convert agent message JSON into Message object"""
# get message
if "message" in message_json:
message = message_json["message"]
else:
message = message_json
# get timestamp
if "timestamp" in message_json:
timestamp = parse_formatted_time(message_json["timestamp"])
else:
timestamp = get_local_time()
# TODO: change this when we fully migrate to tool calls API
if "function_call" in message:
tool_calls = [
ToolCall(
id=message["tool_call_id"],
tool_call_type="function",
function={
"name": message["function_call"]["name"],
"arguments": message["function_call"]["arguments"],
},
)
]
printd(f"Saving tool calls {[vars(tc) for tc in tool_calls]}")
else:
tool_calls = None
# if message["role"] == "function":
# message["role"] = "tool"
return Message(
user_id=self.agent_state.user_id,
agent_id=self.agent_state.id,
role=message["role"],
text=message["content"],
name=message["name"] if "name" in message else None,
model=self.agent_state.llm_config.model,
created_at=timestamp,
tool_calls=tool_calls,
tool_call_id=message["tool_call_id"] if "tool_call_id" in message else None,
id=message["id"] if "id" in message else None,
)
'''
def trim_messages(self, num):
# printd(f"InMemoryStateManager.trim_messages")
# self.messages = [self.messages[0]] + self.messages[num:]
pass
def prepend_to_messages(self, added_messages: List[Message]):
# first tag with timestamps
# added_messages = [{"timestamp": get_local_time(), "message": msg} for msg in added_messages]
printd(f"{self.__class__.__name__}.prepend_to_message")
# self.messages = [self.messages[0]] + added_messages + self.messages[1:]
# add to recall memory
self.recall_memory.insert_many([m for m in added_messages])
def append_to_messages(self, added_messages: List[Message]):
# first tag with timestamps
# added_messages = [{"timestamp": get_local_time(), "message": msg} for msg in added_messages]
printd(f"{self.__class__.__name__}.append_to_messages")
# self.messages = self.messages + added_messages
# add to recall memory
self.recall_memory.insert_many([m for m in added_messages])
def swap_system_message(self, new_system_message: Message):
# first tag with timestamps
# new_system_message = {"timestamp": get_local_time(), "message": new_system_message}
printd(f"{self.__class__.__name__}.swap_system_message")
# self.messages[0] = new_system_message
# add to recall memory
self.recall_memory.insert(new_system_message)
def update_memory(self, new_memory):
printd(f"{self.__class__.__name__}.update_memory")
self.memory = new_memory

View File

@@ -1,91 +1,91 @@
import importlib
import inspect
import os
import uuid
from memgpt.data_types import AgentState, Preset
from memgpt.functions.functions import load_function_set
from memgpt.interface import AgentInterface
from memgpt.metadata import MetadataStore
from memgpt.models.pydantic_models import HumanModel, PersonaModel, ToolModel
from memgpt.presets.utils import load_all_presets
from memgpt.utils import list_human_files, list_persona_files, printd
available_presets = load_all_presets()
preset_options = list(available_presets.keys())
def load_module_tools(module_name="base"):
# return List[ToolModel] from base.py tools
full_module_name = f"memgpt.functions.function_sets.{module_name}"
try:
module = importlib.import_module(full_module_name)
except Exception as e:
# Handle other general exceptions
raise e
# function tags
try:
# Load the function set
functions_to_schema = load_function_set(module)
except ValueError as e:
err = f"Error loading function set '{module_name}': {e}"
printd(err)
# create tool in db
tools = []
for name, schema in functions_to_schema.items():
# print([str(inspect.getsource(line)) for line in schema["imports"]])
source_code = inspect.getsource(schema["python_function"])
tags = [module_name]
if module_name == "base":
tags.append("memgpt-base")
tools.append(
ToolModel(
name=name,
tags=tags,
source_type="python",
module=schema["module"],
source_code=source_code,
json_schema=schema["json_schema"],
)
)
return tools
def add_default_tools(user_id: uuid.UUID, ms: MetadataStore):
module_name = "base"
for tool in load_module_tools(module_name=module_name):
existing_tool = ms.get_tool(tool.name)
if not existing_tool:
ms.add_tool(tool)
def add_default_humans_and_personas(user_id: uuid.UUID, ms: MetadataStore):
for persona_file in list_persona_files():
text = open(persona_file, "r", encoding="utf-8").read()
name = os.path.basename(persona_file).replace(".txt", "")
if ms.get_persona(user_id=user_id, name=name) is not None:
printd(f"Persona '{name}' already exists for user '{user_id}'")
continue
persona = PersonaModel(name=name, text=text, user_id=user_id)
ms.add_persona(persona)
for human_file in list_human_files():
text = open(human_file, "r", encoding="utf-8").read()
name = os.path.basename(human_file).replace(".txt", "")
if ms.get_human(user_id=user_id, name=name) is not None:
printd(f"Human '{name}' already exists for user '{user_id}'")
continue
human = HumanModel(name=name, text=text, user_id=user_id)
print(human, user_id)
ms.add_human(human)
# def create_agent_from_preset(preset_name, agent_config, model, persona, human, interface, persistence_manager):
def create_agent_from_preset(
agent_state: AgentState, preset: Preset, interface: AgentInterface, persona_is_file: bool = True, human_is_file: bool = True
):
"""Initialize a new agent from a preset (combination of system + function)"""
raise DeprecationWarning("Function no longer supported - pass a Preset object to Agent.__init__ instead")
import importlib
import inspect
import os
import uuid
from memgpt.data_types import AgentState, Preset
from memgpt.functions.functions import load_function_set
from memgpt.interface import AgentInterface
from memgpt.metadata import MetadataStore
from memgpt.models.pydantic_models import HumanModel, PersonaModel, ToolModel
from memgpt.presets.utils import load_all_presets
from memgpt.utils import list_human_files, list_persona_files, printd
available_presets = load_all_presets()
preset_options = list(available_presets.keys())
def load_module_tools(module_name="base"):
# return List[ToolModel] from base.py tools
full_module_name = f"memgpt.functions.function_sets.{module_name}"
try:
module = importlib.import_module(full_module_name)
except Exception as e:
# Handle other general exceptions
raise e
# function tags
try:
# Load the function set
functions_to_schema = load_function_set(module)
except ValueError as e:
err = f"Error loading function set '{module_name}': {e}"
printd(err)
# create tool in db
tools = []
for name, schema in functions_to_schema.items():
# print([str(inspect.getsource(line)) for line in schema["imports"]])
source_code = inspect.getsource(schema["python_function"])
tags = [module_name]
if module_name == "base":
tags.append("memgpt-base")
tools.append(
ToolModel(
name=name,
tags=tags,
source_type="python",
module=schema["module"],
source_code=source_code,
json_schema=schema["json_schema"],
)
)
return tools
def add_default_tools(user_id: uuid.UUID, ms: MetadataStore):
module_name = "base"
for tool in load_module_tools(module_name=module_name):
existing_tool = ms.get_tool(tool.name)
if not existing_tool:
ms.add_tool(tool)
def add_default_humans_and_personas(user_id: uuid.UUID, ms: MetadataStore):
for persona_file in list_persona_files():
text = open(persona_file, "r", encoding="utf-8").read()
name = os.path.basename(persona_file).replace(".txt", "")
if ms.get_persona(user_id=user_id, name=name) is not None:
printd(f"Persona '{name}' already exists for user '{user_id}'")
continue
persona = PersonaModel(name=name, text=text, user_id=user_id)
ms.add_persona(persona)
for human_file in list_human_files():
text = open(human_file, "r", encoding="utf-8").read()
name = os.path.basename(human_file).replace(".txt", "")
if ms.get_human(user_id=user_id, name=name) is not None:
printd(f"Human '{name}' already exists for user '{user_id}'")
continue
human = HumanModel(name=name, text=text, user_id=user_id)
print(human, user_id)
ms.add_human(human)
# def create_agent_from_preset(preset_name, agent_config, model, persona, human, interface, persistence_manager):
def create_agent_from_preset(
agent_state: AgentState, preset: Preset, interface: AgentInterface, persona_is_file: bool = True, human_is_file: bool = True
):
"""Initialize a new agent from a preset (combination of system + function)"""
raise DeprecationWarning("Function no longer supported - pass a Preset object to Agent.__init__ instead")

View File

@@ -1,312 +1,312 @@
from ..constants import FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT, MAX_PAUSE_HEARTBEATS
# FUNCTIONS_PROMPT_MULTISTEP_NO_HEARTBEATS = FUNCTIONS_PROMPT_MULTISTEP[:-1]
FUNCTIONS_CHAINING = {
"send_message": {
"name": "send_message",
"description": "Sends a message to the human user.",
"parameters": {
"type": "object",
"properties": {
# https://json-schema.org/understanding-json-schema/reference/array.html
"message": {
"type": "string",
"description": "Message contents. All unicode (including emojis) are supported.",
},
},
"required": ["message"],
},
},
"pause_heartbeats": {
"name": "pause_heartbeats",
"description": "Temporarily ignore timed heartbeats. You may still receive messages from manual heartbeats and other events.",
"parameters": {
"type": "object",
"properties": {
# https://json-schema.org/understanding-json-schema/reference/array.html
"minutes": {
"type": "integer",
"description": f"Number of minutes to ignore heartbeats for. Max value of {MAX_PAUSE_HEARTBEATS} minutes ({MAX_PAUSE_HEARTBEATS//60} hours).",
},
},
"required": ["minutes"],
},
},
"message_chatgpt": {
"name": "message_chatgpt",
"description": "Send a message to a more basic AI, ChatGPT. A useful resource for asking questions. ChatGPT does not retain memory of previous interactions.",
"parameters": {
"type": "object",
"properties": {
# https://json-schema.org/understanding-json-schema/reference/array.html
"message": {
"type": "string",
"description": "Message to send ChatGPT. Phrase your message as a full English sentence.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["message", "request_heartbeat"],
},
},
"core_memory_append": {
"name": "core_memory_append",
"description": "Append to the contents of core memory.",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "Section of the memory to be edited (persona or human).",
},
"content": {
"type": "string",
"description": "Content to write to the memory. All unicode (including emojis) are supported.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["name", "content", "request_heartbeat"],
},
},
"core_memory_replace": {
"name": "core_memory_replace",
"description": "Replace the contents of core memory. To delete memories, use an empty string for new_content.",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "Section of the memory to be edited (persona or human).",
},
"old_content": {
"type": "string",
"description": "String to replace. Must be an exact match.",
},
"new_content": {
"type": "string",
"description": "Content to write to the memory. All unicode (including emojis) are supported.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["name", "old_content", "new_content", "request_heartbeat"],
},
},
"recall_memory_search": {
"name": "recall_memory_search",
"description": "Search prior conversation history using a string.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "String to search for.",
},
"page": {
"type": "integer",
"description": "Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["query", "page", "request_heartbeat"],
},
},
"conversation_search": {
"name": "conversation_search",
"description": "Search prior conversation history using case-insensitive string matching.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "String to search for.",
},
"page": {
"type": "integer",
"description": "Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["query", "request_heartbeat"],
},
},
"recall_memory_search_date": {
"name": "recall_memory_search_date",
"description": "Search prior conversation history using a date range.",
"parameters": {
"type": "object",
"properties": {
"start_date": {
"type": "string",
"description": "The start of the date range to search, in the format 'YYYY-MM-DD'.",
},
"end_date": {
"type": "string",
"description": "The end of the date range to search, in the format 'YYYY-MM-DD'.",
},
"page": {
"type": "integer",
"description": "Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["start_date", "end_date", "page", "request_heartbeat"],
},
},
"conversation_search_date": {
"name": "conversation_search_date",
"description": "Search prior conversation history using a date range.",
"parameters": {
"type": "object",
"properties": {
"start_date": {
"type": "string",
"description": "The start of the date range to search, in the format 'YYYY-MM-DD'.",
},
"end_date": {
"type": "string",
"description": "The end of the date range to search, in the format 'YYYY-MM-DD'.",
},
"page": {
"type": "integer",
"description": "Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["start_date", "end_date", "request_heartbeat"],
},
},
"archival_memory_insert": {
"name": "archival_memory_insert",
"description": "Add to archival memory. Make sure to phrase the memory contents such that it can be easily queried later.",
"parameters": {
"type": "object",
"properties": {
"content": {
"type": "string",
"description": "Content to write to the memory. All unicode (including emojis) are supported.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["content", "request_heartbeat"],
},
},
"archival_memory_search": {
"name": "archival_memory_search",
"description": "Search archival memory using semantic (embedding-based) search.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "String to search for.",
},
"page": {
"type": "integer",
"description": "Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["query", "request_heartbeat"],
},
},
"read_from_text_file": {
"name": "read_from_text_file",
"description": "Read lines from a text file.",
"parameters": {
"type": "object",
"properties": {
"filename": {
"type": "string",
"description": "The name of the file to read.",
},
"line_start": {
"type": "integer",
"description": "Line to start reading from.",
},
"num_lines": {
"type": "integer",
"description": "How many lines to read (defaults to 1).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["filename", "line_start", "request_heartbeat"],
},
},
"append_to_text_file": {
"name": "append_to_text_file",
"description": "Append to a text file.",
"parameters": {
"type": "object",
"properties": {
"filename": {
"type": "string",
"description": "The name of the file to append to.",
},
"content": {
"type": "string",
"description": "Content to append to the file.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["filename", "content", "request_heartbeat"],
},
},
"http_request": {
"name": "http_request",
"description": "Generates an HTTP request and returns the response.",
"parameters": {
"type": "object",
"properties": {
"method": {
"type": "string",
"description": "The HTTP method (e.g., 'GET', 'POST').",
},
"url": {
"type": "string",
"description": "The URL for the request.",
},
"payload_json": {
"type": "string",
"description": "A JSON string representing the request payload.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["method", "url", "request_heartbeat"],
},
},
}
from ..constants import FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT, MAX_PAUSE_HEARTBEATS
# FUNCTIONS_PROMPT_MULTISTEP_NO_HEARTBEATS = FUNCTIONS_PROMPT_MULTISTEP[:-1]
FUNCTIONS_CHAINING = {
"send_message": {
"name": "send_message",
"description": "Sends a message to the human user.",
"parameters": {
"type": "object",
"properties": {
# https://json-schema.org/understanding-json-schema/reference/array.html
"message": {
"type": "string",
"description": "Message contents. All unicode (including emojis) are supported.",
},
},
"required": ["message"],
},
},
"pause_heartbeats": {
"name": "pause_heartbeats",
"description": "Temporarily ignore timed heartbeats. You may still receive messages from manual heartbeats and other events.",
"parameters": {
"type": "object",
"properties": {
# https://json-schema.org/understanding-json-schema/reference/array.html
"minutes": {
"type": "integer",
"description": f"Number of minutes to ignore heartbeats for. Max value of {MAX_PAUSE_HEARTBEATS} minutes ({MAX_PAUSE_HEARTBEATS//60} hours).",
},
},
"required": ["minutes"],
},
},
"message_chatgpt": {
"name": "message_chatgpt",
"description": "Send a message to a more basic AI, ChatGPT. A useful resource for asking questions. ChatGPT does not retain memory of previous interactions.",
"parameters": {
"type": "object",
"properties": {
# https://json-schema.org/understanding-json-schema/reference/array.html
"message": {
"type": "string",
"description": "Message to send ChatGPT. Phrase your message as a full English sentence.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["message", "request_heartbeat"],
},
},
"core_memory_append": {
"name": "core_memory_append",
"description": "Append to the contents of core memory.",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "Section of the memory to be edited (persona or human).",
},
"content": {
"type": "string",
"description": "Content to write to the memory. All unicode (including emojis) are supported.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["name", "content", "request_heartbeat"],
},
},
"core_memory_replace": {
"name": "core_memory_replace",
"description": "Replace the contents of core memory. To delete memories, use an empty string for new_content.",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "Section of the memory to be edited (persona or human).",
},
"old_content": {
"type": "string",
"description": "String to replace. Must be an exact match.",
},
"new_content": {
"type": "string",
"description": "Content to write to the memory. All unicode (including emojis) are supported.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["name", "old_content", "new_content", "request_heartbeat"],
},
},
"recall_memory_search": {
"name": "recall_memory_search",
"description": "Search prior conversation history using a string.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "String to search for.",
},
"page": {
"type": "integer",
"description": "Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["query", "page", "request_heartbeat"],
},
},
"conversation_search": {
"name": "conversation_search",
"description": "Search prior conversation history using case-insensitive string matching.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "String to search for.",
},
"page": {
"type": "integer",
"description": "Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["query", "request_heartbeat"],
},
},
"recall_memory_search_date": {
"name": "recall_memory_search_date",
"description": "Search prior conversation history using a date range.",
"parameters": {
"type": "object",
"properties": {
"start_date": {
"type": "string",
"description": "The start of the date range to search, in the format 'YYYY-MM-DD'.",
},
"end_date": {
"type": "string",
"description": "The end of the date range to search, in the format 'YYYY-MM-DD'.",
},
"page": {
"type": "integer",
"description": "Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["start_date", "end_date", "page", "request_heartbeat"],
},
},
"conversation_search_date": {
"name": "conversation_search_date",
"description": "Search prior conversation history using a date range.",
"parameters": {
"type": "object",
"properties": {
"start_date": {
"type": "string",
"description": "The start of the date range to search, in the format 'YYYY-MM-DD'.",
},
"end_date": {
"type": "string",
"description": "The end of the date range to search, in the format 'YYYY-MM-DD'.",
},
"page": {
"type": "integer",
"description": "Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["start_date", "end_date", "request_heartbeat"],
},
},
"archival_memory_insert": {
"name": "archival_memory_insert",
"description": "Add to archival memory. Make sure to phrase the memory contents such that it can be easily queried later.",
"parameters": {
"type": "object",
"properties": {
"content": {
"type": "string",
"description": "Content to write to the memory. All unicode (including emojis) are supported.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["content", "request_heartbeat"],
},
},
"archival_memory_search": {
"name": "archival_memory_search",
"description": "Search archival memory using semantic (embedding-based) search.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "String to search for.",
},
"page": {
"type": "integer",
"description": "Allows you to page through results. Only use on a follow-up query. Defaults to 0 (first page).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["query", "request_heartbeat"],
},
},
"read_from_text_file": {
"name": "read_from_text_file",
"description": "Read lines from a text file.",
"parameters": {
"type": "object",
"properties": {
"filename": {
"type": "string",
"description": "The name of the file to read.",
},
"line_start": {
"type": "integer",
"description": "Line to start reading from.",
},
"num_lines": {
"type": "integer",
"description": "How many lines to read (defaults to 1).",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["filename", "line_start", "request_heartbeat"],
},
},
"append_to_text_file": {
"name": "append_to_text_file",
"description": "Append to a text file.",
"parameters": {
"type": "object",
"properties": {
"filename": {
"type": "string",
"description": "The name of the file to append to.",
},
"content": {
"type": "string",
"description": "Content to append to the file.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["filename", "content", "request_heartbeat"],
},
},
"http_request": {
"name": "http_request",
"description": "Generates an HTTP request and returns the response.",
"parameters": {
"type": "object",
"properties": {
"method": {
"type": "string",
"description": "The HTTP method (e.g., 'GET', 'POST').",
},
"url": {
"type": "string",
"description": "The URL for the request.",
},
"payload_json": {
"type": "string",
"description": "A JSON string representing the request payload.",
},
"request_heartbeat": {
"type": "boolean",
"description": FUNCTION_PARAM_DESCRIPTION_REQ_HEARTBEAT,
},
},
"required": ["method", "url", "request_heartbeat"],
},
},
}

View File

@@ -1,14 +1,14 @@
WORD_LIMIT = 100
SYSTEM = f"""
Your job is to summarize a history of previous messages in a conversation between an AI persona and a human.
The conversation you are given is a from a fixed context window and may not be complete.
Messages sent by the AI are marked with the 'assistant' role.
The AI 'assistant' can also make calls to functions, whose outputs can be seen in messages with the 'function' role.
Things the AI says in the message content are considered inner monologue and are not seen by the user.
The only AI messages seen by the user are from when the AI uses 'send_message'.
Messages the user sends are in the 'user' role.
The 'user' role is also used for important system events, such as login events and heartbeat events (heartbeats run the AI's program without user action, allowing the AI to act without prompting from the user sending them a message).
Summarize what happened in the conversation from the perspective of the AI (use the first person).
Keep your summary less than {WORD_LIMIT} words, do NOT exceed this word limit.
Only output the summary, do NOT include anything else in your output.
"""
WORD_LIMIT = 100
SYSTEM = f"""
Your job is to summarize a history of previous messages in a conversation between an AI persona and a human.
The conversation you are given is a from a fixed context window and may not be complete.
Messages sent by the AI are marked with the 'assistant' role.
The AI 'assistant' can also make calls to functions, whose outputs can be seen in messages with the 'function' role.
Things the AI says in the message content are considered inner monologue and are not seen by the user.
The only AI messages seen by the user are from when the AI uses 'send_message'.
Messages the user sends are in the 'user' role.
The 'user' role is also used for important system events, such as login events and heartbeat events (heartbeats run the AI's program without user action, allowing the AI to act without prompting from the user sending them a message).
Summarize what happened in the conversation from the perspective of the AI (use the first person).
Keep your summary less than {WORD_LIMIT} words, do NOT exceed this word limit.
Only output the summary, do NOT include anything else in your output.
"""

View File

@@ -1,26 +1,26 @@
import os
from memgpt.constants import MEMGPT_DIR
def get_system_text(key):
filename = f"{key}.txt"
file_path = os.path.join(os.path.dirname(__file__), "system", filename)
# first look in prompts/system/*.txt
if os.path.exists(file_path):
with open(file_path, "r", encoding="utf-8") as file:
return file.read().strip()
else:
# try looking in ~/.memgpt/system_prompts/*.txt
user_system_prompts_dir = os.path.join(MEMGPT_DIR, "system_prompts")
# create directory if it doesn't exist
if not os.path.exists(user_system_prompts_dir):
os.makedirs(user_system_prompts_dir)
# look inside for a matching system prompt
file_path = os.path.join(user_system_prompts_dir, filename)
if os.path.exists(file_path):
with open(file_path, "r", encoding="utf-8") as file:
return file.read().strip()
else:
raise FileNotFoundError(f"No file found for key {key}, path={file_path}")
import os
from memgpt.constants import MEMGPT_DIR
def get_system_text(key):
filename = f"{key}.txt"
file_path = os.path.join(os.path.dirname(__file__), "system", filename)
# first look in prompts/system/*.txt
if os.path.exists(file_path):
with open(file_path, "r", encoding="utf-8") as file:
return file.read().strip()
else:
# try looking in ~/.memgpt/system_prompts/*.txt
user_system_prompts_dir = os.path.join(MEMGPT_DIR, "system_prompts")
# create directory if it doesn't exist
if not os.path.exists(user_system_prompts_dir):
os.makedirs(user_system_prompts_dir)
# look inside for a matching system prompt
file_path = os.path.join(user_system_prompts_dir, filename)
if os.path.exists(file_path):
with open(file_path, "r", encoding="utf-8") as file:
return file.read().strip()
else:
raise FileNotFoundError(f"No file found for key {key}, path={file_path}")

View File

@@ -1,208 +1,208 @@
import json
import uuid
from typing import Optional
from .constants import (
INITIAL_BOOT_MESSAGE,
INITIAL_BOOT_MESSAGE_SEND_MESSAGE_FIRST_MSG,
INITIAL_BOOT_MESSAGE_SEND_MESSAGE_THOUGHT,
JSON_ENSURE_ASCII,
MESSAGE_SUMMARY_WARNING_STR,
)
from .utils import get_local_time
def get_initial_boot_messages(version="startup"):
if version == "startup":
initial_boot_message = INITIAL_BOOT_MESSAGE
messages = [
{"role": "assistant", "content": initial_boot_message},
]
elif version == "startup_with_send_message":
tool_call_id = str(uuid.uuid4())
messages = [
# first message includes both inner monologue and function call to send_message
{
"role": "assistant",
"content": INITIAL_BOOT_MESSAGE_SEND_MESSAGE_THOUGHT,
# "function_call": {
# "name": "send_message",
# "arguments": '{\n "message": "' + f"{INITIAL_BOOT_MESSAGE_SEND_MESSAGE_FIRST_MSG}" + '"\n}',
# },
"tool_calls": [
{
"id": tool_call_id,
"type": "function",
"function": {
"name": "send_message",
"arguments": '{\n "message": "' + f"{INITIAL_BOOT_MESSAGE_SEND_MESSAGE_FIRST_MSG}" + '"\n}',
},
}
],
},
# obligatory function return message
{
# "role": "function",
"role": "tool",
"name": "send_message", # NOTE: technically not up to spec, this is old functions style
"content": package_function_response(True, None),
"tool_call_id": tool_call_id,
},
]
elif version == "startup_with_send_message_gpt35":
tool_call_id = str(uuid.uuid4())
messages = [
# first message includes both inner monologue and function call to send_message
{
"role": "assistant",
"content": "*inner thoughts* Still waiting on the user. Sending a message with function.",
# "function_call": {"name": "send_message", "arguments": '{\n "message": "' + f"Hi, is anyone there?" + '"\n}'},
"tool_calls": [
{
"id": tool_call_id,
"type": "function",
"function": {
"name": "send_message",
"arguments": '{\n "message": "' + f"Hi, is anyone there?" + '"\n}',
},
}
],
},
# obligatory function return message
{
# "role": "function",
"role": "tool",
"name": "send_message",
"content": package_function_response(True, None),
"tool_call_id": tool_call_id,
},
]
else:
raise ValueError(version)
return messages
def get_heartbeat(reason="Automated timer", include_location=False, location_name="San Francisco, CA, USA"):
# Package the message with time and location
formatted_time = get_local_time()
packaged_message = {
"type": "heartbeat",
"reason": reason,
"time": formatted_time,
}
if include_location:
packaged_message["location"] = location_name
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def get_login_event(last_login="Never (first login)", include_location=False, location_name="San Francisco, CA, USA"):
# Package the message with time and location
formatted_time = get_local_time()
packaged_message = {
"type": "login",
"last_login": last_login,
"time": formatted_time,
}
if include_location:
packaged_message["location"] = location_name
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def package_user_message(
user_message: str,
time: Optional[str] = None,
include_location: bool = False,
location_name: Optional[str] = "San Francisco, CA, USA",
name: Optional[str] = None,
):
# Package the message with time and location
formatted_time = time if time else get_local_time()
packaged_message = {
"type": "user_message",
"message": user_message,
"time": formatted_time,
}
if include_location:
packaged_message["location"] = location_name
if name:
packaged_message["name"] = name
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def package_function_response(was_success, response_string, timestamp=None):
formatted_time = get_local_time() if timestamp is None else timestamp
packaged_message = {
"status": "OK" if was_success else "Failed",
"message": response_string,
"time": formatted_time,
}
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def package_system_message(system_message, message_type="system_alert", time=None):
formatted_time = time if time else get_local_time()
packaged_message = {
"type": message_type,
"message": system_message,
"time": formatted_time,
}
return json.dumps(packaged_message)
def package_summarize_message(summary, summary_length, hidden_message_count, total_message_count, timestamp=None):
context_message = (
f"Note: prior messages ({hidden_message_count} of {total_message_count} total messages) have been hidden from view due to conversation memory constraints.\n"
+ f"The following is a summary of the previous {summary_length} messages:\n {summary}"
)
formatted_time = get_local_time() if timestamp is None else timestamp
packaged_message = {
"type": "system_alert",
"message": context_message,
"time": formatted_time,
}
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def package_summarize_message_no_summary(hidden_message_count, timestamp=None, message=None):
"""Add useful metadata to the summary message"""
# Package the message with time and location
formatted_time = get_local_time() if timestamp is None else timestamp
context_message = (
message
if message
else f"Note: {hidden_message_count} prior messages with the user have been hidden from view due to conversation memory constraints. Older messages are stored in Recall Memory and can be viewed using functions."
)
packaged_message = {
"type": "system_alert",
"message": context_message,
"time": formatted_time,
}
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def get_token_limit_warning():
formatted_time = get_local_time()
packaged_message = {
"type": "system_alert",
"message": MESSAGE_SUMMARY_WARNING_STR,
"time": formatted_time,
}
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
import json
import uuid
from typing import Optional
from .constants import (
INITIAL_BOOT_MESSAGE,
INITIAL_BOOT_MESSAGE_SEND_MESSAGE_FIRST_MSG,
INITIAL_BOOT_MESSAGE_SEND_MESSAGE_THOUGHT,
JSON_ENSURE_ASCII,
MESSAGE_SUMMARY_WARNING_STR,
)
from .utils import get_local_time
def get_initial_boot_messages(version="startup"):
if version == "startup":
initial_boot_message = INITIAL_BOOT_MESSAGE
messages = [
{"role": "assistant", "content": initial_boot_message},
]
elif version == "startup_with_send_message":
tool_call_id = str(uuid.uuid4())
messages = [
# first message includes both inner monologue and function call to send_message
{
"role": "assistant",
"content": INITIAL_BOOT_MESSAGE_SEND_MESSAGE_THOUGHT,
# "function_call": {
# "name": "send_message",
# "arguments": '{\n "message": "' + f"{INITIAL_BOOT_MESSAGE_SEND_MESSAGE_FIRST_MSG}" + '"\n}',
# },
"tool_calls": [
{
"id": tool_call_id,
"type": "function",
"function": {
"name": "send_message",
"arguments": '{\n "message": "' + f"{INITIAL_BOOT_MESSAGE_SEND_MESSAGE_FIRST_MSG}" + '"\n}',
},
}
],
},
# obligatory function return message
{
# "role": "function",
"role": "tool",
"name": "send_message", # NOTE: technically not up to spec, this is old functions style
"content": package_function_response(True, None),
"tool_call_id": tool_call_id,
},
]
elif version == "startup_with_send_message_gpt35":
tool_call_id = str(uuid.uuid4())
messages = [
# first message includes both inner monologue and function call to send_message
{
"role": "assistant",
"content": "*inner thoughts* Still waiting on the user. Sending a message with function.",
# "function_call": {"name": "send_message", "arguments": '{\n "message": "' + f"Hi, is anyone there?" + '"\n}'},
"tool_calls": [
{
"id": tool_call_id,
"type": "function",
"function": {
"name": "send_message",
"arguments": '{\n "message": "' + f"Hi, is anyone there?" + '"\n}',
},
}
],
},
# obligatory function return message
{
# "role": "function",
"role": "tool",
"name": "send_message",
"content": package_function_response(True, None),
"tool_call_id": tool_call_id,
},
]
else:
raise ValueError(version)
return messages
def get_heartbeat(reason="Automated timer", include_location=False, location_name="San Francisco, CA, USA"):
# Package the message with time and location
formatted_time = get_local_time()
packaged_message = {
"type": "heartbeat",
"reason": reason,
"time": formatted_time,
}
if include_location:
packaged_message["location"] = location_name
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def get_login_event(last_login="Never (first login)", include_location=False, location_name="San Francisco, CA, USA"):
# Package the message with time and location
formatted_time = get_local_time()
packaged_message = {
"type": "login",
"last_login": last_login,
"time": formatted_time,
}
if include_location:
packaged_message["location"] = location_name
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def package_user_message(
user_message: str,
time: Optional[str] = None,
include_location: bool = False,
location_name: Optional[str] = "San Francisco, CA, USA",
name: Optional[str] = None,
):
# Package the message with time and location
formatted_time = time if time else get_local_time()
packaged_message = {
"type": "user_message",
"message": user_message,
"time": formatted_time,
}
if include_location:
packaged_message["location"] = location_name
if name:
packaged_message["name"] = name
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def package_function_response(was_success, response_string, timestamp=None):
formatted_time = get_local_time() if timestamp is None else timestamp
packaged_message = {
"status": "OK" if was_success else "Failed",
"message": response_string,
"time": formatted_time,
}
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def package_system_message(system_message, message_type="system_alert", time=None):
formatted_time = time if time else get_local_time()
packaged_message = {
"type": message_type,
"message": system_message,
"time": formatted_time,
}
return json.dumps(packaged_message)
def package_summarize_message(summary, summary_length, hidden_message_count, total_message_count, timestamp=None):
context_message = (
f"Note: prior messages ({hidden_message_count} of {total_message_count} total messages) have been hidden from view due to conversation memory constraints.\n"
+ f"The following is a summary of the previous {summary_length} messages:\n {summary}"
)
formatted_time = get_local_time() if timestamp is None else timestamp
packaged_message = {
"type": "system_alert",
"message": context_message,
"time": formatted_time,
}
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def package_summarize_message_no_summary(hidden_message_count, timestamp=None, message=None):
"""Add useful metadata to the summary message"""
# Package the message with time and location
formatted_time = get_local_time() if timestamp is None else timestamp
context_message = (
message
if message
else f"Note: {hidden_message_count} prior messages with the user have been hidden from view due to conversation memory constraints. Older messages are stored in Recall Memory and can be viewed using functions."
)
packaged_message = {
"type": "system_alert",
"message": context_message,
"time": formatted_time,
}
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)
def get_token_limit_warning():
formatted_time = get_local_time()
packaged_message = {
"type": "system_alert",
"message": MESSAGE_SUMMARY_WARNING_STR,
"time": formatted_time,
}
return json.dumps(packaged_message, ensure_ascii=JSON_ENSURE_ASCII)

File diff suppressed because it is too large Load Diff