diff --git a/assets/img/memgpt-system-diagram.png b/assets/img/memgpt-system-diagram.png index 6a49d91f..6bff4729 100644 Binary files a/assets/img/memgpt-system-diagram.png and b/assets/img/memgpt-system-diagram.png differ diff --git a/index.html b/index.html index f5ebd4ee..bfcba79f 100644 --- a/index.html +++ b/index.html @@ -36,7 +36,7 @@ Vivian Fang, Shishir G. Patil,
Kevin Lin, - Sarah Wooders, + Sarah Wooders, Joseph E. Gonzalez @@ -53,7 +53,7 @@

-
+

Teach LLMs to manage their own memory for unbounded context!

@@ -67,12 +67,12 @@
MemGPT system overview - In MemGPT, a fixed-context LLM is augmented with a tiered memory system and a set of functions that allow it to manage its own memory. - The LLM inputs text in it's main context (capped at the size of the standard LLM context window), and outputs text that - is interpreted by a parser, resulting either in a yield or a function call. MemGPT uses functions to move data between main - context and external context. When the LLM generates a function call, it can request immediate return of - execution to chain together functions. In the case of a yield, the LLM will not be run again until the next external - event trigger (e.g. a user message or scheduled interrupt). + + In MemGPT, a fixed-context LLM processor is augmented with a tiered memory system and a set of functions that allow it to manage its own memory. + Main context is the (fixed-length) LLM input. + MemGPT parses the LLM text ouputs at each processing cycle, and either yields control or executes a function call, which can be used to move data between main and external context. + When the LLM generates a function call, it can request immediate return of execution to chain together functions. + In the case of a yield, the LLM will not be run again until the next external event trigger (e.g. a user message or scheduled interrupt).
@@ -85,10 +85,9 @@

Overview

    -
  • LLMs are increasingly being used for perpetual chats.
  • -
  • Challenge: Limited context length makes perpetual chat challenging.
  • -
  • Inspired by memory management in OS, MemGPT uses RAM (in-context), and Disk (Archival) memory to effeciently manage - the limited context length.
  • +
  • LLMs are increasingly being used for perpetual chats
  • +
  • Limited context lengths makes perpetual chat challenging
  • +
  • MemGPT manages a virtual context (inspired by virtual memory in operating systems) to create unbounded LLM context
  • With MemGPT, we demonstrate that LLMs can be taught to manage their own memory!
@@ -101,20 +100,7 @@

Abstract

- Large language models (LLMs) have revolutionized AI but are constrained by - limited context windows, hindering their utility in tasks like extended conversa- - tions and document analysis. Drawing inspiration from the hierarchical memory - systems in traditional operating systems, we introduce MemGPT (Memory-GPT). - Similar to how an operating system (OS) provides the illusion of large memory - resources through data movement between fast and slow memory, MemGPT man- - ages tiers of memory to effectively manage extended context within the language - model's limited context window, and utilizes interrupts to manage control flow - between itself and the user. We evaluate our OS-inspired design in two domains - where the limited context windows of modern LLMs severely handicaps their per- - formance: document analysis, where MemGPT is able to analyze large documents - that far exceed the underlying LLM's context limit, and multi-session chat, where - MemGPT enables conversational agents to remember, reflect, and evolve dynam- - ically through long-term interactions with their users. Code and Data is available at https://memgpt.ai + Large language models (LLMs) have revolutionized AI, but are constrained by limited context windows, hindering their utility in tasks like extended conversations and document analysis. To enable using context beyond limited context windows, we propose virtual context management, a technique drawing inspiration from hierarchical memory systems in traditional operating systems that provide the appearance of large memory resources through data movement between fast and slow memory. Using this technique, we introduce MemGPT (Memory-GPT), a system that intelligently manages different memory tiers in order to effectively provide extended context within the LLM's limited context window, and utilizes interrupts to manage control flow between itself and the user. We evaluate our OS-inspired design in two domains where the limited context windows of modern LLMs severely handicaps their performance: document analysis, where MemGPT is able to analyze large documents that far exceed the underlying LLM's context window, and multi-session chat, where MemGPT can create conversational agents that remember, reflect, and evolve dynamically through long-term interactions with their users. We release MemGPT code and data for our experiments at https://memgpt.ai.