letta-server/index.html

<!DOCTYPE html>
<html>
<head>
    <!-- Global site tag (gtag.js) - Google Analytics -->

    <script async src="https://www.googletagmanager.com/gtag/js?id=G-BFMDYSTN1L"></script>
    <script>
        window.dataLayer = window.dataLayer || [];
        function gtag(){dataLayer.push(arguments);}
        gtag('js', new Date());

        gtag('config', 'G-BFMDYSTN1L');
    </script>

    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0, shrink-to-fit=no">
    <title>MemGPT</title>
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/4.5.0/css/bootstrap.min.css">
    <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Source+Sans+Pro">
    <link rel="stylesheet" href="assets/css/Highlight-Clean.css">
    <link rel="stylesheet" href="assets/css/styles.css">
    <link rel="stylesheet" href="assets/css/Team-Clean.css">
</head>

<body>
    <div class="highlight-clean" style="padding-bottom: 10px;">
        <div class="container">
            <h1 class="text-center">MemGPT: Towards LLMs as Operating Systems</h1>
        </div>
        <div><p></p></div>
        <div class="container" style="max-width: 768px;">
            <div class="row">
                <div class="col-md-12">
                    <h4 class="text-center" style="margin: 0px;">
                        <a class="text-center" href="https://people.eecs.berkeley.edu/~cpacker/">Charles Packer</a>,
                        <a class="text-center" href="https://www.vivi.sh/">Vivian Fang</a>,
                        <a class="text-center" href="https://shishirpatil.github.io/">Shishir G. Patil</a>,
                        <a class="text-center" href="https://people.eecs.berkeley.edu/~kevinlin/">Kevin Lin</a>,
                        <a class="text-center" href="http://sarahwooders.com">Sarah Wooders</a>,
                        <a class="text-center" href="https://people.eecs.berkeley.edu/~jegonzal/">Joseph E. Gonzalez</a>

                    </h4>
                    <h4></h4>
                    <h5 class="text-center">UC Berkeley</h5>
                </div>
            </div>
        </div>
        <div><p></p></div>
        <div class="buttons" style="margin-bottom: 8px;">
            <a class="btn btn-primary" role="button" href="https://memgpt.ai">Paper</a>
            <a class="btn btn-primary" role="button" href="https://discord.gg/9GEQrxmVyE">Discord</a>
            <a class="btn btn-primary" role="button" href="https://github.com/cpacker/MemGPT">GitHub</a>
        </div>
        <div><p></p></div>
        <div>
            <div class="container" style="max-width: 768px;">
                <div class="row">
                    <div class="col-md-12">
                        <h3 class="text-center">Teach LLMs to manage their own memory for unbounded context!</h3>
                    </div>
                </div>

            </div>
        </div>
        <div><p></p></div>
        <div class="container" style="max-width: 768px;">
            <div class="row">
                <div class="col-md-12 text-center"><img src="assets/img/memgpt-system-diagram.png"
                    style="width: 100%;margin-bottom: 8px;" alt="MemGPT system overview">
                    <em>In MemGPT, a fixed-context LLM is augmented with a tiered memory system and a set of functions that allow it to manage its own memory.
                        The LLM inputs text in it's main context (capped at the size of the standard LLM context window), and outputs text that
                        is interpreted by a parser, resulting either in a yield or a function call. MemGPT uses functions to move data between main
                        context and external context. When the LLM generates a function call, it can request immediate return of
                        execution to chain together functions. In the case of a yield, the LLM will not be run again until the next external
                        event trigger (e.g. a user message or scheduled interrupt).
                    </em>
                </div>
            </div>
        </div>
    </div>
    <div><p></p></div>
    <hr style="max-width: 768px;">
    <div class="container" style="max-width: 768px;">
        <div class="row">
            <div class="col-md-12">
                <h2>Overview</h2>
                <ul>
                    <li>LLMs are increasingly being used for perpetual chats. </li>
                    <li>Challenge: Limited context length makes perpetual chat challenging.</li>
                    <li>Inspired by memory management in OS, MemGPT uses RAM (in-context), and Disk (Archival) memory to effeciently manage
                        the limited context length. </li>
                   <li><mark>With MemGPT, we demonstrate that LLMs can be taught to manage their own memory!</mark>
                    </li>
                </ul>
        </div>
    </div>
    <div><p></p></div>
    <hr style="max-width: 768px;">
    <div class="container" style="max-width: 768px;">
        <div class="row">
            <div class="col-md-12">
                <h2>Abstract</h2>
                <p>
                    Large language models (LLMs) have revolutionized AI but are constrained by
                    limited context windows, hindering their utility in tasks like extended conversa-
                    tions and document analysis. Drawing inspiration from the hierarchical memory
                    systems in traditional operating systems, we introduce MemGPT (Memory-GPT).
                    Similar to how an operating system (OS) provides the illusion of large memory
                    resources through data movement between fast and slow memory, MemGPT man-
                    ages tiers of memory to effectively manage extended context within the language
                    model's limited context window, and utilizes interrupts to manage control flow
                    between itself and the user. We evaluate our OS-inspired design in two domains
                    where the limited context windows of modern LLMs severely handicaps their per-
                    formance: document analysis, where MemGPT is able to analyze large documents
                    that far exceed the underlying LLM's context limit, and multi-session chat, where
                    MemGPT enables conversational agents to remember, reflect, and evolve dynam-
                    ically through long-term interactions with their users. Code and Data is available at https://memgpt.ai
                <br>
                </p>
            </div>
        </div>
    </div>

    <hr style="max-width: 768px;">
    <div class="container" style="max-width: 768px;">
        <div class="row">
            <div class="col-md-12">
                <h2>Citation</h2>
                <code>
                    @inproceedings{packer2023memgpt,<br>
                        &nbsp; 	title={{MemGPT}: Towards LLMs as Operating Systems}, <br>
                        &nbsp; 	author={Packer, Charles and Fang, Vivian and Patil, Shishir G. <br>
                        &nbsp;  and Lin, Kevin and Wooders, Sarah and Gonzalez, Joseph E.}, <br>
                        &nbsp; 	year={2023}<br>
                    }
                </code></div>
        </div>
    </div>
</body>

</html>