Files
letta-server/docs/data_sources.md
Charles Packer 354bd520e0 softpass test when keys are missing (#369)
* softpass test when keys are missing

* update to use local model

* both openai and local

* typo

* fix

* Specify model inference and embedding endpoint separately  (#286)

* Fix config tests (#343)

Co-authored-by: Vivian Fang <hi@vivi.sh>

* Avoid throwing error for older `~/.memgpt/config` files due to missing section `archival_storage` (#344)

* avoid error if has old config type

* Dependency management  (#337)

* Divides dependencies into `pip install pymemgpt[legacy,local,postgres,dev]`. 
* Update docs

* Relax verify_first_message_correctness to accept any function call (#340)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Update `poetry.lock` (#346)

* mark depricated API section

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* Add autogen example that lets you chat with docs (#342)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Support attach in memgpt autogen agent

* Add docs example

* Add documentation, cleanup

* add gpt-4-turbo (#349)

* add gpt-4-turbo

* add in another place

* change to 3.5 16k

* Revert relaxing verify_first_message_correctness, still add archival_memory_search as an exception (#350)

* Revert "Relax verify_first_message_correctness to accept any function call (#340)"

This reverts commit 30e911057d755f5946d7bc2ba54619b5f2e08dc3.

* add archival_memory_search as an exception for verify

* Bump version to 0.1.18 (#351)

* Remove `requirements.txt` and `requirements_local.txt` (#358)

* update requirements to match poetry

* update with extras

* remove requirements

* disable pretty exceptions (#367)

* Updated documentation for users (#365)


---------

Co-authored-by: Vivian Fang <hi@vivi.sh>

* Create pull_request_template.md (#368)

* Create pull_request_template.md

* Add pymemgpt-nightly workflow (#373)

* Add pymemgpt-nightly workflow

* change token name

* Update lmstudio.md (#382)

* Update lmstudio.md

* Update lmstudio.md

* Update lmstudio.md to show the Prompt Formatting Option (#384)

* Update lmstudio.md to show the Prompt Formatting Option

* Update lmstudio.md Update the screenshot

* Swap asset location from #384 (#385)

* Update poetry with `pg8000` and include `pgvector` in docs  (#390)

* Allow overriding config location with `MEMGPT_CONFIG_PATH` (#383)

* Always default to local embeddings if not OpenAI or Azure  (#387)

* Add support for larger archival memory stores (#359)

* Replace `memgpt run` flags error with warning + remove custom embedding endpoint option + add agent create time (#364)

* Update webui.md (#397)

turn emoji warning into markdown warning

* Update webui.md (#398)

* dont hard code embeddings

* formatting

* black

* add full deps

* remove changes

* update poetry

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
Co-authored-by: Vivian Fang <hi@vivi.sh>
Co-authored-by: MSZ-MGS <65172063+MSZ-MGS@users.noreply.github.com>
2023-11-09 12:11:19 -08:00

3.6 KiB

Loading External Data

MemGPT supports pre-loading data into archival memory. In order to made data accessible to your agent, you must load data in with memgpt load, then attach the data source to your agent. You can configure where archival memory is stored by configuring the storage backend.

Viewing available data sources

You can view available data sources with:

memgpt list sources
+----------------+----------+----------+
|      Name      | Location | Agents   |
+----------------+----------+----------+
| short-stories  |  local   |  agent_1 |
|      arxiv     |  local   |          |
|  memgpt-docs   |  local   |  agent_1 |
+----------------+----------+----------+

The Agents column indicates which agents have access to the data, while Location indicates what storage backend the data has been loaded into.

Attaching data to agents

Attaching a data source to your agent loads the data into your agent's archival memory to access. You can attach data to your agent in two ways:

[Option 1] From the CLI, run:

memgpt attach --agent <AGENT-NAME> --data-source <DATA-SOURCE-NAME>

[Option 2] While chatting with the agent, enter the /attach command and select the data source

> Enter your message: /attach
? Select data source (Use arrow keys)
 » short-stories
   arxiv
   memgpt-docs

!!! note "Hint" To encourage your agent to reference its archival memory, we recommend adding phrases like "search your archival memory..." for the best results.

Loading a file or directory

You can load a file, list of files, or directry into MemGPT with the following command:

memgpt load directory --name <NAME> \
    [--input-dir <DIRECTORY>] [--input-files <FILE1> <FILE2>...] [--recursive]

Loading a database dump

You can load database into MemGPT, either from a database dump or a database connection, with the following command:

memgpt load database --name <NAME>  \
    --query <QUERY> \ # Query to run on database to get data
    --dump-path <PATH> \ # Path to dump file
    --scheme <SCHEME> \ # Database scheme
    --host <HOST> \ # Database host
    --port <PORT> \ # Database port
    --user <USER> \ # Database user
    --password <PASSWORD> \ # Database password
    --dbname <DB_NAME> # Database name

Loading a vector database

If you already have a vector database containing passages and embeddings, you can load them into MemGPT by specifying the table name, database URI, and the columns containing the passage text and embeddings.

memgpt load vector-database --name <NAME> \
    --uri <URI> \ # Database URI
    --table_name <TABLE-NAME> \ # Name of table containing data
    --text_column <TEXT-COL> \ # Name of column containing text
    --embedding_column <EMBEDDING-COL> # Name of column containing embedding

Since embeddings are already provided, MemGPT will not re-compute the embeddings.

Loading a LlamaIndex dump

If you have a Llama Index VectorIndex which was saved to disk, you can load it into MemGPT by specifying the directory the index was saved to:

memgpt load index --name <NAME> --dir <INDEX-DIR>

Since Llama Index will have already computing embeddings, MemGPT will not re-compute embeddings.

Loading other types of data

We highly encourage contributions for new data sources, which can be added as a new CLI data load command. We recommend checking for Llama Index connectors that may support ingesting the data you're interested in loading.