Files
letta-server/docs/data_sources.md
Charles Packer 354bd520e0 softpass test when keys are missing (#369)
* softpass test when keys are missing

* update to use local model

* both openai and local

* typo

* fix

* Specify model inference and embedding endpoint separately  (#286)

* Fix config tests (#343)

Co-authored-by: Vivian Fang <hi@vivi.sh>

* Avoid throwing error for older `~/.memgpt/config` files due to missing section `archival_storage` (#344)

* avoid error if has old config type

* Dependency management  (#337)

* Divides dependencies into `pip install pymemgpt[legacy,local,postgres,dev]`. 
* Update docs

* Relax verify_first_message_correctness to accept any function call (#340)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Update `poetry.lock` (#346)

* mark depricated API section

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* Add autogen example that lets you chat with docs (#342)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Support attach in memgpt autogen agent

* Add docs example

* Add documentation, cleanup

* add gpt-4-turbo (#349)

* add gpt-4-turbo

* add in another place

* change to 3.5 16k

* Revert relaxing verify_first_message_correctness, still add archival_memory_search as an exception (#350)

* Revert "Relax verify_first_message_correctness to accept any function call (#340)"

This reverts commit 30e911057d755f5946d7bc2ba54619b5f2e08dc3.

* add archival_memory_search as an exception for verify

* Bump version to 0.1.18 (#351)

* Remove `requirements.txt` and `requirements_local.txt` (#358)

* update requirements to match poetry

* update with extras

* remove requirements

* disable pretty exceptions (#367)

* Updated documentation for users (#365)


---------

Co-authored-by: Vivian Fang <hi@vivi.sh>

* Create pull_request_template.md (#368)

* Create pull_request_template.md

* Add pymemgpt-nightly workflow (#373)

* Add pymemgpt-nightly workflow

* change token name

* Update lmstudio.md (#382)

* Update lmstudio.md

* Update lmstudio.md

* Update lmstudio.md to show the Prompt Formatting Option (#384)

* Update lmstudio.md to show the Prompt Formatting Option

* Update lmstudio.md Update the screenshot

* Swap asset location from #384 (#385)

* Update poetry with `pg8000` and include `pgvector` in docs  (#390)

* Allow overriding config location with `MEMGPT_CONFIG_PATH` (#383)

* Always default to local embeddings if not OpenAI or Azure  (#387)

* Add support for larger archival memory stores (#359)

* Replace `memgpt run` flags error with warning + remove custom embedding endpoint option + add agent create time (#364)

* Update webui.md (#397)

turn emoji warning into markdown warning

* Update webui.md (#398)

* dont hard code embeddings

* formatting

* black

* add full deps

* remove changes

* update poetry

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
Co-authored-by: Vivian Fang <hi@vivi.sh>
Co-authored-by: MSZ-MGS <65172063+MSZ-MGS@users.noreply.github.com>
2023-11-09 12:11:19 -08:00

86 lines
3.6 KiB
Markdown

## Loading External Data
MemGPT supports pre-loading data into archival memory. In order to made data accessible to your agent, you must load data in with `memgpt load`, then attach the data source to your agent. You can configure where archival memory is stored by configuring the [storage backend](storage.md).
### Viewing available data sources
You can view available data sources with:
```
memgpt list sources
```
```
+----------------+----------+----------+
| Name | Location | Agents |
+----------------+----------+----------+
| short-stories | local | agent_1 |
| arxiv | local | |
| memgpt-docs | local | agent_1 |
+----------------+----------+----------+
```
The `Agents` column indicates which agents have access to the data, while `Location` indicates what storage backend the data has been loaded into.
### Attaching data to agents
Attaching a data source to your agent loads the data into your agent's archival memory to access. You can attach data to your agent in two ways:
*[Option 1]* From the CLI, run:
```
memgpt attach --agent <AGENT-NAME> --data-source <DATA-SOURCE-NAME>
```
*[Option 2]* While chatting with the agent, enter the `/attach` command and select the data source
```
> Enter your message: /attach
? Select data source (Use arrow keys)
» short-stories
arxiv
memgpt-docs
```
!!! note "Hint"
To encourage your agent to reference its archival memory, we recommend adding phrases like "_search your archival memory..._" for the best results.
### Loading a file or directory
You can load a file, list of files, or directry into MemGPT with the following command:
```sh
memgpt load directory --name <NAME> \
[--input-dir <DIRECTORY>] [--input-files <FILE1> <FILE2>...] [--recursive]
```
### Loading a database dump
You can load database into MemGPT, either from a database dump or a database connection, with the following command:
```sh
memgpt load database --name <NAME> \
--query <QUERY> \ # Query to run on database to get data
--dump-path <PATH> \ # Path to dump file
--scheme <SCHEME> \ # Database scheme
--host <HOST> \ # Database host
--port <PORT> \ # Database port
--user <USER> \ # Database user
--password <PASSWORD> \ # Database password
--dbname <DB_NAME> # Database name
```
### Loading a vector database
If you already have a vector database containing passages and embeddings, you can load them into MemGPT by specifying the table name, database URI, and the columns containing the passage text and embeddings.
```sh
memgpt load vector-database --name <NAME> \
--uri <URI> \ # Database URI
--table_name <TABLE-NAME> \ # Name of table containing data
--text_column <TEXT-COL> \ # Name of column containing text
--embedding_column <EMBEDDING-COL> # Name of column containing embedding
```
Since embeddings are already provided, MemGPT will not re-compute the embeddings.
### Loading a LlamaIndex dump
If you have a Llama Index `VectorIndex` which was saved to disk, you can load it into MemGPT by specifying the directory the index was saved to:
```sh
memgpt load index --name <NAME> --dir <INDEX-DIR>
```
Since Llama Index will have already computing embeddings, MemGPT will not re-compute embeddings.
### Loading other types of data
We highly encourage contributions for new data sources, which can be added as a new [CLI data load command](https://github.com/cpacker/MemGPT/blob/main/memgpt/cli/cli_load.py). We recommend checking for [Llama Index connectors](https://gpt-index.readthedocs.io/en/v0.6.3/how_to/data_connectors.html) that may support ingesting the data you're interested in loading.