Files
letta-server/docs/llamacpp.md
Charles Packer caba2f468c Create docs pages (#328)
* Create docs  (#323)

* Create .readthedocs.yaml

* Update mkdocs.yml

* update

* revise

* syntax

* syntax

* syntax

* syntax

* revise

* revise

* spacing

* Docs (#327)

* add stuff

* patch homepage

* more docs

* updated

* updated

* refresh

* refresh

* refresh

* update

* refresh

* refresh

* refresh

* refresh

* missing file

* refresh

* refresh

* refresh

* refresh

* fix black

* refresh

* refresh

* refresh

* refresh

* add readme for just the docs

* Update README.md

* add more data loading docs

* cleanup data sources

* refresh

* revised

* add search

* make prettier

* revised

* updated

* refresh

* favi

* updated

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
2023-11-06 12:38:49 -08:00

18 lines
718 B
Markdown

### MemGPT + llama.cpp
1. Download + install [llama.cpp](https://github.com/ggerganov/llama.cpp) and the model you want to test with
2. In your terminal, run `./server -m <MODEL> -c <CONTEXT_LENGTH>`
For example, if we downloaded the model `dolphin-2.2.1-mistral-7b.Q6_K.gguf` and put it inside `~/models/TheBloke/`, we would run:
```sh
# using `-c 8000` because Dolphin Mistral 7B has a context length of 8000
# the default port is 8080, you can change this with `--port`
./server -m ~/models/TheBloke/dolphin-2.2.1-mistral-7B-GGUF/dolphin-2.2.1-mistral-7b.Q6_K.gguf -c 8000
```
In your terminal where you're running MemGPT, run:
```sh
export OPENAI_API_BASE=http://localhost:8080
export BACKEND_TYPE=llamacpp
```