* Create docs (#323) * Create .readthedocs.yaml * Update mkdocs.yml * update * revise * syntax * syntax * syntax * syntax * revise * revise * spacing * Docs (#327) * add stuff * patch homepage * more docs * updated * updated * refresh * refresh * refresh * update * refresh * refresh * refresh * refresh * missing file * refresh * refresh * refresh * refresh * fix black * refresh * refresh * refresh * refresh * add readme for just the docs * Update README.md * add more data loading docs * cleanup data sources * refresh * revised * add search * make prettier * revised * updated * refresh * favi * updated --------- Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
718 B
718 B
MemGPT + llama.cpp
- Download + install llama.cpp and the model you want to test with
- In your terminal, run
./server -m <MODEL> -c <CONTEXT_LENGTH>
For example, if we downloaded the model dolphin-2.2.1-mistral-7b.Q6_K.gguf and put it inside ~/models/TheBloke/, we would run:
# using `-c 8000` because Dolphin Mistral 7B has a context length of 8000
# the default port is 8080, you can change this with `--port`
./server -m ~/models/TheBloke/dolphin-2.2.1-mistral-7B-GGUF/dolphin-2.2.1-mistral-7b.Q6_K.gguf -c 8000
In your terminal where you're running MemGPT, run:
export OPENAI_API_BASE=http://localhost:8080
export BACKEND_TYPE=llamacpp