Co-authored-by: Charles Packer <packercharles@gmail.com> Co-authored-by: Shubham Naik <shubham.naik10@gmail.com> Co-authored-by: Shubham Naik <shub@memgpt.ai>
2.4 KiB
title, excerpt, category
| title | excerpt | category |
|---|---|---|
| Configuring embedding backends | Connecting Letta to various endpoint backends | 6580d34ee5e4d00068bf2a1d |
Letta uses embedding models for retrieval search over archival memory. You can use embeddings provided by OpenAI, Azure, or any model on Hugging Face.
OpenAI
To use OpenAI, make sure your OPENAI_API_KEY environment variable is set.
export OPENAI_API_KEY=YOUR_API_KEY # on Linux/Mac
Then, configure Letta and select openai as the embedding provider:
> letta configure
...
? Select embedding provider: openai
...
Azure
To use Azure, set environment variables for Azure and an additional variable specifying your embedding deployment:
# see https://github.com/openai/openai-python#microsoft-azure-endpoints
export AZURE_OPENAI_KEY = ...
export AZURE_OPENAI_ENDPOINT = ...
export AZURE_OPENAI_VERSION = ...
# set the below if you are using deployment ids
export AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT = ...
Then, configure Letta and select azure as the embedding provider:
> letta configure
...
? Select embedding provider: azure
...
Custom Endpoint
Letta supports running embeddings with any Hugging Face model using the Text Embeddings Inference(TEI) library. To get started, first make sure you follow TEI's instructions for getting started. Once you have a running endpoint, you can configure Letta to use your endpoint:
> letta configure
...
? Select embedding provider: hugging-face
? Enter default endpoint: http://localhost:8080
? Enter HuggingFace model tag (e.g. BAAI/bge-large-en-v1.5): BAAI/bge-large-en-v1.5
? Enter embedding model dimentions (e.g. 1024): 1536
...
Local Embeddings
Letta can compute embeddings locally using a lightweight embedding model BAAI/bge-small-en-v1.5.
🚧 Local LLM Performance
The
BAAI/bge-small-en-v1.5was chosen to be lightweight, so you may notice degraded performance with embedding-based retrieval when using this option.
To compute embeddings locally, install dependencies with:
pip install `pyletta[local]`
Then, select the local option during configuration:
letta configure
...
? Select embedding provider: local
...