From cdfb7e1cd51aaa6359be2208cbfa8c8bcee26450 Mon Sep 17 00:00:00 2001 From: Cameron Date: Wed, 25 Feb 2026 10:40:41 -0800 Subject: [PATCH] docs: add OpenAI-compatible API documentation (#395) --- README.md | 14 ++++ docs/configuration.md | 4 + docs/openai-compat.md | 174 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 192 insertions(+) create mode 100644 docs/openai-compat.md diff --git a/README.md b/README.md index 0830d8e..26e3e21 100644 --- a/README.md +++ b/README.md @@ -338,6 +338,19 @@ Any LettaBot agent can also be directly chatted with through [Letta Code](https: letta --agent ``` +## OpenAI-Compatible API + +LettaBot exposes `/v1/chat/completions` and `/v1/models` endpoints, so you can use any OpenAI SDK or compatible frontend (like [Open WebUI](https://github.com/open-webui/open-webui)) to chat with your agents: + +```bash +curl -X POST http://localhost:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer YOUR_API_KEY" \ + -d '{"model": "lettabot", "messages": [{"role": "user", "content": "Hello!"}]}' +``` + +Supports sync and streaming responses. See the [full documentation](docs/openai-compat.md) for Python/Node SDK examples and details. + ## Security ### Network Architecture @@ -448,6 +461,7 @@ Check the [ADE](https://app.letta.com) to see if your agent is attempting to use - [Getting Started](docs/getting-started.md) - [Docker Server Setup](docs/selfhosted-setup.md) - Run with your own Letta server - [Configuration Reference](docs/configuration.md) +- [OpenAI-Compatible API](docs/openai-compat.md) - [Slack Setup](docs/slack-setup.md) - [Discord Setup](docs/discord-setup.md) - [WhatsApp Setup](docs/whatsapp-setup.md) diff --git a/docs/configuration.md b/docs/configuration.md index c6bd549..5fc0e3c 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -777,6 +777,10 @@ data: {"type":"result","success":true} **Multi-agent:** In multi-agent configs, use the `agent` field to target a specific agent by name. Omit it to use the first agent. A 404 is returned if the agent name doesn't match any configured agent. +### OpenAI-Compatible Endpoint + +The API server also exposes `/v1/chat/completions` and `/v1/models` -- a drop-in OpenAI-compatible API. Use it with the OpenAI Python/Node SDK, Open WebUI, or any compatible client. See the [OpenAI-Compatible API docs](openai-compat.md) for details. + ## Environment Variables Environment variables override config file values: diff --git a/docs/openai-compat.md b/docs/openai-compat.md new file mode 100644 index 0000000..6e30c4a --- /dev/null +++ b/docs/openai-compat.md @@ -0,0 +1,174 @@ +# OpenAI-Compatible API + +LettaBot exposes an OpenAI-compatible API so you can point any OpenAI SDK or tool at your LettaBot server and interact with your agents directly. + +## Endpoints + +| Method | Path | Description | +|--------|------|-------------| +| `GET` | `/v1/models` | List available agents (as "models") | +| `POST` | `/v1/chat/completions` | Send a message and get a response (sync or streaming) | + +Both endpoints run on the same API server as the rest of LettaBot (default port `8080`, configurable via `server.api.port`). + +## Authentication + +All requests require an API key, passed as either: + +- `Authorization: Bearer ` +- `X-Api-Key: ` + +The API key is auto-generated on first run and saved to `lettabot-api.json`, or set via the `LETTABOT_API_KEY` environment variable. This is the same key used by the `/api/v1/chat` endpoint. + +## Quick Start + +### Python + +```python +from openai import OpenAI + +client = OpenAI( + base_url="http://localhost:8080/v1", + api_key="YOUR_API_KEY", +) + +# Sync +response = client.chat.completions.create( + model="lettabot", # your agent name + messages=[{"role": "user", "content": "What's on my todo list?"}], +) +print(response.choices[0].message.content) + +# Streaming +stream = client.chat.completions.create( + model="lettabot", + messages=[{"role": "user", "content": "What's on my todo list?"}], + stream=True, +) +for chunk in stream: + delta = chunk.choices[0].delta + if delta.content: + print(delta.content, end="", flush=True) +``` + +### Node / TypeScript + +```typescript +import OpenAI from "openai"; + +const client = new OpenAI({ + baseURL: "http://localhost:8080/v1", + apiKey: "YOUR_API_KEY", +}); + +// Sync +const response = await client.chat.completions.create({ + model: "lettabot", + messages: [{ role: "user", content: "What's on my todo list?" }], +}); +console.log(response.choices[0].message.content); + +// Streaming +const stream = await client.chat.completions.create({ + model: "lettabot", + messages: [{ role: "user", content: "What's on my todo list?" }], + stream: true, +}); +for await (const chunk of stream) { + const delta = chunk.choices[0].delta; + if (delta.content) process.stdout.write(delta.content); +} +``` + +### curl + +**Sync:** + +```bash +curl -X POST http://localhost:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer YOUR_API_KEY" \ + -d '{ + "model": "lettabot", + "messages": [{"role": "user", "content": "What is on my todo list?"}] + }' +``` + +**Streaming:** + +```bash +curl -N -X POST http://localhost:8080/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer YOUR_API_KEY" \ + -d '{ + "model": "lettabot", + "messages": [{"role": "user", "content": "What is on my todo list?"}], + "stream": true + }' +``` + +## Model Mapping + +The `model` field maps to your agent names. Use `GET /v1/models` to list them: + +```bash +curl http://localhost:8080/v1/models \ + -H "Authorization: Bearer YOUR_API_KEY" +``` + +```json +{ + "object": "list", + "data": [ + { "id": "lettabot", "object": "model", "created": 1740000000, "owned_by": "lettabot" }, + { "id": "helper-bot", "object": "model", "created": 1740000000, "owned_by": "lettabot" } + ] +} +``` + +If you omit `model` in a chat request, the first configured agent is used. + +## Streaming + +When `stream: true`, responses arrive as Server-Sent Events (SSE). The stream includes: + +- **Content deltas** -- incremental text from the assistant +- **Tool call deltas** -- tool invocations with name and arguments +- **Finish chunk** -- `finish_reason: "stop"` +- **`[DONE]` sentinel** -- end of stream + +Internal events (reasoning, tool results) are filtered and not included in the stream. + +## Supported Parameters + +| Parameter | Supported | Notes | +|-----------|-----------|-------| +| `model` | Yes | Maps to agent name | +| `messages` | Yes | Only the last user message is extracted (see below) | +| `stream` | Yes | `true` for SSE streaming, `false`/omitted for sync | +| `temperature` | Ignored | Accepted but has no effect | +| `max_tokens` | Ignored | Accepted but has no effect | +| `tools` | Ignored | Agent tools are configured server-side | +| `top_p` | Ignored | Accepted but has no effect | +| All others | Ignored | Silently accepted for compatibility | + +## How Messages Are Handled + +The OpenAI API lets you send a full conversation in the `messages` array. LettaBot handles this differently: + +- **Only the last user message is extracted** and sent to the agent +- Multi-turn context is managed by Letta's built-in memory and conversation history, not by the messages array +- System messages, assistant messages, and tool messages in the array are ignored + +This means you don't need to manage conversation history client-side -- the agent remembers everything on its own. + +## Limitations + +- **`usage` is always `null`** -- token counts are not tracked +- **No multi-turn passthrough** -- only the last user message is used (see above) +- **Tool definitions ignored** -- tools are configured on the agent, not per-request +- **Reasoning events filtered** -- the agent's internal reasoning is not exposed in the stream + +## Use with Open WebUI + +Since the endpoint is OpenAI-compatible, you can connect it to [Open WebUI](https://github.com/open-webui/open-webui) or any other OpenAI-compatible frontend. Point the frontend at `http://localhost:8080/v1` with your API key.