--- title: "Multi-modal (image inputs)" subtitle: "Send images to your agents" slug: "multimodal" --- Multi-modal features require compatible language models. Ensure your agent is configured with a multi-modal capable model. Letta agents support image inputs, enabling richer conversations and more powerful agent capabilities. ## Model Support Multi-modal capabilities depend on the underlying language model. You can check which models from the API providers support image inputs by checking their individual model pages: - **[OpenAI](https://platform.openai.com/docs/models)**: GPT-4.1, o1/3/4, GPT-4o - **[Anthropic](https://docs.anthropic.com/en/docs/about-claude/models/overview)**: Claude Opus 4, Claude Sonnet 4 - **[Gemini](https://ai.google.dev/gemini-api/docs/models)**: Gemini 2.5 Pro, Gemini 2.5 Flash If the provider you're using doesn't support image inputs, your images will still appear in the context window, but as a text message telling the agent that an image exists. ## ADE Support You can pass images to your agents by drag-and-dropping them into the chat window, or clicking the image icon to select a manual file upload. ## Usage Examples (SDK) ### Sending an Image via URL ```python title="python" maxLines=100 from letta_client import Letta client = Letta(token="LETTA_API_KEY") response = client.agents.messages.create( agent_id=agent_state.id, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "url", "url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg", }, }, { "type": "text", "text": "Describe this image." } ], } ], ) ``` ```typescript title="node.js" maxLines=100 import { LettaClient } from '@letta-ai/letta-client'; const client = new LettaClient({ token: "LETTA_API_KEY" }); const response = await client.agents.messages.create( agentState.id, { messages: [ { role: "user", content: [ { type: "image", source: { type: "url", url: "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg", }, }, { type: "text", text: "Describe this image." } ], } ], } ); ``` ### Sending an Image via Base64 ```python title="python" maxLines=100 import base64 import httpx from letta_client import Letta client = Letta(token="LETTA_API_KEY") image_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" image_data = base64.standard_b64encode(httpx.get(image_url).content).decode("utf-8") response = client.agents.messages.create( agent_id=agent_state.id, messages=[ { "role": "user", "content": [ { "type": "image", "source": { "type": "base64", "media_type": "image/jpeg", "data": image_data, }, }, { "type": "text", "text": "Describe this image." } ], } ], ) ``` ```typescript title="node.js" maxLines=100 import { LettaClient } from '@letta-ai/letta-client'; const client = new LettaClient({ token: "LETTA_API_KEY" }); const imageUrl = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"; const imageResponse = await fetch(imageUrl); const imageBuffer = await imageResponse.arrayBuffer(); const imageData = Buffer.from(imageBuffer).toString('base64'); const response = await client.agents.messages.create( agentState.id, { messages: [ { role: "user", content: [ { type: "image", source: { type: "base64", media_type: "image/jpeg", data: imageData, }, }, { type: "text", text: "Describe this image." } ], } ], } ); ```