Files
letta-server/fern/pages/agents/fetch_webpage.mdx
Cameron Pfiffer abfddfe7b9 Add comprehensive documentation for built-in tools (#5519)
* feat: update documentation and add new tutorials for memory blocks and agent collaboration

- Updated navigation paths in docs.yml to reflect new tutorial locations.
- Added comprehensive guides on shared memory blocks and attaching/detaching memory blocks.
- Enhanced existing documentation for memory blocks with examples and best practices.
- Corrected API key references in prebuilt tools documentation.

These changes aim to improve user understanding and facilitate multi-agent collaboration through shared memory systems.

* Add comprehensive documentation for built-in tools

- Restructure Tool Use section with Base Tools and Utilities
- Add Base Tools page documenting memory management and communication tools
- Add Web Search documentation with Exa integration details
- Add Code Interpreter documentation covering supported languages and limitations
- Add Fetch Webpage documentation with fallback extraction methods
- Update Utilities overview page to link to detailed tool docs
2025-10-24 15:13:05 -07:00

167 lines
4.0 KiB
Plaintext

---
title: Fetch Webpage
subtitle: Convert webpages to readable text/markdown
slug: guides/agents/fetch-webpage
---
The `fetch_webpage` tool enables Letta agents to fetch and convert webpages into readable text or markdown format. Useful for reading documentation, articles, and web content.
<Info>
On [Letta Cloud](/guides/cloud/overview), this tool works out of the box. For self-hosted deployments with an Exa API key, fetching is enhanced. Without a key, it falls back to open-source extraction tools.
</Info>
## Quick Start
<CodeGroup>
```python Python
from letta import Letta
client = Letta(token="LETTA_API_KEY")
agent = client.agents.create(
model="openai/gpt-4o",
tools=["fetch_webpage"],
memory_blocks=[{
"label": "persona",
"value": "I can fetch and read webpages to answer questions about online content."
}]
)
```
```typescript TypeScript
import { LettaClient } from '@letta-ai/letta-client';
const client = new LettaClient({ token: "LETTA_API_KEY" });
const agent = await client.agents.create({
model: "openai/gpt-4o",
tools: ["fetch_webpage"],
memoryBlocks: [{
label: "persona",
value: "I can fetch and read webpages to answer questions about online content."
}]
});
```
</CodeGroup>
## Tool Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| `url` | `str` | The URL of the webpage to fetch |
## Return Format
The tool returns webpage content as text/markdown.
**With Exa API (if configured):**
```json
{
"title": "Page title",
"published_date": "2025-01-15",
"author": "Author name",
"text": "Full page content in markdown"
}
```
**Fallback (without Exa):**
Returns markdown-formatted text extracted from the HTML.
## How It Works
The tool uses a multi-tier approach:
1. **Exa API** (if `EXA_API_KEY` is configured): Uses Exa's content extraction
2. **Trafilatura** (fallback): Open-source text extraction to markdown
3. **Readability + html2text** (final fallback): HTML cleaning and conversion
## Self-Hosted Setup
For enhanced fetching on self-hosted servers, optionally configure an Exa API key. Without it, the tool still works using open-source extraction.
### Optional: Configure Exa
<CodeGroup>
```bash Docker
docker run \
-e EXA_API_KEY="your_exa_api_key" \
letta/letta:latest
```
```yaml Docker Compose
services:
letta:
environment:
- EXA_API_KEY=your_exa_api_key
```
```bash Server
export EXA_API_KEY="your_exa_api_key"
letta server
```
```python Per-Agent
agent = client.agents.create(
tools=["fetch_webpage"],
tool_env_vars={
"EXA_API_KEY": "your_exa_api_key"
}
)
```
</CodeGroup>
## Common Patterns
### Documentation Reader
```python
agent = client.agents.create(
model="openai/gpt-4o",
tools=["fetch_webpage", "web_search"],
memory_blocks=[{
"label": "persona",
"value": "I search for documentation with web_search and read it with fetch_webpage."
}]
)
```
### Research Assistant
```python
agent = client.agents.create(
model="openai/gpt-4o",
tools=["fetch_webpage", "archival_memory_insert"],
memory_blocks=[{
"label": "persona",
"value": "I fetch articles and store key insights in archival memory for later reference."
}]
)
```
### Content Summarizer
```python
agent = client.agents.create(
model="openai/gpt-4o",
tools=["fetch_webpage"],
memory_blocks=[{
"label": "persona",
"value": "I fetch webpages and provide summaries of their content."
}]
)
```
## When to Use
| Use Case | Tool | Why |
|----------|------|-----|
| Read specific webpage | `fetch_webpage` | Direct URL access |
| Find webpages to read | `web_search` | Discovery first |
| Read + search in one | `web_search` with `include_text=true` | Combined operation |
| Multiple pages | `fetch_webpage` | Iterate over URLs |
## Related Documentation
- [Utilities Overview](/guides/agents/prebuilt-tools)
- [Web Search](/guides/agents/web-search)
- [Run Code](/guides/agents/run-code)
- [Custom Tools](/guides/agents/custom-tools)
- [Tool Variables](/guides/agents/tool-variables)