* feat: update documentation and add new tutorials for memory blocks and agent collaboration - Updated navigation paths in docs.yml to reflect new tutorial locations. - Added comprehensive guides on shared memory blocks and attaching/detaching memory blocks. - Enhanced existing documentation for memory blocks with examples and best practices. - Corrected API key references in prebuilt tools documentation. These changes aim to improve user understanding and facilitate multi-agent collaboration through shared memory systems. * Add comprehensive documentation for built-in tools - Restructure Tool Use section with Base Tools and Utilities - Add Base Tools page documenting memory management and communication tools - Add Web Search documentation with Exa integration details - Add Code Interpreter documentation covering supported languages and limitations - Add Fetch Webpage documentation with fallback extraction methods - Update Utilities overview page to link to detailed tool docs
167 lines
4.0 KiB
Plaintext
167 lines
4.0 KiB
Plaintext
---
|
|
title: Fetch Webpage
|
|
subtitle: Convert webpages to readable text/markdown
|
|
slug: guides/agents/fetch-webpage
|
|
---
|
|
|
|
The `fetch_webpage` tool enables Letta agents to fetch and convert webpages into readable text or markdown format. Useful for reading documentation, articles, and web content.
|
|
|
|
<Info>
|
|
On [Letta Cloud](/guides/cloud/overview), this tool works out of the box. For self-hosted deployments with an Exa API key, fetching is enhanced. Without a key, it falls back to open-source extraction tools.
|
|
</Info>
|
|
|
|
## Quick Start
|
|
|
|
<CodeGroup>
|
|
```python Python
|
|
from letta import Letta
|
|
|
|
client = Letta(token="LETTA_API_KEY")
|
|
|
|
agent = client.agents.create(
|
|
model="openai/gpt-4o",
|
|
tools=["fetch_webpage"],
|
|
memory_blocks=[{
|
|
"label": "persona",
|
|
"value": "I can fetch and read webpages to answer questions about online content."
|
|
}]
|
|
)
|
|
```
|
|
|
|
```typescript TypeScript
|
|
import { LettaClient } from '@letta-ai/letta-client';
|
|
|
|
const client = new LettaClient({ token: "LETTA_API_KEY" });
|
|
|
|
const agent = await client.agents.create({
|
|
model: "openai/gpt-4o",
|
|
tools: ["fetch_webpage"],
|
|
memoryBlocks: [{
|
|
label: "persona",
|
|
value: "I can fetch and read webpages to answer questions about online content."
|
|
}]
|
|
});
|
|
```
|
|
</CodeGroup>
|
|
|
|
## Tool Parameters
|
|
|
|
| Parameter | Type | Description |
|
|
|-----------|------|-------------|
|
|
| `url` | `str` | The URL of the webpage to fetch |
|
|
|
|
## Return Format
|
|
|
|
The tool returns webpage content as text/markdown.
|
|
|
|
**With Exa API (if configured):**
|
|
```json
|
|
{
|
|
"title": "Page title",
|
|
"published_date": "2025-01-15",
|
|
"author": "Author name",
|
|
"text": "Full page content in markdown"
|
|
}
|
|
```
|
|
|
|
**Fallback (without Exa):**
|
|
Returns markdown-formatted text extracted from the HTML.
|
|
|
|
## How It Works
|
|
|
|
The tool uses a multi-tier approach:
|
|
|
|
1. **Exa API** (if `EXA_API_KEY` is configured): Uses Exa's content extraction
|
|
2. **Trafilatura** (fallback): Open-source text extraction to markdown
|
|
3. **Readability + html2text** (final fallback): HTML cleaning and conversion
|
|
|
|
## Self-Hosted Setup
|
|
|
|
For enhanced fetching on self-hosted servers, optionally configure an Exa API key. Without it, the tool still works using open-source extraction.
|
|
|
|
### Optional: Configure Exa
|
|
|
|
<CodeGroup>
|
|
```bash Docker
|
|
docker run \
|
|
-e EXA_API_KEY="your_exa_api_key" \
|
|
letta/letta:latest
|
|
```
|
|
|
|
```yaml Docker Compose
|
|
services:
|
|
letta:
|
|
environment:
|
|
- EXA_API_KEY=your_exa_api_key
|
|
```
|
|
|
|
```bash Server
|
|
export EXA_API_KEY="your_exa_api_key"
|
|
letta server
|
|
```
|
|
|
|
```python Per-Agent
|
|
agent = client.agents.create(
|
|
tools=["fetch_webpage"],
|
|
tool_env_vars={
|
|
"EXA_API_KEY": "your_exa_api_key"
|
|
}
|
|
)
|
|
```
|
|
</CodeGroup>
|
|
|
|
## Common Patterns
|
|
|
|
### Documentation Reader
|
|
```python
|
|
agent = client.agents.create(
|
|
model="openai/gpt-4o",
|
|
tools=["fetch_webpage", "web_search"],
|
|
memory_blocks=[{
|
|
"label": "persona",
|
|
"value": "I search for documentation with web_search and read it with fetch_webpage."
|
|
}]
|
|
)
|
|
```
|
|
|
|
### Research Assistant
|
|
```python
|
|
agent = client.agents.create(
|
|
model="openai/gpt-4o",
|
|
tools=["fetch_webpage", "archival_memory_insert"],
|
|
memory_blocks=[{
|
|
"label": "persona",
|
|
"value": "I fetch articles and store key insights in archival memory for later reference."
|
|
}]
|
|
)
|
|
```
|
|
|
|
### Content Summarizer
|
|
```python
|
|
agent = client.agents.create(
|
|
model="openai/gpt-4o",
|
|
tools=["fetch_webpage"],
|
|
memory_blocks=[{
|
|
"label": "persona",
|
|
"value": "I fetch webpages and provide summaries of their content."
|
|
}]
|
|
)
|
|
```
|
|
|
|
## When to Use
|
|
|
|
| Use Case | Tool | Why |
|
|
|----------|------|-----|
|
|
| Read specific webpage | `fetch_webpage` | Direct URL access |
|
|
| Find webpages to read | `web_search` | Discovery first |
|
|
| Read + search in one | `web_search` with `include_text=true` | Combined operation |
|
|
| Multiple pages | `fetch_webpage` | Iterate over URLs |
|
|
|
|
## Related Documentation
|
|
|
|
- [Utilities Overview](/guides/agents/prebuilt-tools)
|
|
- [Web Search](/guides/agents/web-search)
|
|
- [Run Code](/guides/agents/run-code)
|
|
- [Custom Tools](/guides/agents/custom-tools)
|
|
- [Tool Variables](/guides/agents/tool-variables)
|