Files
matrix-bridge-legacy/BRIDGE_DESIGN.md
2026-03-28 23:50:54 -04:00

536 lines
14 KiB
Markdown

# Matrix-Letta Bridge Design Plan
## Current State
The bridge is **dumb transport**. It:
- Connects to Matrix (E2EE)
- Receives messages from rooms
- Sends to ONE Letta agent per room
- Receives response
- Displays to user
That's it.
**What it DOES NOT do (yet)**:
- Show tool execution progress
- Parse anything but `assistant_message` from Letta responses
- Handle multi-agent orchestration (that's Letta's job)
- Store task state (that's Letta's job)
- Provide confirmations (that's Letta's job, if ever)
---
## What Letta Handles
| Feature | Owner |
|---------|-------|
| Multi-agent coordination | ✅ Letta |
| Task persistence | ✅ Letta |
| Context switching | ✅ Letta |
| Tool execution | ✅ Letta |
| Agent routing logic | ✅ Letta |
| State resumption | ✅ Letta |
**Bridge role**: Just get the message there and back.
---
## What We Want to Add to Bridge
**Single focus**: Tool execution visibility via emojis.
Letta runs tools (search, read mail, etc.) but the bridge ignores all that. User just sees silence, then a result appears. We want to **surface what's happening** through reactions.
---
## The Problem
Current Letta response parsing:
```python
# bridge-e2ee.py:1149 - CURRENT CODE
for msg in data.get("messages", []):
if msg.get("message_type") == "assistant_message": # ❌ ONLY captures text
content = msg.get("content", "")
if content:
assistant_messages.append(content)
# Everything else is dropped:
# - tool_call (ignored)
# - reasoning_step (ignored)
# - tool_result (ignored)
# - error_message (ignored)
```
**User experience**:
```
User: "Curate my unread emails"
(silence for 30 seconds)
Ani: "Found 3 actionable emails..."
```
**Desired experience**:
```
User: "Curate my unread emails"
Ani: 🧠 Working...
🔍📖📋 (reactions appear as tools run)
Ani: "Found 3 actionable emails..."
```
---
## Solution: Tool Indication via Reactions
### Simple Implementation
```python
# 1. Parse ALL message types
for msg in data.get("messages", []):
msg_type = msg.get("message_type", "unknown")
if msg_type == "assistant_message":
# Capture text response
assistant_messages.append(msg.get("content", ""))
elif msg_type == "tool_call":
# Post progress message with emoji
tool_name = msg.get("tool_name")
emoji = get_emoji_for_tool(tool_name)
await self.post_progress_message(room_id, f"🧠 Working... {emoji}")
# 2. Delete/edit progress when response arrives
await self.finalize_progress(room_id, final_response)
```
### Tool → Emoji Mapping
```python
EMOJI_MAP = {
# Read operations
"read_mail": "📖",
"read_file": "📖",
"retrieve_memory": "📖",
"get_calendar": "📖",
# Write operations
"send_email": "✍️",
"save_note": "✍️",
"write_file": "✍️",
# Search operations
"search_web": "🔍",
"google_search": "🔍",
"web_search": "🔍",
# Compute/Process
"calculate": "🔧",
"process_image": "🔧",
"analyze": "🔧",
# List/Browse
"list_emails": "📋",
"list_files": "📋",
"list": "📋",
# Default
"default": "⚙️",
"error": "⚠️"
}
```
---
## Implementation Approach
### Option A: Separate Progress Message
```
[User]: "Curate my emails"
[Ani]: 🧠 Working...
🔍 (reaction added after search_web)
📖 (reaction added after read_mail)
✍️ (reaction added after save_note)
[Ani]: "Found 3 actionable emails..."
```
### Option B: Inline Update (Edit)
```
[User]: "Curate my emails"
[Ani]: 🧠 Working...
[edit] 🔍 Searching...
[edit] 📖 Reading emails...
[edit] "Found 3 actionable emails..."
```
**Recommendation**: Option A (more compatible with E2EE)
---
## Files to Modify
| File | Change |
|------|--------|
| `bridge-e2ee.py` | `send_to_letta()` - parse all message types |
| `bridge-e2ee.py` | Add `tool_categories` and `EMOJI_MAP` |
| `bridge-e2ee.py` | Add `post_progress_message()` and `finalize_progress()` |
| `bridge-e2ee.py` | Optional: store tool executions in DB for audit |
---
## Minimal Implementation (First Pass)
```python
# In send_to_letta(), after parsing all messages:
# Check if any non-assistant messages exist (tools, reasoning, etc.)
non_assistant = [m for m in messages if m.get("message_type") != "assistant_message"]
if non_assistant:
# Post working message
log.info(f"[Letta] Agent is working: {[m.get('message_type') for m in non_assistant]}")
# For now, just log it. Emojis require bridge to post to Matrix during Letta call,
# which means we need to handle streaming or callback pattern.
```
**Reality check**: The current `send_to_letta()` blocks until Letta finishes. To show progress DURING execution, we'd need:
- Streaming responses from Letta, OR
- A callback/event pattern, OR
- Polling mechanism
**Simplest first step**: Just log what tools are being executed. We can add the Matrix progress in V2.
---
## Simplified Roadmap
| Phase | What | Effort |
|-------|------|--------|
| V0 | Log all message types from Letta (debug) | 30m |
| V1 | Parse tool_call, log tool names | 1h |
| V2 | Post "Working..." message when tools detected | 1h |
| V3 | Add emoji reactions per tool type | 1h |
| V4 | Delete progress message on completion | 30m |
**Total**: ~4 hours for basic tool visibility
---
## What We're NOT Adding
For now, NO:
- ✅/❌ confirmations (Letta's domain if ever needed)
- Multi-agent routing in bridge (Letta's domain)
- Task persistence in bridge (Letta's domain)
- Context switching in bridge (Letta's domain)
The bridge stays dumb. We just add a little peek into what Letta's doing.
---
## Notes on Letta Multi-Agent
If Letta ever has multiple agents conversing in one room, the bridge just:
1. Receives messages
2. Sends to room
3. Tags sender (if Letta provides agent identity)
Example:
```
[Curator]: "Found 3 emails worth reviewing."
[Ani]: "Thanks, can you summarize them?"
[Curator]: "Sure: 1. ..."
```
Bridge just passes through. Agent identity comes from Letta (in `source` or similar field in message).
---
## Summary
**Bridge role**: Transport + visibility
Current transport: ✅ Working
Current visibility: ❌ Blind (only sees final text)
Add visibility: Tool indication via reactions
Complexity: Low (just parse and display)
Multi-agent: Not our problem (Letta's domain)
---
# MAJOR REFACTOR PLAN: mautrix.util.formatter Migration
## Problem Statement
**Current Issues (Patches Maintained):**
1. **Manual markdown** - ~100 lines of regexpatches for false code blocks
2. **Manual color syntax** - `{red|text}``<font color="..." data-mx-color="...">`
3. **Manual spoiler syntax** - `||text||``<span data-mx-spoiler>`
4. **Manual HTML→text** - html.unescape() and strip_tags()
5. **Emoji shortcode** - custom normalize_emoji_shortcodes() + emoji.emojize()
**Maintenance Burden:**
- Each patch edge case → new regex
- False code block detection keeps breaking
- Color palette manually maintained (MATRIX_COLORS)
- No built-in mention/@user or room pill handling
**Solution:** Migrate to `mautrix.util.formatter` (native Matrix formatting)
---
## API Analysis (from mautrix-python docs)
```python
from mautrix.util.formatter import parse_html, MarkdownString, MatrixParser
from mautrix.types import EventType
# HTML → Plain Text (ASYNC)
plain_text = await parse_html(html_input)
# Returns: "Hello world!\n• Item 1\n• Item 2"
# Markdown → HTML (SYNC)
markdown = MarkdownString("**Bold** and ||spoiler||")
html_output = markdown.format(EventType.ROOM_MESSAGE)
# Returns: '<strong>Bold</strong> and <span data-mx-spoiler>spoiler</span>'
# Mentions and pills (ASYNC)
parser = MatrixParser()
formatted = await parser.parse("Hello @user:example.com")
# Returns: MarkdownString with proper Matrix pills
formatted.html # <a href="https://matrix.to/#/@user:example.com">@user:example.com</a>
```
**Key Finding:**
- `parse_html()` is **async** (coroutine)
- `MarkdownString.format()` is **sync** but requires `EntityType` argument
- `MatrixParser.parse()` is **async**
---
## Impact Analysis
### Functions Requiring Async Conversion
| Function | Current | New | Impact |
|----------|---------|-----|--------|
| `format_html()` | sync | async | **HIGH** - called everywhere |
| `send_message()` | sync | async | **MEDIUM** - many call sites |
| `on_message()` | async | async | **LOW** - already async |
| `on_image()` | async | async | **LOW** - already async |
| `on_audio()` | async | async | **LOW** - already async |
| `process_queue()` | async | async | **LOW** - already async |
### Call Sites Count
```bash
format_html() called in:
- send_message() → 20+ callers
- Various message handlers → scattered throughout
All send_message() callers must be updated to await the result.
```
**Estimated:** ~25 call site updates needed
---
## Refactor Implementation Plan
### Phase 1: Infrastructure (2h)
**1.1 Update imports**
```python
from mautrix.util.formatter import parse_html, MarkdownString, MatrixParser
from mautrix.types import EventType
```
**1.2 Create async format_html()**
```python
async def format_html(text: str) -> tuple[str, str]:
"""
Format text using mautrix native formatter.
Args:
text: Response from Letta (markdown or HTML)
Returns:
(plain_text, html_body) tuple
"""
try:
# Strip whitespace
text = text.strip()
# Convert emoji shortcodes (keep existing behavior)
text = normalize_emoji_shortcodes(text)
text = emoji.emojize(text, language='en')
# HTML path → parse to plain (ASYNC)
if text.startswith('<') and '>' in text:
# mautrix handles HTML parsing with Matrix extensions
# Note: our {color|text} syntax needs pre-processing
text = _apply_color_syntax(text.strip())
plain = await parse_html(text)
return plain, text
# Markdown path → use MarkdownString (SYNC)
md = MarkdownString(text)
# Pre-process {color|text} - MarkdownString doesn't handle this
processed_md = _apply_color_syntax(md.text)
md.text = processed_md
# Format to HTML (SYNC)
html = md.format(EventType.ROOM_MESSAGE)
# Generate plain text (ASYNC)
plain = await parse_html(html)
return plain, html
except Exception as e:
log.warning(f"HTML formatting failed: {e}")
return emoji.emojize(text), emoji.emojize(text)
```
**1.3 Color syntax helper**
```python
def _apply_color_syntax(text: str) -> str:
"""Convert {color|text} to HTML spans."""
def replace_color(match):
color = match.group(1)
content = match.group(2)
# Resolve color name
hex_color = MATRIX_COLORS.get(color, color)
# Convert to mautrix MarkdownString color syntax
# mautrix uses <font color="..."> internally
return f'<font color="{hex_color}" data-mx-color="{hex_color}">{content}</font>'
return re.sub(r'\{([a-zA-Z0-9_#]+)\|([^}]+)\}', replace_color, text)
```
---
### Phase 2: Update Call Sites (1h)
**2.1 Update send_message()**
```python
# BEFORE
def format_html(text: str) -> tuple[str, str]: # SYNC
...
async def send_message(self, room_id: RoomID, text: str) -> str | None:
plain_text, html_body = format_html(text) # SYNC call
...
# AFTER
async def format_html(text: str) -> tuple[str, str]: # ASYNC
...
async def send_message(self, room_id: RoomID, text: str) -> str | None:
plain_text, html_body = await format_html(text) # AWAIT
...
```
**2.2 Scattered callers**
- `on_message()` → already async, just add await
- `on_image()` → already async, just add await
- `on_audio()` → already async, just add await
- `process_queue()` → already async, just add await
- Status handlers → already async, just add await
---
### Phase 3: Remove Dead Code (30m)
**Delete these functions (now handled by mautrix):**
```python
# ~~apply_matrix_extensions()~~ - mautrix handles spoilers natively
# ~~enhance_html()~~ - mautrix generates proper HTML
# ~~apply_chromatophores()~~ - could keep if desired, optional
# ~~False code block detection~~ - mautrix doesn't have this bug
# ~~normalize_emoji_shortcodes()~~ - mautrix may handle, keep if needed
```
**Delete MATRIX_COLORS** (or move to config):
- mautrix handles hex colors directly
- Named colors can stay if needed, but resolve to hex first
---
### Phase 4: Testing (1h)
**Test Cases:**
1. Bold/italic markdown → **strong/em** tags
2. Code blocks → proper `<pre><code>`
3. Links → `<a href="...">`
4. Spoilers `||text||``<span data-mx-spoiler>`
5. Colors `{red|text}``<font color="..." data-mx-color="...">`
6. Emoji shortcodes `:heart:` → ❤️
7. Mentions `@user:server` → Matrix pills (if using MatrixParser)
8. Room pills `#room:server` → Matrix pills (if using MatrixParser)
**Regression Tests:**
- Existing messages render correctly
- Tools reactions still work
- TTS still works
- Queue still works
---
## Risks and Mitigations
| Risk | Impact | Mitigation |
|------|--------|------------|
| Async conversion bugs | HIGH | Work on copy `ani_e2ee_bridge.py`, keep original |
| Color syntax breaking | MEDIUM | Keep `_apply_color_syntax()` as adapter |
| Spoiler syntax change | MEDIUM | Test spoilers: `||text||` still works |
| False code blocks returning | LOW | mautrix shouldn't have this bug, but monitor |
---
## Rollback Plan
If refactor fails:
1. Original `bridge-e2ee.py` is preserved intact
2. `ani_e2ee_bridge.py` is the experimental branch
3. Rename/revert as needed:
```bash
mv bridge-e2ee.py bridge-e2ee.backup.py
cp ani_e2ee_bridge.py bridge-e2ee.py # OR
cp bridge-e2ee.backup.py bridge-e2ee.py
```
---
## Estimated Total Effort
| Phase | Time |
|-------|------|
| Phase 1: Infrastructure | 2h |
| Phase 2: Update Call Sites | 1h |
| Phase 3: Remove Dead Code | 30m |
| Phase 4: Testing | 1h |
| **Total** | **~4.5h** |
---
## Benefits After Migration
- **~100 fewer lines** of manual patch code
- **Native spoiler support** - no regex
- **Native color formatting** via EntityType.COLOR
- **Built-in mention/@user support** (EntityType.USER_MENTION)
- **Built-in room pill support** (EntityType.ROOM_MENTION)
- **Better emoji handling** - less custom code
- **Future-proof** - mautrix evolves, we get updates
- **Less maintenance** - fewer edge cases to patch