matrix-bridge-legacy/BRIDGE_DESIGN.md

# Matrix-Letta Bridge Design Plan

## Current State

The bridge is **dumb transport**. It:
- Connects to Matrix (E2EE)
- Receives messages from rooms
- Sends to ONE Letta agent per room
- Receives response
- Displays to user

That's it.

**What it DOES NOT do (yet)**:
- Show tool execution progress
- Parse anything but `assistant_message` from Letta responses
- Handle multi-agent orchestration (that's Letta's job)
- Store task state (that's Letta's job)
- Provide confirmations (that's Letta's job, if ever)

---

## What Letta Handles

| Feature | Owner |
|---------|-------|
| Multi-agent coordination | ✅ Letta |
| Task persistence | ✅ Letta |
| Context switching | ✅ Letta |
| Tool execution | ✅ Letta |
| Agent routing logic | ✅ Letta |
| State resumption | ✅ Letta |

**Bridge role**: Just get the message there and back.

---

## What We Want to Add to Bridge

**Single focus**: Tool execution visibility via emojis.

Letta runs tools (search, read mail, etc.) but the bridge ignores all that. User just sees silence, then a result appears. We want to **surface what's happening** through reactions.

---

## The Problem

Current Letta response parsing:

```python
# bridge-e2ee.py:1149 - CURRENT CODE
for msg in data.get("messages", []):
    if msg.get("message_type") == "assistant_message":  # ❌ ONLY captures text
        content = msg.get("content", "")
        if content:
            assistant_messages.append(content)

# Everything else is dropped:
# - tool_call (ignored)
# - reasoning_step (ignored)
# - tool_result (ignored)
# - error_message (ignored)
```

**User experience**:
```
User: "Curate my unread emails"
       (silence for 30 seconds)
Ani:  "Found 3 actionable emails..."
```

**Desired experience**:
```
User: "Curate my unread emails"
Ani:  🧠 Working...
      🔍📖📋 (reactions appear as tools run)
Ani:  "Found 3 actionable emails..."
```

---

## Solution: Tool Indication via Reactions

### Simple Implementation

```python
# 1. Parse ALL message types
for msg in data.get("messages", []):
    msg_type = msg.get("message_type", "unknown")

    if msg_type == "assistant_message":
        # Capture text response
        assistant_messages.append(msg.get("content", ""))

    elif msg_type == "tool_call":
        # Post progress message with emoji
        tool_name = msg.get("tool_name")
        emoji = get_emoji_for_tool(tool_name)
        await self.post_progress_message(room_id, f"🧠 Working... {emoji}")

# 2. Delete/edit progress when response arrives
await self.finalize_progress(room_id, final_response)
```

### Tool → Emoji Mapping

```python
EMOJI_MAP = {
    # Read operations
    "read_mail": "📖",
    "read_file": "📖",
    "retrieve_memory": "📖",
    "get_calendar": "📖",

    # Write operations
    "send_email": "✍️",
    "save_note": "✍️",
    "write_file": "✍️",

    # Search operations
    "search_web": "🔍",
    "google_search": "🔍",
    "web_search": "🔍",

    # Compute/Process
    "calculate": "🔧",
    "process_image": "🔧",
    "analyze": "🔧",

    # List/Browse
    "list_emails": "📋",
    "list_files": "📋",
    "list": "📋",

    # Default
    "default": "⚙️",
    "error": "⚠️"
}
```

---

## Implementation Approach

### Option A: Separate Progress Message

```
[User]: "Curate my emails"

[Ani]: 🧠 Working...
        🔍 (reaction added after search_web)
        📖 (reaction added after read_mail)
        ✍️ (reaction added after save_note)

[Ani]: "Found 3 actionable emails..."
```

### Option B: Inline Update (Edit)

```
[User]: "Curate my emails"

[Ani]: 🧠 Working...
        [edit] 🔍 Searching...
        [edit] 📖 Reading emails...
        [edit] "Found 3 actionable emails..."
```

**Recommendation**: Option A (more compatible with E2EE)

---

## Files to Modify

| File | Change |
|------|--------|
| `bridge-e2ee.py` | `send_to_letta()` - parse all message types |
| `bridge-e2ee.py` | Add `tool_categories` and `EMOJI_MAP` |
| `bridge-e2ee.py` | Add `post_progress_message()` and `finalize_progress()` |
| `bridge-e2ee.py` | Optional: store tool executions in DB for audit |

---

## Minimal Implementation (First Pass)

```python
# In send_to_letta(), after parsing all messages:

# Check if any non-assistant messages exist (tools, reasoning, etc.)
non_assistant = [m for m in messages if m.get("message_type") != "assistant_message"]

if non_assistant:
    # Post working message
    log.info(f"[Letta] Agent is working: {[m.get('message_type') for m in non_assistant]}")

    # For now, just log it. Emojis require bridge to post to Matrix during Letta call,
    # which means we need to handle streaming or callback pattern.
```

**Reality check**: The current `send_to_letta()` blocks until Letta finishes. To show progress DURING execution, we'd need:
- Streaming responses from Letta, OR
- A callback/event pattern, OR
- Polling mechanism

**Simplest first step**: Just log what tools are being executed. We can add the Matrix progress in V2.

---

## Simplified Roadmap

| Phase | What | Effort |
|-------|------|--------|
| V0 | Log all message types from Letta (debug) | 30m |
| V1 | Parse tool_call, log tool names | 1h |
| V2 | Post "Working..." message when tools detected | 1h |
| V3 | Add emoji reactions per tool type | 1h |
| V4 | Delete progress message on completion | 30m |

**Total**: ~4 hours for basic tool visibility

---

## What We're NOT Adding

For now, NO:
- ✅/❌ confirmations (Letta's domain if ever needed)
- Multi-agent routing in bridge (Letta's domain)
- Task persistence in bridge (Letta's domain)
- Context switching in bridge (Letta's domain)

The bridge stays dumb. We just add a little peek into what Letta's doing.

---

## Notes on Letta Multi-Agent

If Letta ever has multiple agents conversing in one room, the bridge just:
1. Receives messages
2. Sends to room
3. Tags sender (if Letta provides agent identity)

Example:
```
[Curator]: "Found 3 emails worth reviewing."
[Ani]:      "Thanks, can you summarize them?"
[Curator]:  "Sure: 1. ..."
```

Bridge just passes through. Agent identity comes from Letta (in `source` or similar field in message).

---

## Summary

**Bridge role**: Transport + visibility

Current transport: ✅ Working
Current visibility: ❌ Blind (only sees final text)

Add visibility: Tool indication via reactions
Complexity: Low (just parse and display)
Multi-agent: Not our problem (Letta's domain)

---

# MAJOR REFACTOR PLAN: mautrix.util.formatter Migration

## Problem Statement

**Current Issues (Patches Maintained):**
1. **Manual markdown** - ~100 lines of regexpatches for false code blocks
2. **Manual color syntax** - `{red|text}` → `<font color="..." data-mx-color="...">`
3. **Manual spoiler syntax** - `||text||` → `<span data-mx-spoiler>`
4. **Manual HTML→text** - html.unescape() and strip_tags()
5. **Emoji shortcode** - custom normalize_emoji_shortcodes() + emoji.emojize()

**Maintenance Burden:**
- Each patch edge case → new regex
- False code block detection keeps breaking
- Color palette manually maintained (MATRIX_COLORS)
- No built-in mention/@user or room pill handling

**Solution:** Migrate to `mautrix.util.formatter` (native Matrix formatting)

---

## API Analysis (from mautrix-python docs)

```python
from mautrix.util.formatter import parse_html, MarkdownString, MatrixParser
from mautrix.types import EventType

# HTML → Plain Text (ASYNC)
plain_text = await parse_html(html_input)
# Returns: "Hello world!\n• Item 1\n• Item 2"

# Markdown → HTML (SYNC)
markdown = MarkdownString("**Bold** and ||spoiler||")
html_output = markdown.format(EventType.ROOM_MESSAGE)
# Returns: '<strong>Bold</strong> and <span data-mx-spoiler>spoiler</span>'

# Mentions and pills (ASYNC)
parser = MatrixParser()
formatted = await parser.parse("Hello @user:example.com")
# Returns: MarkdownString with proper Matrix pills
formatted.html  # <a href="https://matrix.to/#/@user:example.com">@user:example.com</a>
```

**Key Finding:**
- `parse_html()` is **async** (coroutine)
- `MarkdownString.format()` is **sync** but requires `EntityType` argument
- `MatrixParser.parse()` is **async**

---

## Impact Analysis

### Functions Requiring Async Conversion

| Function | Current | New | Impact |
|----------|---------|-----|--------|
| `format_html()` | sync | async | **HIGH** - called everywhere |
| `send_message()` | sync | async | **MEDIUM** - many call sites |
| `on_message()` | async | async | **LOW** - already async |
| `on_image()` | async | async | **LOW** - already async |
| `on_audio()` | async | async | **LOW** - already async |
| `process_queue()` | async | async | **LOW** - already async |

### Call Sites Count

```bash
format_html() called in:
- send_message()          → 20+ callers
- Various message handlers → scattered throughout

All send_message() callers must be updated to await the result.
```

**Estimated:** ~25 call site updates needed

---

## Refactor Implementation Plan

### Phase 1: Infrastructure (2h)

**1.1 Update imports**
```python
from mautrix.util.formatter import parse_html, MarkdownString, MatrixParser
from mautrix.types import EventType
```

**1.2 Create async format_html()**
```python
async def format_html(text: str) -> tuple[str, str]:
    """
    Format text using mautrix native formatter.

    Args:
        text: Response from Letta (markdown or HTML)

    Returns:
        (plain_text, html_body) tuple
    """
    try:
        # Strip whitespace
        text = text.strip()

        # Convert emoji shortcodes (keep existing behavior)
        text = normalize_emoji_shortcodes(text)
        text = emoji.emojize(text, language='en')

        # HTML path → parse to plain (ASYNC)
        if text.startswith('<') and '>' in text:
            # mautrix handles HTML parsing with Matrix extensions
            # Note: our {color|text} syntax needs pre-processing
            text = _apply_color_syntax(text.strip())
            plain = await parse_html(text)
            return plain, text

        # Markdown path → use MarkdownString (SYNC)
        md = MarkdownString(text)
        # Pre-process {color|text} - MarkdownString doesn't handle this
        processed_md = _apply_color_syntax(md.text)
        md.text = processed_md

        # Format to HTML (SYNC)
        html = md.format(EventType.ROOM_MESSAGE)

        # Generate plain text (ASYNC)
        plain = await parse_html(html)

        return plain, html

    except Exception as e:
        log.warning(f"HTML formatting failed: {e}")
        return emoji.emojize(text), emoji.emojize(text)
```

**1.3 Color syntax helper**
```python
def _apply_color_syntax(text: str) -> str:
    """Convert {color|text} to HTML spans."""
    def replace_color(match):
        color = match.group(1)
        content = match.group(2)

        # Resolve color name
        hex_color = MATRIX_COLORS.get(color, color)

        # Convert to mautrix MarkdownString color syntax
        # mautrix uses <font color="..."> internally
        return f'<font color="{hex_color}" data-mx-color="{hex_color}">{content}</font>'

    return re.sub(r'\{([a-zA-Z0-9_#]+)\|([^}]+)\}', replace_color, text)
```

---

### Phase 2: Update Call Sites (1h)

**2.1 Update send_message()**
```python
# BEFORE
def format_html(text: str) -> tuple[str, str]:  # SYNC
    ...

async def send_message(self, room_id: RoomID, text: str) -> str | None:
    plain_text, html_body = format_html(text)  # SYNC call
    ...

# AFTER
async def format_html(text: str) -> tuple[str, str]:  # ASYNC
    ...

async def send_message(self, room_id: RoomID, text: str) -> str | None:
    plain_text, html_body = await format_html(text)  # AWAIT
    ...
```

**2.2 Scattered callers**
- `on_message()` → already async, just add await
- `on_image()` → already async, just add await
- `on_audio()` → already async, just add await
- `process_queue()` → already async, just add await
- Status handlers → already async, just add await

---

### Phase 3: Remove Dead Code (30m)

**Delete these functions (now handled by mautrix):**

```python
# ~~apply_matrix_extensions()~~  - mautrix handles spoilers natively
# ~~enhance_html()~~             - mautrix generates proper HTML
# ~~apply_chromatophores()~~     - could keep if desired, optional
# ~~False code block detection~~ - mautrix doesn't have this bug
# ~~normalize_emoji_shortcodes()~~ - mautrix may handle, keep if needed
```

**Delete MATRIX_COLORS** (or move to config):
- mautrix handles hex colors directly
- Named colors can stay if needed, but resolve to hex first

---

### Phase 4: Testing (1h)

**Test Cases:**
1. Bold/italic markdown → **strong/em** tags
2. Code blocks → proper `<pre><code>`
3. Links → `<a href="...">`
4. Spoilers `||text||` → `<span data-mx-spoiler>`
5. Colors `{red|text}` → `<font color="..." data-mx-color="...">`
6. Emoji shortcodes `:heart:` → ❤️
7. Mentions `@user:server` → Matrix pills (if using MatrixParser)
8. Room pills `#room:server` → Matrix pills (if using MatrixParser)

**Regression Tests:**
- Existing messages render correctly
- Tools reactions still work
- TTS still works
- Queue still works

---

## Risks and Mitigations

| Risk | Impact | Mitigation |
|------|--------|------------|
| Async conversion bugs | HIGH | Work on copy `ani_e2ee_bridge.py`, keep original |
| Color syntax breaking | MEDIUM | Keep `_apply_color_syntax()` as adapter |
| Spoiler syntax change | MEDIUM | Test spoilers: `||text||` still works |
| False code blocks returning | LOW | mautrix shouldn't have this bug, but monitor |

---

## Rollback Plan

If refactor fails:

1. Original `bridge-e2ee.py` is preserved intact
2. `ani_e2ee_bridge.py` is the experimental branch
3. Rename/revert as needed:
   ```bash
   mv bridge-e2ee.py bridge-e2ee.backup.py
   cp ani_e2ee_bridge.py bridge-e2ee.py  # OR
   cp bridge-e2ee.backup.py bridge-e2ee.py
   ```

---

## Estimated Total Effort

| Phase | Time |
|-------|------|
| Phase 1: Infrastructure | 2h |
| Phase 2: Update Call Sites | 1h |
| Phase 3: Remove Dead Code | 30m |
| Phase 4: Testing | 1h |
| **Total** | **~4.5h** |

---

## Benefits After Migration

- **~100 fewer lines** of manual patch code
- **Native spoiler support** - no regex
- **Native color formatting** via EntityType.COLOR
- **Built-in mention/@user support** (EntityType.USER_MENTION)
- **Built-in room pill support** (EntityType.ROOM_MENTION)
- **Better emoji handling** - less custom code
- **Future-proof** - mautrix evolves, we get updates
- **Less maintenance** - fewer edge cases to patch