Files
matrix-bridge-legacy/BRIDGE_DESIGN.md
2026-03-28 23:50:54 -04:00

14 KiB

Matrix-Letta Bridge Design Plan

Current State

The bridge is dumb transport. It:

  • Connects to Matrix (E2EE)
  • Receives messages from rooms
  • Sends to ONE Letta agent per room
  • Receives response
  • Displays to user

That's it.

What it DOES NOT do (yet):

  • Show tool execution progress
  • Parse anything but assistant_message from Letta responses
  • Handle multi-agent orchestration (that's Letta's job)
  • Store task state (that's Letta's job)
  • Provide confirmations (that's Letta's job, if ever)

What Letta Handles

Feature Owner
Multi-agent coordination Letta
Task persistence Letta
Context switching Letta
Tool execution Letta
Agent routing logic Letta
State resumption Letta

Bridge role: Just get the message there and back.


What We Want to Add to Bridge

Single focus: Tool execution visibility via emojis.

Letta runs tools (search, read mail, etc.) but the bridge ignores all that. User just sees silence, then a result appears. We want to surface what's happening through reactions.


The Problem

Current Letta response parsing:

# bridge-e2ee.py:1149 - CURRENT CODE
for msg in data.get("messages", []):
    if msg.get("message_type") == "assistant_message":  # ❌ ONLY captures text
        content = msg.get("content", "")
        if content:
            assistant_messages.append(content)

# Everything else is dropped:
# - tool_call (ignored)
# - reasoning_step (ignored)
# - tool_result (ignored)
# - error_message (ignored)

User experience:

User: "Curate my unread emails"
       (silence for 30 seconds)
Ani:  "Found 3 actionable emails..."

Desired experience:

User: "Curate my unread emails"
Ani:  🧠 Working...
      🔍📖📋 (reactions appear as tools run)
Ani:  "Found 3 actionable emails..."

Solution: Tool Indication via Reactions

Simple Implementation

# 1. Parse ALL message types
for msg in data.get("messages", []):
    msg_type = msg.get("message_type", "unknown")

    if msg_type == "assistant_message":
        # Capture text response
        assistant_messages.append(msg.get("content", ""))

    elif msg_type == "tool_call":
        # Post progress message with emoji
        tool_name = msg.get("tool_name")
        emoji = get_emoji_for_tool(tool_name)
        await self.post_progress_message(room_id, f"🧠 Working... {emoji}")

# 2. Delete/edit progress when response arrives
await self.finalize_progress(room_id, final_response)

Tool → Emoji Mapping

EMOJI_MAP = {
    # Read operations
    "read_mail": "📖",
    "read_file": "📖",
    "retrieve_memory": "📖",
    "get_calendar": "📖",

    # Write operations
    "send_email": "✍️",
    "save_note": "✍️",
    "write_file": "✍️",

    # Search operations
    "search_web": "🔍",
    "google_search": "🔍",
    "web_search": "🔍",

    # Compute/Process
    "calculate": "🔧",
    "process_image": "🔧",
    "analyze": "🔧",

    # List/Browse
    "list_emails": "📋",
    "list_files": "📋",
    "list": "📋",

    # Default
    "default": "⚙️",
    "error": "⚠️"
}

Implementation Approach

Option A: Separate Progress Message

[User]: "Curate my emails"

[Ani]: 🧠 Working...
        🔍 (reaction added after search_web)
        📖 (reaction added after read_mail)
        ✍️ (reaction added after save_note)

[Ani]: "Found 3 actionable emails..."

Option B: Inline Update (Edit)

[User]: "Curate my emails"

[Ani]: 🧠 Working...
        [edit] 🔍 Searching...
        [edit] 📖 Reading emails...
        [edit] "Found 3 actionable emails..."

Recommendation: Option A (more compatible with E2EE)


Files to Modify

File Change
bridge-e2ee.py send_to_letta() - parse all message types
bridge-e2ee.py Add tool_categories and EMOJI_MAP
bridge-e2ee.py Add post_progress_message() and finalize_progress()
bridge-e2ee.py Optional: store tool executions in DB for audit

Minimal Implementation (First Pass)

# In send_to_letta(), after parsing all messages:

# Check if any non-assistant messages exist (tools, reasoning, etc.)
non_assistant = [m for m in messages if m.get("message_type") != "assistant_message"]

if non_assistant:
    # Post working message
    log.info(f"[Letta] Agent is working: {[m.get('message_type') for m in non_assistant]}")

    # For now, just log it. Emojis require bridge to post to Matrix during Letta call,
    # which means we need to handle streaming or callback pattern.

Reality check: The current send_to_letta() blocks until Letta finishes. To show progress DURING execution, we'd need:

  • Streaming responses from Letta, OR
  • A callback/event pattern, OR
  • Polling mechanism

Simplest first step: Just log what tools are being executed. We can add the Matrix progress in V2.


Simplified Roadmap

Phase What Effort
V0 Log all message types from Letta (debug) 30m
V1 Parse tool_call, log tool names 1h
V2 Post "Working..." message when tools detected 1h
V3 Add emoji reactions per tool type 1h
V4 Delete progress message on completion 30m

Total: ~4 hours for basic tool visibility


What We're NOT Adding

For now, NO:

  • / confirmations (Letta's domain if ever needed)
  • Multi-agent routing in bridge (Letta's domain)
  • Task persistence in bridge (Letta's domain)
  • Context switching in bridge (Letta's domain)

The bridge stays dumb. We just add a little peek into what Letta's doing.


Notes on Letta Multi-Agent

If Letta ever has multiple agents conversing in one room, the bridge just:

  1. Receives messages
  2. Sends to room
  3. Tags sender (if Letta provides agent identity)

Example:

[Curator]: "Found 3 emails worth reviewing."
[Ani]:      "Thanks, can you summarize them?"
[Curator]:  "Sure: 1. ..."

Bridge just passes through. Agent identity comes from Letta (in source or similar field in message).


Summary

Bridge role: Transport + visibility

Current transport: Working Current visibility: Blind (only sees final text)

Add visibility: Tool indication via reactions Complexity: Low (just parse and display) Multi-agent: Not our problem (Letta's domain)


MAJOR REFACTOR PLAN: mautrix.util.formatter Migration

Problem Statement

Current Issues (Patches Maintained):

  1. Manual markdown - ~100 lines of regexpatches for false code blocks
  2. Manual color syntax - {red|text}<font color="..." data-mx-color="...">
  3. Manual spoiler syntax - ||text||<span data-mx-spoiler>
  4. Manual HTML→text - html.unescape() and strip_tags()
  5. Emoji shortcode - custom normalize_emoji_shortcodes() + emoji.emojize()

Maintenance Burden:

  • Each patch edge case → new regex
  • False code block detection keeps breaking
  • Color palette manually maintained (MATRIX_COLORS)
  • No built-in mention/@user or room pill handling

Solution: Migrate to mautrix.util.formatter (native Matrix formatting)


API Analysis (from mautrix-python docs)

from mautrix.util.formatter import parse_html, MarkdownString, MatrixParser
from mautrix.types import EventType

# HTML → Plain Text (ASYNC)
plain_text = await parse_html(html_input)
# Returns: "Hello world!\n• Item 1\n• Item 2"

# Markdown → HTML (SYNC)
markdown = MarkdownString("**Bold** and ||spoiler||")
html_output = markdown.format(EventType.ROOM_MESSAGE)
# Returns: '<strong>Bold</strong> and <span data-mx-spoiler>spoiler</span>'

# Mentions and pills (ASYNC)
parser = MatrixParser()
formatted = await parser.parse("Hello @user:example.com")
# Returns: MarkdownString with proper Matrix pills
formatted.html  # <a href="https://matrix.to/#/@user:example.com">@user:example.com</a>

Key Finding:

  • parse_html() is async (coroutine)
  • MarkdownString.format() is sync but requires EntityType argument
  • MatrixParser.parse() is async

Impact Analysis

Functions Requiring Async Conversion

Function Current New Impact
format_html() sync async HIGH - called everywhere
send_message() sync async MEDIUM - many call sites
on_message() async async LOW - already async
on_image() async async LOW - already async
on_audio() async async LOW - already async
process_queue() async async LOW - already async

Call Sites Count

format_html() called in:
- send_message()          → 20+ callers
- Various message handlers → scattered throughout

All send_message() callers must be updated to await the result.

Estimated: ~25 call site updates needed


Refactor Implementation Plan

Phase 1: Infrastructure (2h)

1.1 Update imports

from mautrix.util.formatter import parse_html, MarkdownString, MatrixParser
from mautrix.types import EventType

1.2 Create async format_html()

async def format_html(text: str) -> tuple[str, str]:
    """
    Format text using mautrix native formatter.

    Args:
        text: Response from Letta (markdown or HTML)

    Returns:
        (plain_text, html_body) tuple
    """
    try:
        # Strip whitespace
        text = text.strip()

        # Convert emoji shortcodes (keep existing behavior)
        text = normalize_emoji_shortcodes(text)
        text = emoji.emojize(text, language='en')

        # HTML path → parse to plain (ASYNC)
        if text.startswith('<') and '>' in text:
            # mautrix handles HTML parsing with Matrix extensions
            # Note: our {color|text} syntax needs pre-processing
            text = _apply_color_syntax(text.strip())
            plain = await parse_html(text)
            return plain, text

        # Markdown path → use MarkdownString (SYNC)
        md = MarkdownString(text)
        # Pre-process {color|text} - MarkdownString doesn't handle this
        processed_md = _apply_color_syntax(md.text)
        md.text = processed_md

        # Format to HTML (SYNC)
        html = md.format(EventType.ROOM_MESSAGE)

        # Generate plain text (ASYNC)
        plain = await parse_html(html)

        return plain, html

    except Exception as e:
        log.warning(f"HTML formatting failed: {e}")
        return emoji.emojize(text), emoji.emojize(text)

1.3 Color syntax helper

def _apply_color_syntax(text: str) -> str:
    """Convert {color|text} to HTML spans."""
    def replace_color(match):
        color = match.group(1)
        content = match.group(2)

        # Resolve color name
        hex_color = MATRIX_COLORS.get(color, color)

        # Convert to mautrix MarkdownString color syntax
        # mautrix uses <font color="..."> internally
        return f'<font color="{hex_color}" data-mx-color="{hex_color}">{content}</font>'

    return re.sub(r'\{([a-zA-Z0-9_#]+)\|([^}]+)\}', replace_color, text)

Phase 2: Update Call Sites (1h)

2.1 Update send_message()

# BEFORE
def format_html(text: str) -> tuple[str, str]:  # SYNC
    ...

async def send_message(self, room_id: RoomID, text: str) -> str | None:
    plain_text, html_body = format_html(text)  # SYNC call
    ...

# AFTER
async def format_html(text: str) -> tuple[str, str]:  # ASYNC
    ...

async def send_message(self, room_id: RoomID, text: str) -> str | None:
    plain_text, html_body = await format_html(text)  # AWAIT
    ...

2.2 Scattered callers

  • on_message() → already async, just add await
  • on_image() → already async, just add await
  • on_audio() → already async, just add await
  • process_queue() → already async, just add await
  • Status handlers → already async, just add await

Phase 3: Remove Dead Code (30m)

Delete these functions (now handled by mautrix):

# ~~apply_matrix_extensions()~~  - mautrix handles spoilers natively
# ~~enhance_html()~~             - mautrix generates proper HTML
# ~~apply_chromatophores()~~     - could keep if desired, optional
# ~~False code block detection~~ - mautrix doesn't have this bug
# ~~normalize_emoji_shortcodes()~~ - mautrix may handle, keep if needed

Delete MATRIX_COLORS (or move to config):

  • mautrix handles hex colors directly
  • Named colors can stay if needed, but resolve to hex first

Phase 4: Testing (1h)

Test Cases:

  1. Bold/italic markdown → strong/em tags
  2. Code blocks → proper <pre><code>
  3. Links → <a href="...">
  4. Spoilers ||text||<span data-mx-spoiler>
  5. Colors {red|text}<font color="..." data-mx-color="...">
  6. Emoji shortcodes :heart:❤️
  7. Mentions @user:server → Matrix pills (if using MatrixParser)
  8. Room pills #room:server → Matrix pills (if using MatrixParser)

Regression Tests:

  • Existing messages render correctly
  • Tools reactions still work
  • TTS still works
  • Queue still works

Risks and Mitigations

Risk Impact Mitigation
Async conversion bugs HIGH Work on copy ani_e2ee_bridge.py, keep original
Color syntax breaking MEDIUM Keep _apply_color_syntax() as adapter
Spoiler syntax change MEDIUM Test spoilers: `
False code blocks returning LOW mautrix shouldn't have this bug, but monitor

Rollback Plan

If refactor fails:

  1. Original bridge-e2ee.py is preserved intact
  2. ani_e2ee_bridge.py is the experimental branch
  3. Rename/revert as needed:
    mv bridge-e2ee.py bridge-e2ee.backup.py
    cp ani_e2ee_bridge.py bridge-e2ee.py  # OR
    cp bridge-e2ee.backup.py bridge-e2ee.py
    

Estimated Total Effort

Phase Time
Phase 1: Infrastructure 2h
Phase 2: Update Call Sites 1h
Phase 3: Remove Dead Code 30m
Phase 4: Testing 1h
Total ~4.5h

Benefits After Migration

  • ~100 fewer lines of manual patch code
  • Native spoiler support - no regex
  • Native color formatting via EntityType.COLOR
  • Built-in mention/@user support (EntityType.USER_MENTION)
  • Built-in room pill support (EntityType.ROOM_MENTION)
  • Better emoji handling - less custom code
  • Future-proof - mautrix evolves, we get updates
  • Less maintenance - fewer edge cases to patch