Initial commit: Community ADE foundation

- Project structure: docs/, src/, tests/, proto/ - Research synthesis: Letta vs commercial ADEs - Architecture: Redis Streams queue design - Phase 1 orchestration design - Execution plan and project state tracking - Working subagent system (manager.ts fixes) This is the foundation for a Community ADE built on Letta's stateful agent architecture with git-native MemFS. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta Code <noreply@letta.com>
2026-03-18 10:30:20 -04:00
commit 00382055c6
9 changed files with 2970 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,46 @@
 # Dependencies
 node_modules/
 package-lock.json
 yarn.lock
 pnpm-lock.yaml
 # Build outputs
 dist/
 build/
 *.tsbuildinfo
 # Environment
 .env
 .env.local
 .env.*.local
 # IDE
 .vscode/
 .idea/
 *.swp
 *.swo
 *~
 # OS
 .DS_Store
 Thumbs.db
 # Logs
 logs/
 *.log
 npm-debug.log*
 yarn-debug.log*
 yarn-error.log*
 # Testing
 coverage/
 .nyc_output/
 # Redis
 dump.rdb
 *.rdb
 # Temporary
 tmp/
 temp/
 *.tmp
--- a/README.md
+++ b/README.md
@@ -0,0 +1,57 @@
 # Community ADE (Agentic Development Environment)
 A community-driven, open-source agentic development environment built on Letta's stateful agent architecture.
 ## Vision
 Build an open-source ADE that combines:
 - **Stateful agents** with hierarchical memory (Letta's unique strength)
 - **Git-native persistence** with MemFS versioning
 - **Persistent task queues** for durable subagent execution
 - **Web dashboard** for real-time monitoring and control
 - **Computer Use** integration for browser automation
 ## Differentiation
 Unlike commercial alternatives (Warp, Intent), Community ADE is:
 - **Open source** and self-hostable
 - **Stateful by design** - agents remember across sessions
 - **Model agnostic** - use any OpenAI-compatible API
 - **Git-native** - version control for agent memory
 ## Project Structure
 ```
 ├── src/           # Queue implementation and worker pool
 ├── tests/         # Test suite
 ├── docs/          # Architecture and design documents
 ├── proto/         # Prototypes and experiments
 └── README.md      # This file
 ```
 ## Documentation
 - [Project State](docs/community-ade-project-state.md) - Current status and active subagents
 - [Phase 1 Design](docs/ade-phase1-orchestration-design.md) - Task queue architecture
 - [Redis Queue Design](docs/ade-redis-queue-design.md) - Detailed Redis implementation spec
 - [Research Synthesis](docs/community-ade-research-synthesis-2026-03-18.md) - Competitive analysis
 ## Phase 1: Orchestration Layer (In Progress)
 Goals:
 1. ✅ Research and design complete
 2. 🔄 Redis task queue implementation
 3. ⏳ Worker pool with heartbeat
 4. ⏳ Integration with Letta Task tool
 ## Quick Start
 Coming soon - queue prototype implementation.
 ## License
 MIT - Community contribution welcome.
 ---
 *Project orchestrated by Ani, with research and design by specialized subagents.*
--- a/docs/ade-phase1-execution-plan.md
+++ b/docs/ade-phase1-execution-plan.md
@@ -0,0 +1,525 @@
 # Phase 1 Execution Plan: Orchestration Layer
 **Date:** March 18, 2026  
 **Status:** Ready for Implementation  
 **Estimated Duration:** 6 weeks  
 **Owner:** TBD
 ---
 ## Overview
 This document provides actionable implementation guidance for Phase 1 of the Community ADE, based on synthesized research from commercial tools (Intent, Warp) and open-source alternatives (Aider, Cline, Agno).
 ---
 ## Key Research Insights
 ### 1. Letta's Competitive Position
 **✅ Strongest Open-Source Position:**
 - No competitor combines: stateful agents + hierarchical memory + git-native persistence + subagent orchestration
 - Aider has git integration but no agent memory
 - Cline is session-based with no persistence
 - Agno lacks Letta's memory architecture
 **⚠️ Commercial Tools Lead in UX:**
 - Warp: Terminal-native with rich context (@file, images)
 - Intent: Specification-driven development
 - Both have web dashboards; Letta needs one
 ### 2. Technical Pattern Validation
 **Redis + Workers (Selected for Phase 1):**
 - ✅ Proven pattern (Celery uses Redis under hood)
 - ✅ Simpler than Temporal for our use case
 - ✅ More control over data model
 - ⚠️ Temporal deferred to Phase 2 evaluation
 **React + FastAPI (Selected for Phase 2):**
 - ✅ Industry standard
 - ✅ shadcn/ui provides accessible components
 - ✅ TanStack Query for caching/real-time sync
 ---
 ## Phase 1 Scope
 ### Goals
 1. Replace in-process Task execution with persistent queue
 2. Ensure tasks survive agent restarts
 3. Support 5+ concurrent workers
 4. Maintain backward compatibility
 ### Out of Scope (Phase 2+)
 - Web dashboard (Phase 2)
 - Temporal workflows (Phase 2 evaluation)
 - GitHub integration (Phase 3)
 - Computer Use (Phase 4)
 ---
 ## Implementation Breakdown
 ### Week 1: In-Memory Prototype
 **Deliverables:**
 - [ ] `TaskQueue` class with asyncio.Queue
 - [ ] Task dataclass with all fields
 - [ ] Worker process skeleton
 - [ ] Basic enqueue/dequeue/complete/fail operations
 **Testing:**
 ```python
 # Test: Task survives worker crash
 # Test: Concurrent task execution
 # Test: Priority ordering
 ```
 **Code Structure:**
 ```
 letta_ade/
 ├── __init__.py
 ├── queue/
 │   ├── __init__.py
 │   ├── models.py          # Task dataclass, enums
 │   ├── memory_queue.py    # Week 1 implementation
 │   └── base.py            # Abstract base class
 └── worker/
    ├── __init__.py
    └── runner.py          # Worker process logic
 ```
 ### Week 2: Redis Integration
 **Deliverables:**
 - [ ] Redis connection manager
 - [ ] Task serialization (JSON/pickle)
 - [ ] Atomic dequeue with WATCH/MULTI/EXEC
 - [ ] Status tracking (Sets per status)
 **Redis Schema:**
 ```redis
 # Task storage
 HSET task:{uuid} field value ...
 # Priority queue (pending)
 ZADD queue:pending {priority} {task_id}
 # Running tasks
 ZADD queue:running {started_at} {task_id}
 # Status index
 SADD status:pending {task_id}
 SADD status:running {task_id}
 SADD status:completed {task_id}
 SADD status:failed {task_id}
 # User index
 SADD user:{user_id}:tasks {task_id}
 ```
 **Dependencies:**
 ```toml
 [dependencies]
 redis = { version = "^5.0", extras = ["hiredis"] }
 ```
 ### Week 3-4: Worker Pool + Heartbeat
 **Deliverables:**
 - [ ] Multiple worker processes
 - [ ] Worker heartbeat (every 30s)
 - [ ] Stall detection (2x heartbeat timeout)
 - [ ] Graceful shutdown handling
 - [ ] Worker capacity management
 **Worker Logic:**
 ```python
 async def worker_loop(agent_id: UUID, queue: TaskQueue):
    while running:
        # Send heartbeat
        await queue.heartbeat(agent_id)
        # Try to get task (5s timeout)
        task = await queue.dequeue(agent_id, timeout_ms=5000)
        if task:
            # Spawn subagent process
            proc = await asyncio.create_subprocess_exec(
                "letta", "run-agent",
                f"--task-id={task.id}",
                stdout=asyncio.subprocess.PIPE,
                stderr=asyncio.subprocess.PIPE
            )
            # Wait for completion
            stdout, stderr = await proc.communicate()
            # Update queue
            if proc.returncode == 0:
                await queue.complete(task.id, parse_result(stdout))
            else:
                await queue.fail(task.id, stderr.decode())
        # Brief pause to prevent tight loop
        await asyncio.sleep(0.1)
 ```
 **Stall Recovery (Cron job):**
 ```python
 async def recover_stalled_tasks(queue: TaskQueue, max_age: timedelta):
    """Requeue tasks from crashed workers."""
    stalled = await queue.find_stalled(max_age)
    for task_id in stalled:
        await queue.requeue(task_id)
 ```
 ### Week 5: API Layer
 **Deliverables:**
 - [ ] FastAPI application structure
 - [ ] REST endpoints (CRUD for tasks)
 - [ ] WebSocket endpoint for real-time updates
 - [ ] Authentication middleware
 **REST Endpoints:**
 ```python
@app.post("/tasks")
 async def create_task(task: TaskCreate) -> TaskResponse:
    """Enqueue a new task."""
    task_id = await queue.enqueue(task)
    return TaskResponse(task_id=task_id, status="pending")
@app.get("/tasks/{task_id}")
 async def get_task(task_id: UUID) -> Task:
    """Get task status and result."""
    return await queue.get(task_id)
@app.get("/tasks")
 async def list_tasks(
    user_id: str,
    status: Optional[TaskStatus] = None
 ) -> List[TaskSummary]:
    """List tasks with optional filtering."""
    return await queue.list_by_user(user_id, status)
@app.post("/tasks/{task_id}/cancel")
 async def cancel_task(task_id: UUID):
    """Cancel a pending or running task."""
    await queue.cancel(task_id)
@app.post("/tasks/{task_id}/retry")
 async def retry_task(task_id: UUID):
    """Retry a failed task."""
    await queue.retry(task_id)
 ```
 **WebSocket:**
 ```python
@app.websocket("/ws")
 async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    # Subscribe to Redis pub/sub for updates
    pubsub = redis.pubsub()
    pubsub.subscribe("task_updates")
    async for message in pubsub.listen():
        if message["type"] == "message":
            await websocket.send_json(message["data"])
 ```
 ### Week 6: Task Tool Integration
 **Deliverables:**
 - [ ] Modify existing Task tool to use queue
 - [ ] `persist` flag for backward compatibility
 - [ ] Polling support for task completion
 - [ ] Migration guide for existing code
 **Modified Task Tool:**
 ```python
 class TaskTool:
    async def run(
        self,
        prompt: str,
        subagent_type: str,
        # ... existing args ...
        persist: bool = False,        # NEW
        priority: int = 100,          # NEW
        wait: bool = False,           # NEW
        timeout: int = 300,           # NEW
    ) -> TaskResult:
        if persist:
            # Enqueue and optionally wait
            task_id = await self.queue.enqueue(...)
            if wait:
                # Poll for completion
                result = await self._wait_for_task(task_id, timeout)
                return result
            else:
                # Return immediately with task_id
                return TaskResult(task_id=task_id, status="pending")
        else:
            # Legacy immediate execution
            return await self._execute_immediately(...)
 ```
 ---
 ## Technical Specifications
 ### Task Data Model
 ```python
@dataclass
 class Task:
    id: UUID
    subagent_type: str
    prompt: str
    system_prompt: Optional[str]
    model: Optional[str]
    # State
    status: TaskStatus
    priority: int = 100
    created_at: datetime
    started_at: Optional[datetime]
    completed_at: Optional[datetime]
    # Execution
    agent_id: Optional[UUID]
    retry_count: int = 0
    max_retries: int = 3
    # Results
    result: Optional[dict]
    error: Optional[str]
    exit_code: Optional[int]
    # Metadata
    tags: List[str]
    user_id: str
    parent_task: Optional[UUID]
    # Cost tracking (NEW)
    input_tokens: int = 0
    output_tokens: int = 0
    estimated_cost: float = 0.0
 ```
 ### Retry Logic
 ```python
 async def retry_with_backoff(task: Task) -> bool:
    if task.retry_count >= task.max_retries:
        return False  # Permanent failure
    # Exponential backoff: 2^retry_count seconds
    delay = min(2 ** task.retry_count, 300)  # Cap at 5 min
    await asyncio.sleep(delay)
    task.retry_count += 1
    # Re-enqueue with same priority
    await queue.enqueue(task, priority=task.priority)
    return True
 ```
 ### Error Classification
 | Error | Retry? | Action |
 |-------|--------|--------|
 | Subagent crash | Yes | Requeue with backoff |
 | Syntax error | No | Fail immediately |
 | API rate limit | Yes | Exponential backoff |
 | Out of memory | No | Alert admin, fail |
 | Redis connection | Yes | Reconnect, retry |
 | Timeout | Yes | Retry with longer timeout |
 ---
 ## Testing Strategy
 ### Unit Tests
 ```python
 # test_queue.py
 def test_enqueue_creates_pending_task():
 def test_dequeue_removes_from_pending():
 def test_complete_moves_to_completed():
 def test_fail_triggers_retry():
 def test_max_retries_exceeded():
 def test_cancel_stops_running_task():
 ```
 ### Integration Tests
 ```python
 # test_worker.py
 async def test_worker_processes_task():
 async def test_worker_handles_failure():
 async def test_worker_heartbeat():
 async def test_stall_recovery():
 ```
 ### Durability Tests
 ```python
 # test_durability.py
 async def test_tasks_survive_restart():
    """Enqueue tasks, restart Redis, verify tasks persist."""
 async def test_worker_crash_recovery():
    """Kill worker mid-task, verify task requeued."""
 async def test_concurrent_workers():
    """5 workers, 20 tasks, verify all complete."""
 ```
 ---
 ## Dependencies
 ### Required
 ```toml
 redis = { version = "^5.0", extras = ["hiredis"] }
 fastapi = "^0.115"
 websockets = "^13.0"
 pydantic = "^2.0"
 ```
 ### Development
 ```toml
 pytest = "^8.0"
 pytest-asyncio = "^0.24"
 httpx = "^0.27"  # For FastAPI test client
 ```
 ### Infrastructure
 - Redis 7.0+ (local or cloud)
 - Python 3.11+
 ---
 ## Migration Guide
 ### For Existing Task Tool Users
 **Before:**
 ```python
 result = await task_tool.run(
    prompt="Create a React component",
    subagent_type="coder"
 )  # Blocks until complete
 ```
 **After (backward compatible):**
 ```python
 # Same behavior (immediate execution)
 result = await task_tool.run(
    prompt="Create a React component",
    subagent_type="coder",
    persist=False  # default
 )
 ```
 **New (persistent):**
 ```python
 # Fire-and-forget
 task_id = await task_tool.run(
    prompt="Create a React component",
    subagent_type="coder",
    persist=True
 )
 # Wait for completion
 result = await task_tool.run(
    prompt="Create a React component",
    subagent_type="coder",
    persist=True,
    wait=True,
    timeout=600
 )
 ```
 ---
 ## Success Criteria
 | Metric | Target | Measurement |
 |--------|--------|-------------|
 | Task durability | 100% | Tasks never lost on restart |
 | Throughput | 10 tasks/min | With 3 workers |
 | Latency | <100ms | Enqueue → pending |
 | Recovery time | <60s | Worker crash → requeue |
 | API uptime | 99.9% | Health check endpoint |
 | Backward compat | 100% | Existing tests pass |
 ---
 ## Risk Mitigation
 | Risk | Likelihood | Impact | Mitigation |
 |------|------------|--------|------------|
 | Redis complexity | Low | Medium | Start with simple ops |
 | Worker pool bugs | Medium | High | Extensive testing |
 | Performance issues | Low | Medium | Load testing Week 5 |
 | Migration breakage | Low | High | Full test suite |
 ---
 ## Handoff to Phase 2
 **Phase 2 Prereqs:**
 - [ ] All Phase 1 success criteria met
 - [ ] API documentation complete
 - [ ] WebSocket tested with simple client
 - [ ] Cost tracking working
 **Phase 2 Inputs:**
 - Task queue API (REST + WebSocket)
 - Task data model
 - Worker management API
 - Redis schema
 ---
 ## Appendix: Quick Reference
 ### Redis Commands Cheat Sheet
 ```bash
 # Start Redis
 docker run -d -p 6379:6379 redis:7-alpine
 # Monitor
 redis-cli monitor
 # Inspect keys
 redis-cli KEYS "task:*"
 redis-cli HGETALL task:abc-123
 # Clear queue
 redis-cli FLUSHDB
 ```
 ### Development Commands
 ```bash
 # Start worker
 python -m letta_ade.worker.runner --agent-id worker-1
 # Start API
 uvicorn letta_ade.api:app --reload
 # Run tests
 pytest tests/ -v --tb=short
 # Integration test
 pytest tests/integration/ -v
 ```
 ---
 *Ready for implementation. Questions? See community-ade-research-synthesis-2026-03-18.md for full context.*
--- a/docs/ade-phase1-orchestration-design.md
+++ b/docs/ade-phase1-orchestration-design.md
@@ -0,0 +1,307 @@
 # Phase 1: Orchestration Layer Design
 **Date:** March 18, 2026
 **Architect:** Researcher subagent
 **Goal:** Design persistent task queue system for Community ADE
 ---
 ## 1. Core Data Model
 ```python
@dataclass
 class Task:
    id: UUID                           # Unique task identifier
    subagent_type: str                 # "researcher", "coder", etc.
    prompt: str                        # User prompt to subagent
    system_prompt: Optional[str]       # Override default system prompt
    model: Optional[str]               # Override default model
    # State tracking
    status: TaskStatus                 # pending/running/completed/failed/cancelled
    priority: int = 100                # Lower = higher priority
    created_at: datetime
    started_at: Optional[datetime]
    completed_at: Optional[datetime]
    # Execution tracking
    agent_id: Optional[UUID]           # Assigned worker agent
    retry_count: int = 0
    max_retries: int = 3
    # Results
    result: Optional[dict]             # Success result
    error: Optional[str]               # Failure message
    exit_code: Optional[int]           # Subprocess exit code
    # Metadata
    tags: List[str]                    # For filtering/grouping
    user_id: str                       # Task owner
    parent_task: Optional[UUID]        # For task chains
 ```
 ### TaskStatus Enum
 ```python
 class TaskStatus(Enum):
    PENDING = "pending"           # Waiting for worker
    RUNNING = "running"           # Assigned to worker
    COMPLETED = "completed"       # Success
    FAILED = "failed"             # Permanent failure (max retries)
    CANCELLED = "cancelled"       # User cancelled
    STALLED = "stalled"           # Worker crashed, needs recovery
 ```
 ---
 ## 2. State Machine
 ```
                    +-----------+
                    |  PENDING  |
                    +-----+-----+
                          | dequeue()
                          v
 +--------+         +-------------+         +-----------+
 | FAILED |<--------+   RUNNING   +-------->| COMPLETED |
 +--------+  fail()  +------+------+  success +-----------+
   ^ max              |      |
   | retries          |      |
   +------------------+      | cancel()
         retry()             v
                      +-----------+
                      | CANCELLED |
                      +-----------+
                          ^
                          | stall detected
                    +----------+
                    | STALLED  |
                    +----------+
 ```
 ### Transitions
 - `PENDING → RUNNING`: Worker dequeues task
 - `RUNNING → COMPLETED`: Subagent succeeds
 - `RUNNING → FAILED`: Subagent fails, max retries reached
 - `RUNNING → STALLED`: Worker heartbeat timeout
 - `STALLED → RUNNING`: Reassigned to new worker
 - `FAILED → RUNNING`: Manual retry triggered
 - Any → CANCELLED: User cancellation
 ---
 ## 3. Redis Data Structures
 | Purpose | Structure | Key Pattern |
 |---------|-----------|-------------|
 | Task payload | Hash | `task:{task_id}` |
 | Pending queue | Sorted Set (by priority) | `queue:pending` |
 | Running set | Set | `queue:running` |
 | Worker registry | Hash | `worker:{agent_id}` |
 | Status index | Set per status | `status:{status}` |
 | User tasks | Set | `user:{user_id}:tasks` |
 ### Example Redis Operations
 ```redis
 # Enqueue (pending)
 ZADD queue:pending {priority} {task_id}
 HSET task:{task_id} status pending created_at {timestamp} ...
 SADD status:pending {task_id}
 # Dequeue (atomic)
 WATCH queue:pending
 task_id = ZPOPMIN queue:pending
 MULTI
    ZADD queue:running {now} {task_id}
    HSET task:{task_id} status running agent_id {worker} started_at {now}
    SMOVE status:pending status:running {task_id}
 EXEC
 # Complete
 ZREM queue:running {task_id}
 SADD status:completed {task_id}
 HSET task:{task_id} status completed result {...} completed_at {now}
 # Fail with retry
 HINCRBY task:{task_id} retry_count 1
 ZADD queue:pending {priority} {task_id}  # Re-enqueue
 SMOVE status:running status:pending {task_id}
 HSET task:{task_id} status pending error {...}
 # Stall recovery (cron job)
 SMEMBERS queue:running
 # For each task where worker heartbeat > threshold:
 ZREM queue:running {task_id}
 SADD status:stalled {task_id}
 ZADD queue:pending {priority} {task_id}
 ```
 ---
 ## 4. Key API Methods
 ```python
 class TaskQueue:
    # Core operations
    async def enqueue(task: Task) -> UUID
    async def dequeue(worker_id: UUID, timeout_ms: int = 5000) -> Optional[Task]
    async def complete(task_id: UUID, result: dict) -> None
    async def fail(task_id: UUID, error: str, retryable: bool = True) -> None
    async def cancel(task_id: UUID) -> None
    # Management
    async def retry(task_id: UUID) -> None           # Manual retry
    async def requeue_stalled(max_age_ms: int = 60000) -> int  # Recover crashed
    async def get_status(task_id: UUID) -> TaskStatus
    async def list_by_user(user_id: str, status: Optional[str]) -> List[TaskSummary]
    # Worker management
    async def register_worker(agent_id: UUID, capacity: int) -> None
    async def heartbeat(agent_id: UUID) -> None
    async def unregister_worker(agent_id: UUID, reason: str) -> None
 ```
 ---
 ## 5. Integration with Existing Task Tool
 ### Current Flow
 ```
 Task tool → immediate subprocess spawn → wait → return result
 ```
 ### New Flow (with persistence)
 ```
 Task tool → enqueue() → return task_id (immediate)
                    ↓
 Background worker → dequeue() → spawn subprocess → complete()/fail()
                    ↓
 Caller polls/gets notification when task completes
 ```
 ### Changes to Task Tool Schema
 ```python
 class TaskTool:
    async def run(
        self,
        prompt: str,
        subagent_type: str,
        # ... existing args ...
        persist: bool = False,           # NEW: enqueue instead of immediate run
        priority: int = 100,            # NEW
        tags: Optional[List[str]] = None  # NEW
    ) -> TaskResult:
        if persist:
            task_id = await self.queue.enqueue(...)
            return TaskResult(task_id=task_id, status="pending")
        else:
            # Legacy: immediate execution
            ...
 ```
 ### Worker Agent Integration
 **Worker subscribes to queue:**
 ```python
 async def worker_loop(agent_id: UUID):
    while running:
        task = await queue.dequeue(agent_id, timeout_ms=5000)
        if task:
            # Spawn subprocess
            proc = await asyncio.create_subprocess_exec(
                "letta", "run-agent", f"--task-id={task.id}",
                stdout=asyncio.subprocess.PIPE,
                stderr=asyncio.subprocess.PIPE
            )
            # Monitor and wait
            stdout, stderr = await proc.communicate()
            # Update queue based on result
            if proc.returncode == 0:
                await queue.complete(task.id, parse_result(stdout))
            else:
                await queue.fail(task.id, stderr.decode(), retryable=True)
 ```
 ---
 ## 6. Implementation Phases
 ### Phase 1a: In-Memory Prototype (Week 1)
 - Python `asyncio.Queue` for pending tasks
 - In-memory dict for task storage
 - Single worker process
 - No Redis dependency
 ### Phase 1b: Redis Integration (Week 2)
 - Replace queue with Redis
 - Add task persistence
 - Implement retry logic
 - Add stall recovery
 ### Phase 1c: Worker Pool (Week 3-4)
 - Multiple worker processes
 - Worker heartbeat monitoring
 - Task assignment logic
 - Graceful shutdown handling
 ### Phase 1d: API & CLI (Week 5-6)
 - REST API for task management
 - CLI commands for queue inspection
 - Task status dashboard endpoint
 - Webhook notifications
 ### Phase 1e: Integration (Week 7-8)
 - Modify Task tool to use queue
 - Add persistence flag
 - Maintain backward compatibility
 - Migration path for existing code
 ---
 ## 7. Retry Logic with Exponential Backoff
 ```python
 async def retry_with_backoff(task_id: UUID):
    task = await queue.get(task_id)
    if task.retry_count >= task.max_retries:
        await queue.fail(task_id, "Max retries exceeded", retryable=False)
        return
    # Exponential backoff: 2^retry_count seconds
    delay = min(2 ** task.retry_count, 300)  # Cap at 5 minutes
    await asyncio.sleep(delay)
    # Re-enqueue with same priority
    await queue.enqueue(task, priority=task.priority)
 ```
 ---
 ## 8. Error Handling Strategy
 | Error Type | Retry? | Action |
 |------------|--------|--------|
 | Subagent crash | Yes | Increment retry, requeue |
 | Syntax error in code | No | Fail immediately |
 | Timeout | Yes | Retry with longer timeout |
 | API rate limit | Yes | Retry with exponential backoff |
 | Out of memory | No | Fail, alert admin |
 | Redis connection lost | Yes | Reconnect, retry operation |
 ---
 ## Next Steps
 1. **Implement in-memory prototype** (Week 1)
 2. **Add Redis persistence** (Week 2)
 3. **Build worker pool** (Week 3-4)
 4. **Integrate with Task tool** (Week 7-8)
 5. **Write tests for queue durability** (ongoing)
 ---
 *Design by Researcher subagent, March 18, 2026*
--- a/docs/ade-redis-queue-design.md
+++ b/docs/ade-redis-queue-design.md
@@ -0,0 +1,835 @@
 # Redis Task Queue Architecture for Letta Community ADE
 ## Executive Summary
 This document outlines the architecture for replacing the in-memory `QueueRuntime` with a Redis-backed persistent task queue. The design prioritizes durability, horizontal scalability, and reliable task execution while maintaining compatibility with the existing Task tool and subagent spawning workflows.
 **Key Decisions:**
 - Use **Redis Streams** (not Sorted Sets) for the primary task queue to leverage consumer groups and at-least-once delivery guarantees
 - Hybrid approach: Streams for queue semantics, Sorted Sets for scheduling/delays, Hashes for task state
 - Stateless workers with heartbeat-based liveness detection
 - Exponential backoff with jitter for retry logic
 ---
 ## 1. Redis Data Structures
 ### 1.1 Primary Queue: Redis Stream
 ```
 Key: ade:queue:tasks
 Type: Stream
 Purpose: Main task ingestion and distribution
 ```
 **Why Streams over Sorted Sets?**
 | Feature | Sorted Sets | Redis Streams |
 |---------|-------------|---------------|
 | Ordering | Score-based (can have ties) | Strict temporal (millisecond ID) |
 | Consumer Groups | Manual implementation | Built-in XREADGROUP |
 | Delivery Semantics | At-most-once (easy) / At-least-once (complex) | At-least-once with ACK |
 | Pending Tracking | Manual | Built-in XPENDING |
 | Claim/Retry | Custom Lua scripts | Built-in XCLAIM/XAUTOCLAIM |
 | Message Visibility | Immediate to all | Consumer-group isolated |
 Streams provide the exact semantics needed for reliable task processing without custom Lua scripting.
 **Stream Entries:**
 ```
 XADD ade:queue:tasks * taskId <uuid> payload <json> priority <int>
 ```
 ### 1.2 Delayed Tasks: Sorted Set
 ```
 Key: ade:queue:delayed
 Type: Sorted Set (ZSET)
 Score: scheduled execution timestamp (ms)
 Member: taskId
 ```
 Used for:
 - Tasks with explicit `runAfter` timestamps
 - Retry scheduling with exponential backoff
 - Rate-limited task release
 ### 1.3 Task State Storage: Redis Hash
 ```
 Key: ade:task:{taskId}
 Type: Hash
 Fields:
  - id: string (UUID v4)
  - status: pending|running|completed|failed
  - payload: JSON (task arguments)
  - createdAt: timestamp (ms)
  - startedAt: timestamp (ms)
  - completedAt: timestamp (ms)
  - workerId: string (nullable)
  - attemptCount: integer
  - maxAttempts: integer (default: 3)
  - error: string (last error message)
  - result: JSON (completed task result)
  - parentTaskId: string (nullable, for task chains)
  - subagentId: string (link to subagent state)
  - priority: integer (0-9, default 5)
  - kind: message|task_notification|approval_result|overlay_action
 TTL: 7 days (configurable cleanup for completed/failed tasks)
 ```
 ### 1.4 Worker Registry: Redis Hash + Sorted Set
 ```
 Key: ade:workers:active
 Type: Hash
 Fields per worker:
  - {workerId}: JSON { hostname, pid, startedAt, lastHeartbeat, version }
 Key: ade:workers:heartbeat
 Type: Sorted Set
 Score: last heartbeat timestamp
 Member: workerId
 ```
 ### 1.5 Consumer Group State
 ```
 Stream Consumer Group: ade:queue:tasks
 Group Name: ade-workers
 Consumer Name: {workerId} (unique per process)
 ```
 Redis Streams automatically track:
 - Pending messages per consumer (XPENDING)
 - Delivery count per message
 - Idle time since last read
 ---
 ## 2. Task Entity Schema
 ### 2.1 TypeScript Interface
 ```typescript
 // src/queue/redis/types.ts
 export type TaskStatus = 
  | "pending"      // Enqueued, not yet claimed
  | "running"      // Claimed by worker, processing
  | "completed"    // Successfully finished
  | "failed"       // Exhausted all retries
  | "cancelled";   // Explicitly cancelled
 export type TaskKind =
  | "message"
  | "task_notification" 
  | "approval_result"
  | "overlay_action";
 export interface TaskPayload {
  // Task identification
  id: string;                    // UUID v4
  kind: TaskKind;
  // Execution context
  agentId?: string;
  conversationId?: string;
  clientMessageId?: string;
  // Content (varies by kind)
  content?: unknown;             // For "message" kind
  text?: string;                 // For notification/approval/overlay
  // Subagent execution params (for task_notification)
  subagentType?: string;
  prompt?: string;
  model?: string;
  existingAgentId?: string;
  existingConversationId?: string;
  maxTurns?: number;
  // Scheduling
  priority: number;              // 0-9, lower = higher priority
  runAfter?: number;             // Timestamp ms (for delayed tasks)
  // Retry configuration
  maxAttempts: number;
  backoffMultiplier: number;     // Default: 2
  maxBackoffMs: number;          // Default: 300000 (5 min)
  // Metadata
  enqueuedAt: number;
  source: "user" | "system" | "hook";
 }
 export interface TaskState extends TaskPayload {
  status: TaskStatus;
  workerId?: string;
  attemptCount: number;
  startedAt?: number;
  completedAt?: number;
  error?: string;
  result?: unknown;
  // Coalescing support (from QueueRuntime)
  isCoalescable: boolean;
  scopeKey?: string;             // For grouping coalescable items
 }
 ```
 ### 2.2 State Transitions
 ```
                    ┌─────────────┐
                    │   PENDING   │◄──────────────────┐
                    │  (queued)   │                   │
                    └──────┬──────┘                   │
                           │ claim                    │ retry
                           ▼                          │ (with delay)
                    ┌─────────────┐                   │
         ┌─────────│   RUNNING   │───────────────────┘
         │         │  (claimed)  │ fail (retryable)
         │         └──────┬──────┘
 complete │                │ fail (final)
         │                ▼
         │         ┌─────────────┐
         └────────►│  COMPLETED  │
                   └─────────────┘
                          │
                   ┌──────┴──────┐
                   │    FAILED   │
                   │  (exhausted)│
                   └─────────────┘
 ```
 ---
 ## 3. Worker Pool Registration and Heartbeat
 ### 3.1 Worker Lifecycle
 ```typescript
 // src/queue/redis/worker.ts
 class TaskWorker {
  private workerId: string;
  private redis: RedisClient;
  private isRunning: boolean = false;
  private heartbeatInterval?: NodeJS.Timeout;
  private claimInterval?: NodeJS.Timeout;
  // Config
  private readonly HEARTBEAT_INTERVAL_MS = 5000;
  private readonly HEARTBEAT_TIMEOUT_MS = 30000;
  private readonly CLAIM_BATCH_SIZE = 10;
  private readonly PROCESSING_TIMEOUT_MS = 300000; // 5 min
  async start(): Promise<void> {
    this.workerId = generateWorkerId(); // {hostname}:{pid}:{uuid}
    // Register in worker registry
    await this.redis.hSet("ade:workers:active", this.workerId, JSON.stringify({
      hostname: os.hostname(),
      pid: process.pid,
      startedAt: Date.now(),
      lastHeartbeat: Date.now(),
      version: process.env.npm_package_version || "unknown"
    }));
    // Create consumer in stream group (idempotent)
    try {
      await this.redis.xGroupCreate("ade:queue:tasks", "ade-workers", "$", {
        MKSTREAM: true
      });
    } catch (err) {
      // Group already exists - ignore
    }
    this.isRunning = true;
    this.startHeartbeat();
    this.startClaimLoop();
  }
  async stop(): Promise<void> {
    this.isRunning = false;
    clearInterval(this.heartbeatInterval);
    clearInterval(this.claimInterval);
    // Release pending tasks back to queue
    await this.releasePendingTasks();
    // Deregister
    await this.redis.hDel("ade:workers:active", this.workerId);
    await this.redis.zRem("ade:workers:heartbeat", this.workerId);
  }
  private startHeartbeat(): void {
    this.heartbeatInterval = setInterval(async () => {
      await this.redis.zAdd("ade:workers:heartbeat", {
        score: Date.now(),
        value: this.workerId
      });
      await this.redis.hSet("ade:workers:active", this.workerId, JSON.stringify({
        ...currentInfo,
        lastHeartbeat: Date.now()
      }));
    }, this.HEARTBEAT_INTERVAL_MS);
  }
 }
 ```
 ### 3.2 Dead Worker Detection
 ```typescript
 // src/queue/redis/orchestrator.ts (singleton, per-deployment)
 class QueueOrchestrator {
  async detectAndReclaimDeadWorkerTasks(): Promise<number> {
    const now = Date.now();
    const cutoff = now - this.HEARTBEAT_TIMEOUT_MS;
    // Find dead workers
    const deadWorkers = await this.redis.zRangeByScore(
      "ade:workers:heartbeat",
      "-inf",
      cutoff
    );
    let reclaimedCount = 0;
    for (const workerId of deadWorkers) {
      // Find pending tasks for this worker using XPENDING
      const pending = await this.redis.xPendingRange(
        "ade:queue:tasks",
        "ade-workers",
        "-",
        "+
        this.CLAIM_BATCH_SIZE
      );
      for (const item of pending) {
        if (item.consumer === workerId && item.idle > this.PROCESSING_TIMEOUT_MS) {
          // Use XAUTOCLAIM to atomically claim and retry
          const [nextId, claimed] = await this.redis.xAutoClaim(
            "ade:queue:tasks",
            "ade-workers",
            "orchestrator", // consumer name for cleanup
            this.PROCESSING_TIMEOUT_MS,
            item.id,
            { COUNT: 1 }
          );
          // Release back to pending by ACKing (removes from pending list)
          // The orchestrator will re-add to delayed queue for retry
          await this.redis.xAck("ade:queue:tasks", "ade-workers", item.id);
          await this.scheduleRetry(item.id);
          reclaimedCount++;
        }
      }
      // Clean up dead worker registration
      await this.redis.hDel("ade:workers:active", workerId);
      await this.redis.zRem("ade:workers:heartbeat", workerId);
    }
    return reclaimedCount;
  }
 }
 ```
 ---
 ## 4. Retry Logic with Exponential Backoff
 ### 4.1 Backoff Calculation
 ```typescript
 // src/queue/redis/retry.ts
 interface RetryConfig {
  attempt: number;           // 0-indexed (0 = first retry)
  baseDelayMs: number;       // Default: 1000
  multiplier: number;        // Default: 2
  maxDelayMs: number;        // Default: 300000 (5 min)
  jitterFactor: number;      // Default: 0.1 (10% randomization)
 }
 function calculateRetryDelay(config: RetryConfig): number {
  // Exponential backoff: base * (multiplier ^ attempt)
  const exponentialDelay = config.baseDelayMs * 
    Math.pow(config.multiplier, config.attempt);
  // Cap at max
  const cappedDelay = Math.min(exponentialDelay, config.maxDelayMs);
  // Add jitter to prevent thundering herd: ±jitterFactor
  const jitter = cappedDelay * config.jitterFactor * (Math.random() * 2 - 1);
  return Math.floor(cappedDelay + jitter);
 }
 // Examples with defaults:
 // Attempt 0 (first retry): ~1000ms ±100ms
 // Attempt 1: ~2000ms ±200ms
 // Attempt 2: ~4000ms ±400ms
 // Attempt 3: ~8000ms ±800ms
 // Attempt 4: ~16000ms ±1600ms
 // ...up to max 300000ms (5 min)
 ```
 ### 4.2 Retry Flow
 ```typescript
 async function handleTaskFailure(
  taskId: string,
  error: Error,
  workerId: string
 ): Promise<void> {
  const taskKey = `ade:task:${taskId}`;
  const task = await redis.hGetAll(taskKey);
  const attemptCount = parseInt(task.attemptCount) + 1;
  const maxAttempts = parseInt(task.maxAttempts);
  if (attemptCount >= maxAttempts) {
    // Final failure - mark as failed
    await redis.hSet(taskKey, {
      status: "failed",
      error: error.message,
      completedAt: Date.now(),
      attemptCount: attemptCount.toString()
    });
    // Publish failure event for observers
    await redis.publish("ade:events:task-failed", JSON.stringify({
      taskId,
      error: error.message,
      totalAttempts: attemptCount
    }));
    // ACK to remove from pending
    await redis.xAck("ade:queue:tasks", "ade-workers", taskId);
  } else {
    // Schedule retry
    const delay = calculateRetryDelay({
      attempt: attemptCount,
      baseDelayMs: 1000,
      multiplier: 2,
      maxDelayMs: 300000,
      jitterFactor: 0.1
    });
    const runAfter = Date.now() + delay;
    // Update task state
    await redis.hSet(taskKey, {
      status: "pending",
      attemptCount: attemptCount.toString(),
      error: error.message,
      workerId: "" // Clear worker assignment
    });
    // Add to delayed queue
    await redis.zAdd("ade:queue:delayed", {
      score: runAfter,
      value: taskId
    });
    // ACK to remove from stream pending
    await redis.xAck("ade:queue:tasks", "ade-workers", taskId);
  }
 }
 ```
 ### 4.3 Delayed Task Promoter
 ```typescript
 // Runs periodically (every 1 second) to move due tasks from delayed set to stream
 async function promoteDelayedTasks(): Promise<number> {
  const now = Date.now();
  // Atomically get and remove due tasks
  const dueTasks = await redis.zRangeByScore(
    "ade:queue:delayed",
    "-inf",
    now,
    { LIMIT: { offset: 0, count: 100 } }
  );
  if (dueTasks.length === 0) return 0;
  // Remove from delayed queue
  await redis.zRem("ade:queue:delayed", dueTasks);
  // Re-add to stream for processing
  for (const taskId of dueTasks) {
    const task = await redis.hGetAll(`ade:task:${taskId}`);
    await redis.xAdd("ade:queue:tasks", "*", {
      taskId,
      payload: task.payload,
      priority: task.priority
    });
  }
  return dueTasks.length;
 }
 ```
 ---
 ## 5. Integration with Existing Task.ts
 ### 5.1 Adapter Pattern
 ```typescript
 // src/queue/redis/adapter.ts
 import { QueueRuntime, QueueItem, DequeuedBatch } from "../queueRuntime";
 import { RedisQueue } from "./queue";
 /**
 * Redis-backed implementation of QueueRuntime interface.
 * Allows drop-in replacement of in-memory queue.
 */
 export class RedisQueueAdapter implements QueueRuntime {
  private redisQueue: RedisQueue;
  private localBatchBuffer: Map<string, QueueItem> = new Map();
  constructor(redisUrl: string, options?: QueueRuntimeOptions) {
    this.redisQueue = new RedisQueue(redisUrl, {
      ...options,
      onTaskCompleted: this.handleTaskCompleted.bind(this),
      onTaskFailed: this.handleTaskFailed.bind(this)
    });
  }
  async enqueue(input: Omit<QueueItem, "id" | "enqueuedAt">): Promise<QueueItem | null> {
    // Map QueueItem to TaskPayload
    const taskId = generateUUID();
    const enqueuedAt = Date.now();
    const payload: TaskPayload = {
      id: taskId,
      kind: input.kind,
      agentId: input.agentId,
      conversationId: input.conversationId,
      clientMessageId: input.clientMessageId,
      text: (input as any).text,
      content: (input as any).content,
      priority: 5, // Default priority
      maxAttempts: 3,
      backoffMultiplier: 2,
      maxBackoffMs: 300000,
      enqueuedAt,
      source: "user",
      isCoalescable: isCoalescable(input.kind)
    };
    const success = await this.redisQueue.enqueue(payload);
    if (!success) return null;
    return {
      ...input,
      id: taskId,
      enqueuedAt
    } as QueueItem;
  }
  async tryDequeue(blockedReason: QueueBlockedReason | null): Promise<DequeuedBatch | null> {
    if (blockedReason !== null) {
      // Emit blocked event if needed (preserving QueueRuntime behavior)
      return null;
    }
    // Claim batch from Redis
    const batch = await this.redisQueue.claimBatch({
      consumerId: this.workerId,
      batchSize: this.getCoalescingBatchSize(),
      coalescingWindowMs: 50 // Small window for coalescing
    });
    if (!batch || batch.length === 0) return null;
    // Map back to QueueItem format
    const items: QueueItem[] = batch.map(task => this.mapTaskToQueueItem(task));
    return {
      batchId: generateBatchId(),
      items,
      mergedCount: items.length,
      queueLenAfter: await this.redisQueue.getQueueLength()
    };
  }
  // ... other QueueRuntime methods
 }
 ```
 ### 5.2 Task.ts Integration Points
 **Current Flow (Task.ts line 403+):**
 ```typescript
 // Background task spawning
 const { taskId, outputFile, subagentId } = spawnBackgroundSubagentTask({
  subagentType: subagent_type,
  prompt,
  description,
  model,
  toolCallId,
  existingAgentId: args.agent_id,
  existingConversationId: args.conversation_id,
  maxTurns: args.max_turns,
 });
 ```
 **Proposed Redis Integration:**
 ```typescript
 // New: Redis-backed task queue integration
 interface TaskQueueEnqueueOptions {
  subagentType: string;
  prompt: string;
  description: string;
  model?: string;
  toolCallId?: string;
  existingAgentId?: string;
  existingConversationId?: string;
  maxTurns?: number;
  priority?: number;
  runInBackground?: boolean;
 }
 // In Task.ts - replace spawnBackgroundSubagentTask with:
 export async function enqueueSubagentTask(
  args: TaskQueueEnqueueOptions,
  queue: RedisQueue
 ): Promise<TaskEnqueueResult> {
  const taskId = generateTaskId();
  const subagentId = generateSubagentId();
  // Register in subagent state store (for UI)
  registerSubagent(subagentId, args.subagentType, args.description, args.toolCallId, true);
  const outputFile = createBackgroundOutputFile(taskId);
  // Create task payload
  const payload: TaskPayload = {
    id: taskId,
    kind: "task_notification",
    subagentType: args.subagentType,
    prompt: args.prompt,
    description: args.description,
    model: args.model,
    existingAgentId: args.existingAgentId,
    existingConversationId: args.existingConversationId,
    maxTurns: args.maxTurns,
    subagentId,
    outputFile,
    priority: args.priority ?? 5,
    maxAttempts: 3,
    backoffMultiplier: 2,
    maxBackoffMs: 300000,
    enqueuedAt: Date.now(),
    source: "user",
    isCoalescable: false // Task notifications are not coalescable
  };
  // Enqueue to Redis
  await queue.enqueue(payload);
  return { taskId, outputFile, subagentId };
 }
 ```
 ### 5.3 Worker Implementation for Subagents
 ```typescript
 // src/queue/redis/subagent-worker.ts
 class SubagentTaskWorker extends TaskWorker {
  protected async processTask(task: TaskState): Promise<void> {
    // Update subagent state to "running"
    updateSubagent(task.subagentId!, { status: "running" });
    try {
      // Execute subagent (existing manager.ts logic)
      const result = await spawnSubagent(
        task.subagentType!,
        task.prompt!,
        task.model,
        task.subagentId!,
        undefined, // signal - handled via task cancellation
        task.existingAgentId,
        task.existingConversationId,
        task.maxTurns
      );
      // Write transcript
      writeTaskTranscriptResult(task.outputFile!, result, "");
      // Complete subagent state
      completeSubagent(task.subagentId!, {
        success: result.success,
        error: result.error,
        totalTokens: result.totalTokens
      });
      // Send notification if not silent
      if (!task.silent) {
        const notification = formatTaskNotification({
          taskId: task.id,
          status: result.success ? "completed" : "failed",
          summary: `Agent "${task.description}" ${result.success ? "completed" : "failed"}`,
          result: result.success ? result.report : result.error,
          outputFile: task.outputFile!
        });
        // Add to message queue for parent agent
        addToMessageQueue({
          kind: "task_notification",
          text: notification
        });
      }
      // Mark task completed
      await this.completeTask(task.id, result);
    } catch (error) {
      const errorMessage = error instanceof Error ? error.message : String(error);
      // Update subagent state
      completeSubagent(task.subagentId!, { success: false, error: errorMessage });
      // Fail task (triggers retry logic)
      await this.failTask(task.id, new Error(errorMessage));
    }
  }
 }
 ```
 ---
 ## 6. Operational Considerations
 ### 6.1 Redis Configuration
 ```yaml
 # Recommended Redis config for task queue
 maxmemory: 1gb
 maxmemory-policy: allkeys-lru  # Evict old completed tasks first
 # Persistence (for durability)
 appendonly: yes
 appendfsync: everysec
 # Stream trimming (prevent unbounded growth)
 # Set via XTRIM or MAXLEN on XADD
 ```
 ### 6.2 Key Patterns and Cleanup
 | Key Pattern | Type | TTL | Cleanup Strategy |
 |-------------|------|-----|------------------|
 | `ade:queue:tasks` | Stream | - | XTRIM by MAXLEN (keep 100k) |
 | `ade:queue:delayed` | ZSET | - | Processed by promoter |
 | `ade:task:{id}` | Hash | 7 days | Expire completed/failed |
 | `ade:workers:active` | Hash | - | On worker deregistration |
 | `ade:workers:heartbeat` | ZSET | - | On worker timeout |
 ### 6.3 Monitoring Metrics
 ```typescript
 // Metrics to expose via Prometheus/StatsD
 interface QueueMetrics {
  // Queue depth
  "ade_queue_pending_total": number;      // XPENDING count
  "ade_queue_delayed_total": number;      // ZCARD ade:queue:delayed
  "ade_queue_stream_length": number;      // XLEN ade:queue:tasks
  // Throughput
  "ade_tasks_enqueued_rate": number;      // XADD rate
  "ade_tasks_completed_rate": number;     // Completion rate
  "ade_tasks_failed_rate": number;        // Failure rate
  // Worker health
  "ade_workers_active_total": number;     // HLEN ade:workers:active
  "ade_workers_dead_total": number;       // Detected dead workers
  // Processing
  "ade_task_duration_ms": Histogram;      // Time from claim to complete
  "ade_task_wait_ms": Histogram;          // Time from enqueue to claim
  "ade_task_attempts": Histogram;         // Distribution of retry counts
 }
 ```
 ### 6.4 Failure Modes
 | Scenario | Handling |
 |----------|----------|
 | Redis unavailable | Tasks fail immediately; caller responsible for retry |
 | Worker crash | Tasks reclaimed via heartbeat timeout (30s) |
 | Poison message | Max retries (3) then moved to DLQ |
 | Slow task | Processing timeout (5 min) triggers requeue |
 | Duplicate task | Idempotent task IDs (UUID) prevent double execution |
 ---
 ## 7. Migration Strategy
 ### Phase 1: Dual-Write (Week 1)
 - Implement RedisQueueAdapter
 - Write to both in-memory and Redis queues
 - Read from in-memory only (Redis for validation)
 ### Phase 2: Shadow Mode (Week 2)
 - Read from both queues
 - Compare results, log discrepancies
 - Fix any edge cases
 ### Phase 3: Cutover (Week 3)
 - Switch reads to Redis
 - Keep in-memory as fallback
 - Monitor for 1 week
 ### Phase 4: Cleanup (Week 4)
 - Remove in-memory queue code
 - Full Redis dependency
 ---
 ## 8. Implementation Checklist
 - [ ] Redis client configuration (ioredis or node-redis)
 - [ ] Task entity schema and serialization
 - [ ] Stream consumer group setup
 - [ ] Worker registration and heartbeat
 - [ ] Task claim and processing loop
 - [ ] Retry logic with exponential backoff
 - [ ] Delayed task promotion
 - [ ] Dead worker detection and reclamation
 - [ ] QueueRuntime adapter implementation
 - [ ] Task.ts integration
 - [ ] Subagent state synchronization
 - [ ] Metrics and monitoring
 - [ ] Error handling and DLQ
 - [ ] Tests (unit, integration, load)
 - [ ] Documentation
 ---
 ## 9. Appendix: Redis Commands Reference
 | Operation | Command | Complexity |
 |-----------|---------|------------|
 | Enqueue task | `XADD` | O(1) |
 | Claim tasks | `XREADGROUP` | O(N) N=count |
 | Ack completion | `XACK` | O(1) |
 | Get pending | `XPENDING` | O(1) |
 | Claim pending | `XCLAIM` / `XAUTOCLAIM` | O(log N) |
 | Delay task | `ZADD` delayed | O(log N) |
 | Promote delayed | `ZRANGEBYSCORE` + `ZREM` + `XADD` | O(log N + M) |
 | Register worker | `HSET` + `ZADD` | O(1) |
 | Heartbeat | `ZADD` | O(log N) |
 | Detect dead | `ZRANGEBYSCORE` | O(log N + M) |
--- a/docs/ade-research.md
+++ b/docs/ade-research.md
@@ -0,0 +1,257 @@
 # Agent Development Environment (ADE) Research
 **Date:** March 17, 2026  
 **Purpose:** Compare existing ADE solutions to inform Letta Community ADE development
 ---
 ## Executive Summary
 The ADE category emerged in 2025 as agentic AI proved too complex for traditional IDE/CLI tooling. Three primary architectures exist:
 1. **Letta ADE** - Memory-first, context window transparency, multi-model
 2. **Intent (Augment)** - Spec-driven with coordinator/specialist/verifier pattern
 3. **Warp Oz** - Terminal-native with cloud orchestration
 Each approaches multi-agent orchestration differently, offering distinct tradeoffs for community implementation.
 ---
 ## 1. Letta ADE (Our Foundation)
 ### Core Philosophy
 > "Designing great agents is all about designing great context windows"
 Letta ADE makes the opaque world of context windows and agent reasoning **visible and manageable**.
 ### Key Features
 | Feature | Implementation |
 |---------|---------------|
 | **State & Memory** | Stateful agents that learn from interactions vs stateless LLMs |
 | **Context Management** | Editable memory blocks, tools, system prompts with character limits |
 | **Memory Architecture** | Core Memory (in-context blocks) + Archival/Recall Memory (vector DB) |
 | **Transparent Reasoning** | All agents must show their work - reasoning separated from user communication |
 | **Tool Integration** | 7,000+ tools via Composio, custom Python tool editor |
 | **Production Modes** | Simple/Interactive/Debug modes for different use cases |
 ### Architecture Highlights
 - **Core Memory**: Editable in-context blocks (`core_memory_append`, `core_memory_replace`)
 - **Archival Memory**: Vector database for free-form storage (`archival_memory_insert`, `archival_memory_search`)
 - **Recall Memory**: Automatic conversation history tracking (`conversation_search`)
 - **Context Pruning**: Recursive summarization + message pruning to manage window size
 ### Strengths
 ✅ Memory-first design (MemGPT heritage)  
 ✅ Transparent reasoning by design  
 ✅ Context window controls  
 ✅ Real-time tool execution in ADE  
 ✅ Production deployment ready
 ### Letta Code CLI Features
 - Client-side tool execution (Bash, Read, Write execute locally)
 - Streaming API with background mode for long operations
 - Conversations API for parallel sessions with shared memory
 - Subagent spawning via Task tool
 - Memory-first coding with persistent context
 ---
 ## 2. Intent by Augment Code
 ### Core Philosophy
 > "Spec-Driven Development puts the spec at the center of your workflow"
 Intent uses **living specifications** that update as agents work, preventing the "outdated PRD" problem.
 ### Key Features
 | Feature | Implementation |
 |---------|---------------|
 | **Spec-Driven** | Living spec as source of truth - updates as code changes |
 | **Coordinator Pattern** | Coordinator → Specialists → Verifier pipeline |
 | **Parallel Work** | Isolated git worktrees for concurrent agent execution |
 | **Specialist Agents** | Investigate, Implement, Verify, Critique, Debug, Code Review |
 | **BYOA** | Bring Your Own Agent (Claude Code, Codex, OpenCode supported) |
 | **Context Engine** | Semantic dependency analysis across 400,000+ files |
 ### Architecture: Coordinator/Specialist/Verifier
 ```
 Coordinator Agent
  ↓ analyzes codebase, drafts spec, generates tasks
 Specialist Agents (parallel in isolated worktrees)
  ↓ execute scoped tasks
 Verifier Agent
  ↓ validates against spec before merge
 Changes Tab
  ↓ human review, merge/stage/create PR
 ```
 ### Specialist Roles
 - **Investigate** - Explore codebase, assess feasibility
 - **Implement** - Execute implementation plans  
 - **Verify** - Check implementations match specs
 - **Critique** - Review specs for feasibility
 - **Debug** - Analyze and fix issues
 - **Code Review** - Automated reviews with severity
 ### Unique Features
 - **Git Worktree Isolation**: Each agent runs in independent working directory
 - **WARP.md**: Compatible with agents.md, claude.md for agent behavior
 - **Context Engine**: Call-graph and dependency-chain understanding
 - **Verifier Agent**: Catches misalignment before human review
 ### Compliance
 - SOC 2 Type II (zero deviations, Coalfire audited)
 - ISO/IEC 42001 (AI governance certification)
 - Customer-Managed Encryption Keys (CMEK)
 - Air-gapped deployment options
 ### Strengths
 ✅ Living specs prevent drift  
 ✅ Verifier catches misalignment  
 ✅ Enterprise compliance (dual certification)  
 ✅ BYOA prevents lock-in  
 ✅ Context Engine handles massive codebases
 ---
 ## 3. Warp Oz (Terminal-Native ADE)
 ### Core Philosophy
 > "Break out of your shell" - Terminal as the primary surface for agentic development
 Warp reimagines the terminal as an agent platform with **Oz orchestration**.
 ### Key Features
 | Feature | Implementation |
 |---------|---------------|
 | **Full Terminal Use** | Agents can run interactive CLI apps (REPLs, debuggers, top) |
 | **Cloud Agents** | Background agents on Warp infrastructure or self-hosted |
 | **Local Agents** | Real-time interactive coding in Warp terminal |
 | **Auto-Tracking** | Every agent produces link + audit trail |
 | **Multi-Model** | Mixed-model approach with fallback chains |
 | **Skills** | Reusable instructions (compatible with Claude Code, Codex) |
 ### Architecture: Oz Platform
 **Local Mode:**
 - Run directly in Warp app
 - Real-time, interactive assistance
 - Multi-step planning, debugging, fixing
 **Cloud Mode:**
 - Run on Warp infrastructure (or self-hosted)
 - Scheduled agents (cron-like)
 - Event triggers (Slack, GitHub, webhooks)
 - Parallel execution across repos
 ### Oz Capabilities
 - **Environments**: Docker containers + git repos + startup commands
 - **Session Sharing**: Links to track and steer agents
 - **Artifacts**: PRs, branches, plans automatically tracked
 - **Skills**: Any Skill can become an agent automation
 - **API/SDK/CLI**: Fully programmable agent stack
 ### Unique Features
 - **Multi-Repo Changes**: One agent can work across repos
 - **Computer Use**: Visual verification via screenshots
 - **Agent Session Sharing**: Hop into any running agent
 - **Cloud Mode**: Background automation with full visibility
 ### Performance Claims
 - Terminal-Bench: #1 ranked (52% → 61.2%)
 - SWE-bench Verified: 71%
 - 60%+ merged PRs created by Oz
 - 700K+ active developers
 ### Security
 - SOC 2 Type 2 certified
 - Contractual Zero Data Retention (ZDR) with Anthropic, OpenAI, Fireworks, Google
 - Configurable permissions (Never/Always allow/Prompt/Let agent decide)
 - Agent Profiles (Prod mode/YOLO mode)
 ### Strengths
 ✅ Full terminal control (unique in market)  
 ✅ Cloud agent infrastructure  
 ✅ Multi-repo changes  
 ✅ Contractual ZDR across all providers  
 ✅ Terminal-native workflow
 ---
 ## 4. Feature Comparison Matrix
 | Feature | Letta ADE | Intent | Warp Oz |
 |---------|-----------|--------|---------|
 | **Orchestration Model** | Memory-driven | Coordinator/Specialist/Verifier | Local + Cloud agents |
 | **Core Abstraction** | Context windows + Memory | Living specs + Git worktrees | Terminal + Environments |
 | **Multi-Agent** | Subagents via Task | Parallel specialists | Cloud agent pool |
 | **Isolation** | Memory blocks | Git worktrees | Docker environments |
 | **Context Strategy** | Hierarchical memory | Semantic Context Engine | Codebase indexing + MCP |
 | **Verification** | Tool return validation | Verifier agent | Human-in-the-loop |
 | **BYOA** | Open source, BYOK | Claude/Codex/OpenCode | Multi-model, BYOK |
 | **Compliance** | SOC 2 | SOC 2 + ISO 42001 | SOC 2 + ZDR |
 | **Scale** | Terminal-Bench #1 | 400K+ files | 700K+ developers |
 | **Unique** | Memory-first | Spec-driven | Terminal-native |
 ---
 ## 5. Community ADE Recommendations
 Based on this research, here's what a **Letta Community ADE** should prioritize:
 ### Phase 1: Foundation (Letta Already Has)
 - ✅ Memory-first architecture (Core/Archival/Recall)
 - ✅ Context window transparency
 - ✅ Subagent spawning (Task tool)
 - ✅ Real-time tool execution
 - ✅ Multi-model support
 ### Phase 2: Enhanced Orchestration (From Intent)
 - **Git Worktree Isolation**: Execute subagents in isolated branches
 - **Coordinator Pattern**: Formal coordinator/specialist/verifier roles
 - **Approval Queue Enhancement**: Structured task delegation
 - **Spec Tracking**: Document what was planned vs executed
 ### Phase 3: Scale Features (From Warp)
 - **Cloud Agent Mode**: Background agents with session tracking
 - **Multi-Repo Support**: Cross-repository changes
 - **Skills System**: Reusable agent instructions
 - **Session Sharing**: Links to share agent runs
 ### Phase 4: Advanced Features
 - **Verification Layer**: Automated spec compliance checking
 - **Context Engine**: Semantic dependency analysis
 - **Scheduling**: Recurring agent tasks
 - **Event Triggers**: React to GitHub/Slack events
 ---
 ## 6. Key Implementation Insights
 ### From Intent: Spec-Driven Works
 The "living spec" concept prevents the most common agent failure mode: drift between intent and implementation. Letta's memory blocks could serve this purpose with explicit "plan" vs "execution" blocks.
 ### From Warp: Terminal is Underrated
 Full terminal control enables agents to use the same tools developers use (REPLs, debuggers, etc.). Letta Code's Bash tool already supports this, but could be enhanced with "terminal session" preservation.
 ### From Letta: Memory is Differentiating
 Neither Intent nor Warp have Letta's tiered memory architecture. This is a unique strength to build upon - memory as the coordination layer, not just context.
 ---
 ## 7. Sources
 1. [Letta ADE Blog](https://www.letta.com/blog/introducing-the-agent-development-environment)
 2. [Letta ADE Docs](https://docs.letta.com/guides/ade/overview/)
 3. [Intent by Augment](https://www.augmentcode.com/product/intent)
 4. [Intent ADE Guide](https://www.augmentcode.com/guides/what-is-an-agentic-development-environment)
 5. [Warp Oz Platform](https://www.warp.dev/oz)
 6. [Warp Oz Launch](https://www.warp.dev/blog/oz-orchestration-platform-cloud-agents)
 ---
 *Generated by Ani (Letta agent) - March 17, 2026*
--- a/docs/community-ade-project-state.md
+++ b/docs/community-ade-project-state.md
@@ -0,0 +1,97 @@
 # Community ADE Project - State Management
 **Project:** Letta Community Agentic Development Environment  
 **Orchestrator:** Ani (Annie Tunturi)  
 **Created:** March 18, 2026  
 **Status:** Phase 1 - Orchestration Layer  
 ---
 ## Active Subagents
 | Subagent | Type | Status | Assigned Task | Output Location |
 |----------|------|--------|---------------|-----------------|
 | explorer-1 | explore | PENDING | Codebase exploration - task queue patterns | /tmp/ade-explorer-1/ |
 | architect-1 | feature-architect | PENDING | Design Redis queue integration | /tmp/ade-architect-1/ |
 | researcher-1 | researcher | COMPLETED | ADE competitive analysis | docs/community-ade-research-synthesis-2026-03-18.md |
 ---
 ## Document Registry
 ### Research Documents
 - [x] `community-ade-research-2026-03-18.md` - Initial research
 - [x] `ade-phase1-orchestration-design.md` - Phase 1 technical design
 - [x] `community-ade-research-synthesis-2026-03-18.md` - Web research synthesis
 - [x] `ade-phase1-execution-plan.md` - 6-week execution plan
 ### Design Documents
 - [x] `ade-redis-queue-design.md` - Redis queue architecture (COMPLETED by researcher-2)
 - [ ] `ade-task-queue-spec.md` - Detailed task queue specification (IN PROGRESS)
 - [ ] `ade-worker-pool-design.md` - Worker pool architecture (PENDING)
 - [ ] `ade-dashboard-wireframes.md` - Dashboard UI design (PENDING)
 ### Implementation
 - [ ] `ade-queue-prototype/` - In-memory prototype (NOT STARTED)
 - [ ] `ade-redis-queue/` - Redis-backed implementation (NOT STARTED)
 - [ ] `ade-worker-process/` - Worker daemon (NOT STARTED)
 ---
 ## Current Phase: Phase 1 - Orchestration Layer
 ### Goals
 1. Build persistent task queue system
 2. Implement worker pool for subagent execution
 3. Add retry logic with exponential backoff
 4. Integrate with existing Task tool
 ### Decisions Made
 - Use Redis (not Celery) for direct control
 - In-memory prototype first, then Redis
 - Worker pool with heartbeat monitoring
 - Defer Temporal to Phase 2 evaluation
 ### Open Questions
 - Should we use Redis Streams or Sorted Sets?
 - Worker count: Fixed or dynamic?
 - Task priority levels: Simple (high/normal) or granular?
 ---
 ## Subagent Work Queue
 ### Ready to Assign
 1. **Explore task queue patterns in codebase**
   - Type: explore
   - Focus: Find existing queue/spawning code
   - Output: File locations and patterns
 2. **Design Redis queue architecture**
   - Type: architect
   - Focus: Data models, operations, integration points
   - Output: Architecture spec document
 3. **Research Playwright Computer Use**
   - Type: researcher
   - Focus: Browser automation for agentic coding
   - Output: Integration approach
 ### Blocked
 - None currently
 ### Completed
 - [x] ADE competitive analysis (researcher-1)
 ---
 ## State Updates Log
 **2026-03-18 09:23 EDT** - Project initiated, research documents created
 **2026-03-18 10:01 EDT** - Attempting to spawn parallel subagents
 **2026-03-18 02:03 EDT** - explorer-1 completed: Found Task.ts (line 403), manager.ts (spawnSubagent at line 883), in-memory QueueRuntime class. No Redis currently exists.
 **2026-03-18 02:07 EDT** - researcher-2 completed: Redis queue architecture design. Key decisions: Redis Streams (consumer groups), Hash per task, 5s worker heartbeat, exponential backoff with jitter, adapter pattern integration.
 ---
 *This file is maintained by Ani. Update when subagents report progress.*
--- a/docs/community-ade-research-2026-03-18.md
+++ b/docs/community-ade-research-2026-03-18.md
@@ -0,0 +1,245 @@
 # Community ADE Research - Letta vs Commercial Alternatives
 **Date:** March 18, 2026
 **Researcher:** Ani (researcher subagent)
 **Goal:** Analyze Letta ADE capabilities and gaps vs Intent, Warp, and other commercial alternatives
 ---
 ## Executive Summary
 Letta has a **solid foundation** for an ADE (Agentic Development Environment) with best-in-class memory management and subagent orchestration. The gaps are primarily in **orchestration polish, web UI, and team collaboration features** rather than core agent capabilities.
 **Estimated effort to full community ADE:** 6-9 months for a small team
 ---
 ## 1. Current Letta ADE Capabilities
 ### ✅ Subagent Orchestration (MATURE)
 - Built-in approval queues with ACCEPT/REJECT flow
 - Multi-agent fan-out (parallel subagent spawning)
 - Result aggregation and error handling
 - Agent lifecycle management (create/deploy/destroy)
 - Conversation threading for complex workflows
 ### ✅ Git Integration (UNIQUE STRENGTH)
 - MemFS with git versioning
 - Worktree isolation for parallel execution
 - Automatic checkpointing
 - Branch/merge semantics for agent memory
 - Diff-based memory updates
 ### ✅ CLI Experience (MATURE)
 - Full headless mode (`--headless`)
 - JSON streaming output (`--output-format stream-json`)
 - System prompt customization (`--system-custom`)
 - Tool filtering (`--tools`, `--allowedTools`)
 - Session persistence
 ### ✅ Memory System (BEST-IN-CLASS)
 - Hierarchical memory blocks
 - Resident/On-Demand archival tiers
 - Clear-immune sacred memory
 - Memory consolidation (Aster background agent)
 - Git-backed durability
 ### ✅ Skills System (SOLID)
 - Portable, reusable tool packages
 - MCP (Model Context Protocol) integration
 - Custom skill development
 - Tool registry and discovery
 ### ✅ Model Agnostic
 - Works with any OpenAI-compatible API
 - Synthetic API integration
 - Local Ollama support
 - Multi-provider fallback
 ---
 ## 2. Gaps vs Commercial ADEs
 | Feature | Letta Status | Intent | Warp | Notes |
 |---------|-------------|--------|------|-------|
 | **Web Dashboard** | ❌ Missing | ✅ Full UI | ✅ Full UI | CLI-only currently |
 | **Triggers/Schedules** | ❌ Missing | ✅ GitHub webhooks | ✅ Built-in | No automated triggers |
 | **GitHub Integration** | ⚠️ Manual | ✅ Native PR reviews | ✅ PR comments | No native GitHub app |
 | **Persistent Task Queue** | ⚠️ In-process | ✅ Temporal/Cadence | ✅ Durable | Tasks lost on restart |
 | **Spec-Driven Dev** | ⚠️ Basic | ✅ Full PRD support | ✅ Constitution | No formal spec layer |
 | **Team Collaboration** | ❌ Missing | ✅ Shared workspaces | ✅ Org features | Single-user focus |
 | **Observability** | ⚠️ Logs only | ✅ Full traces | ✅ Metrics | No dashboard metrics |
 | **RBAC/Permissions** | ⚠️ Tool-level only | ✅ Role-based | ✅ Enterprise auth | No user roles |
 ---
 ## 3. Community ADE Implementation Roadmap
 ### Phase 1: Orchestration Layer (6-8 weeks)
 **Goal:** Persistent, durable task execution
 **Components:**
 - Task queue (Redis/RabbitMQ)
 - Durable execution (Temporal.io or Cadence)
 - Retry logic with exponential backoff
 - Task state persistence
 - Failure recovery
 **Key Decisions:**
 - Use existing Letta subagent system
 - Add queue abstraction layer
 - Maintain CLI compatibility
 ### Phase 2: Web Dashboard (4-6 weeks)
 **Goal:** Real-time visibility and control
 **Features:**
 - Agent grid view (running/pending/failed)
 - Real-time logs (WebSocket streaming)
 - Approval queue UI
 - Memory browser
 - Task history
 **Tech Stack:**
 - React + Vite
 - FastAPI backend
 - WebSocket for real-time updates
 - PostgreSQL for metadata
 ### Phase 3: Integration Ecosystem (3-4 weeks)
 **Goal:** Connect to external tools
 **Integrations:**
 - GitHub App (PR reviews, issue comments)
 - Slack/Discord notifications
 - Webhook triggers
 - CI/CD pipeline hooks
 - Linear/Jira ticket creation
 ### Phase 4: Specification Layer (8-10 weeks)
 **Goal:** PRD-driven development
 **Features:**
 - Executable specifications (Zod schemas)
 - Constitution enforcement
 - Architectural guardrails
 - Test-first enforcement
 - Spec versioning
 **Example Workflow:**
 ```yaml
 specification:
  prd: "./docs/prd.md"
  constitution:
    - "library-first: prefer existing packages"
    - "test-first: all code must have tests"
    - "simplicity: minimize abstraction layers"
 execution:
  generate_plan: true
  auto_implement: false
  review_checkpoints: true
 ```
 ### Phase 5: Team Collaboration (4-6 weeks)
 **Goal:** Multi-user, organization support
 **Features:**
 - Shared agent registry
 - Organization memory
 - Agent permissions/roles
 - Session sharing
 - Comment/annotation system
 ---
 ## 4. Technical Architecture Recommendations
 ### Recommended Stack
 | Layer | Technology | Rationale |
 |-------|------------|-----------|
 | **Orchestration** | Temporal.io | Durable execution, retries, observability |
 | **Queue** | Redis | Reliable, fast, pub/sub support |
 | **API** | FastAPI | Async native, easy WebSockets |
 | **Dashboard** | React + Vite | Modern, good DX, fast builds |
 | **Database** | PostgreSQL | ACID, JSON support, mature |
 | **Memory** | Git + Letta MemFS | Existing infrastructure |
 | **Auth** | OAuth 2.0 / SSO | Enterprise compatibility |
 ### Integration Points
 ```typescript
 // Letta API Integration
 interface LettaCloudConfig {
  lettaBaseUrl: string;
  apiKey: string;
  // Cloud features
  triggers: WebhookConfig[];
  schedules: ScheduleConfig[];
  // Team features
  organizationId: string;
  sharedAgents: string[];
 }
 // Specification Layer
 interface SpecificationConfig {
  prdPath: string;
  constitution: string[];
  guardrails: GuardrailRule[];
 }
 ```
 ---
 ## 5. Letta's Unique Advantages
 1. **Stateful by Design**: Unlike stateless alternatives, Letta's MemFS provides true persistence
 2. **Git-Native**: Version control for agent memory is unique to Letta
 3. **Model Agnostic**: Not locked into single provider
 4. **Open Source**: Full transparency, community extensible
 5. **CLI-First**: Developers love the terminal experience
 ---
 ## 6. Critical Path to MVP
 **Minimum Viable Community ADE:**
 1. ✅ **Already have**: Subagent orchestration, memory system, CLI
 2. 🔄 **Need soon**: Persistent task queue, basic dashboard
 3. 📋 **Next phase**: GitHub integration, triggers
 4. 📋 **Future**: Full spec layer, team features
 **Priorities:**
 1. Fix Task tool reliability (persistent queue)
 2. Build minimal dashboard (agent status, approvals)
 3. Add GitHub webhook support
 4. Implement basic spec validation
 ---
 ## 7. Conclusion
 ### Letta's Position
 - **Best memory system** in the open-source ADE space
 - **Solid foundation** for enterprise-grade ADE
 - **Unique git-native approach** provides durability others lack
 - **Gaps are solvable** with focused engineering effort
 ### Recommendation
 **Build the community ADE on Letta**. The core architecture is superior to many commercial alternatives. The missing pieces (dashboard, queue durability, integrations) are well-understood engineering problems.
 **Start with:**
 1. Persistent task queue (fixes current Task reliability issues)
 2. Minimal web dashboard (real-time visibility)
 3. GitHub webhook integration (proves external integration model)
 **Total effort to MVP:** 3-4 months focused work
 **Total effort to full ADE:** 6-9 months
 ---
 *Research conducted by Ani (researcher subagent) on March 18, 2026*
--- a/docs/community-ade-research-synthesis-2026-03-18.md
+++ b/docs/community-ade-research-synthesis-2026-03-18.md
@@ -0,0 +1,601 @@
 # Community ADE Research Synthesis
 **Date:** March 18, 2026  
 **Researcher:** Research Agent  
 **Goal:** Synthesize web research findings with existing documents and recommend technical next steps
 ---
 ## Executive Summary
 Based on web research and analysis of existing documents, the Letta community ADE opportunity is **stronger than initially assessed**. The research confirms:
 1. **Letta's core architecture is genuinely differentiated** - No open-source competitor combines stateful agents, hierarchical memory, and git-native persistence
 2. **Commercial ADEs (Warp, Intent) validate the market** but leave gaps Letta can fill
 3. **Technical patterns are well-established** - Redis queues, Temporal workflows, FastAPI + React dashboards are proven patterns
 4. **Community momentum exists** - Multiple open-source tools (Aider, Cline, Agno) show appetite for agentic development tools
 **Updated recommendation:** The 6-9 month estimate is realistic. The critical path is **orchestration layer + dashboard MVP**.
 ---
 ## 1. Letta Position Analysis (Updated)
 ### Current Strengths Confirmed
 From [docs.letta.com](https://docs.letta.com) and [github.com/letta-ai/letta](https://github.com/letta-ai/letta):
 | Feature | Letta | Competitive Landscape |
 |---------|-------|----------------------|
 | **Stateful Agents** | ✅ Core design | ❌ Most tools (Cline, Aider) are stateless |
 | **Memory Blocks** | ✅ Hierarchical, tiered | ❌ Simple conversation history only |
 | **Git-Native** | ✅ MemFS with versioning | ⚠️ Aider has git commits, no MemFS |
 | **Model Agnostic** | ✅ Any OpenAI-compatible | ⚠️ Many lock to specific providers |
 | **Subagents** | ✅ Built-in orchestration | ⚠️ Limited or external |
 | **Skills/MCP** | ✅ Portable tool packages | ✅ Emerging standard |
 ### Key Differentiator: Stateful + Memory
 Letta's **memory-first architecture** is unique in the open-source space:
 - **Agno** ([docs.agno.com](https://docs.agno.com)): Agent framework, but no persistent memory architecture
 - **Cline** ([github.com/cline/cline](https://github.com/cline/cline)): VS Code extension, session-only state
 - **Aider** ([github.com/Aider-AI/aider](https://github.com/Aider-AI/aider)): Git-aware but no agent memory
 This positions Letta as the **only open-source option** for long-running, learning agents that persist knowledge across sessions.
 ---
 ## 2. Commercial ADE Analysis
 ### Intent (intent.dev)
 **Core Value Proposition:** Specification-Driven Development (SDD)
 From [intent.dev](https://intent.dev):
 - **The Specification as the Asset**: PRD is source of truth
 - **15-Minute Workflows**: Automated documentation generation
 - **Architectural Guardrails**: "Development Constitution" for enforcing principles
 - **Total Pivotability**: Change spec → regenerate implementation
 **Key Insight for Letta:**
 Intent focuses on the *specification layer* - the "what" before the "how". This is Phase 4 in our roadmap. Intent's approach validates that **executable specifications are valuable**, but they don't provide the underlying agent infrastructure. Letta could integrate Intent-style spec validation without building the spec layer from scratch.
 **Gap Letta Can Fill:**
 - Intent appears to be a platform/service; Letta is open-source and self-hostable
 - Intent doesn't mention stateful/memory-enabled agents
 - No evidence of subagent orchestration
 ### Warp (warp.dev)
 **Core Value Proposition:** Terminal-native ADE with "Oz" agent
 From [warp.dev](https://warp.dev):
 - **Multi-model by default**: OpenAI, Anthropic, Google models
 - **Full Terminal Use**: Interactive terminal commands
 - **Computer Use**: Browser automation for verification
 - **MCP Support**: Linear, Figma, Slack, Sentry integration
 - **WARP.md**: Configuration files compatible with agents.md, claude.md
 **Key Features:**
 ```
 - IDE + CLI hybrid experience
 - Agent code review interface (line-level comments)
 - Universal Input: @file, image upload, URL attachment
 - Snapshot/restore for workspace exploration
 - Enterprise: SSO, audit trails, VPC support
 ```
 **Key Insight for Letta:**
 Warp validates the **terminal-native workflow** but extends it with:
 1. **Rich prompt context** (@mentions, images, URLs)
 2. **Code review UI** in terminal (Letta could add approval queue UI)
 3. **MCP ecosystem** (Letta already supports skills, should expand MCP)
 4. **Workspace snapshots** (Letta's git worktrees provide similar isolation)
 **Critical Gap Warp Leaves:**
 - No mention of durable execution or task persistence
 - Appears to be single-session focused
 - No subagent orchestration (single agent model)
 - No persistent memory across sessions
 ### Antigravity
 **Status:** No functional website found. May be early-stage or rebranded.
 ---
 ## 3. Open-Source Community Landscape
 ### Aider (Most Direct Comparison)
 From [github.com/Aider-AI/aider](https://github.com/Aider-AI/aider):
 **Strengths:**
 - Multi-file editing with diff view
 - Automatic git commits with sensible messages
 - Repository mapping for large codebases
 - Voice support, image input
 - Works with almost any LLM (including local)
 - Lint/test integration (auto-fix)
 - Strong community (28k+ GitHub stars)
 **Architecture:**
 - Git-repo centered (not stateful agents)
 - Edits files directly (no MemFS)
 - No subagent orchestration
 - Session-based (no persistent memory)
 **Lessons for Letta:**
 - Aider's git integration patterns are excellent (auto-commit, sensible messages)
 - Codebase mapping is crucial for large projects
 - Diff-based editing is user-friendly
 - Multi-model support is table stakes
 ### Cline (VS Code Extension)
 From [github.com/cline/cline](https://github.com/cline/cline):
 **Strengths:**
 - Human-in-the-loop GUI (approve every change)
 - VS Code native integration
 - Computer Use (browser automation)
 - MCP support for custom tools
 - Workspace snapshots and restore
 - Token/cost tracking
 - Terminal integration in editor
 **Architecture:**
 - Extension-based (VS Code only)
 - Session-based state
 - No subagent orchestration
 - No persistent memory
 **Lessons for Letta:**
 - Human-in-the-loop approval is important for trust
 - Workspace snapshots enable safe exploration
 - Cost transparency (token tracking) is valuable
 - Browser automation (Computer Use) is becoming standard
 ### Agno
 From [docs.agno.com](https://docs.agno.com):
 **Positioning:** "AgentOS" - platform for building agents
 **Observations:**
 - Focus on production deployment patterns
 - Less mature than Letta in documentation
 - No clear differentiation on memory/state
 ---
 ## 4. Technical Pattern Research
 ### 4.1 Task Queues: Redis + Python
 From [redis.io/docs](https://redis.io/docs/latest/develop/clients/redis-py/):
 **Redis-py patterns for Letta:**
 ```python
 import redis
 # Connection with auto-decode
 r = redis.Redis(host='localhost', port=6379, decode_responses=True)
 # Hash for task storage (matches our design)
 r.hset(f'task:{task_id}', mapping={
    'status': 'pending',
    'prompt': prompt,
    'subagent_type': subagent_type,
    'created_at': timestamp,
 })
 # Sorted Set for priority queue
 r.zadd('queue:pending', {task_id: priority})
 # Atomic dequeue with WATCH/MULTI/EXEC
 ```
 **Key Insights:**
 - `decode_responses=True` eliminates manual byte decoding
 - `hset` with `mapping=` is clean for task storage
 - Redis transactions (MULTI/EXEC) provide atomic queue operations
 - Consider `redis[hiredis]` for performance
 **Alternative: Celery**
 From [docs.celeryq.dev](https://docs.celeryq.dev):
 Celery provides:
 - Distributed task processing
 - Real-time + scheduling
 - Worker management
 - Built-in retry logic
 **Recommendation:** For Phase 1, use **raw Redis** (not Celery):
 - Celery adds abstraction layer that may conflict with Letta's specific needs
 - Our task queue has unique requirements (subagent spawning, git worktrees)
 - Raw Redis gives more control over the data model
 - Can migrate to Celery later if needed
 ### 4.2 Durable Execution: Temporal
 From [temporal.io](https://temporal.io):
 **Core Value:** "Write code as if failure doesn't exist"
 **Key Concepts:**
 - **Workflows**: Durable, fault-tolerant business logic
 - **Activities**: Retryable, failure-prone operations
 - **State persistence**: Automatic checkpointing
 - **Signals**: External events to running workflows
 **Temporal for Letta ADE:**
 ```python
 # Potential Workflow structure
@workflow.defn
 class SubagentWorkflow:
    @workflow.run
    async def run(self, task_id: str):
        # Activity: Spawn subagent (may fail)
        result = await workflow.execute_activity(
            spawn_subagent,
            task_id,
            start_to_close_timeout=timedelta(minutes=5),
            retry_policy=RetryPolicy(maximum_attempts=3)
        )
        # Activity: Wait for completion (long-running)
        final_result = await workflow.execute_activity(
            poll_subagent_completion,
            result.agent_id,
            start_to_close_timeout=timedelta(hours=1),
        )
        return final_result
 ```
 **Decision Matrix:**
 | Approach | Complexity | Durability | Letta Fit |
 |----------|-----------|------------|-----------|
 | Raw Redis + Workers | Low | Medium | ✅ Good fit |
 | Temporal | Medium | High | ⚠️ Overkill? |
 | Celery | Low | Medium | ⚠️ Too abstract |
 **Recommendation:**
 - **Phase 1**: Use Redis + custom workers (existing design)
 - **Phase 2**: Evaluate Temporal for complex multi-step workflows
 - Temporal shines for **long-running, multi-step** workflows with human-in-the-loop
 - Letta's subagent tasks are relatively simple (spawn → run → complete)
 ### 4.3 Web Dashboard: React + FastAPI Patterns
 From [ui.shadcn.com](https://ui.shadcn.com):
 **Shadcn/ui** provides:
 - Unstyled, accessible components
 - Tailwind CSS integration
 - Customizable design system
 - Modern React patterns
 **Recommended Dashboard Stack:**
 ```
 Frontend:
 - React + Vite (existing plan)
 - shadcn/ui for components
 - TanStack Query for data fetching
 - WebSocket client for real-time updates
 Backend:
 - FastAPI (async native)
 - WebSocket support built-in
 - Redis for pub/sub
 - PostgreSQL for metadata
 ```
 **Dashboard Features (Prioritized):**
 | Priority | Feature | Tech |
 |----------|---------|------|
 | P0 | Agent grid view | React + TanStack Query |
 | P0 | Real-time logs | WebSocket |
 | P1 | Approval queue UI | shadcn Dialog + Table |
 | P1 | Task history | TanStack Query + Pagination |
 | P2 | Memory browser | Tree view component |
 | P2 | Metrics dashboard | Recharts or Tremor |
 ---
 ## 5. Updated Gap Analysis
 ### Revised Comparison Table
 | Feature | Letta | Intent | Warp | Aider | Cline |
 |---------|-------|--------|------|-------|-------|
 | **Web Dashboard** | ❌ | ✅ | ✅ | ❌ | ❌ |
 | **Persistent Tasks** | ⚠️ | ✅ | ⚠️ | ❌ | ❌ |
 | **Stateful Agents** | ✅ | ⚠️ | ❌ | ❌ | ❌ |
 | **Subagent Orchestration** | ✅ | ❌ | ❌ | ❌ | ❌ |
 | **Git-Native Memory** | ✅ | ❌ | ❌ | ⚠️ | ❌ |
 | **MCP/Skills** | ✅ | ❌ | ✅ | ❌ | ✅ |
 | **Approval Queues** | ✅ CLI | ✅ | ✅ | ❌ | ✅ |
 | **Multi-Model** | ✅ | ? | ✅ | ✅ | ✅ |
 | **Computer Use** | ❌ | ? | ✅ | ❌ | ✅ |
 | **Spec-Driven Dev** | ⚠️ | ✅ | ❌ | ❌ | ❌ |
 **Key Insights:**
 1. **Letta leads in agent infrastructure** (state, memory, orchestration)
 2. **Commercial tools lead in UX** (dashboards, IDE integration)
 3. **No competitor has Letta's core combination** (stateful + memory + git + subagents)
 4. **Computer Use** is becoming standard (Warp, Cline have it)
 ---
 ## 6. Critical Gaps Identified
 ### High Priority (MVP Blockers)
 1. **Computer Use / Browser Automation**
   - Warp and Cline both offer browser automation
   - Letta agents should be able to launch browsers, click, screenshot
   - Enables web testing, visual verification
 2. **Rich Context in Prompts**
   - Warp's "Universal Input" (@file, images, URLs)
   - Currently requires manual file reading
   - Should support: `@path/to/file`, `@https://example.com`, drag-drop images
 3. **Workspace Snapshots**
   - Cline's snapshot's/restore feature
   - Letta git worktrees provide foundation
   - Need UI for "save point" and "restore to point"
 ### Medium Priority (Competitive Parity)
 4. **Cost Tracking**
   - Cline shows token usage and cost per request
   - Important for agent transparency
 5. **Voice Input Flow**
   - Warp integrates with Wispr
   - Nice-to-have, not MVP critical
 6. **MCP Ecosystem Expansion**
   - Expand skills to full MCP server support
   - Community MCP marketplace
 ---
 ## 7. Revised Implementation Recommendations
 ### Phase 1: Orchestration Layer (Refined)
 **Duration:** 6 weeks (was 8)
 **Changes from original design:**
 1. **Week 1-2**: In-memory prototype (unchanged)
 2. **Week 3-4**: Redis integration + worker pool (unchanged)
 3. **Week 5**: API endpoints (REST + WebSocket)
 4. **Week 6**: Task tool integration + testing
 **Additions:**
 - WebSocket endpoint for real-time task updates
 - Cost tracking (token counting)
 - Rich context parsing (@file, URLs)
 **Deferred:**
 - Temporal integration (evaluate in Phase 2)
 - Advanced retry policies (basic exponential backoff sufficient)
 ### Phase 2: Dashboard MVP (Refined)
 **Duration:** 4 weeks
 **Week 1**: Project setup + Agent grid view
 - Vite + React + shadcn/ui setup
 - TanStack Query integration
 - Basic agent status display
 **Week 2**: Real-time features
 - WebSocket connection
 - Live log streaming
 - Status updates
 **Week 3**: Approval queue UI
 - Pending approvals list
 - Accept/Reject buttons
 - Comment/feedback input
 **Week 4**: Task history + polish
 - Task list with filtering
 - Detail view
 - Error display
 **Additions based on research:**
 - Cost display (per task, total)
 - Workspace snapshot indicator
 - @mention support in prompts
 ### Phase 3: Integration Ecosystem (Unchanged)
 **Add priority:**
 1. GitHub App (highest - matches Aider/Warp)
 2. Slack notifications
 3. Linear/Jira (MCP-based)
 4. Webhook triggers
 ### Phase 4: Computer Use (NEW PHASE)
 **Duration:** 4 weeks
 **Rationale:** Computer Use is becoming table stakes (Warp, Cline have it)
 **Scope:**
 - Browser automation (Playwright integration)
 - Screenshot capture
 - Click/type/scroll actions
 - Visual verification workflows
 **Integration:**
 - New skill: `computer_use`
 - Subagent can launch browser
 - Screenshots stored in MemFS
 ### Phase 5: Specification Layer (Refined)
 **Duration:** 6 weeks (was 8-10)
 **Scope reduction:**
 - Start with PRD validation (Zod schemas)
 - Basic constitution enforcement (regex + AST rules)
 - No full natural language spec parsing yet
 **Deferred:**
 - Full spec regeneration (Intent-level functionality)
 - Architectural diagram generation
 ### Phase 6: Team Collaboration (Unchanged)
 ---
 ## 8. Technical Stack Recommendations (Updated)
 ### Orchestration Layer
 | Component | Original | Updated | Rationale |
 |-----------|----------|---------|-----------|
 | Queue | Redis | Redis ✅ | Proven, matches research |
 | Durable Execution | Temporal | Redis + Workers | Temporal overkill for Phase 1 |
 | Workers | Python asyncio | Python asyncio ✅ | Good fit |
 | API | FastAPI | FastAPI ✅ | Async native, WebSocket support |
 ### Dashboard
 | Component | Original | Updated | Rationale |
 |-----------|----------|---------|-----------|
 | Framework | React | React ✅ | Standard |
 | Build Tool | Vite | Vite ✅ | Fast, modern |
 | UI Library | - | shadcn/ui | Accessible, customizable |
 | Styling | - | Tailwind CSS | Standard with shadcn |
 | Data Fetching | - | TanStack Query | Caching, real-time sync |
 | Charts | - | Tremor/Recharts | Dashboard metrics |
 ### Additional Components
 | Component | Recommendation |
 |-----------|----------------|
 | Browser Automation | Playwright |
 | Rich Context Parsing | Custom parser (@file, URL regex) |
 | Cost Tracking | Token counting in subagent wrapper |
 | WebSocket | FastAPI native + Redis pub/sub |
 ---
 ## 9. Risks and Mitigations
 ### Identified Risks
 | Risk | Impact | Mitigation |
 |------|--------|------------|
 | Web search unavailable for research | Medium | Use fetch_webpage for known URLs |
 | Temporal overengineering | High | Defer to Phase 2 evaluation |
 | Dashboard scope creep | High | Strict MVP definition (4 weeks) |
 | Computer Use complexity | Medium | Use Playwright, limit scope |
 | Competition pace | Medium | Focus on Letta differentiators |
 ### Competitive Response
 **If Warp releases open-source:**
 - Warp is terminal + IDE hybrid; Letta is agent infrastructure
 - Different target users (Warp = developers, Letta = agent builders)
 - Letta's stateful/memory approach still differentiated
 **If Intent releases spec layer as open standard:**
 - Letta could adopt Intent spec format
 - Focus on execution infrastructure
 - Potential collaboration opportunity
 ---
 ## 10. Next Steps (Prioritized)
 ### Immediate (This Week)
 1. **Decision:** Confirm Redis-only vs Temporal evaluation
 2. **Prototype:** Build in-memory task queue (Week 1)
 3. **Research:** Playwright integration for Computer Use
 4. **Design:** Dashboard wireframes (shadcn components)
 ### Week 2-3
 5. **Implement:** Redis integration
 6. **Test:** Worker pool with 3+ concurrent workers
 7. **API:** REST endpoints for task management
 ### Week 4-6
 8. **Integrate:** Modify Task tool to use queue
 9. **WebSocket:** Real-time updates endpoint
 10. **Dashboard:** Start React project setup
 ### Success Metrics
 | Metric | Target |
 |--------|--------|
 | Task durability | 0% loss on restart |
 | Worker concurrency | 5+ parallel tasks |
 | Dashboard load time | <2 seconds |
 | Approval latency | <1 second from event |
 ---
 ## 11. Conclusion
 ### Updated Assessment
 **Letta's position is stronger than initially thought:**
 1. **No open-source competitor** has the combination of:
   - Stateful agents with hierarchical memory
   - Git-native persistence
   - Subagent orchestration
   - Model agnostic design
 2. **Commercial tools validate the market** but focus on different layers:
   - Intent: Specification layer (Letta can integrate)
   - Warp: Terminal UX (Letta can offer alternative)
 3. **Technical patterns are well-understood**:
   - Redis queues: Proven, simple
   - Temporal: Powerful but may be overkill
   - React + FastAPI: Standard, well-supported
 ### Final Recommendation
 **Proceed with Phase 1 (Orchestration) immediately.**
 The research confirms:
 - The orchestration layer design is sound
 - Redis is the right choice for Phase 1
 - The dashboard stack (React + shadcn + TanStack Query) is industry standard
 - Competitive pressure is real but Letta has unique advantages
 **Revised Timeline:**
 - Phase 1 (Orchestration): 6 weeks
 - Phase 2 (Dashboard): 4 weeks
 - Phase 3 (Integrations): 4 weeks
 - Phase 4 (Computer Use): 4 weeks
 - Phase 5 (Specifications): 6 weeks
 - Phase 6 (Team): 4 weeks
 **Total to full ADE:** 7 months (was 6-9)
 **MVP (Phases 1-2):** 10 weeks (was 3-4 months)
 ---
 *Research synthesis conducted on March 18, 2026*
 *Sources: Letta docs, Intent.dev, Warp.dev, Temporal.io, Redis docs, Celery docs, GitHub (Aider, Cline)*