Initial commit: Community ADE foundation
- Project structure: docs/, src/, tests/, proto/ - Research synthesis: Letta vs commercial ADEs - Architecture: Redis Streams queue design - Phase 1 orchestration design - Execution plan and project state tracking - Working subagent system (manager.ts fixes) This is the foundation for a Community ADE built on Letta's stateful agent architecture with git-native MemFS. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta Code <noreply@letta.com>
This commit is contained in:
46
.gitignore
vendored
Normal file
46
.gitignore
vendored
Normal file
@@ -0,0 +1,46 @@
|
||||
# Dependencies
|
||||
node_modules/
|
||||
package-lock.json
|
||||
yarn.lock
|
||||
pnpm-lock.yaml
|
||||
|
||||
# Build outputs
|
||||
dist/
|
||||
build/
|
||||
*.tsbuildinfo
|
||||
|
||||
# Environment
|
||||
.env
|
||||
.env.local
|
||||
.env.*.local
|
||||
|
||||
# IDE
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
# OS
|
||||
.DS_Store
|
||||
Thumbs.db
|
||||
|
||||
# Logs
|
||||
logs/
|
||||
*.log
|
||||
npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
|
||||
# Testing
|
||||
coverage/
|
||||
.nyc_output/
|
||||
|
||||
# Redis
|
||||
dump.rdb
|
||||
*.rdb
|
||||
|
||||
# Temporary
|
||||
tmp/
|
||||
temp/
|
||||
*.tmp
|
||||
57
README.md
Normal file
57
README.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# Community ADE (Agentic Development Environment)
|
||||
|
||||
A community-driven, open-source agentic development environment built on Letta's stateful agent architecture.
|
||||
|
||||
## Vision
|
||||
|
||||
Build an open-source ADE that combines:
|
||||
- **Stateful agents** with hierarchical memory (Letta's unique strength)
|
||||
- **Git-native persistence** with MemFS versioning
|
||||
- **Persistent task queues** for durable subagent execution
|
||||
- **Web dashboard** for real-time monitoring and control
|
||||
- **Computer Use** integration for browser automation
|
||||
|
||||
## Differentiation
|
||||
|
||||
Unlike commercial alternatives (Warp, Intent), Community ADE is:
|
||||
- **Open source** and self-hostable
|
||||
- **Stateful by design** - agents remember across sessions
|
||||
- **Model agnostic** - use any OpenAI-compatible API
|
||||
- **Git-native** - version control for agent memory
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
├── src/ # Queue implementation and worker pool
|
||||
├── tests/ # Test suite
|
||||
├── docs/ # Architecture and design documents
|
||||
├── proto/ # Prototypes and experiments
|
||||
└── README.md # This file
|
||||
```
|
||||
|
||||
## Documentation
|
||||
|
||||
- [Project State](docs/community-ade-project-state.md) - Current status and active subagents
|
||||
- [Phase 1 Design](docs/ade-phase1-orchestration-design.md) - Task queue architecture
|
||||
- [Redis Queue Design](docs/ade-redis-queue-design.md) - Detailed Redis implementation spec
|
||||
- [Research Synthesis](docs/community-ade-research-synthesis-2026-03-18.md) - Competitive analysis
|
||||
|
||||
## Phase 1: Orchestration Layer (In Progress)
|
||||
|
||||
Goals:
|
||||
1. ✅ Research and design complete
|
||||
2. 🔄 Redis task queue implementation
|
||||
3. ⏳ Worker pool with heartbeat
|
||||
4. ⏳ Integration with Letta Task tool
|
||||
|
||||
## Quick Start
|
||||
|
||||
Coming soon - queue prototype implementation.
|
||||
|
||||
## License
|
||||
|
||||
MIT - Community contribution welcome.
|
||||
|
||||
---
|
||||
|
||||
*Project orchestrated by Ani, with research and design by specialized subagents.*
|
||||
525
docs/ade-phase1-execution-plan.md
Normal file
525
docs/ade-phase1-execution-plan.md
Normal file
@@ -0,0 +1,525 @@
|
||||
# Phase 1 Execution Plan: Orchestration Layer
|
||||
|
||||
**Date:** March 18, 2026
|
||||
**Status:** Ready for Implementation
|
||||
**Estimated Duration:** 6 weeks
|
||||
**Owner:** TBD
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This document provides actionable implementation guidance for Phase 1 of the Community ADE, based on synthesized research from commercial tools (Intent, Warp) and open-source alternatives (Aider, Cline, Agno).
|
||||
|
||||
---
|
||||
|
||||
## Key Research Insights
|
||||
|
||||
### 1. Letta's Competitive Position
|
||||
|
||||
**✅ Strongest Open-Source Position:**
|
||||
- No competitor combines: stateful agents + hierarchical memory + git-native persistence + subagent orchestration
|
||||
- Aider has git integration but no agent memory
|
||||
- Cline is session-based with no persistence
|
||||
- Agno lacks Letta's memory architecture
|
||||
|
||||
**⚠️ Commercial Tools Lead in UX:**
|
||||
- Warp: Terminal-native with rich context (@file, images)
|
||||
- Intent: Specification-driven development
|
||||
- Both have web dashboards; Letta needs one
|
||||
|
||||
### 2. Technical Pattern Validation
|
||||
|
||||
**Redis + Workers (Selected for Phase 1):**
|
||||
- ✅ Proven pattern (Celery uses Redis under hood)
|
||||
- ✅ Simpler than Temporal for our use case
|
||||
- ✅ More control over data model
|
||||
- ⚠️ Temporal deferred to Phase 2 evaluation
|
||||
|
||||
**React + FastAPI (Selected for Phase 2):**
|
||||
- ✅ Industry standard
|
||||
- ✅ shadcn/ui provides accessible components
|
||||
- ✅ TanStack Query for caching/real-time sync
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 Scope
|
||||
|
||||
### Goals
|
||||
1. Replace in-process Task execution with persistent queue
|
||||
2. Ensure tasks survive agent restarts
|
||||
3. Support 5+ concurrent workers
|
||||
4. Maintain backward compatibility
|
||||
|
||||
### Out of Scope (Phase 2+)
|
||||
- Web dashboard (Phase 2)
|
||||
- Temporal workflows (Phase 2 evaluation)
|
||||
- GitHub integration (Phase 3)
|
||||
- Computer Use (Phase 4)
|
||||
|
||||
---
|
||||
|
||||
## Implementation Breakdown
|
||||
|
||||
### Week 1: In-Memory Prototype
|
||||
|
||||
**Deliverables:**
|
||||
- [ ] `TaskQueue` class with asyncio.Queue
|
||||
- [ ] Task dataclass with all fields
|
||||
- [ ] Worker process skeleton
|
||||
- [ ] Basic enqueue/dequeue/complete/fail operations
|
||||
|
||||
**Testing:**
|
||||
```python
|
||||
# Test: Task survives worker crash
|
||||
# Test: Concurrent task execution
|
||||
# Test: Priority ordering
|
||||
```
|
||||
|
||||
**Code Structure:**
|
||||
```
|
||||
letta_ade/
|
||||
├── __init__.py
|
||||
├── queue/
|
||||
│ ├── __init__.py
|
||||
│ ├── models.py # Task dataclass, enums
|
||||
│ ├── memory_queue.py # Week 1 implementation
|
||||
│ └── base.py # Abstract base class
|
||||
└── worker/
|
||||
├── __init__.py
|
||||
└── runner.py # Worker process logic
|
||||
```
|
||||
|
||||
### Week 2: Redis Integration
|
||||
|
||||
**Deliverables:**
|
||||
- [ ] Redis connection manager
|
||||
- [ ] Task serialization (JSON/pickle)
|
||||
- [ ] Atomic dequeue with WATCH/MULTI/EXEC
|
||||
- [ ] Status tracking (Sets per status)
|
||||
|
||||
**Redis Schema:**
|
||||
```redis
|
||||
# Task storage
|
||||
HSET task:{uuid} field value ...
|
||||
|
||||
# Priority queue (pending)
|
||||
ZADD queue:pending {priority} {task_id}
|
||||
|
||||
# Running tasks
|
||||
ZADD queue:running {started_at} {task_id}
|
||||
|
||||
# Status index
|
||||
SADD status:pending {task_id}
|
||||
SADD status:running {task_id}
|
||||
SADD status:completed {task_id}
|
||||
SADD status:failed {task_id}
|
||||
|
||||
# User index
|
||||
SADD user:{user_id}:tasks {task_id}
|
||||
```
|
||||
|
||||
**Dependencies:**
|
||||
```toml
|
||||
[dependencies]
|
||||
redis = { version = "^5.0", extras = ["hiredis"] }
|
||||
```
|
||||
|
||||
### Week 3-4: Worker Pool + Heartbeat
|
||||
|
||||
**Deliverables:**
|
||||
- [ ] Multiple worker processes
|
||||
- [ ] Worker heartbeat (every 30s)
|
||||
- [ ] Stall detection (2x heartbeat timeout)
|
||||
- [ ] Graceful shutdown handling
|
||||
- [ ] Worker capacity management
|
||||
|
||||
**Worker Logic:**
|
||||
```python
|
||||
async def worker_loop(agent_id: UUID, queue: TaskQueue):
|
||||
while running:
|
||||
# Send heartbeat
|
||||
await queue.heartbeat(agent_id)
|
||||
|
||||
# Try to get task (5s timeout)
|
||||
task = await queue.dequeue(agent_id, timeout_ms=5000)
|
||||
|
||||
if task:
|
||||
# Spawn subagent process
|
||||
proc = await asyncio.create_subprocess_exec(
|
||||
"letta", "run-agent",
|
||||
f"--task-id={task.id}",
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.PIPE
|
||||
)
|
||||
|
||||
# Wait for completion
|
||||
stdout, stderr = await proc.communicate()
|
||||
|
||||
# Update queue
|
||||
if proc.returncode == 0:
|
||||
await queue.complete(task.id, parse_result(stdout))
|
||||
else:
|
||||
await queue.fail(task.id, stderr.decode())
|
||||
|
||||
# Brief pause to prevent tight loop
|
||||
await asyncio.sleep(0.1)
|
||||
```
|
||||
|
||||
**Stall Recovery (Cron job):**
|
||||
```python
|
||||
async def recover_stalled_tasks(queue: TaskQueue, max_age: timedelta):
|
||||
"""Requeue tasks from crashed workers."""
|
||||
stalled = await queue.find_stalled(max_age)
|
||||
for task_id in stalled:
|
||||
await queue.requeue(task_id)
|
||||
```
|
||||
|
||||
### Week 5: API Layer
|
||||
|
||||
**Deliverables:**
|
||||
- [ ] FastAPI application structure
|
||||
- [ ] REST endpoints (CRUD for tasks)
|
||||
- [ ] WebSocket endpoint for real-time updates
|
||||
- [ ] Authentication middleware
|
||||
|
||||
**REST Endpoints:**
|
||||
```python
|
||||
@app.post("/tasks")
|
||||
async def create_task(task: TaskCreate) -> TaskResponse:
|
||||
"""Enqueue a new task."""
|
||||
task_id = await queue.enqueue(task)
|
||||
return TaskResponse(task_id=task_id, status="pending")
|
||||
|
||||
@app.get("/tasks/{task_id}")
|
||||
async def get_task(task_id: UUID) -> Task:
|
||||
"""Get task status and result."""
|
||||
return await queue.get(task_id)
|
||||
|
||||
@app.get("/tasks")
|
||||
async def list_tasks(
|
||||
user_id: str,
|
||||
status: Optional[TaskStatus] = None
|
||||
) -> List[TaskSummary]:
|
||||
"""List tasks with optional filtering."""
|
||||
return await queue.list_by_user(user_id, status)
|
||||
|
||||
@app.post("/tasks/{task_id}/cancel")
|
||||
async def cancel_task(task_id: UUID):
|
||||
"""Cancel a pending or running task."""
|
||||
await queue.cancel(task_id)
|
||||
|
||||
@app.post("/tasks/{task_id}/retry")
|
||||
async def retry_task(task_id: UUID):
|
||||
"""Retry a failed task."""
|
||||
await queue.retry(task_id)
|
||||
```
|
||||
|
||||
**WebSocket:**
|
||||
```python
|
||||
@app.websocket("/ws")
|
||||
async def websocket_endpoint(websocket: WebSocket):
|
||||
await websocket.accept()
|
||||
|
||||
# Subscribe to Redis pub/sub for updates
|
||||
pubsub = redis.pubsub()
|
||||
pubsub.subscribe("task_updates")
|
||||
|
||||
async for message in pubsub.listen():
|
||||
if message["type"] == "message":
|
||||
await websocket.send_json(message["data"])
|
||||
```
|
||||
|
||||
### Week 6: Task Tool Integration
|
||||
|
||||
**Deliverables:**
|
||||
- [ ] Modify existing Task tool to use queue
|
||||
- [ ] `persist` flag for backward compatibility
|
||||
- [ ] Polling support for task completion
|
||||
- [ ] Migration guide for existing code
|
||||
|
||||
**Modified Task Tool:**
|
||||
```python
|
||||
class TaskTool:
|
||||
async def run(
|
||||
self,
|
||||
prompt: str,
|
||||
subagent_type: str,
|
||||
# ... existing args ...
|
||||
persist: bool = False, # NEW
|
||||
priority: int = 100, # NEW
|
||||
wait: bool = False, # NEW
|
||||
timeout: int = 300, # NEW
|
||||
) -> TaskResult:
|
||||
|
||||
if persist:
|
||||
# Enqueue and optionally wait
|
||||
task_id = await self.queue.enqueue(...)
|
||||
|
||||
if wait:
|
||||
# Poll for completion
|
||||
result = await self._wait_for_task(task_id, timeout)
|
||||
return result
|
||||
else:
|
||||
# Return immediately with task_id
|
||||
return TaskResult(task_id=task_id, status="pending")
|
||||
else:
|
||||
# Legacy immediate execution
|
||||
return await self._execute_immediately(...)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Technical Specifications
|
||||
|
||||
### Task Data Model
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class Task:
|
||||
id: UUID
|
||||
subagent_type: str
|
||||
prompt: str
|
||||
system_prompt: Optional[str]
|
||||
model: Optional[str]
|
||||
|
||||
# State
|
||||
status: TaskStatus
|
||||
priority: int = 100
|
||||
created_at: datetime
|
||||
started_at: Optional[datetime]
|
||||
completed_at: Optional[datetime]
|
||||
|
||||
# Execution
|
||||
agent_id: Optional[UUID]
|
||||
retry_count: int = 0
|
||||
max_retries: int = 3
|
||||
|
||||
# Results
|
||||
result: Optional[dict]
|
||||
error: Optional[str]
|
||||
exit_code: Optional[int]
|
||||
|
||||
# Metadata
|
||||
tags: List[str]
|
||||
user_id: str
|
||||
parent_task: Optional[UUID]
|
||||
|
||||
# Cost tracking (NEW)
|
||||
input_tokens: int = 0
|
||||
output_tokens: int = 0
|
||||
estimated_cost: float = 0.0
|
||||
```
|
||||
|
||||
### Retry Logic
|
||||
|
||||
```python
|
||||
async def retry_with_backoff(task: Task) -> bool:
|
||||
if task.retry_count >= task.max_retries:
|
||||
return False # Permanent failure
|
||||
|
||||
# Exponential backoff: 2^retry_count seconds
|
||||
delay = min(2 ** task.retry_count, 300) # Cap at 5 min
|
||||
|
||||
await asyncio.sleep(delay)
|
||||
task.retry_count += 1
|
||||
|
||||
# Re-enqueue with same priority
|
||||
await queue.enqueue(task, priority=task.priority)
|
||||
return True
|
||||
```
|
||||
|
||||
### Error Classification
|
||||
|
||||
| Error | Retry? | Action |
|
||||
|-------|--------|--------|
|
||||
| Subagent crash | Yes | Requeue with backoff |
|
||||
| Syntax error | No | Fail immediately |
|
||||
| API rate limit | Yes | Exponential backoff |
|
||||
| Out of memory | No | Alert admin, fail |
|
||||
| Redis connection | Yes | Reconnect, retry |
|
||||
| Timeout | Yes | Retry with longer timeout |
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
```python
|
||||
# test_queue.py
|
||||
def test_enqueue_creates_pending_task():
|
||||
def test_dequeue_removes_from_pending():
|
||||
def test_complete_moves_to_completed():
|
||||
def test_fail_triggers_retry():
|
||||
def test_max_retries_exceeded():
|
||||
def test_cancel_stops_running_task():
|
||||
```
|
||||
|
||||
### Integration Tests
|
||||
```python
|
||||
# test_worker.py
|
||||
async def test_worker_processes_task():
|
||||
async def test_worker_handles_failure():
|
||||
async def test_worker_heartbeat():
|
||||
async def test_stall_recovery():
|
||||
```
|
||||
|
||||
### Durability Tests
|
||||
```python
|
||||
# test_durability.py
|
||||
async def test_tasks_survive_restart():
|
||||
"""Enqueue tasks, restart Redis, verify tasks persist."""
|
||||
|
||||
async def test_worker_crash_recovery():
|
||||
"""Kill worker mid-task, verify task requeued."""
|
||||
|
||||
async def test_concurrent_workers():
|
||||
"""5 workers, 20 tasks, verify all complete."""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Required
|
||||
```toml
|
||||
redis = { version = "^5.0", extras = ["hiredis"] }
|
||||
fastapi = "^0.115"
|
||||
websockets = "^13.0"
|
||||
pydantic = "^2.0"
|
||||
```
|
||||
|
||||
### Development
|
||||
```toml
|
||||
pytest = "^8.0"
|
||||
pytest-asyncio = "^0.24"
|
||||
httpx = "^0.27" # For FastAPI test client
|
||||
```
|
||||
|
||||
### Infrastructure
|
||||
- Redis 7.0+ (local or cloud)
|
||||
- Python 3.11+
|
||||
|
||||
---
|
||||
|
||||
## Migration Guide
|
||||
|
||||
### For Existing Task Tool Users
|
||||
|
||||
**Before:**
|
||||
```python
|
||||
result = await task_tool.run(
|
||||
prompt="Create a React component",
|
||||
subagent_type="coder"
|
||||
) # Blocks until complete
|
||||
```
|
||||
|
||||
**After (backward compatible):**
|
||||
```python
|
||||
# Same behavior (immediate execution)
|
||||
result = await task_tool.run(
|
||||
prompt="Create a React component",
|
||||
subagent_type="coder",
|
||||
persist=False # default
|
||||
)
|
||||
```
|
||||
|
||||
**New (persistent):**
|
||||
```python
|
||||
# Fire-and-forget
|
||||
task_id = await task_tool.run(
|
||||
prompt="Create a React component",
|
||||
subagent_type="coder",
|
||||
persist=True
|
||||
)
|
||||
|
||||
# Wait for completion
|
||||
result = await task_tool.run(
|
||||
prompt="Create a React component",
|
||||
subagent_type="coder",
|
||||
persist=True,
|
||||
wait=True,
|
||||
timeout=600
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
|
||||
| Metric | Target | Measurement |
|
||||
|--------|--------|-------------|
|
||||
| Task durability | 100% | Tasks never lost on restart |
|
||||
| Throughput | 10 tasks/min | With 3 workers |
|
||||
| Latency | <100ms | Enqueue → pending |
|
||||
| Recovery time | <60s | Worker crash → requeue |
|
||||
| API uptime | 99.9% | Health check endpoint |
|
||||
| Backward compat | 100% | Existing tests pass |
|
||||
|
||||
---
|
||||
|
||||
## Risk Mitigation
|
||||
|
||||
| Risk | Likelihood | Impact | Mitigation |
|
||||
|------|------------|--------|------------|
|
||||
| Redis complexity | Low | Medium | Start with simple ops |
|
||||
| Worker pool bugs | Medium | High | Extensive testing |
|
||||
| Performance issues | Low | Medium | Load testing Week 5 |
|
||||
| Migration breakage | Low | High | Full test suite |
|
||||
|
||||
---
|
||||
|
||||
## Handoff to Phase 2
|
||||
|
||||
**Phase 2 Prereqs:**
|
||||
- [ ] All Phase 1 success criteria met
|
||||
- [ ] API documentation complete
|
||||
- [ ] WebSocket tested with simple client
|
||||
- [ ] Cost tracking working
|
||||
|
||||
**Phase 2 Inputs:**
|
||||
- Task queue API (REST + WebSocket)
|
||||
- Task data model
|
||||
- Worker management API
|
||||
- Redis schema
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Quick Reference
|
||||
|
||||
### Redis Commands Cheat Sheet
|
||||
|
||||
```bash
|
||||
# Start Redis
|
||||
docker run -d -p 6379:6379 redis:7-alpine
|
||||
|
||||
# Monitor
|
||||
redis-cli monitor
|
||||
|
||||
# Inspect keys
|
||||
redis-cli KEYS "task:*"
|
||||
redis-cli HGETALL task:abc-123
|
||||
|
||||
# Clear queue
|
||||
redis-cli FLUSHDB
|
||||
```
|
||||
|
||||
### Development Commands
|
||||
|
||||
```bash
|
||||
# Start worker
|
||||
python -m letta_ade.worker.runner --agent-id worker-1
|
||||
|
||||
# Start API
|
||||
uvicorn letta_ade.api:app --reload
|
||||
|
||||
# Run tests
|
||||
pytest tests/ -v --tb=short
|
||||
|
||||
# Integration test
|
||||
pytest tests/integration/ -v
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
*Ready for implementation. Questions? See community-ade-research-synthesis-2026-03-18.md for full context.*
|
||||
307
docs/ade-phase1-orchestration-design.md
Normal file
307
docs/ade-phase1-orchestration-design.md
Normal file
@@ -0,0 +1,307 @@
|
||||
# Phase 1: Orchestration Layer Design
|
||||
|
||||
**Date:** March 18, 2026
|
||||
**Architect:** Researcher subagent
|
||||
**Goal:** Design persistent task queue system for Community ADE
|
||||
|
||||
---
|
||||
|
||||
## 1. Core Data Model
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class Task:
|
||||
id: UUID # Unique task identifier
|
||||
subagent_type: str # "researcher", "coder", etc.
|
||||
prompt: str # User prompt to subagent
|
||||
system_prompt: Optional[str] # Override default system prompt
|
||||
model: Optional[str] # Override default model
|
||||
|
||||
# State tracking
|
||||
status: TaskStatus # pending/running/completed/failed/cancelled
|
||||
priority: int = 100 # Lower = higher priority
|
||||
created_at: datetime
|
||||
started_at: Optional[datetime]
|
||||
completed_at: Optional[datetime]
|
||||
|
||||
# Execution tracking
|
||||
agent_id: Optional[UUID] # Assigned worker agent
|
||||
retry_count: int = 0
|
||||
max_retries: int = 3
|
||||
|
||||
# Results
|
||||
result: Optional[dict] # Success result
|
||||
error: Optional[str] # Failure message
|
||||
exit_code: Optional[int] # Subprocess exit code
|
||||
|
||||
# Metadata
|
||||
tags: List[str] # For filtering/grouping
|
||||
user_id: str # Task owner
|
||||
parent_task: Optional[UUID] # For task chains
|
||||
```
|
||||
|
||||
### TaskStatus Enum
|
||||
```python
|
||||
class TaskStatus(Enum):
|
||||
PENDING = "pending" # Waiting for worker
|
||||
RUNNING = "running" # Assigned to worker
|
||||
COMPLETED = "completed" # Success
|
||||
FAILED = "failed" # Permanent failure (max retries)
|
||||
CANCELLED = "cancelled" # User cancelled
|
||||
STALLED = "stalled" # Worker crashed, needs recovery
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. State Machine
|
||||
|
||||
```
|
||||
+-----------+
|
||||
| PENDING |
|
||||
+-----+-----+
|
||||
| dequeue()
|
||||
v
|
||||
+--------+ +-------------+ +-----------+
|
||||
| FAILED |<--------+ RUNNING +-------->| COMPLETED |
|
||||
+--------+ fail() +------+------+ success +-----------+
|
||||
^ max | |
|
||||
| retries | |
|
||||
+------------------+ | cancel()
|
||||
retry() v
|
||||
+-----------+
|
||||
| CANCELLED |
|
||||
+-----------+
|
||||
^
|
||||
| stall detected
|
||||
+----------+
|
||||
| STALLED |
|
||||
+----------+
|
||||
```
|
||||
|
||||
### Transitions
|
||||
- `PENDING → RUNNING`: Worker dequeues task
|
||||
- `RUNNING → COMPLETED`: Subagent succeeds
|
||||
- `RUNNING → FAILED`: Subagent fails, max retries reached
|
||||
- `RUNNING → STALLED`: Worker heartbeat timeout
|
||||
- `STALLED → RUNNING`: Reassigned to new worker
|
||||
- `FAILED → RUNNING`: Manual retry triggered
|
||||
- Any → CANCELLED: User cancellation
|
||||
|
||||
---
|
||||
|
||||
## 3. Redis Data Structures
|
||||
|
||||
| Purpose | Structure | Key Pattern |
|
||||
|---------|-----------|-------------|
|
||||
| Task payload | Hash | `task:{task_id}` |
|
||||
| Pending queue | Sorted Set (by priority) | `queue:pending` |
|
||||
| Running set | Set | `queue:running` |
|
||||
| Worker registry | Hash | `worker:{agent_id}` |
|
||||
| Status index | Set per status | `status:{status}` |
|
||||
| User tasks | Set | `user:{user_id}:tasks` |
|
||||
|
||||
### Example Redis Operations
|
||||
|
||||
```redis
|
||||
# Enqueue (pending)
|
||||
ZADD queue:pending {priority} {task_id}
|
||||
HSET task:{task_id} status pending created_at {timestamp} ...
|
||||
SADD status:pending {task_id}
|
||||
|
||||
# Dequeue (atomic)
|
||||
WATCH queue:pending
|
||||
task_id = ZPOPMIN queue:pending
|
||||
MULTI
|
||||
ZADD queue:running {now} {task_id}
|
||||
HSET task:{task_id} status running agent_id {worker} started_at {now}
|
||||
SMOVE status:pending status:running {task_id}
|
||||
EXEC
|
||||
|
||||
# Complete
|
||||
ZREM queue:running {task_id}
|
||||
SADD status:completed {task_id}
|
||||
HSET task:{task_id} status completed result {...} completed_at {now}
|
||||
|
||||
# Fail with retry
|
||||
HINCRBY task:{task_id} retry_count 1
|
||||
ZADD queue:pending {priority} {task_id} # Re-enqueue
|
||||
SMOVE status:running status:pending {task_id}
|
||||
HSET task:{task_id} status pending error {...}
|
||||
|
||||
# Stall recovery (cron job)
|
||||
SMEMBERS queue:running
|
||||
# For each task where worker heartbeat > threshold:
|
||||
ZREM queue:running {task_id}
|
||||
SADD status:stalled {task_id}
|
||||
ZADD queue:pending {priority} {task_id}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Key API Methods
|
||||
|
||||
```python
|
||||
class TaskQueue:
|
||||
# Core operations
|
||||
async def enqueue(task: Task) -> UUID
|
||||
async def dequeue(worker_id: UUID, timeout_ms: int = 5000) -> Optional[Task]
|
||||
async def complete(task_id: UUID, result: dict) -> None
|
||||
async def fail(task_id: UUID, error: str, retryable: bool = True) -> None
|
||||
async def cancel(task_id: UUID) -> None
|
||||
|
||||
# Management
|
||||
async def retry(task_id: UUID) -> None # Manual retry
|
||||
async def requeue_stalled(max_age_ms: int = 60000) -> int # Recover crashed
|
||||
async def get_status(task_id: UUID) -> TaskStatus
|
||||
async def list_by_user(user_id: str, status: Optional[str]) -> List[TaskSummary]
|
||||
|
||||
# Worker management
|
||||
async def register_worker(agent_id: UUID, capacity: int) -> None
|
||||
async def heartbeat(agent_id: UUID) -> None
|
||||
async def unregister_worker(agent_id: UUID, reason: str) -> None
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Integration with Existing Task Tool
|
||||
|
||||
### Current Flow
|
||||
```
|
||||
Task tool → immediate subprocess spawn → wait → return result
|
||||
```
|
||||
|
||||
### New Flow (with persistence)
|
||||
```
|
||||
Task tool → enqueue() → return task_id (immediate)
|
||||
↓
|
||||
Background worker → dequeue() → spawn subprocess → complete()/fail()
|
||||
↓
|
||||
Caller polls/gets notification when task completes
|
||||
```
|
||||
|
||||
### Changes to Task Tool Schema
|
||||
```python
|
||||
class TaskTool:
|
||||
async def run(
|
||||
self,
|
||||
prompt: str,
|
||||
subagent_type: str,
|
||||
# ... existing args ...
|
||||
persist: bool = False, # NEW: enqueue instead of immediate run
|
||||
priority: int = 100, # NEW
|
||||
tags: Optional[List[str]] = None # NEW
|
||||
) -> TaskResult:
|
||||
if persist:
|
||||
task_id = await self.queue.enqueue(...)
|
||||
return TaskResult(task_id=task_id, status="pending")
|
||||
else:
|
||||
# Legacy: immediate execution
|
||||
...
|
||||
```
|
||||
|
||||
### Worker Agent Integration
|
||||
|
||||
**Worker subscribes to queue:**
|
||||
```python
|
||||
async def worker_loop(agent_id: UUID):
|
||||
while running:
|
||||
task = await queue.dequeue(agent_id, timeout_ms=5000)
|
||||
if task:
|
||||
# Spawn subprocess
|
||||
proc = await asyncio.create_subprocess_exec(
|
||||
"letta", "run-agent", f"--task-id={task.id}",
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.PIPE
|
||||
)
|
||||
|
||||
# Monitor and wait
|
||||
stdout, stderr = await proc.communicate()
|
||||
|
||||
# Update queue based on result
|
||||
if proc.returncode == 0:
|
||||
await queue.complete(task.id, parse_result(stdout))
|
||||
else:
|
||||
await queue.fail(task.id, stderr.decode(), retryable=True)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Implementation Phases
|
||||
|
||||
### Phase 1a: In-Memory Prototype (Week 1)
|
||||
- Python `asyncio.Queue` for pending tasks
|
||||
- In-memory dict for task storage
|
||||
- Single worker process
|
||||
- No Redis dependency
|
||||
|
||||
### Phase 1b: Redis Integration (Week 2)
|
||||
- Replace queue with Redis
|
||||
- Add task persistence
|
||||
- Implement retry logic
|
||||
- Add stall recovery
|
||||
|
||||
### Phase 1c: Worker Pool (Week 3-4)
|
||||
- Multiple worker processes
|
||||
- Worker heartbeat monitoring
|
||||
- Task assignment logic
|
||||
- Graceful shutdown handling
|
||||
|
||||
### Phase 1d: API & CLI (Week 5-6)
|
||||
- REST API for task management
|
||||
- CLI commands for queue inspection
|
||||
- Task status dashboard endpoint
|
||||
- Webhook notifications
|
||||
|
||||
### Phase 1e: Integration (Week 7-8)
|
||||
- Modify Task tool to use queue
|
||||
- Add persistence flag
|
||||
- Maintain backward compatibility
|
||||
- Migration path for existing code
|
||||
|
||||
---
|
||||
|
||||
## 7. Retry Logic with Exponential Backoff
|
||||
|
||||
```python
|
||||
async def retry_with_backoff(task_id: UUID):
|
||||
task = await queue.get(task_id)
|
||||
|
||||
if task.retry_count >= task.max_retries:
|
||||
await queue.fail(task_id, "Max retries exceeded", retryable=False)
|
||||
return
|
||||
|
||||
# Exponential backoff: 2^retry_count seconds
|
||||
delay = min(2 ** task.retry_count, 300) # Cap at 5 minutes
|
||||
|
||||
await asyncio.sleep(delay)
|
||||
|
||||
# Re-enqueue with same priority
|
||||
await queue.enqueue(task, priority=task.priority)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Error Handling Strategy
|
||||
|
||||
| Error Type | Retry? | Action |
|
||||
|------------|--------|--------|
|
||||
| Subagent crash | Yes | Increment retry, requeue |
|
||||
| Syntax error in code | No | Fail immediately |
|
||||
| Timeout | Yes | Retry with longer timeout |
|
||||
| API rate limit | Yes | Retry with exponential backoff |
|
||||
| Out of memory | No | Fail, alert admin |
|
||||
| Redis connection lost | Yes | Reconnect, retry operation |
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Implement in-memory prototype** (Week 1)
|
||||
2. **Add Redis persistence** (Week 2)
|
||||
3. **Build worker pool** (Week 3-4)
|
||||
4. **Integrate with Task tool** (Week 7-8)
|
||||
5. **Write tests for queue durability** (ongoing)
|
||||
|
||||
---
|
||||
|
||||
*Design by Researcher subagent, March 18, 2026*
|
||||
835
docs/ade-redis-queue-design.md
Normal file
835
docs/ade-redis-queue-design.md
Normal file
@@ -0,0 +1,835 @@
|
||||
# Redis Task Queue Architecture for Letta Community ADE
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This document outlines the architecture for replacing the in-memory `QueueRuntime` with a Redis-backed persistent task queue. The design prioritizes durability, horizontal scalability, and reliable task execution while maintaining compatibility with the existing Task tool and subagent spawning workflows.
|
||||
|
||||
**Key Decisions:**
|
||||
- Use **Redis Streams** (not Sorted Sets) for the primary task queue to leverage consumer groups and at-least-once delivery guarantees
|
||||
- Hybrid approach: Streams for queue semantics, Sorted Sets for scheduling/delays, Hashes for task state
|
||||
- Stateless workers with heartbeat-based liveness detection
|
||||
- Exponential backoff with jitter for retry logic
|
||||
|
||||
---
|
||||
|
||||
## 1. Redis Data Structures
|
||||
|
||||
### 1.1 Primary Queue: Redis Stream
|
||||
|
||||
```
|
||||
Key: ade:queue:tasks
|
||||
Type: Stream
|
||||
Purpose: Main task ingestion and distribution
|
||||
```
|
||||
|
||||
**Why Streams over Sorted Sets?**
|
||||
|
||||
| Feature | Sorted Sets | Redis Streams |
|
||||
|---------|-------------|---------------|
|
||||
| Ordering | Score-based (can have ties) | Strict temporal (millisecond ID) |
|
||||
| Consumer Groups | Manual implementation | Built-in XREADGROUP |
|
||||
| Delivery Semantics | At-most-once (easy) / At-least-once (complex) | At-least-once with ACK |
|
||||
| Pending Tracking | Manual | Built-in XPENDING |
|
||||
| Claim/Retry | Custom Lua scripts | Built-in XCLAIM/XAUTOCLAIM |
|
||||
| Message Visibility | Immediate to all | Consumer-group isolated |
|
||||
|
||||
Streams provide the exact semantics needed for reliable task processing without custom Lua scripting.
|
||||
|
||||
**Stream Entries:**
|
||||
```
|
||||
XADD ade:queue:tasks * taskId <uuid> payload <json> priority <int>
|
||||
```
|
||||
|
||||
### 1.2 Delayed Tasks: Sorted Set
|
||||
|
||||
```
|
||||
Key: ade:queue:delayed
|
||||
Type: Sorted Set (ZSET)
|
||||
Score: scheduled execution timestamp (ms)
|
||||
Member: taskId
|
||||
```
|
||||
|
||||
Used for:
|
||||
- Tasks with explicit `runAfter` timestamps
|
||||
- Retry scheduling with exponential backoff
|
||||
- Rate-limited task release
|
||||
|
||||
### 1.3 Task State Storage: Redis Hash
|
||||
|
||||
```
|
||||
Key: ade:task:{taskId}
|
||||
Type: Hash
|
||||
Fields:
|
||||
- id: string (UUID v4)
|
||||
- status: pending|running|completed|failed
|
||||
- payload: JSON (task arguments)
|
||||
- createdAt: timestamp (ms)
|
||||
- startedAt: timestamp (ms)
|
||||
- completedAt: timestamp (ms)
|
||||
- workerId: string (nullable)
|
||||
- attemptCount: integer
|
||||
- maxAttempts: integer (default: 3)
|
||||
- error: string (last error message)
|
||||
- result: JSON (completed task result)
|
||||
- parentTaskId: string (nullable, for task chains)
|
||||
- subagentId: string (link to subagent state)
|
||||
- priority: integer (0-9, default 5)
|
||||
- kind: message|task_notification|approval_result|overlay_action
|
||||
TTL: 7 days (configurable cleanup for completed/failed tasks)
|
||||
```
|
||||
|
||||
### 1.4 Worker Registry: Redis Hash + Sorted Set
|
||||
|
||||
```
|
||||
Key: ade:workers:active
|
||||
Type: Hash
|
||||
Fields per worker:
|
||||
- {workerId}: JSON { hostname, pid, startedAt, lastHeartbeat, version }
|
||||
|
||||
Key: ade:workers:heartbeat
|
||||
Type: Sorted Set
|
||||
Score: last heartbeat timestamp
|
||||
Member: workerId
|
||||
```
|
||||
|
||||
### 1.5 Consumer Group State
|
||||
|
||||
```
|
||||
Stream Consumer Group: ade:queue:tasks
|
||||
Group Name: ade-workers
|
||||
Consumer Name: {workerId} (unique per process)
|
||||
```
|
||||
|
||||
Redis Streams automatically track:
|
||||
- Pending messages per consumer (XPENDING)
|
||||
- Delivery count per message
|
||||
- Idle time since last read
|
||||
|
||||
---
|
||||
|
||||
## 2. Task Entity Schema
|
||||
|
||||
### 2.1 TypeScript Interface
|
||||
|
||||
```typescript
|
||||
// src/queue/redis/types.ts
|
||||
|
||||
export type TaskStatus =
|
||||
| "pending" // Enqueued, not yet claimed
|
||||
| "running" // Claimed by worker, processing
|
||||
| "completed" // Successfully finished
|
||||
| "failed" // Exhausted all retries
|
||||
| "cancelled"; // Explicitly cancelled
|
||||
|
||||
export type TaskKind =
|
||||
| "message"
|
||||
| "task_notification"
|
||||
| "approval_result"
|
||||
| "overlay_action";
|
||||
|
||||
export interface TaskPayload {
|
||||
// Task identification
|
||||
id: string; // UUID v4
|
||||
kind: TaskKind;
|
||||
|
||||
// Execution context
|
||||
agentId?: string;
|
||||
conversationId?: string;
|
||||
clientMessageId?: string;
|
||||
|
||||
// Content (varies by kind)
|
||||
content?: unknown; // For "message" kind
|
||||
text?: string; // For notification/approval/overlay
|
||||
|
||||
// Subagent execution params (for task_notification)
|
||||
subagentType?: string;
|
||||
prompt?: string;
|
||||
model?: string;
|
||||
existingAgentId?: string;
|
||||
existingConversationId?: string;
|
||||
maxTurns?: number;
|
||||
|
||||
// Scheduling
|
||||
priority: number; // 0-9, lower = higher priority
|
||||
runAfter?: number; // Timestamp ms (for delayed tasks)
|
||||
|
||||
// Retry configuration
|
||||
maxAttempts: number;
|
||||
backoffMultiplier: number; // Default: 2
|
||||
maxBackoffMs: number; // Default: 300000 (5 min)
|
||||
|
||||
// Metadata
|
||||
enqueuedAt: number;
|
||||
source: "user" | "system" | "hook";
|
||||
}
|
||||
|
||||
export interface TaskState extends TaskPayload {
|
||||
status: TaskStatus;
|
||||
workerId?: string;
|
||||
attemptCount: number;
|
||||
startedAt?: number;
|
||||
completedAt?: number;
|
||||
error?: string;
|
||||
result?: unknown;
|
||||
|
||||
// Coalescing support (from QueueRuntime)
|
||||
isCoalescable: boolean;
|
||||
scopeKey?: string; // For grouping coalescable items
|
||||
}
|
||||
```
|
||||
|
||||
### 2.2 State Transitions
|
||||
|
||||
```
|
||||
┌─────────────┐
|
||||
│ PENDING │◄──────────────────┐
|
||||
│ (queued) │ │
|
||||
└──────┬──────┘ │
|
||||
│ claim │ retry
|
||||
▼ │ (with delay)
|
||||
┌─────────────┐ │
|
||||
┌─────────│ RUNNING │───────────────────┘
|
||||
│ │ (claimed) │ fail (retryable)
|
||||
│ └──────┬──────┘
|
||||
complete │ │ fail (final)
|
||||
│ ▼
|
||||
│ ┌─────────────┐
|
||||
└────────►│ COMPLETED │
|
||||
└─────────────┘
|
||||
│
|
||||
┌──────┴──────┐
|
||||
│ FAILED │
|
||||
│ (exhausted)│
|
||||
└─────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Worker Pool Registration and Heartbeat
|
||||
|
||||
### 3.1 Worker Lifecycle
|
||||
|
||||
```typescript
|
||||
// src/queue/redis/worker.ts
|
||||
|
||||
class TaskWorker {
|
||||
private workerId: string;
|
||||
private redis: RedisClient;
|
||||
private isRunning: boolean = false;
|
||||
private heartbeatInterval?: NodeJS.Timeout;
|
||||
private claimInterval?: NodeJS.Timeout;
|
||||
|
||||
// Config
|
||||
private readonly HEARTBEAT_INTERVAL_MS = 5000;
|
||||
private readonly HEARTBEAT_TIMEOUT_MS = 30000;
|
||||
private readonly CLAIM_BATCH_SIZE = 10;
|
||||
private readonly PROCESSING_TIMEOUT_MS = 300000; // 5 min
|
||||
|
||||
async start(): Promise<void> {
|
||||
this.workerId = generateWorkerId(); // {hostname}:{pid}:{uuid}
|
||||
|
||||
// Register in worker registry
|
||||
await this.redis.hSet("ade:workers:active", this.workerId, JSON.stringify({
|
||||
hostname: os.hostname(),
|
||||
pid: process.pid,
|
||||
startedAt: Date.now(),
|
||||
lastHeartbeat: Date.now(),
|
||||
version: process.env.npm_package_version || "unknown"
|
||||
}));
|
||||
|
||||
// Create consumer in stream group (idempotent)
|
||||
try {
|
||||
await this.redis.xGroupCreate("ade:queue:tasks", "ade-workers", "$", {
|
||||
MKSTREAM: true
|
||||
});
|
||||
} catch (err) {
|
||||
// Group already exists - ignore
|
||||
}
|
||||
|
||||
this.isRunning = true;
|
||||
this.startHeartbeat();
|
||||
this.startClaimLoop();
|
||||
}
|
||||
|
||||
async stop(): Promise<void> {
|
||||
this.isRunning = false;
|
||||
clearInterval(this.heartbeatInterval);
|
||||
clearInterval(this.claimInterval);
|
||||
|
||||
// Release pending tasks back to queue
|
||||
await this.releasePendingTasks();
|
||||
|
||||
// Deregister
|
||||
await this.redis.hDel("ade:workers:active", this.workerId);
|
||||
await this.redis.zRem("ade:workers:heartbeat", this.workerId);
|
||||
}
|
||||
|
||||
private startHeartbeat(): void {
|
||||
this.heartbeatInterval = setInterval(async () => {
|
||||
await this.redis.zAdd("ade:workers:heartbeat", {
|
||||
score: Date.now(),
|
||||
value: this.workerId
|
||||
});
|
||||
await this.redis.hSet("ade:workers:active", this.workerId, JSON.stringify({
|
||||
...currentInfo,
|
||||
lastHeartbeat: Date.now()
|
||||
}));
|
||||
}, this.HEARTBEAT_INTERVAL_MS);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3.2 Dead Worker Detection
|
||||
|
||||
```typescript
|
||||
// src/queue/redis/orchestrator.ts (singleton, per-deployment)
|
||||
|
||||
class QueueOrchestrator {
|
||||
async detectAndReclaimDeadWorkerTasks(): Promise<number> {
|
||||
const now = Date.now();
|
||||
const cutoff = now - this.HEARTBEAT_TIMEOUT_MS;
|
||||
|
||||
// Find dead workers
|
||||
const deadWorkers = await this.redis.zRangeByScore(
|
||||
"ade:workers:heartbeat",
|
||||
"-inf",
|
||||
cutoff
|
||||
);
|
||||
|
||||
let reclaimedCount = 0;
|
||||
|
||||
for (const workerId of deadWorkers) {
|
||||
// Find pending tasks for this worker using XPENDING
|
||||
const pending = await this.redis.xPendingRange(
|
||||
"ade:queue:tasks",
|
||||
"ade-workers",
|
||||
"-",
|
||||
"+
|
||||
this.CLAIM_BATCH_SIZE
|
||||
);
|
||||
|
||||
for (const item of pending) {
|
||||
if (item.consumer === workerId && item.idle > this.PROCESSING_TIMEOUT_MS) {
|
||||
// Use XAUTOCLAIM to atomically claim and retry
|
||||
const [nextId, claimed] = await this.redis.xAutoClaim(
|
||||
"ade:queue:tasks",
|
||||
"ade-workers",
|
||||
"orchestrator", // consumer name for cleanup
|
||||
this.PROCESSING_TIMEOUT_MS,
|
||||
item.id,
|
||||
{ COUNT: 1 }
|
||||
);
|
||||
|
||||
// Release back to pending by ACKing (removes from pending list)
|
||||
// The orchestrator will re-add to delayed queue for retry
|
||||
await this.redis.xAck("ade:queue:tasks", "ade-workers", item.id);
|
||||
await this.scheduleRetry(item.id);
|
||||
reclaimedCount++;
|
||||
}
|
||||
}
|
||||
|
||||
// Clean up dead worker registration
|
||||
await this.redis.hDel("ade:workers:active", workerId);
|
||||
await this.redis.zRem("ade:workers:heartbeat", workerId);
|
||||
}
|
||||
|
||||
return reclaimedCount;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Retry Logic with Exponential Backoff
|
||||
|
||||
### 4.1 Backoff Calculation
|
||||
|
||||
```typescript
|
||||
// src/queue/redis/retry.ts
|
||||
|
||||
interface RetryConfig {
|
||||
attempt: number; // 0-indexed (0 = first retry)
|
||||
baseDelayMs: number; // Default: 1000
|
||||
multiplier: number; // Default: 2
|
||||
maxDelayMs: number; // Default: 300000 (5 min)
|
||||
jitterFactor: number; // Default: 0.1 (10% randomization)
|
||||
}
|
||||
|
||||
function calculateRetryDelay(config: RetryConfig): number {
|
||||
// Exponential backoff: base * (multiplier ^ attempt)
|
||||
const exponentialDelay = config.baseDelayMs *
|
||||
Math.pow(config.multiplier, config.attempt);
|
||||
|
||||
// Cap at max
|
||||
const cappedDelay = Math.min(exponentialDelay, config.maxDelayMs);
|
||||
|
||||
// Add jitter to prevent thundering herd: ±jitterFactor
|
||||
const jitter = cappedDelay * config.jitterFactor * (Math.random() * 2 - 1);
|
||||
|
||||
return Math.floor(cappedDelay + jitter);
|
||||
}
|
||||
|
||||
// Examples with defaults:
|
||||
// Attempt 0 (first retry): ~1000ms ±100ms
|
||||
// Attempt 1: ~2000ms ±200ms
|
||||
// Attempt 2: ~4000ms ±400ms
|
||||
// Attempt 3: ~8000ms ±800ms
|
||||
// Attempt 4: ~16000ms ±1600ms
|
||||
// ...up to max 300000ms (5 min)
|
||||
```
|
||||
|
||||
### 4.2 Retry Flow
|
||||
|
||||
```typescript
|
||||
async function handleTaskFailure(
|
||||
taskId: string,
|
||||
error: Error,
|
||||
workerId: string
|
||||
): Promise<void> {
|
||||
const taskKey = `ade:task:${taskId}`;
|
||||
const task = await redis.hGetAll(taskKey);
|
||||
|
||||
const attemptCount = parseInt(task.attemptCount) + 1;
|
||||
const maxAttempts = parseInt(task.maxAttempts);
|
||||
|
||||
if (attemptCount >= maxAttempts) {
|
||||
// Final failure - mark as failed
|
||||
await redis.hSet(taskKey, {
|
||||
status: "failed",
|
||||
error: error.message,
|
||||
completedAt: Date.now(),
|
||||
attemptCount: attemptCount.toString()
|
||||
});
|
||||
|
||||
// Publish failure event for observers
|
||||
await redis.publish("ade:events:task-failed", JSON.stringify({
|
||||
taskId,
|
||||
error: error.message,
|
||||
totalAttempts: attemptCount
|
||||
}));
|
||||
|
||||
// ACK to remove from pending
|
||||
await redis.xAck("ade:queue:tasks", "ade-workers", taskId);
|
||||
} else {
|
||||
// Schedule retry
|
||||
const delay = calculateRetryDelay({
|
||||
attempt: attemptCount,
|
||||
baseDelayMs: 1000,
|
||||
multiplier: 2,
|
||||
maxDelayMs: 300000,
|
||||
jitterFactor: 0.1
|
||||
});
|
||||
|
||||
const runAfter = Date.now() + delay;
|
||||
|
||||
// Update task state
|
||||
await redis.hSet(taskKey, {
|
||||
status: "pending",
|
||||
attemptCount: attemptCount.toString(),
|
||||
error: error.message,
|
||||
workerId: "" // Clear worker assignment
|
||||
});
|
||||
|
||||
// Add to delayed queue
|
||||
await redis.zAdd("ade:queue:delayed", {
|
||||
score: runAfter,
|
||||
value: taskId
|
||||
});
|
||||
|
||||
// ACK to remove from stream pending
|
||||
await redis.xAck("ade:queue:tasks", "ade-workers", taskId);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4.3 Delayed Task Promoter
|
||||
|
||||
```typescript
|
||||
// Runs periodically (every 1 second) to move due tasks from delayed set to stream
|
||||
|
||||
async function promoteDelayedTasks(): Promise<number> {
|
||||
const now = Date.now();
|
||||
|
||||
// Atomically get and remove due tasks
|
||||
const dueTasks = await redis.zRangeByScore(
|
||||
"ade:queue:delayed",
|
||||
"-inf",
|
||||
now,
|
||||
{ LIMIT: { offset: 0, count: 100 } }
|
||||
);
|
||||
|
||||
if (dueTasks.length === 0) return 0;
|
||||
|
||||
// Remove from delayed queue
|
||||
await redis.zRem("ade:queue:delayed", dueTasks);
|
||||
|
||||
// Re-add to stream for processing
|
||||
for (const taskId of dueTasks) {
|
||||
const task = await redis.hGetAll(`ade:task:${taskId}`);
|
||||
await redis.xAdd("ade:queue:tasks", "*", {
|
||||
taskId,
|
||||
payload: task.payload,
|
||||
priority: task.priority
|
||||
});
|
||||
}
|
||||
|
||||
return dueTasks.length;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Integration with Existing Task.ts
|
||||
|
||||
### 5.1 Adapter Pattern
|
||||
|
||||
```typescript
|
||||
// src/queue/redis/adapter.ts
|
||||
|
||||
import { QueueRuntime, QueueItem, DequeuedBatch } from "../queueRuntime";
|
||||
import { RedisQueue } from "./queue";
|
||||
|
||||
/**
|
||||
* Redis-backed implementation of QueueRuntime interface.
|
||||
* Allows drop-in replacement of in-memory queue.
|
||||
*/
|
||||
export class RedisQueueAdapter implements QueueRuntime {
|
||||
private redisQueue: RedisQueue;
|
||||
private localBatchBuffer: Map<string, QueueItem> = new Map();
|
||||
|
||||
constructor(redisUrl: string, options?: QueueRuntimeOptions) {
|
||||
this.redisQueue = new RedisQueue(redisUrl, {
|
||||
...options,
|
||||
onTaskCompleted: this.handleTaskCompleted.bind(this),
|
||||
onTaskFailed: this.handleTaskFailed.bind(this)
|
||||
});
|
||||
}
|
||||
|
||||
async enqueue(input: Omit<QueueItem, "id" | "enqueuedAt">): Promise<QueueItem | null> {
|
||||
// Map QueueItem to TaskPayload
|
||||
const taskId = generateUUID();
|
||||
const enqueuedAt = Date.now();
|
||||
|
||||
const payload: TaskPayload = {
|
||||
id: taskId,
|
||||
kind: input.kind,
|
||||
agentId: input.agentId,
|
||||
conversationId: input.conversationId,
|
||||
clientMessageId: input.clientMessageId,
|
||||
text: (input as any).text,
|
||||
content: (input as any).content,
|
||||
priority: 5, // Default priority
|
||||
maxAttempts: 3,
|
||||
backoffMultiplier: 2,
|
||||
maxBackoffMs: 300000,
|
||||
enqueuedAt,
|
||||
source: "user",
|
||||
isCoalescable: isCoalescable(input.kind)
|
||||
};
|
||||
|
||||
const success = await this.redisQueue.enqueue(payload);
|
||||
if (!success) return null;
|
||||
|
||||
return {
|
||||
...input,
|
||||
id: taskId,
|
||||
enqueuedAt
|
||||
} as QueueItem;
|
||||
}
|
||||
|
||||
async tryDequeue(blockedReason: QueueBlockedReason | null): Promise<DequeuedBatch | null> {
|
||||
if (blockedReason !== null) {
|
||||
// Emit blocked event if needed (preserving QueueRuntime behavior)
|
||||
return null;
|
||||
}
|
||||
|
||||
// Claim batch from Redis
|
||||
const batch = await this.redisQueue.claimBatch({
|
||||
consumerId: this.workerId,
|
||||
batchSize: this.getCoalescingBatchSize(),
|
||||
coalescingWindowMs: 50 // Small window for coalescing
|
||||
});
|
||||
|
||||
if (!batch || batch.length === 0) return null;
|
||||
|
||||
// Map back to QueueItem format
|
||||
const items: QueueItem[] = batch.map(task => this.mapTaskToQueueItem(task));
|
||||
|
||||
return {
|
||||
batchId: generateBatchId(),
|
||||
items,
|
||||
mergedCount: items.length,
|
||||
queueLenAfter: await this.redisQueue.getQueueLength()
|
||||
};
|
||||
}
|
||||
|
||||
// ... other QueueRuntime methods
|
||||
}
|
||||
```
|
||||
|
||||
### 5.2 Task.ts Integration Points
|
||||
|
||||
**Current Flow (Task.ts line 403+):**
|
||||
```typescript
|
||||
// Background task spawning
|
||||
const { taskId, outputFile, subagentId } = spawnBackgroundSubagentTask({
|
||||
subagentType: subagent_type,
|
||||
prompt,
|
||||
description,
|
||||
model,
|
||||
toolCallId,
|
||||
existingAgentId: args.agent_id,
|
||||
existingConversationId: args.conversation_id,
|
||||
maxTurns: args.max_turns,
|
||||
});
|
||||
```
|
||||
|
||||
**Proposed Redis Integration:**
|
||||
```typescript
|
||||
// New: Redis-backed task queue integration
|
||||
interface TaskQueueEnqueueOptions {
|
||||
subagentType: string;
|
||||
prompt: string;
|
||||
description: string;
|
||||
model?: string;
|
||||
toolCallId?: string;
|
||||
existingAgentId?: string;
|
||||
existingConversationId?: string;
|
||||
maxTurns?: number;
|
||||
priority?: number;
|
||||
runInBackground?: boolean;
|
||||
}
|
||||
|
||||
// In Task.ts - replace spawnBackgroundSubagentTask with:
|
||||
export async function enqueueSubagentTask(
|
||||
args: TaskQueueEnqueueOptions,
|
||||
queue: RedisQueue
|
||||
): Promise<TaskEnqueueResult> {
|
||||
|
||||
const taskId = generateTaskId();
|
||||
const subagentId = generateSubagentId();
|
||||
|
||||
// Register in subagent state store (for UI)
|
||||
registerSubagent(subagentId, args.subagentType, args.description, args.toolCallId, true);
|
||||
|
||||
const outputFile = createBackgroundOutputFile(taskId);
|
||||
|
||||
// Create task payload
|
||||
const payload: TaskPayload = {
|
||||
id: taskId,
|
||||
kind: "task_notification",
|
||||
subagentType: args.subagentType,
|
||||
prompt: args.prompt,
|
||||
description: args.description,
|
||||
model: args.model,
|
||||
existingAgentId: args.existingAgentId,
|
||||
existingConversationId: args.existingConversationId,
|
||||
maxTurns: args.maxTurns,
|
||||
subagentId,
|
||||
outputFile,
|
||||
priority: args.priority ?? 5,
|
||||
maxAttempts: 3,
|
||||
backoffMultiplier: 2,
|
||||
maxBackoffMs: 300000,
|
||||
enqueuedAt: Date.now(),
|
||||
source: "user",
|
||||
isCoalescable: false // Task notifications are not coalescable
|
||||
};
|
||||
|
||||
// Enqueue to Redis
|
||||
await queue.enqueue(payload);
|
||||
|
||||
return { taskId, outputFile, subagentId };
|
||||
}
|
||||
```
|
||||
|
||||
### 5.3 Worker Implementation for Subagents
|
||||
|
||||
```typescript
|
||||
// src/queue/redis/subagent-worker.ts
|
||||
|
||||
class SubagentTaskWorker extends TaskWorker {
|
||||
protected async processTask(task: TaskState): Promise<void> {
|
||||
// Update subagent state to "running"
|
||||
updateSubagent(task.subagentId!, { status: "running" });
|
||||
|
||||
try {
|
||||
// Execute subagent (existing manager.ts logic)
|
||||
const result = await spawnSubagent(
|
||||
task.subagentType!,
|
||||
task.prompt!,
|
||||
task.model,
|
||||
task.subagentId!,
|
||||
undefined, // signal - handled via task cancellation
|
||||
task.existingAgentId,
|
||||
task.existingConversationId,
|
||||
task.maxTurns
|
||||
);
|
||||
|
||||
// Write transcript
|
||||
writeTaskTranscriptResult(task.outputFile!, result, "");
|
||||
|
||||
// Complete subagent state
|
||||
completeSubagent(task.subagentId!, {
|
||||
success: result.success,
|
||||
error: result.error,
|
||||
totalTokens: result.totalTokens
|
||||
});
|
||||
|
||||
// Send notification if not silent
|
||||
if (!task.silent) {
|
||||
const notification = formatTaskNotification({
|
||||
taskId: task.id,
|
||||
status: result.success ? "completed" : "failed",
|
||||
summary: `Agent "${task.description}" ${result.success ? "completed" : "failed"}`,
|
||||
result: result.success ? result.report : result.error,
|
||||
outputFile: task.outputFile!
|
||||
});
|
||||
|
||||
// Add to message queue for parent agent
|
||||
addToMessageQueue({
|
||||
kind: "task_notification",
|
||||
text: notification
|
||||
});
|
||||
}
|
||||
|
||||
// Mark task completed
|
||||
await this.completeTask(task.id, result);
|
||||
|
||||
} catch (error) {
|
||||
const errorMessage = error instanceof Error ? error.message : String(error);
|
||||
|
||||
// Update subagent state
|
||||
completeSubagent(task.subagentId!, { success: false, error: errorMessage });
|
||||
|
||||
// Fail task (triggers retry logic)
|
||||
await this.failTask(task.id, new Error(errorMessage));
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Operational Considerations
|
||||
|
||||
### 6.1 Redis Configuration
|
||||
|
||||
```yaml
|
||||
# Recommended Redis config for task queue
|
||||
maxmemory: 1gb
|
||||
maxmemory-policy: allkeys-lru # Evict old completed tasks first
|
||||
|
||||
# Persistence (for durability)
|
||||
appendonly: yes
|
||||
appendfsync: everysec
|
||||
|
||||
# Stream trimming (prevent unbounded growth)
|
||||
# Set via XTRIM or MAXLEN on XADD
|
||||
```
|
||||
|
||||
### 6.2 Key Patterns and Cleanup
|
||||
|
||||
| Key Pattern | Type | TTL | Cleanup Strategy |
|
||||
|-------------|------|-----|------------------|
|
||||
| `ade:queue:tasks` | Stream | - | XTRIM by MAXLEN (keep 100k) |
|
||||
| `ade:queue:delayed` | ZSET | - | Processed by promoter |
|
||||
| `ade:task:{id}` | Hash | 7 days | Expire completed/failed |
|
||||
| `ade:workers:active` | Hash | - | On worker deregistration |
|
||||
| `ade:workers:heartbeat` | ZSET | - | On worker timeout |
|
||||
|
||||
### 6.3 Monitoring Metrics
|
||||
|
||||
```typescript
|
||||
// Metrics to expose via Prometheus/StatsD
|
||||
interface QueueMetrics {
|
||||
// Queue depth
|
||||
"ade_queue_pending_total": number; // XPENDING count
|
||||
"ade_queue_delayed_total": number; // ZCARD ade:queue:delayed
|
||||
"ade_queue_stream_length": number; // XLEN ade:queue:tasks
|
||||
|
||||
// Throughput
|
||||
"ade_tasks_enqueued_rate": number; // XADD rate
|
||||
"ade_tasks_completed_rate": number; // Completion rate
|
||||
"ade_tasks_failed_rate": number; // Failure rate
|
||||
|
||||
// Worker health
|
||||
"ade_workers_active_total": number; // HLEN ade:workers:active
|
||||
"ade_workers_dead_total": number; // Detected dead workers
|
||||
|
||||
// Processing
|
||||
"ade_task_duration_ms": Histogram; // Time from claim to complete
|
||||
"ade_task_wait_ms": Histogram; // Time from enqueue to claim
|
||||
"ade_task_attempts": Histogram; // Distribution of retry counts
|
||||
}
|
||||
```
|
||||
|
||||
### 6.4 Failure Modes
|
||||
|
||||
| Scenario | Handling |
|
||||
|----------|----------|
|
||||
| Redis unavailable | Tasks fail immediately; caller responsible for retry |
|
||||
| Worker crash | Tasks reclaimed via heartbeat timeout (30s) |
|
||||
| Poison message | Max retries (3) then moved to DLQ |
|
||||
| Slow task | Processing timeout (5 min) triggers requeue |
|
||||
| Duplicate task | Idempotent task IDs (UUID) prevent double execution |
|
||||
|
||||
---
|
||||
|
||||
## 7. Migration Strategy
|
||||
|
||||
### Phase 1: Dual-Write (Week 1)
|
||||
- Implement RedisQueueAdapter
|
||||
- Write to both in-memory and Redis queues
|
||||
- Read from in-memory only (Redis for validation)
|
||||
|
||||
### Phase 2: Shadow Mode (Week 2)
|
||||
- Read from both queues
|
||||
- Compare results, log discrepancies
|
||||
- Fix any edge cases
|
||||
|
||||
### Phase 3: Cutover (Week 3)
|
||||
- Switch reads to Redis
|
||||
- Keep in-memory as fallback
|
||||
- Monitor for 1 week
|
||||
|
||||
### Phase 4: Cleanup (Week 4)
|
||||
- Remove in-memory queue code
|
||||
- Full Redis dependency
|
||||
|
||||
---
|
||||
|
||||
## 8. Implementation Checklist
|
||||
|
||||
- [ ] Redis client configuration (ioredis or node-redis)
|
||||
- [ ] Task entity schema and serialization
|
||||
- [ ] Stream consumer group setup
|
||||
- [ ] Worker registration and heartbeat
|
||||
- [ ] Task claim and processing loop
|
||||
- [ ] Retry logic with exponential backoff
|
||||
- [ ] Delayed task promotion
|
||||
- [ ] Dead worker detection and reclamation
|
||||
- [ ] QueueRuntime adapter implementation
|
||||
- [ ] Task.ts integration
|
||||
- [ ] Subagent state synchronization
|
||||
- [ ] Metrics and monitoring
|
||||
- [ ] Error handling and DLQ
|
||||
- [ ] Tests (unit, integration, load)
|
||||
- [ ] Documentation
|
||||
|
||||
---
|
||||
|
||||
## 9. Appendix: Redis Commands Reference
|
||||
|
||||
| Operation | Command | Complexity |
|
||||
|-----------|---------|------------|
|
||||
| Enqueue task | `XADD` | O(1) |
|
||||
| Claim tasks | `XREADGROUP` | O(N) N=count |
|
||||
| Ack completion | `XACK` | O(1) |
|
||||
| Get pending | `XPENDING` | O(1) |
|
||||
| Claim pending | `XCLAIM` / `XAUTOCLAIM` | O(log N) |
|
||||
| Delay task | `ZADD` delayed | O(log N) |
|
||||
| Promote delayed | `ZRANGEBYSCORE` + `ZREM` + `XADD` | O(log N + M) |
|
||||
| Register worker | `HSET` + `ZADD` | O(1) |
|
||||
| Heartbeat | `ZADD` | O(log N) |
|
||||
| Detect dead | `ZRANGEBYSCORE` | O(log N + M) |
|
||||
257
docs/ade-research.md
Normal file
257
docs/ade-research.md
Normal file
@@ -0,0 +1,257 @@
|
||||
# Agent Development Environment (ADE) Research
|
||||
|
||||
**Date:** March 17, 2026
|
||||
**Purpose:** Compare existing ADE solutions to inform Letta Community ADE development
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
The ADE category emerged in 2025 as agentic AI proved too complex for traditional IDE/CLI tooling. Three primary architectures exist:
|
||||
|
||||
1. **Letta ADE** - Memory-first, context window transparency, multi-model
|
||||
2. **Intent (Augment)** - Spec-driven with coordinator/specialist/verifier pattern
|
||||
3. **Warp Oz** - Terminal-native with cloud orchestration
|
||||
|
||||
Each approaches multi-agent orchestration differently, offering distinct tradeoffs for community implementation.
|
||||
|
||||
---
|
||||
|
||||
## 1. Letta ADE (Our Foundation)
|
||||
|
||||
### Core Philosophy
|
||||
> "Designing great agents is all about designing great context windows"
|
||||
|
||||
Letta ADE makes the opaque world of context windows and agent reasoning **visible and manageable**.
|
||||
|
||||
### Key Features
|
||||
|
||||
| Feature | Implementation |
|
||||
|---------|---------------|
|
||||
| **State & Memory** | Stateful agents that learn from interactions vs stateless LLMs |
|
||||
| **Context Management** | Editable memory blocks, tools, system prompts with character limits |
|
||||
| **Memory Architecture** | Core Memory (in-context blocks) + Archival/Recall Memory (vector DB) |
|
||||
| **Transparent Reasoning** | All agents must show their work - reasoning separated from user communication |
|
||||
| **Tool Integration** | 7,000+ tools via Composio, custom Python tool editor |
|
||||
| **Production Modes** | Simple/Interactive/Debug modes for different use cases |
|
||||
|
||||
### Architecture Highlights
|
||||
- **Core Memory**: Editable in-context blocks (`core_memory_append`, `core_memory_replace`)
|
||||
- **Archival Memory**: Vector database for free-form storage (`archival_memory_insert`, `archival_memory_search`)
|
||||
- **Recall Memory**: Automatic conversation history tracking (`conversation_search`)
|
||||
- **Context Pruning**: Recursive summarization + message pruning to manage window size
|
||||
|
||||
### Strengths
|
||||
✅ Memory-first design (MemGPT heritage)
|
||||
✅ Transparent reasoning by design
|
||||
✅ Context window controls
|
||||
✅ Real-time tool execution in ADE
|
||||
✅ Production deployment ready
|
||||
|
||||
### Letta Code CLI Features
|
||||
- Client-side tool execution (Bash, Read, Write execute locally)
|
||||
- Streaming API with background mode for long operations
|
||||
- Conversations API for parallel sessions with shared memory
|
||||
- Subagent spawning via Task tool
|
||||
- Memory-first coding with persistent context
|
||||
|
||||
---
|
||||
|
||||
## 2. Intent by Augment Code
|
||||
|
||||
### Core Philosophy
|
||||
> "Spec-Driven Development puts the spec at the center of your workflow"
|
||||
|
||||
Intent uses **living specifications** that update as agents work, preventing the "outdated PRD" problem.
|
||||
|
||||
### Key Features
|
||||
|
||||
| Feature | Implementation |
|
||||
|---------|---------------|
|
||||
| **Spec-Driven** | Living spec as source of truth - updates as code changes |
|
||||
| **Coordinator Pattern** | Coordinator → Specialists → Verifier pipeline |
|
||||
| **Parallel Work** | Isolated git worktrees for concurrent agent execution |
|
||||
| **Specialist Agents** | Investigate, Implement, Verify, Critique, Debug, Code Review |
|
||||
| **BYOA** | Bring Your Own Agent (Claude Code, Codex, OpenCode supported) |
|
||||
| **Context Engine** | Semantic dependency analysis across 400,000+ files |
|
||||
|
||||
### Architecture: Coordinator/Specialist/Verifier
|
||||
|
||||
```
|
||||
Coordinator Agent
|
||||
↓ analyzes codebase, drafts spec, generates tasks
|
||||
Specialist Agents (parallel in isolated worktrees)
|
||||
↓ execute scoped tasks
|
||||
Verifier Agent
|
||||
↓ validates against spec before merge
|
||||
Changes Tab
|
||||
↓ human review, merge/stage/create PR
|
||||
```
|
||||
|
||||
### Specialist Roles
|
||||
- **Investigate** - Explore codebase, assess feasibility
|
||||
- **Implement** - Execute implementation plans
|
||||
- **Verify** - Check implementations match specs
|
||||
- **Critique** - Review specs for feasibility
|
||||
- **Debug** - Analyze and fix issues
|
||||
- **Code Review** - Automated reviews with severity
|
||||
|
||||
### Unique Features
|
||||
- **Git Worktree Isolation**: Each agent runs in independent working directory
|
||||
- **WARP.md**: Compatible with agents.md, claude.md for agent behavior
|
||||
- **Context Engine**: Call-graph and dependency-chain understanding
|
||||
- **Verifier Agent**: Catches misalignment before human review
|
||||
|
||||
### Compliance
|
||||
- SOC 2 Type II (zero deviations, Coalfire audited)
|
||||
- ISO/IEC 42001 (AI governance certification)
|
||||
- Customer-Managed Encryption Keys (CMEK)
|
||||
- Air-gapped deployment options
|
||||
|
||||
### Strengths
|
||||
✅ Living specs prevent drift
|
||||
✅ Verifier catches misalignment
|
||||
✅ Enterprise compliance (dual certification)
|
||||
✅ BYOA prevents lock-in
|
||||
✅ Context Engine handles massive codebases
|
||||
|
||||
---
|
||||
|
||||
## 3. Warp Oz (Terminal-Native ADE)
|
||||
|
||||
### Core Philosophy
|
||||
> "Break out of your shell" - Terminal as the primary surface for agentic development
|
||||
|
||||
Warp reimagines the terminal as an agent platform with **Oz orchestration**.
|
||||
|
||||
### Key Features
|
||||
|
||||
| Feature | Implementation |
|
||||
|---------|---------------|
|
||||
| **Full Terminal Use** | Agents can run interactive CLI apps (REPLs, debuggers, top) |
|
||||
| **Cloud Agents** | Background agents on Warp infrastructure or self-hosted |
|
||||
| **Local Agents** | Real-time interactive coding in Warp terminal |
|
||||
| **Auto-Tracking** | Every agent produces link + audit trail |
|
||||
| **Multi-Model** | Mixed-model approach with fallback chains |
|
||||
| **Skills** | Reusable instructions (compatible with Claude Code, Codex) |
|
||||
|
||||
### Architecture: Oz Platform
|
||||
|
||||
**Local Mode:**
|
||||
- Run directly in Warp app
|
||||
- Real-time, interactive assistance
|
||||
- Multi-step planning, debugging, fixing
|
||||
|
||||
**Cloud Mode:**
|
||||
- Run on Warp infrastructure (or self-hosted)
|
||||
- Scheduled agents (cron-like)
|
||||
- Event triggers (Slack, GitHub, webhooks)
|
||||
- Parallel execution across repos
|
||||
|
||||
### Oz Capabilities
|
||||
- **Environments**: Docker containers + git repos + startup commands
|
||||
- **Session Sharing**: Links to track and steer agents
|
||||
- **Artifacts**: PRs, branches, plans automatically tracked
|
||||
- **Skills**: Any Skill can become an agent automation
|
||||
- **API/SDK/CLI**: Fully programmable agent stack
|
||||
|
||||
### Unique Features
|
||||
- **Multi-Repo Changes**: One agent can work across repos
|
||||
- **Computer Use**: Visual verification via screenshots
|
||||
- **Agent Session Sharing**: Hop into any running agent
|
||||
- **Cloud Mode**: Background automation with full visibility
|
||||
|
||||
### Performance Claims
|
||||
- Terminal-Bench: #1 ranked (52% → 61.2%)
|
||||
- SWE-bench Verified: 71%
|
||||
- 60%+ merged PRs created by Oz
|
||||
- 700K+ active developers
|
||||
|
||||
### Security
|
||||
- SOC 2 Type 2 certified
|
||||
- Contractual Zero Data Retention (ZDR) with Anthropic, OpenAI, Fireworks, Google
|
||||
- Configurable permissions (Never/Always allow/Prompt/Let agent decide)
|
||||
- Agent Profiles (Prod mode/YOLO mode)
|
||||
|
||||
### Strengths
|
||||
✅ Full terminal control (unique in market)
|
||||
✅ Cloud agent infrastructure
|
||||
✅ Multi-repo changes
|
||||
✅ Contractual ZDR across all providers
|
||||
✅ Terminal-native workflow
|
||||
|
||||
---
|
||||
|
||||
## 4. Feature Comparison Matrix
|
||||
|
||||
| Feature | Letta ADE | Intent | Warp Oz |
|
||||
|---------|-----------|--------|---------|
|
||||
| **Orchestration Model** | Memory-driven | Coordinator/Specialist/Verifier | Local + Cloud agents |
|
||||
| **Core Abstraction** | Context windows + Memory | Living specs + Git worktrees | Terminal + Environments |
|
||||
| **Multi-Agent** | Subagents via Task | Parallel specialists | Cloud agent pool |
|
||||
| **Isolation** | Memory blocks | Git worktrees | Docker environments |
|
||||
| **Context Strategy** | Hierarchical memory | Semantic Context Engine | Codebase indexing + MCP |
|
||||
| **Verification** | Tool return validation | Verifier agent | Human-in-the-loop |
|
||||
| **BYOA** | Open source, BYOK | Claude/Codex/OpenCode | Multi-model, BYOK |
|
||||
| **Compliance** | SOC 2 | SOC 2 + ISO 42001 | SOC 2 + ZDR |
|
||||
| **Scale** | Terminal-Bench #1 | 400K+ files | 700K+ developers |
|
||||
| **Unique** | Memory-first | Spec-driven | Terminal-native |
|
||||
|
||||
---
|
||||
|
||||
## 5. Community ADE Recommendations
|
||||
|
||||
Based on this research, here's what a **Letta Community ADE** should prioritize:
|
||||
|
||||
### Phase 1: Foundation (Letta Already Has)
|
||||
- ✅ Memory-first architecture (Core/Archival/Recall)
|
||||
- ✅ Context window transparency
|
||||
- ✅ Subagent spawning (Task tool)
|
||||
- ✅ Real-time tool execution
|
||||
- ✅ Multi-model support
|
||||
|
||||
### Phase 2: Enhanced Orchestration (From Intent)
|
||||
- **Git Worktree Isolation**: Execute subagents in isolated branches
|
||||
- **Coordinator Pattern**: Formal coordinator/specialist/verifier roles
|
||||
- **Approval Queue Enhancement**: Structured task delegation
|
||||
- **Spec Tracking**: Document what was planned vs executed
|
||||
|
||||
### Phase 3: Scale Features (From Warp)
|
||||
- **Cloud Agent Mode**: Background agents with session tracking
|
||||
- **Multi-Repo Support**: Cross-repository changes
|
||||
- **Skills System**: Reusable agent instructions
|
||||
- **Session Sharing**: Links to share agent runs
|
||||
|
||||
### Phase 4: Advanced Features
|
||||
- **Verification Layer**: Automated spec compliance checking
|
||||
- **Context Engine**: Semantic dependency analysis
|
||||
- **Scheduling**: Recurring agent tasks
|
||||
- **Event Triggers**: React to GitHub/Slack events
|
||||
|
||||
---
|
||||
|
||||
## 6. Key Implementation Insights
|
||||
|
||||
### From Intent: Spec-Driven Works
|
||||
The "living spec" concept prevents the most common agent failure mode: drift between intent and implementation. Letta's memory blocks could serve this purpose with explicit "plan" vs "execution" blocks.
|
||||
|
||||
### From Warp: Terminal is Underrated
|
||||
Full terminal control enables agents to use the same tools developers use (REPLs, debuggers, etc.). Letta Code's Bash tool already supports this, but could be enhanced with "terminal session" preservation.
|
||||
|
||||
### From Letta: Memory is Differentiating
|
||||
Neither Intent nor Warp have Letta's tiered memory architecture. This is a unique strength to build upon - memory as the coordination layer, not just context.
|
||||
|
||||
---
|
||||
|
||||
## 7. Sources
|
||||
|
||||
1. [Letta ADE Blog](https://www.letta.com/blog/introducing-the-agent-development-environment)
|
||||
2. [Letta ADE Docs](https://docs.letta.com/guides/ade/overview/)
|
||||
3. [Intent by Augment](https://www.augmentcode.com/product/intent)
|
||||
4. [Intent ADE Guide](https://www.augmentcode.com/guides/what-is-an-agentic-development-environment)
|
||||
5. [Warp Oz Platform](https://www.warp.dev/oz)
|
||||
6. [Warp Oz Launch](https://www.warp.dev/blog/oz-orchestration-platform-cloud-agents)
|
||||
|
||||
---
|
||||
|
||||
*Generated by Ani (Letta agent) - March 17, 2026*
|
||||
97
docs/community-ade-project-state.md
Normal file
97
docs/community-ade-project-state.md
Normal file
@@ -0,0 +1,97 @@
|
||||
# Community ADE Project - State Management
|
||||
|
||||
**Project:** Letta Community Agentic Development Environment
|
||||
**Orchestrator:** Ani (Annie Tunturi)
|
||||
**Created:** March 18, 2026
|
||||
**Status:** Phase 1 - Orchestration Layer
|
||||
|
||||
---
|
||||
|
||||
## Active Subagents
|
||||
|
||||
| Subagent | Type | Status | Assigned Task | Output Location |
|
||||
|----------|------|--------|---------------|-----------------|
|
||||
| explorer-1 | explore | PENDING | Codebase exploration - task queue patterns | /tmp/ade-explorer-1/ |
|
||||
| architect-1 | feature-architect | PENDING | Design Redis queue integration | /tmp/ade-architect-1/ |
|
||||
| researcher-1 | researcher | COMPLETED | ADE competitive analysis | docs/community-ade-research-synthesis-2026-03-18.md |
|
||||
|
||||
---
|
||||
|
||||
## Document Registry
|
||||
|
||||
### Research Documents
|
||||
- [x] `community-ade-research-2026-03-18.md` - Initial research
|
||||
- [x] `ade-phase1-orchestration-design.md` - Phase 1 technical design
|
||||
- [x] `community-ade-research-synthesis-2026-03-18.md` - Web research synthesis
|
||||
- [x] `ade-phase1-execution-plan.md` - 6-week execution plan
|
||||
|
||||
### Design Documents
|
||||
- [x] `ade-redis-queue-design.md` - Redis queue architecture (COMPLETED by researcher-2)
|
||||
- [ ] `ade-task-queue-spec.md` - Detailed task queue specification (IN PROGRESS)
|
||||
- [ ] `ade-worker-pool-design.md` - Worker pool architecture (PENDING)
|
||||
- [ ] `ade-dashboard-wireframes.md` - Dashboard UI design (PENDING)
|
||||
|
||||
### Implementation
|
||||
- [ ] `ade-queue-prototype/` - In-memory prototype (NOT STARTED)
|
||||
- [ ] `ade-redis-queue/` - Redis-backed implementation (NOT STARTED)
|
||||
- [ ] `ade-worker-process/` - Worker daemon (NOT STARTED)
|
||||
|
||||
---
|
||||
|
||||
## Current Phase: Phase 1 - Orchestration Layer
|
||||
|
||||
### Goals
|
||||
1. Build persistent task queue system
|
||||
2. Implement worker pool for subagent execution
|
||||
3. Add retry logic with exponential backoff
|
||||
4. Integrate with existing Task tool
|
||||
|
||||
### Decisions Made
|
||||
- Use Redis (not Celery) for direct control
|
||||
- In-memory prototype first, then Redis
|
||||
- Worker pool with heartbeat monitoring
|
||||
- Defer Temporal to Phase 2 evaluation
|
||||
|
||||
### Open Questions
|
||||
- Should we use Redis Streams or Sorted Sets?
|
||||
- Worker count: Fixed or dynamic?
|
||||
- Task priority levels: Simple (high/normal) or granular?
|
||||
|
||||
---
|
||||
|
||||
## Subagent Work Queue
|
||||
|
||||
### Ready to Assign
|
||||
1. **Explore task queue patterns in codebase**
|
||||
- Type: explore
|
||||
- Focus: Find existing queue/spawning code
|
||||
- Output: File locations and patterns
|
||||
|
||||
2. **Design Redis queue architecture**
|
||||
- Type: architect
|
||||
- Focus: Data models, operations, integration points
|
||||
- Output: Architecture spec document
|
||||
|
||||
3. **Research Playwright Computer Use**
|
||||
- Type: researcher
|
||||
- Focus: Browser automation for agentic coding
|
||||
- Output: Integration approach
|
||||
|
||||
### Blocked
|
||||
- None currently
|
||||
|
||||
### Completed
|
||||
- [x] ADE competitive analysis (researcher-1)
|
||||
|
||||
---
|
||||
|
||||
## State Updates Log
|
||||
|
||||
**2026-03-18 09:23 EDT** - Project initiated, research documents created
|
||||
**2026-03-18 10:01 EDT** - Attempting to spawn parallel subagents
|
||||
**2026-03-18 02:03 EDT** - explorer-1 completed: Found Task.ts (line 403), manager.ts (spawnSubagent at line 883), in-memory QueueRuntime class. No Redis currently exists.
|
||||
**2026-03-18 02:07 EDT** - researcher-2 completed: Redis queue architecture design. Key decisions: Redis Streams (consumer groups), Hash per task, 5s worker heartbeat, exponential backoff with jitter, adapter pattern integration.
|
||||
|
||||
---
|
||||
|
||||
*This file is maintained by Ani. Update when subagents report progress.*
|
||||
245
docs/community-ade-research-2026-03-18.md
Normal file
245
docs/community-ade-research-2026-03-18.md
Normal file
@@ -0,0 +1,245 @@
|
||||
# Community ADE Research - Letta vs Commercial Alternatives
|
||||
|
||||
**Date:** March 18, 2026
|
||||
**Researcher:** Ani (researcher subagent)
|
||||
**Goal:** Analyze Letta ADE capabilities and gaps vs Intent, Warp, and other commercial alternatives
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Letta has a **solid foundation** for an ADE (Agentic Development Environment) with best-in-class memory management and subagent orchestration. The gaps are primarily in **orchestration polish, web UI, and team collaboration features** rather than core agent capabilities.
|
||||
|
||||
**Estimated effort to full community ADE:** 6-9 months for a small team
|
||||
|
||||
---
|
||||
|
||||
## 1. Current Letta ADE Capabilities
|
||||
|
||||
### ✅ Subagent Orchestration (MATURE)
|
||||
- Built-in approval queues with ACCEPT/REJECT flow
|
||||
- Multi-agent fan-out (parallel subagent spawning)
|
||||
- Result aggregation and error handling
|
||||
- Agent lifecycle management (create/deploy/destroy)
|
||||
- Conversation threading for complex workflows
|
||||
|
||||
### ✅ Git Integration (UNIQUE STRENGTH)
|
||||
- MemFS with git versioning
|
||||
- Worktree isolation for parallel execution
|
||||
- Automatic checkpointing
|
||||
- Branch/merge semantics for agent memory
|
||||
- Diff-based memory updates
|
||||
|
||||
### ✅ CLI Experience (MATURE)
|
||||
- Full headless mode (`--headless`)
|
||||
- JSON streaming output (`--output-format stream-json`)
|
||||
- System prompt customization (`--system-custom`)
|
||||
- Tool filtering (`--tools`, `--allowedTools`)
|
||||
- Session persistence
|
||||
|
||||
### ✅ Memory System (BEST-IN-CLASS)
|
||||
- Hierarchical memory blocks
|
||||
- Resident/On-Demand archival tiers
|
||||
- Clear-immune sacred memory
|
||||
- Memory consolidation (Aster background agent)
|
||||
- Git-backed durability
|
||||
|
||||
### ✅ Skills System (SOLID)
|
||||
- Portable, reusable tool packages
|
||||
- MCP (Model Context Protocol) integration
|
||||
- Custom skill development
|
||||
- Tool registry and discovery
|
||||
|
||||
### ✅ Model Agnostic
|
||||
- Works with any OpenAI-compatible API
|
||||
- Synthetic API integration
|
||||
- Local Ollama support
|
||||
- Multi-provider fallback
|
||||
|
||||
---
|
||||
|
||||
## 2. Gaps vs Commercial ADEs
|
||||
|
||||
| Feature | Letta Status | Intent | Warp | Notes |
|
||||
|---------|-------------|--------|------|-------|
|
||||
| **Web Dashboard** | ❌ Missing | ✅ Full UI | ✅ Full UI | CLI-only currently |
|
||||
| **Triggers/Schedules** | ❌ Missing | ✅ GitHub webhooks | ✅ Built-in | No automated triggers |
|
||||
| **GitHub Integration** | ⚠️ Manual | ✅ Native PR reviews | ✅ PR comments | No native GitHub app |
|
||||
| **Persistent Task Queue** | ⚠️ In-process | ✅ Temporal/Cadence | ✅ Durable | Tasks lost on restart |
|
||||
| **Spec-Driven Dev** | ⚠️ Basic | ✅ Full PRD support | ✅ Constitution | No formal spec layer |
|
||||
| **Team Collaboration** | ❌ Missing | ✅ Shared workspaces | ✅ Org features | Single-user focus |
|
||||
| **Observability** | ⚠️ Logs only | ✅ Full traces | ✅ Metrics | No dashboard metrics |
|
||||
| **RBAC/Permissions** | ⚠️ Tool-level only | ✅ Role-based | ✅ Enterprise auth | No user roles |
|
||||
|
||||
---
|
||||
|
||||
## 3. Community ADE Implementation Roadmap
|
||||
|
||||
### Phase 1: Orchestration Layer (6-8 weeks)
|
||||
**Goal:** Persistent, durable task execution
|
||||
|
||||
**Components:**
|
||||
- Task queue (Redis/RabbitMQ)
|
||||
- Durable execution (Temporal.io or Cadence)
|
||||
- Retry logic with exponential backoff
|
||||
- Task state persistence
|
||||
- Failure recovery
|
||||
|
||||
**Key Decisions:**
|
||||
- Use existing Letta subagent system
|
||||
- Add queue abstraction layer
|
||||
- Maintain CLI compatibility
|
||||
|
||||
### Phase 2: Web Dashboard (4-6 weeks)
|
||||
**Goal:** Real-time visibility and control
|
||||
|
||||
**Features:**
|
||||
- Agent grid view (running/pending/failed)
|
||||
- Real-time logs (WebSocket streaming)
|
||||
- Approval queue UI
|
||||
- Memory browser
|
||||
- Task history
|
||||
|
||||
**Tech Stack:**
|
||||
- React + Vite
|
||||
- FastAPI backend
|
||||
- WebSocket for real-time updates
|
||||
- PostgreSQL for metadata
|
||||
|
||||
### Phase 3: Integration Ecosystem (3-4 weeks)
|
||||
**Goal:** Connect to external tools
|
||||
|
||||
**Integrations:**
|
||||
- GitHub App (PR reviews, issue comments)
|
||||
- Slack/Discord notifications
|
||||
- Webhook triggers
|
||||
- CI/CD pipeline hooks
|
||||
- Linear/Jira ticket creation
|
||||
|
||||
### Phase 4: Specification Layer (8-10 weeks)
|
||||
**Goal:** PRD-driven development
|
||||
|
||||
**Features:**
|
||||
- Executable specifications (Zod schemas)
|
||||
- Constitution enforcement
|
||||
- Architectural guardrails
|
||||
- Test-first enforcement
|
||||
- Spec versioning
|
||||
|
||||
**Example Workflow:**
|
||||
```yaml
|
||||
specification:
|
||||
prd: "./docs/prd.md"
|
||||
constitution:
|
||||
- "library-first: prefer existing packages"
|
||||
- "test-first: all code must have tests"
|
||||
- "simplicity: minimize abstraction layers"
|
||||
|
||||
execution:
|
||||
generate_plan: true
|
||||
auto_implement: false
|
||||
review_checkpoints: true
|
||||
```
|
||||
|
||||
### Phase 5: Team Collaboration (4-6 weeks)
|
||||
**Goal:** Multi-user, organization support
|
||||
|
||||
**Features:**
|
||||
- Shared agent registry
|
||||
- Organization memory
|
||||
- Agent permissions/roles
|
||||
- Session sharing
|
||||
- Comment/annotation system
|
||||
|
||||
---
|
||||
|
||||
## 4. Technical Architecture Recommendations
|
||||
|
||||
### Recommended Stack
|
||||
|
||||
| Layer | Technology | Rationale |
|
||||
|-------|------------|-----------|
|
||||
| **Orchestration** | Temporal.io | Durable execution, retries, observability |
|
||||
| **Queue** | Redis | Reliable, fast, pub/sub support |
|
||||
| **API** | FastAPI | Async native, easy WebSockets |
|
||||
| **Dashboard** | React + Vite | Modern, good DX, fast builds |
|
||||
| **Database** | PostgreSQL | ACID, JSON support, mature |
|
||||
| **Memory** | Git + Letta MemFS | Existing infrastructure |
|
||||
| **Auth** | OAuth 2.0 / SSO | Enterprise compatibility |
|
||||
|
||||
### Integration Points
|
||||
|
||||
```typescript
|
||||
// Letta API Integration
|
||||
interface LettaCloudConfig {
|
||||
lettaBaseUrl: string;
|
||||
apiKey: string;
|
||||
|
||||
// Cloud features
|
||||
triggers: WebhookConfig[];
|
||||
schedules: ScheduleConfig[];
|
||||
|
||||
// Team features
|
||||
organizationId: string;
|
||||
sharedAgents: string[];
|
||||
}
|
||||
|
||||
// Specification Layer
|
||||
interface SpecificationConfig {
|
||||
prdPath: string;
|
||||
constitution: string[];
|
||||
guardrails: GuardrailRule[];
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Letta's Unique Advantages
|
||||
|
||||
1. **Stateful by Design**: Unlike stateless alternatives, Letta's MemFS provides true persistence
|
||||
2. **Git-Native**: Version control for agent memory is unique to Letta
|
||||
3. **Model Agnostic**: Not locked into single provider
|
||||
4. **Open Source**: Full transparency, community extensible
|
||||
5. **CLI-First**: Developers love the terminal experience
|
||||
|
||||
---
|
||||
|
||||
## 6. Critical Path to MVP
|
||||
|
||||
**Minimum Viable Community ADE:**
|
||||
|
||||
1. ✅ **Already have**: Subagent orchestration, memory system, CLI
|
||||
2. 🔄 **Need soon**: Persistent task queue, basic dashboard
|
||||
3. 📋 **Next phase**: GitHub integration, triggers
|
||||
4. 📋 **Future**: Full spec layer, team features
|
||||
|
||||
**Priorities:**
|
||||
1. Fix Task tool reliability (persistent queue)
|
||||
2. Build minimal dashboard (agent status, approvals)
|
||||
3. Add GitHub webhook support
|
||||
4. Implement basic spec validation
|
||||
|
||||
---
|
||||
|
||||
## 7. Conclusion
|
||||
|
||||
### Letta's Position
|
||||
- **Best memory system** in the open-source ADE space
|
||||
- **Solid foundation** for enterprise-grade ADE
|
||||
- **Unique git-native approach** provides durability others lack
|
||||
- **Gaps are solvable** with focused engineering effort
|
||||
|
||||
### Recommendation
|
||||
**Build the community ADE on Letta**. The core architecture is superior to many commercial alternatives. The missing pieces (dashboard, queue durability, integrations) are well-understood engineering problems.
|
||||
|
||||
**Start with:**
|
||||
1. Persistent task queue (fixes current Task reliability issues)
|
||||
2. Minimal web dashboard (real-time visibility)
|
||||
3. GitHub webhook integration (proves external integration model)
|
||||
|
||||
**Total effort to MVP:** 3-4 months focused work
|
||||
**Total effort to full ADE:** 6-9 months
|
||||
|
||||
---
|
||||
|
||||
*Research conducted by Ani (researcher subagent) on March 18, 2026*
|
||||
601
docs/community-ade-research-synthesis-2026-03-18.md
Normal file
601
docs/community-ade-research-synthesis-2026-03-18.md
Normal file
@@ -0,0 +1,601 @@
|
||||
# Community ADE Research Synthesis
|
||||
|
||||
**Date:** March 18, 2026
|
||||
**Researcher:** Research Agent
|
||||
**Goal:** Synthesize web research findings with existing documents and recommend technical next steps
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Based on web research and analysis of existing documents, the Letta community ADE opportunity is **stronger than initially assessed**. The research confirms:
|
||||
|
||||
1. **Letta's core architecture is genuinely differentiated** - No open-source competitor combines stateful agents, hierarchical memory, and git-native persistence
|
||||
2. **Commercial ADEs (Warp, Intent) validate the market** but leave gaps Letta can fill
|
||||
3. **Technical patterns are well-established** - Redis queues, Temporal workflows, FastAPI + React dashboards are proven patterns
|
||||
4. **Community momentum exists** - Multiple open-source tools (Aider, Cline, Agno) show appetite for agentic development tools
|
||||
|
||||
**Updated recommendation:** The 6-9 month estimate is realistic. The critical path is **orchestration layer + dashboard MVP**.
|
||||
|
||||
---
|
||||
|
||||
## 1. Letta Position Analysis (Updated)
|
||||
|
||||
### Current Strengths Confirmed
|
||||
|
||||
From [docs.letta.com](https://docs.letta.com) and [github.com/letta-ai/letta](https://github.com/letta-ai/letta):
|
||||
|
||||
| Feature | Letta | Competitive Landscape |
|
||||
|---------|-------|----------------------|
|
||||
| **Stateful Agents** | ✅ Core design | ❌ Most tools (Cline, Aider) are stateless |
|
||||
| **Memory Blocks** | ✅ Hierarchical, tiered | ❌ Simple conversation history only |
|
||||
| **Git-Native** | ✅ MemFS with versioning | ⚠️ Aider has git commits, no MemFS |
|
||||
| **Model Agnostic** | ✅ Any OpenAI-compatible | ⚠️ Many lock to specific providers |
|
||||
| **Subagents** | ✅ Built-in orchestration | ⚠️ Limited or external |
|
||||
| **Skills/MCP** | ✅ Portable tool packages | ✅ Emerging standard |
|
||||
|
||||
### Key Differentiator: Stateful + Memory
|
||||
|
||||
Letta's **memory-first architecture** is unique in the open-source space:
|
||||
- **Agno** ([docs.agno.com](https://docs.agno.com)): Agent framework, but no persistent memory architecture
|
||||
- **Cline** ([github.com/cline/cline](https://github.com/cline/cline)): VS Code extension, session-only state
|
||||
- **Aider** ([github.com/Aider-AI/aider](https://github.com/Aider-AI/aider)): Git-aware but no agent memory
|
||||
|
||||
This positions Letta as the **only open-source option** for long-running, learning agents that persist knowledge across sessions.
|
||||
|
||||
---
|
||||
|
||||
## 2. Commercial ADE Analysis
|
||||
|
||||
### Intent (intent.dev)
|
||||
|
||||
**Core Value Proposition:** Specification-Driven Development (SDD)
|
||||
|
||||
From [intent.dev](https://intent.dev):
|
||||
- **The Specification as the Asset**: PRD is source of truth
|
||||
- **15-Minute Workflows**: Automated documentation generation
|
||||
- **Architectural Guardrails**: "Development Constitution" for enforcing principles
|
||||
- **Total Pivotability**: Change spec → regenerate implementation
|
||||
|
||||
**Key Insight for Letta:**
|
||||
Intent focuses on the *specification layer* - the "what" before the "how". This is Phase 4 in our roadmap. Intent's approach validates that **executable specifications are valuable**, but they don't provide the underlying agent infrastructure. Letta could integrate Intent-style spec validation without building the spec layer from scratch.
|
||||
|
||||
**Gap Letta Can Fill:**
|
||||
- Intent appears to be a platform/service; Letta is open-source and self-hostable
|
||||
- Intent doesn't mention stateful/memory-enabled agents
|
||||
- No evidence of subagent orchestration
|
||||
|
||||
### Warp (warp.dev)
|
||||
|
||||
**Core Value Proposition:** Terminal-native ADE with "Oz" agent
|
||||
|
||||
From [warp.dev](https://warp.dev):
|
||||
- **Multi-model by default**: OpenAI, Anthropic, Google models
|
||||
- **Full Terminal Use**: Interactive terminal commands
|
||||
- **Computer Use**: Browser automation for verification
|
||||
- **MCP Support**: Linear, Figma, Slack, Sentry integration
|
||||
- **WARP.md**: Configuration files compatible with agents.md, claude.md
|
||||
|
||||
**Key Features:**
|
||||
```
|
||||
- IDE + CLI hybrid experience
|
||||
- Agent code review interface (line-level comments)
|
||||
- Universal Input: @file, image upload, URL attachment
|
||||
- Snapshot/restore for workspace exploration
|
||||
- Enterprise: SSO, audit trails, VPC support
|
||||
```
|
||||
|
||||
**Key Insight for Letta:**
|
||||
Warp validates the **terminal-native workflow** but extends it with:
|
||||
1. **Rich prompt context** (@mentions, images, URLs)
|
||||
2. **Code review UI** in terminal (Letta could add approval queue UI)
|
||||
3. **MCP ecosystem** (Letta already supports skills, should expand MCP)
|
||||
4. **Workspace snapshots** (Letta's git worktrees provide similar isolation)
|
||||
|
||||
**Critical Gap Warp Leaves:**
|
||||
- No mention of durable execution or task persistence
|
||||
- Appears to be single-session focused
|
||||
- No subagent orchestration (single agent model)
|
||||
- No persistent memory across sessions
|
||||
|
||||
### Antigravity
|
||||
|
||||
**Status:** No functional website found. May be early-stage or rebranded.
|
||||
|
||||
---
|
||||
|
||||
## 3. Open-Source Community Landscape
|
||||
|
||||
### Aider (Most Direct Comparison)
|
||||
|
||||
From [github.com/Aider-AI/aider](https://github.com/Aider-AI/aider):
|
||||
|
||||
**Strengths:**
|
||||
- Multi-file editing with diff view
|
||||
- Automatic git commits with sensible messages
|
||||
- Repository mapping for large codebases
|
||||
- Voice support, image input
|
||||
- Works with almost any LLM (including local)
|
||||
- Lint/test integration (auto-fix)
|
||||
- Strong community (28k+ GitHub stars)
|
||||
|
||||
**Architecture:**
|
||||
- Git-repo centered (not stateful agents)
|
||||
- Edits files directly (no MemFS)
|
||||
- No subagent orchestration
|
||||
- Session-based (no persistent memory)
|
||||
|
||||
**Lessons for Letta:**
|
||||
- Aider's git integration patterns are excellent (auto-commit, sensible messages)
|
||||
- Codebase mapping is crucial for large projects
|
||||
- Diff-based editing is user-friendly
|
||||
- Multi-model support is table stakes
|
||||
|
||||
### Cline (VS Code Extension)
|
||||
|
||||
From [github.com/cline/cline](https://github.com/cline/cline):
|
||||
|
||||
**Strengths:**
|
||||
- Human-in-the-loop GUI (approve every change)
|
||||
- VS Code native integration
|
||||
- Computer Use (browser automation)
|
||||
- MCP support for custom tools
|
||||
- Workspace snapshots and restore
|
||||
- Token/cost tracking
|
||||
- Terminal integration in editor
|
||||
|
||||
**Architecture:**
|
||||
- Extension-based (VS Code only)
|
||||
- Session-based state
|
||||
- No subagent orchestration
|
||||
- No persistent memory
|
||||
|
||||
**Lessons for Letta:**
|
||||
- Human-in-the-loop approval is important for trust
|
||||
- Workspace snapshots enable safe exploration
|
||||
- Cost transparency (token tracking) is valuable
|
||||
- Browser automation (Computer Use) is becoming standard
|
||||
|
||||
### Agno
|
||||
|
||||
From [docs.agno.com](https://docs.agno.com):
|
||||
|
||||
**Positioning:** "AgentOS" - platform for building agents
|
||||
|
||||
**Observations:**
|
||||
- Focus on production deployment patterns
|
||||
- Less mature than Letta in documentation
|
||||
- No clear differentiation on memory/state
|
||||
|
||||
---
|
||||
|
||||
## 4. Technical Pattern Research
|
||||
|
||||
### 4.1 Task Queues: Redis + Python
|
||||
|
||||
From [redis.io/docs](https://redis.io/docs/latest/develop/clients/redis-py/):
|
||||
|
||||
**Redis-py patterns for Letta:**
|
||||
```python
|
||||
import redis
|
||||
|
||||
# Connection with auto-decode
|
||||
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
|
||||
|
||||
# Hash for task storage (matches our design)
|
||||
r.hset(f'task:{task_id}', mapping={
|
||||
'status': 'pending',
|
||||
'prompt': prompt,
|
||||
'subagent_type': subagent_type,
|
||||
'created_at': timestamp,
|
||||
})
|
||||
|
||||
# Sorted Set for priority queue
|
||||
r.zadd('queue:pending', {task_id: priority})
|
||||
|
||||
# Atomic dequeue with WATCH/MULTI/EXEC
|
||||
```
|
||||
|
||||
**Key Insights:**
|
||||
- `decode_responses=True` eliminates manual byte decoding
|
||||
- `hset` with `mapping=` is clean for task storage
|
||||
- Redis transactions (MULTI/EXEC) provide atomic queue operations
|
||||
- Consider `redis[hiredis]` for performance
|
||||
|
||||
**Alternative: Celery**
|
||||
|
||||
From [docs.celeryq.dev](https://docs.celeryq.dev):
|
||||
|
||||
Celery provides:
|
||||
- Distributed task processing
|
||||
- Real-time + scheduling
|
||||
- Worker management
|
||||
- Built-in retry logic
|
||||
|
||||
**Recommendation:** For Phase 1, use **raw Redis** (not Celery):
|
||||
- Celery adds abstraction layer that may conflict with Letta's specific needs
|
||||
- Our task queue has unique requirements (subagent spawning, git worktrees)
|
||||
- Raw Redis gives more control over the data model
|
||||
- Can migrate to Celery later if needed
|
||||
|
||||
### 4.2 Durable Execution: Temporal
|
||||
|
||||
From [temporal.io](https://temporal.io):
|
||||
|
||||
**Core Value:** "Write code as if failure doesn't exist"
|
||||
|
||||
**Key Concepts:**
|
||||
- **Workflows**: Durable, fault-tolerant business logic
|
||||
- **Activities**: Retryable, failure-prone operations
|
||||
- **State persistence**: Automatic checkpointing
|
||||
- **Signals**: External events to running workflows
|
||||
|
||||
**Temporal for Letta ADE:**
|
||||
```python
|
||||
# Potential Workflow structure
|
||||
@workflow.defn
|
||||
class SubagentWorkflow:
|
||||
@workflow.run
|
||||
async def run(self, task_id: str):
|
||||
# Activity: Spawn subagent (may fail)
|
||||
result = await workflow.execute_activity(
|
||||
spawn_subagent,
|
||||
task_id,
|
||||
start_to_close_timeout=timedelta(minutes=5),
|
||||
retry_policy=RetryPolicy(maximum_attempts=3)
|
||||
)
|
||||
|
||||
# Activity: Wait for completion (long-running)
|
||||
final_result = await workflow.execute_activity(
|
||||
poll_subagent_completion,
|
||||
result.agent_id,
|
||||
start_to_close_timeout=timedelta(hours=1),
|
||||
)
|
||||
|
||||
return final_result
|
||||
```
|
||||
|
||||
**Decision Matrix:**
|
||||
|
||||
| Approach | Complexity | Durability | Letta Fit |
|
||||
|----------|-----------|------------|-----------|
|
||||
| Raw Redis + Workers | Low | Medium | ✅ Good fit |
|
||||
| Temporal | Medium | High | ⚠️ Overkill? |
|
||||
| Celery | Low | Medium | ⚠️ Too abstract |
|
||||
|
||||
**Recommendation:**
|
||||
- **Phase 1**: Use Redis + custom workers (existing design)
|
||||
- **Phase 2**: Evaluate Temporal for complex multi-step workflows
|
||||
- Temporal shines for **long-running, multi-step** workflows with human-in-the-loop
|
||||
- Letta's subagent tasks are relatively simple (spawn → run → complete)
|
||||
|
||||
### 4.3 Web Dashboard: React + FastAPI Patterns
|
||||
|
||||
From [ui.shadcn.com](https://ui.shadcn.com):
|
||||
|
||||
**Shadcn/ui** provides:
|
||||
- Unstyled, accessible components
|
||||
- Tailwind CSS integration
|
||||
- Customizable design system
|
||||
- Modern React patterns
|
||||
|
||||
**Recommended Dashboard Stack:**
|
||||
|
||||
```
|
||||
Frontend:
|
||||
- React + Vite (existing plan)
|
||||
- shadcn/ui for components
|
||||
- TanStack Query for data fetching
|
||||
- WebSocket client for real-time updates
|
||||
|
||||
Backend:
|
||||
- FastAPI (async native)
|
||||
- WebSocket support built-in
|
||||
- Redis for pub/sub
|
||||
- PostgreSQL for metadata
|
||||
```
|
||||
|
||||
**Dashboard Features (Prioritized):**
|
||||
|
||||
| Priority | Feature | Tech |
|
||||
|----------|---------|------|
|
||||
| P0 | Agent grid view | React + TanStack Query |
|
||||
| P0 | Real-time logs | WebSocket |
|
||||
| P1 | Approval queue UI | shadcn Dialog + Table |
|
||||
| P1 | Task history | TanStack Query + Pagination |
|
||||
| P2 | Memory browser | Tree view component |
|
||||
| P2 | Metrics dashboard | Recharts or Tremor |
|
||||
|
||||
---
|
||||
|
||||
## 5. Updated Gap Analysis
|
||||
|
||||
### Revised Comparison Table
|
||||
|
||||
| Feature | Letta | Intent | Warp | Aider | Cline |
|
||||
|---------|-------|--------|------|-------|-------|
|
||||
| **Web Dashboard** | ❌ | ✅ | ✅ | ❌ | ❌ |
|
||||
| **Persistent Tasks** | ⚠️ | ✅ | ⚠️ | ❌ | ❌ |
|
||||
| **Stateful Agents** | ✅ | ⚠️ | ❌ | ❌ | ❌ |
|
||||
| **Subagent Orchestration** | ✅ | ❌ | ❌ | ❌ | ❌ |
|
||||
| **Git-Native Memory** | ✅ | ❌ | ❌ | ⚠️ | ❌ |
|
||||
| **MCP/Skills** | ✅ | ❌ | ✅ | ❌ | ✅ |
|
||||
| **Approval Queues** | ✅ CLI | ✅ | ✅ | ❌ | ✅ |
|
||||
| **Multi-Model** | ✅ | ? | ✅ | ✅ | ✅ |
|
||||
| **Computer Use** | ❌ | ? | ✅ | ❌ | ✅ |
|
||||
| **Spec-Driven Dev** | ⚠️ | ✅ | ❌ | ❌ | ❌ |
|
||||
|
||||
**Key Insights:**
|
||||
1. **Letta leads in agent infrastructure** (state, memory, orchestration)
|
||||
2. **Commercial tools lead in UX** (dashboards, IDE integration)
|
||||
3. **No competitor has Letta's core combination** (stateful + memory + git + subagents)
|
||||
4. **Computer Use** is becoming standard (Warp, Cline have it)
|
||||
|
||||
---
|
||||
|
||||
## 6. Critical Gaps Identified
|
||||
|
||||
### High Priority (MVP Blockers)
|
||||
|
||||
1. **Computer Use / Browser Automation**
|
||||
- Warp and Cline both offer browser automation
|
||||
- Letta agents should be able to launch browsers, click, screenshot
|
||||
- Enables web testing, visual verification
|
||||
|
||||
2. **Rich Context in Prompts**
|
||||
- Warp's "Universal Input" (@file, images, URLs)
|
||||
- Currently requires manual file reading
|
||||
- Should support: `@path/to/file`, `@https://example.com`, drag-drop images
|
||||
|
||||
3. **Workspace Snapshots**
|
||||
- Cline's snapshot's/restore feature
|
||||
- Letta git worktrees provide foundation
|
||||
- Need UI for "save point" and "restore to point"
|
||||
|
||||
### Medium Priority (Competitive Parity)
|
||||
|
||||
4. **Cost Tracking**
|
||||
- Cline shows token usage and cost per request
|
||||
- Important for agent transparency
|
||||
|
||||
5. **Voice Input Flow**
|
||||
- Warp integrates with Wispr
|
||||
- Nice-to-have, not MVP critical
|
||||
|
||||
6. **MCP Ecosystem Expansion**
|
||||
- Expand skills to full MCP server support
|
||||
- Community MCP marketplace
|
||||
|
||||
---
|
||||
|
||||
## 7. Revised Implementation Recommendations
|
||||
|
||||
### Phase 1: Orchestration Layer (Refined)
|
||||
|
||||
**Duration:** 6 weeks (was 8)
|
||||
|
||||
**Changes from original design:**
|
||||
1. **Week 1-2**: In-memory prototype (unchanged)
|
||||
2. **Week 3-4**: Redis integration + worker pool (unchanged)
|
||||
3. **Week 5**: API endpoints (REST + WebSocket)
|
||||
4. **Week 6**: Task tool integration + testing
|
||||
|
||||
**Additions:**
|
||||
- WebSocket endpoint for real-time task updates
|
||||
- Cost tracking (token counting)
|
||||
- Rich context parsing (@file, URLs)
|
||||
|
||||
**Deferred:**
|
||||
- Temporal integration (evaluate in Phase 2)
|
||||
- Advanced retry policies (basic exponential backoff sufficient)
|
||||
|
||||
### Phase 2: Dashboard MVP (Refined)
|
||||
|
||||
**Duration:** 4 weeks
|
||||
|
||||
**Week 1**: Project setup + Agent grid view
|
||||
- Vite + React + shadcn/ui setup
|
||||
- TanStack Query integration
|
||||
- Basic agent status display
|
||||
|
||||
**Week 2**: Real-time features
|
||||
- WebSocket connection
|
||||
- Live log streaming
|
||||
- Status updates
|
||||
|
||||
**Week 3**: Approval queue UI
|
||||
- Pending approvals list
|
||||
- Accept/Reject buttons
|
||||
- Comment/feedback input
|
||||
|
||||
**Week 4**: Task history + polish
|
||||
- Task list with filtering
|
||||
- Detail view
|
||||
- Error display
|
||||
|
||||
**Additions based on research:**
|
||||
- Cost display (per task, total)
|
||||
- Workspace snapshot indicator
|
||||
- @mention support in prompts
|
||||
|
||||
### Phase 3: Integration Ecosystem (Unchanged)
|
||||
|
||||
**Add priority:**
|
||||
1. GitHub App (highest - matches Aider/Warp)
|
||||
2. Slack notifications
|
||||
3. Linear/Jira (MCP-based)
|
||||
4. Webhook triggers
|
||||
|
||||
### Phase 4: Computer Use (NEW PHASE)
|
||||
|
||||
**Duration:** 4 weeks
|
||||
|
||||
**Rationale:** Computer Use is becoming table stakes (Warp, Cline have it)
|
||||
|
||||
**Scope:**
|
||||
- Browser automation (Playwright integration)
|
||||
- Screenshot capture
|
||||
- Click/type/scroll actions
|
||||
- Visual verification workflows
|
||||
|
||||
**Integration:**
|
||||
- New skill: `computer_use`
|
||||
- Subagent can launch browser
|
||||
- Screenshots stored in MemFS
|
||||
|
||||
### Phase 5: Specification Layer (Refined)
|
||||
|
||||
**Duration:** 6 weeks (was 8-10)
|
||||
|
||||
**Scope reduction:**
|
||||
- Start with PRD validation (Zod schemas)
|
||||
- Basic constitution enforcement (regex + AST rules)
|
||||
- No full natural language spec parsing yet
|
||||
|
||||
**Deferred:**
|
||||
- Full spec regeneration (Intent-level functionality)
|
||||
- Architectural diagram generation
|
||||
|
||||
### Phase 6: Team Collaboration (Unchanged)
|
||||
|
||||
---
|
||||
|
||||
## 8. Technical Stack Recommendations (Updated)
|
||||
|
||||
### Orchestration Layer
|
||||
|
||||
| Component | Original | Updated | Rationale |
|
||||
|-----------|----------|---------|-----------|
|
||||
| Queue | Redis | Redis ✅ | Proven, matches research |
|
||||
| Durable Execution | Temporal | Redis + Workers | Temporal overkill for Phase 1 |
|
||||
| Workers | Python asyncio | Python asyncio ✅ | Good fit |
|
||||
| API | FastAPI | FastAPI ✅ | Async native, WebSocket support |
|
||||
|
||||
### Dashboard
|
||||
|
||||
| Component | Original | Updated | Rationale |
|
||||
|-----------|----------|---------|-----------|
|
||||
| Framework | React | React ✅ | Standard |
|
||||
| Build Tool | Vite | Vite ✅ | Fast, modern |
|
||||
| UI Library | - | shadcn/ui | Accessible, customizable |
|
||||
| Styling | - | Tailwind CSS | Standard with shadcn |
|
||||
| Data Fetching | - | TanStack Query | Caching, real-time sync |
|
||||
| Charts | - | Tremor/Recharts | Dashboard metrics |
|
||||
|
||||
### Additional Components
|
||||
|
||||
| Component | Recommendation |
|
||||
|-----------|----------------|
|
||||
| Browser Automation | Playwright |
|
||||
| Rich Context Parsing | Custom parser (@file, URL regex) |
|
||||
| Cost Tracking | Token counting in subagent wrapper |
|
||||
| WebSocket | FastAPI native + Redis pub/sub |
|
||||
|
||||
---
|
||||
|
||||
## 9. Risks and Mitigations
|
||||
|
||||
### Identified Risks
|
||||
|
||||
| Risk | Impact | Mitigation |
|
||||
|------|--------|------------|
|
||||
| Web search unavailable for research | Medium | Use fetch_webpage for known URLs |
|
||||
| Temporal overengineering | High | Defer to Phase 2 evaluation |
|
||||
| Dashboard scope creep | High | Strict MVP definition (4 weeks) |
|
||||
| Computer Use complexity | Medium | Use Playwright, limit scope |
|
||||
| Competition pace | Medium | Focus on Letta differentiators |
|
||||
|
||||
### Competitive Response
|
||||
|
||||
**If Warp releases open-source:**
|
||||
- Warp is terminal + IDE hybrid; Letta is agent infrastructure
|
||||
- Different target users (Warp = developers, Letta = agent builders)
|
||||
- Letta's stateful/memory approach still differentiated
|
||||
|
||||
**If Intent releases spec layer as open standard:**
|
||||
- Letta could adopt Intent spec format
|
||||
- Focus on execution infrastructure
|
||||
- Potential collaboration opportunity
|
||||
|
||||
---
|
||||
|
||||
## 10. Next Steps (Prioritized)
|
||||
|
||||
### Immediate (This Week)
|
||||
|
||||
1. **Decision:** Confirm Redis-only vs Temporal evaluation
|
||||
2. **Prototype:** Build in-memory task queue (Week 1)
|
||||
3. **Research:** Playwright integration for Computer Use
|
||||
4. **Design:** Dashboard wireframes (shadcn components)
|
||||
|
||||
### Week 2-3
|
||||
|
||||
5. **Implement:** Redis integration
|
||||
6. **Test:** Worker pool with 3+ concurrent workers
|
||||
7. **API:** REST endpoints for task management
|
||||
|
||||
### Week 4-6
|
||||
|
||||
8. **Integrate:** Modify Task tool to use queue
|
||||
9. **WebSocket:** Real-time updates endpoint
|
||||
10. **Dashboard:** Start React project setup
|
||||
|
||||
### Success Metrics
|
||||
|
||||
| Metric | Target |
|
||||
|--------|--------|
|
||||
| Task durability | 0% loss on restart |
|
||||
| Worker concurrency | 5+ parallel tasks |
|
||||
| Dashboard load time | <2 seconds |
|
||||
| Approval latency | <1 second from event |
|
||||
|
||||
---
|
||||
|
||||
## 11. Conclusion
|
||||
|
||||
### Updated Assessment
|
||||
|
||||
**Letta's position is stronger than initially thought:**
|
||||
|
||||
1. **No open-source competitor** has the combination of:
|
||||
- Stateful agents with hierarchical memory
|
||||
- Git-native persistence
|
||||
- Subagent orchestration
|
||||
- Model agnostic design
|
||||
|
||||
2. **Commercial tools validate the market** but focus on different layers:
|
||||
- Intent: Specification layer (Letta can integrate)
|
||||
- Warp: Terminal UX (Letta can offer alternative)
|
||||
|
||||
3. **Technical patterns are well-understood**:
|
||||
- Redis queues: Proven, simple
|
||||
- Temporal: Powerful but may be overkill
|
||||
- React + FastAPI: Standard, well-supported
|
||||
|
||||
### Final Recommendation
|
||||
|
||||
**Proceed with Phase 1 (Orchestration) immediately.**
|
||||
|
||||
The research confirms:
|
||||
- The orchestration layer design is sound
|
||||
- Redis is the right choice for Phase 1
|
||||
- The dashboard stack (React + shadcn + TanStack Query) is industry standard
|
||||
- Competitive pressure is real but Letta has unique advantages
|
||||
|
||||
**Revised Timeline:**
|
||||
- Phase 1 (Orchestration): 6 weeks
|
||||
- Phase 2 (Dashboard): 4 weeks
|
||||
- Phase 3 (Integrations): 4 weeks
|
||||
- Phase 4 (Computer Use): 4 weeks
|
||||
- Phase 5 (Specifications): 6 weeks
|
||||
- Phase 6 (Team): 4 weeks
|
||||
|
||||
**Total to full ADE:** 7 months (was 6-9)
|
||||
|
||||
**MVP (Phases 1-2):** 10 weeks (was 3-4 months)
|
||||
|
||||
---
|
||||
|
||||
*Research synthesis conducted on March 18, 2026*
|
||||
|
||||
*Sources: Letta docs, Intent.dev, Warp.dev, Temporal.io, Redis docs, Celery docs, GitHub (Aider, Cline)*
|
||||
Reference in New Issue
Block a user