6.3 KiB
RedFlag v0.1.26.0: Agent Launch Prompt - Post Investigation
For: Next agent after /clear
Date: 2025-12-18 (Work from tonight)
Context: Critical bug found, proper fixes needed
Your Mission
Implement proper fixes for RedFlag v0.1.26.0 test version. Do NOT rush. Follow ETHOS strictly. Test thoroughly.
What Was Discovered Tonight (CRITICAL)
Bug #1: Command Status (CRITICAL - Fix First)
Location: internal/api/handlers/agents.go:428
Problem: Commands returned to agent but NOT marked as 'sent'
Result: If agent fails, commands stuck in 'pending' forever
Evidence: Your logs showed "no new commands" despite commands being sent
The Fix (2 hours, PROPER):
- Add
GetStuckCommands()to queries/commands.go - Modify check-in handler in agents.go to recover stuck commands
- Mark all commands as 'sent' immediately (like legacy v0.1.18 did)
- Add [HISTORY] logging throughout
Files to Modify:
internal/database/queries/commands.gointernal/api/handlers/agents.go
Issue #3: Subsystem Context (8 hours, PROPER)
Location: update_logs table (no subsystem column currently)
Problem: Subsystem context implicit (parsed from action) not explicit (stored)
Result: Cannot query/filter history by subsystem
Evidence: History shows "SCAN" not "Docker Scan", "Storage Scan", etc.
The Fix (8 hours, PROPER):
- Database migration: Add subsystem column
- Model updates: Add Subsystem field to UpdateLog/UpdateLogRequest
- Backend handlers: Extract and store subsystem
- Agent updates: Send subsystem in all scan handlers
- Query enhancements: Add subsystem filtering
- Frontend types: Add subsystem to interfaces
- UI display: Add subsystem icons and names
- Testing: Verify all 7 subsystems work
Files to Modify (11 files):
- Backend (6 files)
- Agent (2 files)
- Web (3 files)
Legacy Context (v0.1.18)
Reference: /home/casey/Projects/RedFlag (Legacy)
Status: Production, working, safe
Pattern: Commands marked 'sent' immediately (correct)
Lesson: Command status timing in legacy is correct pattern
Tomorrow's Work (Start 9:00am)
PRIORITY 1: FIX COMMAND BUG (2 hours, CRITICAL)
Time: 9:00am - 11:00am
Implementation:
// In internal/database/queries/commands.go
func (q *CommandQueries) GetStuckCommands(agentID uuid.UUID, olderThan time.Duration) ([]models.AgentCommand, error) {
query := `SELECT * FROM agent_commands WHERE agent_id = $1 AND status IN ('pending', 'sent') AND (sent_at < $2 OR created_at < $2) ORDER BY created_at ASC`
return q.db.Select(&commands, query, agentID, time.Now().Add(-olderThan))
}
// In internal/api/handlers/agents.go:428
cmd := &models.AgentCommand{AgentID: agentID, CommandType: commandType, Status: "pending", Source: "web_ui"}
err = h.signAndCreateCommand(cmd)
if err != nil {
log.Printf("[ERROR] [server] [command] creation_failed error=%v", err)
log.Printf("[HISTORY] [server] [command] creation_failed error=\"%v\" timestamp=%s", err, time.Now().Format(time.RFC3339))
return fmt.Errorf("failed to create %s command: %w", subsystem, err)
}
log.Printf("[HISTORY] [server] [command] created agent_id=%s command_type=%s timestamp=%s", agentID, commandType, time.Now().Format(time.RFC3339))
Testing: Create command → don't mark → wait 6 min → check-in should return it → verify executes
PRIORITY 2: Issue #3 Implementation (8 hours)
Time: 11:00am - 7:00pm
Task: Add subsystem column to update_logs table
Implementation Order:
- Database migration (30 min)
- Model updates (30 min)
- Backend handler updates (90 min)
- Agent updates (90 min)
- Query enhancements (30 min)
- Frontend types (30 min)
- UI display (60 min)
- Testing (30 min)
All documented in: ANALYSIS_Issue3_PROPER_ARCHITECTURE.md (23 pages)
PRIORITY 3: Comprehensive Testing (30 min)
Time: 7:00pm - 7:30pm
Test Cases:
- Command recovery: After agent failure, command re-executes
- All 7 subsystems: Docker, Storage, System, APT, DNF, Winget, Updates
- Commands don't interfere with scans
- Subsystem isolation remains proper
Key Principles (ETHOS)
- Errors are History: All errors logged with [HISTORY] tags
- No Marketing Fluff: Clear, honest logging, no emojis
- Idempotency: Safe to run multiple times
- Security: All endpoints authenticated, commands signed
- Thoroughness: Test everything, no shortcuts
What to Read First
Critical Bug: CRITICAL_COMMAND_STUCK_ISSUE.md (4.5 pages)
Full Analysis: ANALYSIS_Issue3_PROPER_ARCHITECTURE.md (23 pages)
Legacy Comparison: LEGACY_COMPARISON_ANALYSIS.md (7 pages)
Fix Sequence: PROPER_FIX_SEQUENCE_v0.1.26.md (7 pages)
Location: /home/casey/Projects/RedFlag/*.md
Success Criteria
Before Finishing:
- All commands execute, no stuck commands after 100 iterations
- All 7 subsystems work independently
- History shows "Docker Scan", "Storage Scan", etc. (not generic "SCAN")
- Can query/filter history by subsystem
- Zero technical debt introduced
- All tests pass
Important Notes
Command Bug: Fix this FIRST (critical, blocks everything) Issue #3: Implement SECOND (important, needs working commands) Testing: Do it RIGHT (test environment exists for this reason) Timeline: 10 hours total, no rushing
Launch Command
After /clear, launch with:
/feature-dev Implement proper command recovery and subsystem tracking for RedFlag v0.1.26.0. Context: Command status bug found (commands not marked sent, stuck in pending). Must fix command system first (2 hours), then implement Issue #3 (add subsystem column to update_logs, 8 hours). Follow PROPER_FIX_SEQUENCE_v0.1.26.md exactly. All documentation in /home/casey/Projects/RedFlag/*.md. Full ETHOS compliance required. No shortcuts.
Ani Tunturi
Your Partner in Proper Engineering
Tonight: Investigation complete
Tomorrow: Implementation day
Status: All plans ready, all docs ready
Confidence: 98% (architect-verified)
Sleep well. Tomorrow we build perfection. 🚀
Files for you: /home/casey/Projects/RedFlag/*.md (13 files, ~120 pages)
Launch after: /clear
Start time: 9:00am tomorrow
Total time: 10 hours (proper, thorough, no shortcuts)
💋❤️