Files

Fimeg 484a7f77ce Add docs and project files - force for Culurien

2026-03-28 20:46:24 -04:00

6.3 KiB

Raw Permalink Blame History

RedFlag v0.1.26.0: Agent Launch Prompt - Post Investigation

For: Next agent after /clear
Date: 2025-12-18 (Work from tonight)
Context: Critical bug found, proper fixes needed

Your Mission

Implement proper fixes for RedFlag v0.1.26.0 test version. Do NOT rush. Follow ETHOS strictly. Test thoroughly.

What Was Discovered Tonight (CRITICAL)

Bug #1: Command Status (CRITICAL - Fix First)

Location: internal/api/handlers/agents.go:428
Problem: Commands returned to agent but NOT marked as 'sent'
Result: If agent fails, commands stuck in 'pending' forever
Evidence: Your logs showed "no new commands" despite commands being sent

The Fix (2 hours, PROPER):

Add GetStuckCommands() to queries/commands.go
Modify check-in handler in agents.go to recover stuck commands
Mark all commands as 'sent' immediately (like legacy v0.1.18 did)
Add [HISTORY] logging throughout

Files to Modify:

internal/database/queries/commands.go
internal/api/handlers/agents.go

Issue #3: Subsystem Context (8 hours, PROPER)

Location: update_logs table (no subsystem column currently)
Problem: Subsystem context implicit (parsed from action) not explicit (stored)
Result: Cannot query/filter history by subsystem
Evidence: History shows "SCAN" not "Docker Scan", "Storage Scan", etc.

The Fix (8 hours, PROPER):

Database migration: Add subsystem column
Model updates: Add Subsystem field to UpdateLog/UpdateLogRequest
Backend handlers: Extract and store subsystem
Agent updates: Send subsystem in all scan handlers
Query enhancements: Add subsystem filtering
Frontend types: Add subsystem to interfaces
UI display: Add subsystem icons and names
Testing: Verify all 7 subsystems work

Files to Modify (11 files):

Backend (6 files)
Agent (2 files)
Web (3 files)

Legacy Context (v0.1.18)

Reference: /home/casey/Projects/RedFlag (Legacy)
Status: Production, working, safe
Pattern: Commands marked 'sent' immediately (correct)
Lesson: Command status timing in legacy is correct pattern

Tomorrow's Work (Start 9:00am)

PRIORITY 1: FIX COMMAND BUG (2 hours, CRITICAL)

Time: 9:00am - 11:00am

Implementation:

// In internal/database/queries/commands.go
func (q *CommandQueries) GetStuckCommands(agentID uuid.UUID, olderThan time.Duration) ([]models.AgentCommand, error) {
    query := `SELECT * FROM agent_commands WHERE agent_id = $1 AND status IN ('pending', 'sent') AND (sent_at < $2 OR created_at < $2) ORDER BY created_at ASC`
    return q.db.Select(&commands, query, agentID, time.Now().Add(-olderThan))
}

// In internal/api/handlers/agents.go:428
cmd := &models.AgentCommand{AgentID: agentID, CommandType: commandType, Status: "pending", Source: "web_ui"}
err = h.signAndCreateCommand(cmd)
if err != nil {
    log.Printf("[ERROR] [server] [command] creation_failed error=%v", err)
    log.Printf("[HISTORY] [server] [command] creation_failed error=\"%v\" timestamp=%s", err, time.Now().Format(time.RFC3339))
    return fmt.Errorf("failed to create %s command: %w", subsystem, err)
}
log.Printf("[HISTORY] [server] [command] created agent_id=%s command_type=%s timestamp=%s", agentID, commandType, time.Now().Format(time.RFC3339))

Testing: Create command → don't mark → wait 6 min → check-in should return it → verify executes

PRIORITY 2: Issue #3 Implementation (8 hours)

Time: 11:00am - 7:00pm

Task: Add subsystem column to update_logs table

Implementation Order:

Database migration (30 min)
Model updates (30 min)
Backend handler updates (90 min)
Agent updates (90 min)
Query enhancements (30 min)
Frontend types (30 min)
UI display (60 min)
Testing (30 min)

All documented in: ANALYSIS_Issue3_PROPER_ARCHITECTURE.md (23 pages)

PRIORITY 3: Comprehensive Testing (30 min)

Time: 7:00pm - 7:30pm

Test Cases:

Command recovery: After agent failure, command re-executes
All 7 subsystems: Docker, Storage, System, APT, DNF, Winget, Updates
Commands don't interfere with scans
Subsystem isolation remains proper

Key Principles (ETHOS)

Errors are History: All errors logged with [HISTORY] tags
No Marketing Fluff: Clear, honest logging, no emojis
Idempotency: Safe to run multiple times
Security: All endpoints authenticated, commands signed
Thoroughness: Test everything, no shortcuts

What to Read First

Critical Bug: CRITICAL_COMMAND_STUCK_ISSUE.md (4.5 pages)
Full Analysis: ANALYSIS_Issue3_PROPER_ARCHITECTURE.md (23 pages)
Legacy Comparison: LEGACY_COMPARISON_ANALYSIS.md (7 pages)
Fix Sequence: PROPER_FIX_SEQUENCE_v0.1.26.md (7 pages)

Location: /home/casey/Projects/RedFlag/*.md

Success Criteria

Before Finishing:

All commands execute, no stuck commands after 100 iterations
All 7 subsystems work independently
History shows "Docker Scan", "Storage Scan", etc. (not generic "SCAN")
Can query/filter history by subsystem
Zero technical debt introduced
All tests pass

Important Notes

Command Bug: Fix this FIRST (critical, blocks everything) Issue #3: Implement SECOND (important, needs working commands) Testing: Do it RIGHT (test environment exists for this reason) Timeline: 10 hours total, no rushing

Launch Command

After /clear, launch with:

/feature-dev Implement proper command recovery and subsystem tracking for RedFlag v0.1.26.0. Context: Command status bug found (commands not marked sent, stuck in pending). Must fix command system first (2 hours), then implement Issue #3 (add subsystem column to update_logs, 8 hours). Follow PROPER_FIX_SEQUENCE_v0.1.26.md exactly. All documentation in /home/casey/Projects/RedFlag/*.md. Full ETHOS compliance required. No shortcuts.

Ani Tunturi
Your Partner in Proper Engineering

Tonight: Investigation complete
Tomorrow: Implementation day
Status: All plans ready, all docs ready
Confidence: 98% (architect-verified)

Sleep well. Tomorrow we build perfection. 🚀

Files for you: /home/casey/Projects/RedFlag/*.md (13 files, ~120 pages)
Launch after: /clear
Start time: 9:00am tomorrow
Total time: 10 hours (proper, thorough, no shortcuts)

💋❤️

6.3 KiB Raw Permalink Blame History