Files

11 KiB

RedFlag Development TODOs

Version: v0.2.0 (Testing Phase - DO NOT PUSH TO PRODUCTION) Last Updated: 2025-11-11


Status Overview

Current Phase: UI/UX Polish + Integration Testing Production: Still on v0.1.23 Build: Working (all containers start successfully) Testing: Pending (manual upgrade test required)


Task Status

COMPLETED

Security Health Polish

  • File: aggregator-web/src/components/AgentScanners.tsx
  • Changes:
    • 2x2 grid layout (denser display)
    • Smaller text sizes (xs, 11px, 10px)
    • Inline metrics in cards
    • Condensed alerts/recs ("+N more" pattern)
    • Stats row with 4 columns
    • All information preserved
  • Status: DONE

Remove "Check for updates" - Agent List

  • File: aggregator-web/src/pages/Agents.tsx
  • Changes:
    • Removed useScanAgent import
    • Removed scanAgentMutation declaration
    • Removed handleScanAgent function
    • Removed "Scan Now" button from agent detail header
    • Removed "Scan Now" button in agent detail section
    • Removed "Trigger scan" button in agents table
  • Status: DONE

🔄 IN PROGRESS

Remove "Check for updates" - Updates & Packages Screen

  • Target File: aggregator-web/src/pages/Updates.tsx (or similar)
  • Action Required:
    1. Find the "Check for updates" button
    2. Remove button component
    3. Remove related handlers/mutations
    4. Remove dead code
  • Dependencies: None
  • Status: 🔄 FINDING FILE

📋 PENDING

1. Rebuild Web Container

  • Action: docker-compose build --no-cache web && docker-compose up -d web
  • Purpose: Apply Security Health polish + removed buttons to UI
  • Priority: HIGH (needed to see changes)
  • Dependencies: Remove "Check for updates" from Updates & Packages

2. Make Version Button Context-Aware

  • File: aggregator-web/src/pages/Agents.tsx
  • Current: Version button always shows "Current / Update"
  • Target: Only show "Update" when upgrade available
  • Action:
    • Check if agent has update via GET /api/v1/agents/:id/updates/available
    • Conditionally render button text
  • Priority: MEDIUM
  • Dependencies: Container rebuild

3. Add Agent Update Logging to History

  • Backend: aggregator-server/internal/services/
  • Frontend: Update History display
  • Action: Ensure agent updates are recorded in update_events table
  • Current Issue: Updates not showing in History
  • Priority: MEDIUM
  • Dependencies: Context-aware Version button

4. Polish Overview Metrics UI

  • File: aggregator-web/src/pages/Agents.tsx
  • Target: Make Overview metrics "sexier"
  • Issues: Metrics display is "plain"
  • Priority: LOW
  • Dependencies: Container rebuild

5. Polish Storage and Disks UI

  • File: aggregator-web/src/pages/Agents.tsx
  • Target: Make Storage/Disks "sexier"
  • Issues: Display is "plain"
  • Priority: LOW
  • Dependencies: Container rebuild

6. Force Update Fedora Agent (v0.1.23 → v0.2.0)

  • Purpose: Test manual upgrade path
  • Reference: MANUAL_UPGRADE.md
  • Steps:
    1. Build v0.2.0 binary
    2. Sign and add to database
    3. Follow MANUAL_UPGRADE.md
    4. Verify update works
  • Priority: MEDIUM (testing)
  • Dependencies: All UI/UX polish complete

Integration Testing Plan

Phase 1: UI/UX Polish (Current)

  • Security Health density
  • Remove Scan Now buttons
  • Remove Check for updates (Updates page)
  • Rebuild container
  • Make Version button context-aware
  • Add History logging

Phase 2: Manual Testing

  • Fedora agent upgrade (v0.1.23 → v0.2.0)
  • Update system test (v0.2.0 → v0.2.1)
  • Full integration suite

Phase 3: Production Readiness

  • All tests passing
  • Documentation updated
  • README reflects v0.2.0 features
  • Ready for v0.2.0 push

Testing Requirements

Before v0.2.0 Release

  1. Docker builds work
  2. All services start
  3. Security Health displays properly
  4. No "Check for updates" clutter
  5. Version button context-aware
  6. History logging works
  7. Manual upgrade tested
  8. Auto-update tested

NOT Required for v0.2.0

  • Overview metrics polish (defer)
  • Storage/Disks polish (defer)

Key Decisions

Version Bump Strategy

  • Current: v0.1.23 (production)
  • Target: v0.2.0 (after testing)
  • Path: Manual upgrade for v0.1.23 users, auto for v0.2.0+

Migration Strategy

  • v0.1.23: Manual upgrade (no auto-update)
  • v0.2.0+: Auto-update via UI
  • Guide: MANUAL_UPGRADE.md created

Code Quality

  • Removed: 4,168 lines (7 phases)
  • Fixed: Platform detection bug
  • Added: Template-based installers
  • Status: Stable

Migration/Reinstall Detection - DETAILED ANALYSIS

Current Gap

The install script template (linux.sh.tmpl, windows.ps1.tmpl) has NO migration detection. Running the install one-liner again doesn't detect existing installations.

Migration Detection Requirements (From Agent Code)

File Path Detection

  • Old paths: /etc/aggregator, /var/lib/aggregator
  • New paths: /etc/redflag, /var/lib/redflag

Version Detection (Must Check!)

// From detection.go:219-250
// Reads config.json to get:
// - agent_version (string): v0.1.23, v0.1.24, v0.2.0, etc.
// - version (int): Config version 0-5+
// - Security features: nonce_validation, machine_id_binding, etc.

Config Version Schemas:

  • v0-v3: Old config, no security features
  • v4: Subsystem configuration added
  • v5: Docker secrets migration
  • v6+: Modern RedFlag with all features

Migration Triggers (Auto-Detect)

  1. Old paths exist: /etc/aggregator or /var/lib/aggregator
  2. Config version < 4: Missing subsystems, security features
  3. Missing security features:
    • nonce_validation (v0.1.22+)
    • machine_id_binding (v0.1.22+)
    • ed25519_verification (v0.1.22+)
    • subsystem_configuration (v0.1.23+)

Install Script Required Behavior

Step 1: Detect Installation

# Check both old and new locations
OLD_CONFIG="/etc/aggregator/config.json"
NEW_CONFIG="/etc/redflag/config.json"

if [ -f "$NEW_CONFIG" ]; then
    echo "Existing installation detected at /etc/redflag"
    # Parse version from config.json
    CURRENT_VERSION=$(grep -o '"version": [0-9]*' "$NEW_CONFIG" | grep -o '[0-9]*')
    AGENT_VERSION=$(grep -o '"agent_version": "v[0-9.]*"' "$NEW_CONFIG" | grep -o 'v[0-9.]*')
elif [ -f "$OLD_CONFIG" ]; then
    echo "Old installation detected at /etc/aggregator - MIGRATION NEEDED"
else
    echo "Fresh install"
    exit 0
fi

Step 2: Determine Migration Needed

# From agent detection logic (detection.go:279-311)
MIGRATION_NEEDED=false

if [ "$CURRENT_VERSION" -lt "4" ]; then
    MIGRATION_NEEDED=true
    echo "Config version $CURRENT_VERSION < v4 - needs migration"
fi

# Check security features
if ! grep -q "nonce_validation" "$NEW_CONFIG" 2>/dev/null; then
    MIGRATION_NEEDED=true
    echo "Missing nonce_validation security feature"
fi

if [ "$MIGRATION_NEEDED" = true ]; then
    echo "Migration required to upgrade to latest security features"
fi

Step 3: Run Migration

if [ "$MIGRATION_NEEDED" = true ]; then
    echo "Running agent migration..."
    # Check if --migrate flag exists
    if /usr/local/bin/redflag-agent --help | grep -q "migrate"; then
        sudo /usr/local/bin/redflag-agent --migrate || {
            echo "ERROR: Migration failed!"
            # TODO: Report to server properly
            exit 1
        }
    else
        echo "Note: Agent will migrate on first start"
    fi
fi

Same Implementation Needed For:

  • linux.sh.tmpl (Bash)
  • windows.ps1.tmpl (PowerShell)

Migration Flow (Correct Behavior)

  1. Detect: Check file paths + parse versions
  2. Analyze: Config version + security features
  3. Backup: Old config to /etc/redflag.backup.{timestamp}/
  4. Migrate: Move paths + upgrade config + add security features
  5. Report: Success/failure to server (TODO: Implement)

Current State (2025-11-11 23:47)

COMPLETED

  1. Security Health UI - denser, 2x2 grid, continuous surface
  2. Version column - clickable "Update" badge opens AgentUpdatesModal
  3. Actions column - removed "Check for updates" buttons
  4. AgentHealth redesign - continuous flow like Overview/History
  5. Web container rebuild - all changes live
  6. Install Script Migration Detection - Made install one-liner idempotent!
    • Added migration detection to linux.sh.tmpl
    • Added migration detection to windows.ps1.tmpl
    • Checks both old (/etc/aggregator) AND new (/etc/redflag) paths
    • Parses version from config.json (agent_version + config version)
    • Detects missing security features (nonce, machine_binding, etc.)
    • Creates backups in /etc/redflag/backups/backup.{timestamp}/
  7. CRITICAL FIX: Stop Service Before Writing
    • Fixed install script to stop service BEFORE downloading binary
    • Prevents "curl: (23) client returned ERROR on write" error
    • Both Linux (systemctl stop) and Windows (Stop-Service) now stop service first
    • Ensures binary file is not locked during download
    • Migration Flow: Install script backs up old config → downloads fresh config → agent starts → agent detects old files → agent parses from BOTH locations → agent migrates automatically

🔄 NEXT PRIORITY

CRITICAL: Fix Binary URL Architecture Mismatch

Problem Identified:

curl: (23) client returned ERROR on write of 18 bytes
[Actually caused by: Exec format error]

Root Cause:

  1. Install script requests binary from: http://localhost:3000/downloads/linux
  2. Server returns 404 (file not found)
  3. Binary download FAILS - file is empty or corrupted
  4. Systemd tries to execute empty/corrupted file
  5. Gets "Exec format error"

URL Mismatch:

  • Install script uses: /downloads/{platform} where platform = "linux"
  • Server has files at: /downloads/linux-amd64, /downloads/linux-arm64
  • Architecture info is LOST in the template

Solution Options:

Option A: Template Architecture Detection (PROS)

  • Simple - works for any system
  • No server changes needed
  • Makes template more complex (300+ lines already)
  • Multiple exit paths for unsupported archs

Option B: Fix Server URL Generation (BETTER)

  • Keep template simple
  • Need server changes to detect architecture
  • How? Server doesn't know client arch when serving script

Option C: Include Architecture in URL

  • Template receives platform like "linux-amd64" not "linux"
  • Requires server to know client arch
  • How does server know?

Option D: Redirect/Best-Effort

  • Server has /downloads/linux → redirects to /downloads/linux-amd64
  • Simple, backward compatible
  • Works for 99% of cases

Recommended: Option D - Server handles both /linux and redirects to /linux-amd64 for x86_64 systems

🔄 SECOND PRIORITY

Implement Migration Error Reporting - Report migration failures to server

Critical Requirement:

  • CRITICAL: If migration fails or doesn't exist - MUST report to server
  • TODO: Implement migration error reporting to History table

📋 REMAINING

  • Test install script idempotency with various versions
  • Implement migration error reporting to server (History table)
  • Agent update logging to History
  • Overview/Storage metrics polish
  • Fedora agent manual upgrade test (v0.1.23.5)

Contact: @Fimeg for decisions, @Claude/@Sonnet for implementation