# RedFlag Development TODOs **Version:** v0.2.0 (Testing Phase - DO NOT PUSH TO PRODUCTION) **Last Updated:** 2025-11-11 --- ## Status Overview **Current Phase:** UI/UX Polish + Integration Testing **Production:** Still on v0.1.23 **Build:** ✅ Working (all containers start successfully) **Testing:** ⏳ Pending (manual upgrade test required) --- ## Task Status ### ✅ COMPLETED #### Security Health Polish - **File:** `aggregator-web/src/components/AgentScanners.tsx` - **Changes:** - 2x2 grid layout (denser display) - Smaller text sizes (xs, 11px, 10px) - Inline metrics in cards - Condensed alerts/recs ("+N more" pattern) - Stats row with 4 columns - All information preserved - **Status:** ✅ DONE #### Remove "Check for updates" - Agent List - **File:** `aggregator-web/src/pages/Agents.tsx` - **Changes:** - Removed `useScanAgent` import - Removed `scanAgentMutation` declaration - Removed `handleScanAgent` function - Removed "Scan Now" button from agent detail header - Removed "Scan Now" button in agent detail section - Removed "Trigger scan" button in agents table - **Status:** ✅ DONE --- ### 🔄 IN PROGRESS #### Remove "Check for updates" - Updates & Packages Screen - **Target File:** `aggregator-web/src/pages/Updates.tsx` (or similar) - **Action Required:** 1. Find the "Check for updates" button 2. Remove button component 3. Remove related handlers/mutations 4. Remove dead code - **Dependencies:** None - **Status:** 🔄 FINDING FILE --- ### 📋 PENDING #### 1. Rebuild Web Container - **Action:** `docker-compose build --no-cache web && docker-compose up -d web` - **Purpose:** Apply Security Health polish + removed buttons to UI - **Priority:** HIGH (needed to see changes) - **Dependencies:** Remove "Check for updates" from Updates & Packages #### 2. Make Version Button Context-Aware - **File:** `aggregator-web/src/pages/Agents.tsx` - **Current:** Version button always shows "Current / Update" - **Target:** Only show "Update" when upgrade available - **Action:** - Check if agent has update via `GET /api/v1/agents/:id/updates/available` - Conditionally render button text - **Priority:** MEDIUM - **Dependencies:** Container rebuild #### 3. Add Agent Update Logging to History - **Backend:** `aggregator-server/internal/services/` - **Frontend:** Update History display - **Action:** Ensure agent updates are recorded in `update_events` table - **Current Issue:** Updates not showing in History - **Priority:** MEDIUM - **Dependencies:** Context-aware Version button #### 4. Polish Overview Metrics UI - **File:** `aggregator-web/src/pages/Agents.tsx` - **Target:** Make Overview metrics "sexier" - **Issues:** Metrics display is "plain" - **Priority:** LOW - **Dependencies:** Container rebuild #### 5. Polish Storage and Disks UI - **File:** `aggregator-web/src/pages/Agents.tsx` - **Target:** Make Storage/Disks "sexier" - **Issues:** Display is "plain" - **Priority:** LOW - **Dependencies:** Container rebuild #### 6. Force Update Fedora Agent (v0.1.23 → v0.2.0) - **Purpose:** Test manual upgrade path - **Reference:** `MANUAL_UPGRADE.md` - **Steps:** 1. Build v0.2.0 binary 2. Sign and add to database 3. Follow MANUAL_UPGRADE.md 4. Verify update works - **Priority:** MEDIUM (testing) - **Dependencies:** All UI/UX polish complete --- ## Integration Testing Plan ### Phase 1: UI/UX Polish (Current) - [x] Security Health density - [x] Remove Scan Now buttons - [ ] Remove Check for updates (Updates page) - [ ] Rebuild container - [ ] Make Version button context-aware - [ ] Add History logging ### Phase 2: Manual Testing - [ ] Fedora agent upgrade (v0.1.23 → v0.2.0) - [ ] Update system test (v0.2.0 → v0.2.1) - [ ] Full integration suite ### Phase 3: Production Readiness - [ ] All tests passing - [ ] Documentation updated - [ ] README reflects v0.2.0 features - [ ] Ready for v0.2.0 push --- ## Testing Requirements ### Before v0.2.0 Release 1. ✅ Docker builds work 2. ✅ All services start 3. ✅ Security Health displays properly 4. ✅ No "Check for updates" clutter 5. ⏳ Version button context-aware 6. ⏳ History logging works 7. ⏳ Manual upgrade tested 8. ⏳ Auto-update tested ### NOT Required for v0.2.0 - Overview metrics polish (defer) - Storage/Disks polish (defer) --- ## Key Decisions ### Version Bump Strategy - **Current:** v0.1.23 (production) - **Target:** v0.2.0 (after testing) - **Path:** Manual upgrade for v0.1.23 users, auto for v0.2.0+ ### Migration Strategy - **v0.1.23:** Manual upgrade (no auto-update) - **v0.2.0+:** Auto-update via UI - **Guide:** `MANUAL_UPGRADE.md` created ### Code Quality - **Removed:** 4,168 lines (7 phases) - **Fixed:** Platform detection bug - **Added:** Template-based installers - **Status:** ✅ Stable --- ## Migration/Reinstall Detection - DETAILED ANALYSIS ### Current Gap The **install script template** (`linux.sh.tmpl`, `windows.ps1.tmpl`) has NO migration detection. Running the install one-liner again doesn't detect existing installations. ### Migration Detection Requirements (From Agent Code) #### File Path Detection - **Old paths:** `/etc/aggregator`, `/var/lib/aggregator` - **New paths:** `/etc/redflag`, `/var/lib/redflag` #### Version Detection (Must Check!) ```go // From detection.go:219-250 // Reads config.json to get: // - agent_version (string): v0.1.23, v0.1.24, v0.2.0, etc. // - version (int): Config version 0-5+ // - Security features: nonce_validation, machine_id_binding, etc. ``` **Config Version Schemas:** - **v0-v3:** Old config, no security features - **v4:** Subsystem configuration added - **v5:** Docker secrets migration - **v6+:** Modern RedFlag with all features #### Migration Triggers (Auto-Detect) 1. **Old paths exist:** `/etc/aggregator` or `/var/lib/aggregator` 2. **Config version < 4:** Missing subsystems, security features 3. **Missing security features:** - `nonce_validation` (v0.1.22+) - `machine_id_binding` (v0.1.22+) - `ed25519_verification` (v0.1.22+) - `subsystem_configuration` (v0.1.23+) ### Install Script Required Behavior #### Step 1: Detect Installation ```bash # Check both old and new locations OLD_CONFIG="/etc/aggregator/config.json" NEW_CONFIG="/etc/redflag/config.json" if [ -f "$NEW_CONFIG" ]; then echo "Existing installation detected at /etc/redflag" # Parse version from config.json CURRENT_VERSION=$(grep -o '"version": [0-9]*' "$NEW_CONFIG" | grep -o '[0-9]*') AGENT_VERSION=$(grep -o '"agent_version": "v[0-9.]*"' "$NEW_CONFIG" | grep -o 'v[0-9.]*') elif [ -f "$OLD_CONFIG" ]; then echo "Old installation detected at /etc/aggregator - MIGRATION NEEDED" else echo "Fresh install" exit 0 fi ``` #### Step 2: Determine Migration Needed ```bash # From agent detection logic (detection.go:279-311) MIGRATION_NEEDED=false if [ "$CURRENT_VERSION" -lt "4" ]; then MIGRATION_NEEDED=true echo "Config version $CURRENT_VERSION < v4 - needs migration" fi # Check security features if ! grep -q "nonce_validation" "$NEW_CONFIG" 2>/dev/null; then MIGRATION_NEEDED=true echo "Missing nonce_validation security feature" fi if [ "$MIGRATION_NEEDED" = true ]; then echo "Migration required to upgrade to latest security features" fi ``` #### Step 3: Run Migration ```bash if [ "$MIGRATION_NEEDED" = true ]; then echo "Running agent migration..." # Check if --migrate flag exists if /usr/local/bin/redflag-agent --help | grep -q "migrate"; then sudo /usr/local/bin/redflag-agent --migrate || { echo "ERROR: Migration failed!" # TODO: Report to server properly exit 1 } else echo "Note: Agent will migrate on first start" fi fi ``` ### Same Implementation Needed For: - `linux.sh.tmpl` (Bash) - `windows.ps1.tmpl` (PowerShell) ### Migration Flow (Correct Behavior) 1. **Detect:** Check file paths + parse versions 2. **Analyze:** Config version + security features 3. **Backup:** Old config to `/etc/redflag.backup.{timestamp}/` 4. **Migrate:** Move paths + upgrade config + add security features 5. **Report:** Success/failure to server (TODO: Implement) --- ## Current State (2025-11-11 23:47) ### ✅ COMPLETED 1. Security Health UI - denser, 2x2 grid, continuous surface 2. Version column - clickable "Update" badge opens AgentUpdatesModal 3. Actions column - removed "Check for updates" buttons 4. AgentHealth redesign - continuous flow like Overview/History 5. Web container rebuild - all changes live 6. **Install Script Migration Detection** - Made install one-liner idempotent! ✅ - Added migration detection to `linux.sh.tmpl` - Added migration detection to `windows.ps1.tmpl` - Checks both old (`/etc/aggregator`) AND new (`/etc/redflag`) paths - Parses version from config.json (agent_version + config version) - Detects missing security features (nonce, machine_binding, etc.) - Creates backups in `/etc/redflag/backups/backup.{timestamp}/` 7. **CRITICAL FIX: Stop Service Before Writing** ✅ - Fixed install script to stop service BEFORE downloading binary - Prevents "curl: (23) client returned ERROR on write" error - Both Linux (systemctl stop) and Windows (Stop-Service) now stop service first - Ensures binary file is not locked during download - **Migration Flow**: Install script backs up old config → downloads fresh config → agent starts → agent detects old files → agent parses from BOTH locations → agent migrates automatically ### 🔄 NEXT PRIORITY **CRITICAL: Fix Binary URL Architecture Mismatch** **Problem Identified:** ``` curl: (23) client returned ERROR on write of 18 bytes [Actually caused by: Exec format error] ``` **Root Cause:** 1. Install script requests binary from: `http://localhost:3000/downloads/linux` 2. Server returns 404 (file not found) 3. Binary download FAILS - file is empty or corrupted 4. Systemd tries to execute empty/corrupted file 5. Gets "Exec format error" **URL Mismatch:** - Install script uses: `/downloads/{platform}` where platform = "linux" - Server has files at: `/downloads/linux-amd64`, `/downloads/linux-arm64` - Architecture info is LOST in the template **Solution Options:** **Option A: Template Architecture Detection (PROS)** - ✅ Simple - works for any system - ✅ No server changes needed - ❌ Makes template more complex (300+ lines already) - ❌ Multiple exit paths for unsupported archs **Option B: Fix Server URL Generation (BETTER)** - ✅ Keep template simple - ❌ Need server changes to detect architecture - How? Server doesn't know client arch when serving script **Option C: Include Architecture in URL** - Template receives platform like "linux-amd64" not "linux" - Requires server to know client arch - How does server know? **Option D: Redirect/Best-Effort** - Server has `/downloads/linux` → redirects to `/downloads/linux-amd64` - Simple, backward compatible - ✅ Works for 99% of cases **Recommended:** Option D - Server handles both `/linux` and redirects to `/linux-amd64` for x86_64 systems ### 🔄 SECOND PRIORITY **Implement Migration Error Reporting** - Report migration failures to server **Critical Requirement:** - ❌ **CRITICAL:** If migration fails or doesn't exist - MUST report to server - ❌ **TODO:** Implement migration error reporting to History table ### 📋 REMAINING - Test install script idempotency with various versions - Implement migration error reporting to server (History table) - Agent update logging to History - Overview/Storage metrics polish - Fedora agent manual upgrade test (v0.1.23.5) --- **Contact:** @Fimeg for decisions, @Claude/@Sonnet for implementation