Files
Redflag/docs/3_BACKLOG/P0-008_Migration-Runs-On-Fresh-Install.md

119 lines
4.1 KiB
Markdown

# P0-008: Migration Runs on Fresh Install - False Positive Detection
**Priority:** P0 (Critical)
**Date Identified:** 2025-12-17
**Status:** ✅ FIXED
**Date Fixed:** 2025-12-17
**Fixed By:** Casey & Claude
## Problem Description
On fresh agent installations, the migration system incorrectly detects that migration is required and runs unnecessary migration logic before registration check, causing confusing logs and potential failures.
**Logs from Fresh Install:**
```
2025/12/17 10:26:38 [RedFlag Server Migrator] Agent may not function correctly until migration is completed
2025/12/17 10:26:38 [CONFIG] Adding missing 'updates' subsystem configuration
2025/12/17 10:26:38 Agent not registered. Run with -register flag first.
```
**Root Cause:**
- Fresh install creates minimal config with empty `agent_id`
- `DetectMigrationRequirements()` sees config file exists
- Checks for missing security features (subsystems, machine_id)
- Adds "security_hardening" migration since version is 0
- Runs migration BEFORE registration check
- This is unnecessary - fresh installs should be clean
**Why This Matters:**
- **Confusing UX**: Users see "migration required" on first run
- **False Positives**: Migration system detects upgrades where none exist
- **Potential Failures**: If migration fails, agent won't start
- **Performance**: Adds unnecessary startup delay
## Root Cause Analysis
### Current Logic Flow (Broken)
1. **Installer creates config**: `/etc/redflag/agent/config.json` with:
```json
{
"agent_id": "",
"registration_token": "...",
// ... other fields but NO subsystems, NO machine_id
}
```
2. **Agent starts** → `main.go:209` calls `DetectMigrationRequirements()`
3. **Detection sees**: Config file exists → version is 0 → missing security features
4. **Migration adds**: `subsystems` configuration → updates version
5. **THEN registration check runs** → agent_id is empty → fails
### The Fundamental Flaw
**Migration should ONLY run for actual upgrades, NEVER for fresh installs.**
Current code checks:
- ✅ Config file exists? → Yes (fresh install creates it)
- ❌ Agent is registered? → Not checked!
## Solution Implemented
**Added early return in `determineRequiredMigrations()` to skip migration for fresh installs:**
```go
// NEW: Check if this is a fresh install (config exists but agent_id is empty)
if configData, err := os.ReadFile(configPath); err == nil {
var config map[string]interface{}
if json.Unmarshal(configData, &config) == nil {
if agentID, ok := config["agent_id"].(string); !ok || agentID == "" {
// Fresh install - no migrations needed
return migrations
}
}
}
```
**Location:** `aggregator-agent/internal/migration/detection.go` lines 290-299
### How It Works
1. **Fresh install**: Config has empty `agent_id` → skip all migrations
2. **Registered agent**: Config has valid `agent_id` → proceed with migration detection
3. **Legacy upgrade**: Config has agent_id but old version → migration runs normally
## Files Modified
- `aggregator-agent/internal/migration/detection.go`
- Added fresh install detection (lines 290-299)
- No other changes needed
## Verification
**Testing fresh install:**
1. Install agent on clean system
2. Start service: `sudo systemctl start redflag-agent`
3. Check logs: `sudo journalctl -u redflag-agent -f`
4. **Should NOT see**: "Migration detected" or "Agent may not function correctly until migration"
5. **Should see only**: "Agent not registered" (if not registered yet)
**Testing upgrade:**
1. Install older version (v0.1.18 if available)
2. Register agent
3. Upgrade to current version
4. **Should see**: Migration running normally
## Impact
- **Fixed:** Fresh installs no longer trigger false migration
- **Why P0:** Confusing UX, potential for migration failures on first run
- **Performance:** Faster agent startup for new installations
- **Reliability:** Prevents migration failures blocking new users
---
**Note:** This fix prevents false positives while preserving legitimate migration for actual upgrades. The logic is simple: if agent_id is empty, it's a fresh install - skip migration.