119 lines
4.1 KiB
Markdown
119 lines
4.1 KiB
Markdown
# P0-008: Migration Runs on Fresh Install - False Positive Detection
|
|
|
|
**Priority:** P0 (Critical)
|
|
**Date Identified:** 2025-12-17
|
|
**Status:** ✅ FIXED
|
|
**Date Fixed:** 2025-12-17
|
|
**Fixed By:** Casey & Claude
|
|
|
|
## Problem Description
|
|
|
|
On fresh agent installations, the migration system incorrectly detects that migration is required and runs unnecessary migration logic before registration check, causing confusing logs and potential failures.
|
|
|
|
**Logs from Fresh Install:**
|
|
```
|
|
2025/12/17 10:26:38 [RedFlag Server Migrator] Agent may not function correctly until migration is completed
|
|
2025/12/17 10:26:38 [CONFIG] Adding missing 'updates' subsystem configuration
|
|
2025/12/17 10:26:38 Agent not registered. Run with -register flag first.
|
|
```
|
|
|
|
**Root Cause:**
|
|
- Fresh install creates minimal config with empty `agent_id`
|
|
- `DetectMigrationRequirements()` sees config file exists
|
|
- Checks for missing security features (subsystems, machine_id)
|
|
- Adds "security_hardening" migration since version is 0
|
|
- Runs migration BEFORE registration check
|
|
- This is unnecessary - fresh installs should be clean
|
|
|
|
**Why This Matters:**
|
|
- **Confusing UX**: Users see "migration required" on first run
|
|
- **False Positives**: Migration system detects upgrades where none exist
|
|
- **Potential Failures**: If migration fails, agent won't start
|
|
- **Performance**: Adds unnecessary startup delay
|
|
|
|
## Root Cause Analysis
|
|
|
|
### Current Logic Flow (Broken)
|
|
|
|
1. **Installer creates config**: `/etc/redflag/agent/config.json` with:
|
|
```json
|
|
{
|
|
"agent_id": "",
|
|
"registration_token": "...",
|
|
// ... other fields but NO subsystems, NO machine_id
|
|
}
|
|
```
|
|
|
|
2. **Agent starts** → `main.go:209` calls `DetectMigrationRequirements()`
|
|
|
|
3. **Detection sees**: Config file exists → version is 0 → missing security features
|
|
|
|
4. **Migration adds**: `subsystems` configuration → updates version
|
|
|
|
5. **THEN registration check runs** → agent_id is empty → fails
|
|
|
|
### The Fundamental Flaw
|
|
|
|
**Migration should ONLY run for actual upgrades, NEVER for fresh installs.**
|
|
|
|
Current code checks:
|
|
- ✅ Config file exists? → Yes (fresh install creates it)
|
|
- ❌ Agent is registered? → Not checked!
|
|
|
|
## Solution Implemented
|
|
|
|
**Added early return in `determineRequiredMigrations()` to skip migration for fresh installs:**
|
|
|
|
```go
|
|
// NEW: Check if this is a fresh install (config exists but agent_id is empty)
|
|
if configData, err := os.ReadFile(configPath); err == nil {
|
|
var config map[string]interface{}
|
|
if json.Unmarshal(configData, &config) == nil {
|
|
if agentID, ok := config["agent_id"].(string); !ok || agentID == "" {
|
|
// Fresh install - no migrations needed
|
|
return migrations
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Location:** `aggregator-agent/internal/migration/detection.go` lines 290-299
|
|
|
|
### How It Works
|
|
|
|
1. **Fresh install**: Config has empty `agent_id` → skip all migrations
|
|
2. **Registered agent**: Config has valid `agent_id` → proceed with migration detection
|
|
3. **Legacy upgrade**: Config has agent_id but old version → migration runs normally
|
|
|
|
## Files Modified
|
|
|
|
- `aggregator-agent/internal/migration/detection.go`
|
|
- Added fresh install detection (lines 290-299)
|
|
- No other changes needed
|
|
|
|
## Verification
|
|
|
|
**Testing fresh install:**
|
|
1. Install agent on clean system
|
|
2. Start service: `sudo systemctl start redflag-agent`
|
|
3. Check logs: `sudo journalctl -u redflag-agent -f`
|
|
4. **Should NOT see**: "Migration detected" or "Agent may not function correctly until migration"
|
|
5. **Should see only**: "Agent not registered" (if not registered yet)
|
|
|
|
**Testing upgrade:**
|
|
1. Install older version (v0.1.18 if available)
|
|
2. Register agent
|
|
3. Upgrade to current version
|
|
4. **Should see**: Migration running normally
|
|
|
|
## Impact
|
|
|
|
- **Fixed:** Fresh installs no longer trigger false migration
|
|
- **Why P0:** Confusing UX, potential for migration failures on first run
|
|
- **Performance:** Faster agent startup for new installations
|
|
- **Reliability:** Prevents migration failures blocking new users
|
|
|
|
---
|
|
|
|
**Note:** This fix prevents false positives while preserving legitimate migration for actual upgrades. The logic is simple: if agent_id is empty, it's a fresh install - skip migration.
|