Files
Redflag/docs/3_BACKLOG/P0-009_Storage-Scanner-Reports-to-Wrong-Table.md

172 lines
5.8 KiB
Markdown

# P0-009: Storage Scanner Reports Disk Info to Updates Table Instead of System Info
**Priority:** P0 (Critical - Data Architecture Issue)
**Date Identified:** 2025-12-17
**Date Created:** 2025-12-17
**Status:** Open (Investigation Complete, Fix Needed)
**Created By:** Casey & Claude
## Problem Description
The storage scanner (Disk Usage Reporter) is reporting disk/partition information to the **updates** table instead of populating **system_info**. This causes:
- Disk information not appearing in Agent Storage UI tab
- Disk data stored in wrong table (treated as updatable packages)
- Hourly delay for disk info to appear (waiting for system info report)
- Inappropriate severity tracking for static system information
## Current Behavior
**Agent Side:**
- Storage scanner runs and collects detailed disk info (mountpoints, devices, types, filesystems)
- Converts to `UpdateReportItem` format with severity levels
- Reports via `/api/v1/agents/:id/updates` endpoint
- Stored in `update_packages` table with package_type='storage'
**Server Side:**
- Storage metrics endpoint reads from `system_info` structure (not updates table)
- UI expects `agent.system_info.disk_info` array
- Disk data is in wrong place, so UI shows empty
## Root Cause
**In `aggregator-agent/internal/orchestrator/storage_scanner.go`:**
```go
func (s *StorageScanner) Scan() ([]client.UpdateReportItem, error) {
// Converts StorageMetric to UpdateReportItem
// Stores disk info in updates table, not system_info
}
```
**The storage scanner implements legacy interface:**
- Uses `Scan()``UpdateReportItem`
- Should use `ScanTyped()``TypedScannerResult` with `StorageData`
- System info reporters (system, filesystem) already use proper interface
## What I've Investigated
**Agent Code:**
- ✅ Storage scanner collects comprehensive disk data
- ✅ Data includes: mountpoints, devices, disk_type, filesystem, severity
- ❌ Reports via legacy conversion to updates
**Server Code:**
- ✅ Has `/api/v1/agents/:id/metrics/storage` endpoint (reads from system_info)
- ✅ Has proper `TypedScannerResult` infrastructure
- ❌ Never receives disk data because it's in wrong table
**Database Schema:**
- `update_packages` table stores disk info (package_type='storage')
- `agent_specs` table has `metadata` JSONB field
- No dedicated `system_info` table - it's embedded in agent response
**UI Code:**
- `AgentStorage.tsx` reads from `agent.system_info.disk_info`
- Has both overview bars AND detailed table implemented
- Works correctly when data is in right place
## What Needs to be Fixed
### Option 1: Store in AgentSpecs.metadata (Proper)
1. Modify storage scanner to return `TypedScannerResult`
2. Call `client.ReportSystemInfo()` with disk_info populated
3. Update `agent_specs.metadata` or add `system_info` column
4. Remove legacy `Scan()` method from storage scanner
**Pros:**
- Data in correct semantic location
- No hourly delay for disk info
- Aligns with system/filesystem scanners
- Works immediately with existing UI
**Cons:**
- Requires database schema change (add system_info column or use metadata)
- Breaking change for existing disk usage report structure
- Need migration for existing storage data
### Option 2: Make Metrics READ from Updates (Workaround)
1. Keep storage scanner reporting to updates table
2. Modify `GetAgentStorageMetrics()` to read from updates
3. Transform update_packages rows into storage metrics format
**Pros:**
- No agent code changes
- Works with current data flow
- Quick fix
**Cons:**
- Semantic wrongness (system info in updates)
- Performance issues (querying updates table for system info)
- Still has severity tracking (inappropriate for static info)
### Option 3: Dual Write (Temporary Bridge)
1. Storage scanner reports BOTH to system_info AND updates (for backward compatibility)
2. After migration, remove updates reporting
**Pros:**
- Backward compatible
- Smooth transition
**Cons:**
- Data duplication
- Temporary hack
- Still need Option 1 eventually
## Recommended Fix: Option 1
**Implement proper typed scanning for storage:**
1. **In `storage_scanner.go`:**
- Remove `Scan()` legacy method
- Implement `ScanTyped()` returning `TypedScannerResult`
- Populate `TypedScannerResult.StorageData` with disks
2. **In metrics handler or agent check-in:**
- When storage scanner runs, collect `TypedScannerResult`
- Call `client.ReportSystemInfo()` with `report.SystemInfo.DiskInfo` populated
- This updates agent's system_info in real-time
3. **In database:**
- Add `system_info JSONB` column to agent_specs table
- Or reuse existing metadata field
4. **In UI:**
- No changes needed (already reads from system_info)
## Files to Modify
**Agent:**
- `/home/casey/Projects/RedFlag/aggregator-agent/internal/orchestrator/storage_scanner.go`
**Server:**
- `/home/casey/Projects/RedFlag/aggregator-server/internal/api/handlers/metrics.go`
- Database schema (add system_info column)
**Migration:**
- Create migration to add system_info column
- Optional: migrate existing storage update_reports to system_info
## Testing After Fix
1. Install agent with fixed storage scanner
2. Navigate to Agent → Storage tab
3. Should immediately see:
- Overview disk usage bars
- Detailed partition table with all disks
- Device names, types, filesystems, mountpoints
- Severity indicators
4. No waiting for hourly system info report
5. Data should persist correctly
## Related Issues
- P0-007: Install script path variables (fixed)
- P0-008: Migration false positives (fixed)
- P0-009: This issue (storage scanner wrong table)
## Notes for Implementer
- Look at how `system` scanner implements `ScanTyped()` for reference
- The agent already has `reportSystemInfo()` method - just need to populate disk_info
- Storage scanner is the ONLY scanner still using legacy Scan() interface
- Remove legacy Scan() method entirely once ScanTyped() is implemented