172 lines
5.8 KiB
Markdown
172 lines
5.8 KiB
Markdown
# P0-009: Storage Scanner Reports Disk Info to Updates Table Instead of System Info
|
|
|
|
**Priority:** P0 (Critical - Data Architecture Issue)
|
|
**Date Identified:** 2025-12-17
|
|
**Date Created:** 2025-12-17
|
|
**Status:** Open (Investigation Complete, Fix Needed)
|
|
**Created By:** Casey & Claude
|
|
|
|
## Problem Description
|
|
|
|
The storage scanner (Disk Usage Reporter) is reporting disk/partition information to the **updates** table instead of populating **system_info**. This causes:
|
|
- Disk information not appearing in Agent Storage UI tab
|
|
- Disk data stored in wrong table (treated as updatable packages)
|
|
- Hourly delay for disk info to appear (waiting for system info report)
|
|
- Inappropriate severity tracking for static system information
|
|
|
|
## Current Behavior
|
|
|
|
**Agent Side:**
|
|
- Storage scanner runs and collects detailed disk info (mountpoints, devices, types, filesystems)
|
|
- Converts to `UpdateReportItem` format with severity levels
|
|
- Reports via `/api/v1/agents/:id/updates` endpoint
|
|
- Stored in `update_packages` table with package_type='storage'
|
|
|
|
**Server Side:**
|
|
- Storage metrics endpoint reads from `system_info` structure (not updates table)
|
|
- UI expects `agent.system_info.disk_info` array
|
|
- Disk data is in wrong place, so UI shows empty
|
|
|
|
## Root Cause
|
|
|
|
**In `aggregator-agent/internal/orchestrator/storage_scanner.go`:**
|
|
```go
|
|
func (s *StorageScanner) Scan() ([]client.UpdateReportItem, error) {
|
|
// Converts StorageMetric to UpdateReportItem
|
|
// Stores disk info in updates table, not system_info
|
|
}
|
|
```
|
|
|
|
**The storage scanner implements legacy interface:**
|
|
- Uses `Scan()` → `UpdateReportItem`
|
|
- Should use `ScanTyped()` → `TypedScannerResult` with `StorageData`
|
|
- System info reporters (system, filesystem) already use proper interface
|
|
|
|
## What I've Investigated
|
|
|
|
**Agent Code:**
|
|
- ✅ Storage scanner collects comprehensive disk data
|
|
- ✅ Data includes: mountpoints, devices, disk_type, filesystem, severity
|
|
- ❌ Reports via legacy conversion to updates
|
|
|
|
**Server Code:**
|
|
- ✅ Has `/api/v1/agents/:id/metrics/storage` endpoint (reads from system_info)
|
|
- ✅ Has proper `TypedScannerResult` infrastructure
|
|
- ❌ Never receives disk data because it's in wrong table
|
|
|
|
**Database Schema:**
|
|
- `update_packages` table stores disk info (package_type='storage')
|
|
- `agent_specs` table has `metadata` JSONB field
|
|
- No dedicated `system_info` table - it's embedded in agent response
|
|
|
|
**UI Code:**
|
|
- `AgentStorage.tsx` reads from `agent.system_info.disk_info`
|
|
- Has both overview bars AND detailed table implemented
|
|
- Works correctly when data is in right place
|
|
|
|
## What Needs to be Fixed
|
|
|
|
### Option 1: Store in AgentSpecs.metadata (Proper)
|
|
1. Modify storage scanner to return `TypedScannerResult`
|
|
2. Call `client.ReportSystemInfo()` with disk_info populated
|
|
3. Update `agent_specs.metadata` or add `system_info` column
|
|
4. Remove legacy `Scan()` method from storage scanner
|
|
|
|
**Pros:**
|
|
- Data in correct semantic location
|
|
- No hourly delay for disk info
|
|
- Aligns with system/filesystem scanners
|
|
- Works immediately with existing UI
|
|
|
|
**Cons:**
|
|
- Requires database schema change (add system_info column or use metadata)
|
|
- Breaking change for existing disk usage report structure
|
|
- Need migration for existing storage data
|
|
|
|
### Option 2: Make Metrics READ from Updates (Workaround)
|
|
1. Keep storage scanner reporting to updates table
|
|
2. Modify `GetAgentStorageMetrics()` to read from updates
|
|
3. Transform update_packages rows into storage metrics format
|
|
|
|
**Pros:**
|
|
- No agent code changes
|
|
- Works with current data flow
|
|
- Quick fix
|
|
|
|
**Cons:**
|
|
- Semantic wrongness (system info in updates)
|
|
- Performance issues (querying updates table for system info)
|
|
- Still has severity tracking (inappropriate for static info)
|
|
|
|
### Option 3: Dual Write (Temporary Bridge)
|
|
1. Storage scanner reports BOTH to system_info AND updates (for backward compatibility)
|
|
2. After migration, remove updates reporting
|
|
|
|
**Pros:**
|
|
- Backward compatible
|
|
- Smooth transition
|
|
|
|
**Cons:**
|
|
- Data duplication
|
|
- Temporary hack
|
|
- Still need Option 1 eventually
|
|
|
|
## Recommended Fix: Option 1
|
|
|
|
**Implement proper typed scanning for storage:**
|
|
|
|
1. **In `storage_scanner.go`:**
|
|
- Remove `Scan()` legacy method
|
|
- Implement `ScanTyped()` returning `TypedScannerResult`
|
|
- Populate `TypedScannerResult.StorageData` with disks
|
|
|
|
2. **In metrics handler or agent check-in:**
|
|
- When storage scanner runs, collect `TypedScannerResult`
|
|
- Call `client.ReportSystemInfo()` with `report.SystemInfo.DiskInfo` populated
|
|
- This updates agent's system_info in real-time
|
|
|
|
3. **In database:**
|
|
- Add `system_info JSONB` column to agent_specs table
|
|
- Or reuse existing metadata field
|
|
|
|
4. **In UI:**
|
|
- No changes needed (already reads from system_info)
|
|
|
|
## Files to Modify
|
|
|
|
**Agent:**
|
|
- `/home/casey/Projects/RedFlag/aggregator-agent/internal/orchestrator/storage_scanner.go`
|
|
|
|
**Server:**
|
|
- `/home/casey/Projects/RedFlag/aggregator-server/internal/api/handlers/metrics.go`
|
|
- Database schema (add system_info column)
|
|
|
|
**Migration:**
|
|
- Create migration to add system_info column
|
|
- Optional: migrate existing storage update_reports to system_info
|
|
|
|
## Testing After Fix
|
|
|
|
1. Install agent with fixed storage scanner
|
|
2. Navigate to Agent → Storage tab
|
|
3. Should immediately see:
|
|
- Overview disk usage bars
|
|
- Detailed partition table with all disks
|
|
- Device names, types, filesystems, mountpoints
|
|
- Severity indicators
|
|
4. No waiting for hourly system info report
|
|
5. Data should persist correctly
|
|
|
|
## Related Issues
|
|
|
|
- P0-007: Install script path variables (fixed)
|
|
- P0-008: Migration false positives (fixed)
|
|
- P0-009: This issue (storage scanner wrong table)
|
|
|
|
## Notes for Implementer
|
|
|
|
- Look at how `system` scanner implements `ScanTyped()` for reference
|
|
- The agent already has `reportSystemInfo()` method - just need to populate disk_info
|
|
- Storage scanner is the ONLY scanner still using legacy Scan() interface
|
|
- Remove legacy Scan() method entirely once ScanTyped() is implemented
|