284 lines
9.8 KiB
Markdown
284 lines
9.8 KiB
Markdown
# RedFlag Subsystem Architecture Refactor - Changes Made November 4, 2025
|
|
|
|
## 🎯 **MISSION ACCOMPLISHED**
|
|
Complete subsystem scanning architecture refactor to fix stuck scan_results operations and incorrect data classification.
|
|
|
|
---
|
|
|
|
## 🚨 **PROBLEMS FIXED**
|
|
|
|
### 1. **Stuck scan_results Operations**
|
|
- **Issue**: Operations stuck in "sent" status for 96+ minutes
|
|
- **Root Cause**: Monolithic `scan_updates` approach causing system-wide failures
|
|
- **Solution**: Replaced with individual subsystem scans (storage, system, docker)
|
|
|
|
### 2. **Incorrect Data Classification**
|
|
- **Issue**: Storage/system metrics appearing as "Updates" in the UI
|
|
- **Root Cause**: All subsystems incorrectly calling `ReportUpdates()` endpoint
|
|
- **Solution**: Created separate API endpoints: `ReportMetrics()` and `ReportDockerImages()`
|
|
|
|
---
|
|
|
|
## 📁 **FILES MODIFIED**
|
|
|
|
### **Server API Handlers**
|
|
- ✅ `aggregator-server/internal/api/handlers/metrics.go` - **CREATED**
|
|
- `MetricsHandler` struct
|
|
- `ReportMetrics()` endpoint (POST `/api/v1/agents/:id/metrics`)
|
|
- `GetAgentMetrics()` endpoint (GET `/api/v1/agents/:id/metrics`)
|
|
- `GetAgentStorageMetrics()` endpoint (GET `/api/v1/agents/:id/metrics/storage`)
|
|
- `GetAgentSystemMetrics()` endpoint (GET `/api/v1/agents/:id/metrics/system`)
|
|
|
|
- ✅ `aggregator-server/internal/api/handlers/docker_reports.go` - **CREATED**
|
|
- `DockerReportsHandler` struct
|
|
- `ReportDockerImages()` endpoint (POST `/api/v1/agents/:id/docker-images`)
|
|
- `GetAgentDockerImages()` endpoint (GET `/api/v1/agents/:id/docker-images`)
|
|
- `GetAgentDockerInfo()` endpoint (GET `/api/v1/agents/:id/docker-info`)
|
|
|
|
- ✅ `aggregator-server/internal/api/handlers/agents.go` - **MODIFIED**
|
|
- Fixed unused variable error (line 1153): Changed `agent, err :=` to `_, err =`
|
|
|
|
### **Data Models**
|
|
- ✅ `aggregator-server/internal/models/metrics.go` - **CREATED**
|
|
```go
|
|
type MetricsReportRequest struct {
|
|
CommandID string `json:"command_id"`
|
|
Timestamp time.Time `json:"timestamp"`
|
|
Metrics []Metric `json:"metrics"`
|
|
}
|
|
|
|
type Metric struct {
|
|
PackageType string `json:"package_type"`
|
|
PackageName string `json:"package_name"`
|
|
CurrentVersion string `json:"current_version"`
|
|
AvailableVersion string `json:"available_version"`
|
|
Severity string `json:"severity"`
|
|
RepositorySource string `json:"repository_source"`
|
|
Metadata map[string]string `json:"metadata"`
|
|
}
|
|
```
|
|
|
|
- ✅ `aggregator-server/internal/models/docker.go` - **MODIFIED**
|
|
- Added `AgentDockerImage` struct
|
|
- Added `DockerReportRequest` struct
|
|
- Added `DockerImageInfo` struct
|
|
- Added `StoredDockerImage` struct
|
|
- Added `DockerFilter` and `DockerResult` structs
|
|
|
|
### **Database Queries**
|
|
- ✅ `aggregator-server/internal/database/queries/metrics.go` - **CREATED**
|
|
- `MetricsQueries` struct
|
|
- `CreateMetricsEventsBatch()` method
|
|
- `GetMetrics()` method with filtering
|
|
- `GetMetricsByAgentID()` method
|
|
- `GetLatestMetricsByType()` method
|
|
- `DeleteOldMetrics()` method
|
|
|
|
- ✅ `aggregator-server/internal/database/queries/docker.go` - **CREATED**
|
|
- `DockerQueries` struct
|
|
- `CreateDockerEventsBatch()` method
|
|
- `GetDockerImages()` method with filtering
|
|
- `GetDockerImagesByAgentID()` method
|
|
- `GetDockerImagesWithUpdates()` method
|
|
- `DeleteOldDockerImages()` method
|
|
- `GetDockerStats()` method
|
|
|
|
### **Database Migration**
|
|
- ✅ `aggregator-server/internal/database/migrations/018_create_metrics_and_docker_tables.up.sql` - **CREATED**
|
|
```sql
|
|
CREATE TABLE metrics (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
agent_id UUID NOT NULL,
|
|
package_type VARCHAR(50) NOT NULL,
|
|
package_name VARCHAR(255) NOT NULL,
|
|
current_version VARCHAR(255),
|
|
available_version VARCHAR(255),
|
|
severity VARCHAR(20),
|
|
repository_source TEXT,
|
|
metadata JSONB DEFAULT '{}',
|
|
event_type VARCHAR(50) DEFAULT 'discovered',
|
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
CREATE TABLE docker_images (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
agent_id UUID NOT NULL,
|
|
package_type VARCHAR(50) NOT NULL,
|
|
package_name VARCHAR(255) NOT NULL,
|
|
current_version VARCHAR(255),
|
|
available_version VARCHAR(255),
|
|
severity VARCHAR(20),
|
|
repository_source TEXT,
|
|
metadata JSONB DEFAULT '{}',
|
|
event_type VARCHAR(50) DEFAULT 'discovered',
|
|
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
```
|
|
|
|
- ✅ `aggregator-server/internal/database/migrations/018_create_metrics_and_docker_tables.down.sql` - **CREATED**
|
|
- Rollback scripts for both tables
|
|
|
|
### **Agent Architecture**
|
|
- ✅ `aggregator-agent/internal/orchestrator/scanner_types.go` - **CREATED**
|
|
```go
|
|
type StorageScanner interface {
|
|
ScanStorage() ([]Metric, error)
|
|
}
|
|
|
|
type SystemScanner interface {
|
|
ScanSystem() ([]Metric, error)
|
|
}
|
|
|
|
type DockerScanner interface {
|
|
ScanDocker() ([]DockerImage, error)
|
|
}
|
|
```
|
|
|
|
- ✅ `aggregator-agent/internal/orchestrator/storage_scanner.go` - **MODIFIED**
|
|
- Fixed type conversion: `int64(disk.Total)` instead of `disk.Total`
|
|
- Updated to return `[]Metric` instead of `[]UpdateReportItem`
|
|
- Added proper timestamp handling
|
|
|
|
- ✅ `aggregator-agent/internal/orchestrator/system_scanner.go` - **MODIFIED**
|
|
- Updated to return `[]Metric` instead of `[]UpdateReportItem`
|
|
- Fixed data type conversions
|
|
|
|
- ✅ `aggregator-agent/internal/orchestrator/docker_scanner.go` - **CREATED**
|
|
- Complete Docker scanner implementation
|
|
- Returns `[]DockerImage` with proper metadata
|
|
- Handles image creation time parsing
|
|
|
|
- ✅ `aggregator-agent/cmd/agent/subsystem_handlers.go` - **MODIFIED**
|
|
- **Storage Handler**: Now calls `ScanStorage()` → `ReportMetrics()`
|
|
- **System Handler**: Now calls `ScanSystem()` → `ReportMetrics()`
|
|
- **Docker Handler**: Now calls `ScanDocker()` → `ReportDockerImages()`
|
|
|
|
### **Agent Client**
|
|
- ✅ `aggregator-agent/internal/client/client.go` - **MODIFIED**
|
|
- Added `ReportMetrics()` method
|
|
- Added `ReportDockerImages()` method
|
|
|
|
### **Server Router**
|
|
- ✅ `aggregator-server/cmd/server/main.go` - **MODIFIED**
|
|
- Fixed database type passing: `db.DB.DB` instead of `db.DB` for new queries
|
|
- Added new handler initializations:
|
|
```go
|
|
metricsQueries := queries.NewMetricsQueries(db.DB.DB)
|
|
dockerQueries := queries.NewDockerQueries(db.DB.DB)
|
|
```
|
|
|
|
### **Documentation**
|
|
- ✅ `REDFLAG_REFACTOR_PLAN.md` - **CREATED**
|
|
- Comprehensive refactor plan documenting all phases
|
|
- Existing infrastructure analysis and reuse strategies
|
|
- Code examples for agent, server, and UI changes
|
|
|
|
---
|
|
|
|
## 🔧 **COMPILATION FIXES**
|
|
|
|
### **UUID Conversion Issues**
|
|
- Fixed `image.ID` and `image.AgentID` from UUID to string using `.String()`
|
|
|
|
### **Database Type Mismatches**
|
|
- Fixed `*sqlx.DB` vs `*sql.DB` type mismatch by accessing underlying database: `db.DB.DB`
|
|
|
|
### **Duplicate Function Declarations**
|
|
- Removed duplicate `extractTag`, `parseImageSize`, `extractLabels` functions
|
|
|
|
### **Unused Imports**
|
|
- Removed unused `"log"` import from metrics.go
|
|
- Removed unused `"github.com/jmoiron/sqlx"` import after type fix
|
|
|
|
### **Type Conversion Errors**
|
|
- Fixed `uint64` to `int64` conversions in storage scanner
|
|
- Fixed image creation time string handling in docker scanner
|
|
|
|
---
|
|
|
|
## 🎯 **API ENDPOINTS ADDED**
|
|
|
|
### Metrics Endpoints
|
|
- `POST /api/v1/agents/:id/metrics` - Report metrics from agent
|
|
- `GET /api/v1/agents/:id/metrics` - Get agent metrics with filtering
|
|
- `GET /api/v1/agents/:id/metrics/storage` - Get agent storage metrics
|
|
- `GET /api/v1/agents/:id/metrics/system` - Get agent system metrics
|
|
|
|
### Docker Endpoints
|
|
- `POST /api/v1/agents/:id/docker-images` - Report Docker images from agent
|
|
- `GET /api/v1/agents/:id/docker-images` - Get agent Docker images with filtering
|
|
- `GET /api/v1/agents/:id/docker-info` - Get detailed Docker information for agent
|
|
|
|
---
|
|
|
|
## 🗄️ **DATABASE SCHEMA CHANGES**
|
|
|
|
### New Tables Created
|
|
1. **metrics** - Stores storage and system metrics
|
|
2. **docker_images** - Stores Docker image information
|
|
|
|
### Indexes Added
|
|
- Agent ID indexes on both tables
|
|
- Package type indexes
|
|
- Created timestamp indexes
|
|
- Composite unique constraints for duplicate prevention
|
|
|
|
---
|
|
|
|
## ✅ **SUCCESS METRICS**
|
|
|
|
### Build Success
|
|
- ✅ Docker build completed without errors
|
|
- ✅ All compilation issues resolved
|
|
- ✅ Server container started successfully
|
|
|
|
### Database Success
|
|
- ✅ Migration 018 executed successfully
|
|
- ✅ New tables created with proper schema
|
|
- ✅ All existing migrations preserved
|
|
|
|
### Runtime Success
|
|
- ✅ Server listening on port 8080
|
|
- ✅ All new API routes registered
|
|
- ✅ Agent connectivity maintained
|
|
- ✅ Existing functionality preserved
|
|
|
|
---
|
|
|
|
## 🚀 **WHAT THIS ACHIEVES**
|
|
|
|
### Proper Data Classification
|
|
- **Storage metrics** → `metrics` table
|
|
- **System metrics** → `metrics` table
|
|
- **Docker images** → `docker_images` table
|
|
- **Package updates** → `update_events` table (existing)
|
|
|
|
### No More Stuck Operations
|
|
- Individual subsystem scans prevent monolithic failures
|
|
- Each subsystem operates independently
|
|
- Error isolation between subsystems
|
|
|
|
### Scalable Architecture
|
|
- Each subsystem can be independently scanned
|
|
- Proper separation of concerns
|
|
- Maintains existing security patterns
|
|
|
|
### Infrastructure Reuse
|
|
- Leverages existing Agent page UI components
|
|
- Reuses existing heartbeat and status systems
|
|
- Maintains existing authentication and validation patterns
|
|
|
|
---
|
|
|
|
## 🎉 **DEPLOYMENT STATUS**
|
|
|
|
**COMPLETE** - November 4, 2025 at 14:04 UTC
|
|
|
|
- ✅ All code changes implemented
|
|
- ✅ Database migration executed
|
|
- ✅ Server built and deployed
|
|
- ✅ API endpoints functional
|
|
- ✅ Agent connectivity verified
|
|
- ✅ Data classification fix operational
|
|
|
|
**The RedFlag subsystem scanning architecture refactor is now complete and successfully deployed!** 🎯 |