Add docs and project files - force for Culurien

This commit is contained in:
Fimeg
2026-03-28 20:46:24 -04:00
parent dc61797423
commit 484a7f77ce
343 changed files with 119530 additions and 0 deletions

View File

@@ -0,0 +1,378 @@
# P4-003: Agent File Management and Migration System
**Priority:** P4 (Technical Debt)
**Source Reference:** From analysis of needsfixingbeforepush.md lines 1477-1517 and DEVELOPMENT_TODOS.md lines 1611-1635
**Date Identified:** 2025-11-12
## Problem Description
Agent has no validation that working files belong to current agent binary/version. Stale files from previous agent installations interfere with current operations, causing timeout issues and data corruption. Mixed directory naming creates confusion and maintenance issues.
## Impact
- **Data Corruption:** Stale `last_scan.json` files with wrong agent IDs cause parsing timeouts
- **Installation Conflicts:** No clean migration between agent versions
- **Path Inconsistency:** Mixed `/var/lib/aggregator` vs `/var/lib/redflag` paths
- **Security Risk:** No file validation prevents potential file poisoning attacks
- **Maintenance Burden:** Manual cleanup required for corrupted files
## Current Issues Identified
### 1. Stale File Problem
```json
// /var/lib/aggregator/last_scan.json from October 14th
{
"last_scan_time": "2025-10-14T10:19:23.20489739-04:00", // OLD!
"agent_id": "49f9a1e8-66db-4d21-b3f4-f416e0523ed1", // OLD!
"updates": [/* 50,000+ lines causing timeouts */]
}
```
### 2. Path Inconsistency
- Old paths: `/var/lib/aggregator`, `/etc/aggregator`
- New paths: `/var/lib/redflag`, `/etc/redflag`
- Mixed usage across codebase
- No standardized migration strategy
### 3. No Version Validation
- Agent doesn't validate file ownership
- No binary signature validation of working files
- Stale files accumulate and cause issues
- No cleanup mechanisms
## Proposed Solution
Implement comprehensive file management and migration system:
### 1. File Validation and Migration System
```go
type FileManager struct {
CurrentAgentID string
CurrentVersion string
BasePaths PathConfig
MigrationConfig MigrationConfig
}
type PathConfig struct {
Config string // /etc/redflag/config.json
State string // /var/lib/redflag/
Backup string // /var/lib/redflag/backups/
Logs string // /var/log/redflag/
}
type MigrationConfig struct {
OldPaths []string // Legacy paths to migrate from
BackupEnabled bool
MaxBackups int
}
func (fm *FileManager) ValidateAndMigrate() error {
// 1. Check for legacy paths and migrate
if err := fm.migrateLegacyPaths(); err != nil {
return fmt.Errorf("path migration failed: %w", err)
}
// 2. Validate file ownership
if err := fm.validateFileOwnership(); err != nil {
return fmt.Errorf("file ownership validation failed: %w", err)
}
// 3. Clean up stale files
if err := fm.cleanupStaleFiles(); err != nil {
return fmt.Errorf("stale file cleanup failed: %w", err)
}
return nil
}
```
### 2. Agent File Ownership Validation
```go
type FileMetadata struct {
AgentID string `json:"agent_id"`
Version string `json:"version"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
Checksum string `json:"checksum"`
}
func (fm *FileManager) ValidateFile(filePath string) error {
// Check if file exists
if _, err := os.Stat(filePath); os.IsNotExist(err) {
return nil // No file to validate
}
// Read file metadata
metadata, err := fm.readFileMetadata(filePath)
if err != nil {
// No metadata found - treat as legacy file
return fm.handleLegacyFile(filePath)
}
// Validate agent ID matches
if metadata.AgentID != fm.CurrentAgentID {
return fm.handleMismatchedFile(filePath, metadata)
}
// Validate version compatibility
if !fm.isVersionCompatible(metadata.Version) {
return fm.handleVersionMismatch(filePath, metadata)
}
// Validate file integrity
if err := fm.validateFileIntegrity(filePath, metadata.Checksum); err != nil {
return fmt.Errorf("file integrity check failed for %s: %w", filePath, err)
}
return nil
}
```
### 3. Stale File Detection and Cleanup
```go
func (fm *FileManager) cleanupStaleFiles() error {
files := []string{
filepath.Join(fm.BasePaths.State, "last_scan.json"),
filepath.Join(fm.BasePaths.State, "pending_acks.json"),
filepath.Join(fm.BasePaths.State, "command_history.json"),
}
for _, file := range files {
if err := fm.ValidateFile(file); err != nil {
if isStaleFileError(err) {
// Backup and remove stale file
if err := fm.backupAndRemove(file); err != nil {
log.Printf("Warning: Failed to backup stale file %s: %v", file, err)
} else {
log.Printf("Cleaned up stale file: %s", file)
}
}
}
}
return nil
}
func (fm *FileManager) backupAndRemove(filePath string) error {
if !fm.MigrationConfig.BackupEnabled {
return os.Remove(filePath)
}
// Create backup with timestamp
timestamp := time.Now().Format("20060102-150405")
backupPath := filepath.Join(fm.BasePaths.Backup, fmt.Sprintf("%s.%s", filepath.Base(filePath), timestamp))
// Ensure backup directory exists
if err := os.MkdirAll(fm.BasePaths.Backup, 0755); err != nil {
return err
}
// Copy to backup
if err := copyFile(filePath, backupPath); err != nil {
return err
}
// Remove original
return os.Remove(filePath)
}
```
### 4. Path Standardization
```go
// Standardized paths for consistency
const (
DefaultConfigPath = "/etc/redflag/config.json"
DefaultStatePath = "/var/lib/redflag/"
DefaultBackupPath = "/var/lib/redflag/backups/"
DefaultLogPath = "/var/log/redflag/"
)
func GetStandardPaths() PathConfig {
return PathConfig{
Config: DefaultConfigPath,
State: DefaultStatePath,
Backup: DefaultBackupPath,
Logs: DefaultLogPath,
}
}
func (fm *FileManager) migrateLegacyPaths() error {
legacyPaths := []string{
"/etc/aggregator",
"/var/lib/aggregator",
}
for _, legacyPath := range legacyPaths {
if _, err := os.Stat(legacyPath); err == nil {
if err := fm.migrateFromPath(legacyPath); err != nil {
return fmt.Errorf("failed to migrate from %s: %w", legacyPath, err)
}
}
}
return nil
}
```
### 5. Binary Signature Validation
```go
func (fm *FileManager) validateBinarySignature(filePath string) error {
// Get current binary signature
currentBinary, err := os.Executable()
if err != nil {
return err
}
currentSignature, err := fm.calculateFileSignature(currentBinary)
if err != nil {
return err
}
// Read file's expected binary signature
metadata, err := fm.readFileMetadata(filePath)
if err != nil {
return err
}
if metadata.BinarySignature != "" && metadata.BinarySignature != currentSignature {
return fmt.Errorf("file was created by different binary version")
}
return nil
}
```
## Definition of Done
- [ ] File validation system checks agent ID and version compatibility
- [ ] Automatic cleanup of stale files from previous installations
- [ ] Path standardization implemented across codebase
- [ ] Migration system handles legacy path transitions
- [ ] Backup system preserves important files during cleanup
- [ ] Binary signature validation prevents file poisoning
- [ ] Configuration options for migration behavior
- [ ] Comprehensive logging for debugging file issues
## Implementation Details
### File Locations
- **Primary:** `aggregator-agent/internal/filesystem/` (new package)
- **Integration:** `aggregator-agent/cmd/agent/main.go` (initialization)
- **Config:** `aggregator-agent/internal/config/config.go`
### Configuration Options
```json
{
"file_management": {
"paths": {
"config": "/etc/redflag/config.json",
"state": "/var/lib/redflag/",
"backup": "/var/lib/redflag/backups/",
"logs": "/var/log/redflag/"
},
"migration": {
"cleanup_stale_files": true,
"backup_on_cleanup": true,
"max_backups": 10,
"migrate_legacy_paths": true
},
"validation": {
"validate_agent_id": true,
"validate_version": true,
"validate_binary_signature": false
}
}
}
```
### Integration Points
```go
// Agent initialization
func (a *Agent) initialize() error {
// Existing initialization...
// File management setup
fileManager := filesystem.NewFileManager(a.config, a.agentID, AgentVersion)
if err := fileManager.ValidateAndMigrate(); err != nil {
return fmt.Errorf("file management initialization failed: %w", err)
}
a.fileManager = fileManager
return nil
}
// Before scan operations
func (a *Agent) scanForUpdates() error {
// Validate files before operation
if err := a.fileManager.ValidateAndMigrate(); err != nil {
log.Printf("Warning: File validation failed, proceeding anyway: %v", err)
}
// Continue with scan...
}
```
## Testing Strategy
### Unit Tests
- File validation logic
- Migration path handling
- Backup and cleanup operations
- Signature validation
### Integration Tests
- Full migration scenarios
- Stale file detection
- Path transition testing
- Configuration validation
### Manual Test Scenarios
1. **Stale File Cleanup:**
- Install agent v1, create state files
- Install agent v2 with different agent ID
- Verify stale files are backed up and cleaned
2. **Path Migration:**
- Install agent with old paths
- Upgrade to new version
- Verify files are moved to new locations
3. **File Corruption Recovery:**
- Corrupt state files manually
- Restart agent
- Verify recovery or graceful degradation
## Prerequisites
- Configuration system supports nested structures
- Logging infrastructure supports structured output
- Agent has unique ID and version information
- File system permissions allow access to required paths
## Effort Estimate
**Complexity:** Medium-High
**Effort:** 3-4 days
- Day 1: File validation and cleanup system
- Day 2: Path migration and standardization
- Day 3: Binary signature validation
- Day 4: Integration testing and configuration
## Success Metrics
- Elimination of timeout issues from stale files
- Zero manual intervention required for upgrades
- Consistent path usage across codebase
- No data loss during migration operations
- Improved system startup reliability
- Enhanced security through file validation
## Monitoring
Track these metrics after implementation:
- File validation error rate
- Migration success rate
- Stale file cleanup frequency
- Path standardization compliance
- Agent startup time improvement
- User-reported file issues reduction