5.3 KiB
Migration Error Reporting System
Priority: P2 (New Feature) Source Reference: From DEVELOPMENT_TODOS.md line 348 Status: Ready for Implementation
Problem Statement
When agent migration fails (either during detection or execution), there is currently no mechanism to report these failures to the server for visibility in the History table. Failed migrations are silently logged locally only, making it impossible to track migration issues across the agent fleet.
Feature Description
Implement a migration error reporting system that sends migration failure information to the server for storage in the update_events table, enabling administrators to see migration status and troubleshoot issues through the web interface.
Acceptance Criteria
- Migration failures are reported to the server with detailed error information
- Migration events appear in the agent History with appropriate severity levels
- Both detection failures and execution failures are captured and reported
- Error reports include context: migration type, error message, and system information
- Server accepts migration events via existing agent check-in mechanism
- Migration success/failure status is visible in the web interface
Technical Approach
1. Agent-Side Changes
Migration Event Structure (aggregator-agent/internal/migration/):
type MigrationEvent struct {
EventType string // "migration_detection" or "migration_execution"
Status string // "success", "failed", "warning"
ErrorMessage string // Detailed error message
MigrationFrom string // Source version/path
MigrationTo string // Target version/path
Timestamp time.Time
SystemInfo map[string]interface{}
}
Enhanced Migration Logic:
- Wrap migration detection and execution with error reporting
- Capture detailed error context and system information
- Queue migration events alongside regular update events
2. Server-Side Changes
Database Schema (if needed):
- Verify
update_eventstable can handle migration event types - Add migration-specific event types if not already supported
API Handler (aggregator-server/internal/api/handlers/agent_updates.go):
- Accept migration events in existing check-in endpoint
- Validate migration event structure
- Store events with appropriate metadata
Event Processing:
- Categorize migration events separately from regular updates
- Include migration-specific metadata in responses
3. Frontend Changes
History Display (aggregator-web/src/components/AgentUpdate.tsx):
- Show migration events with distinct styling
- Display migration status (success/failed/warning)
- Show detailed error messages in expandable sections
- Filter capability for migration-specific events
Definition of Done
- ✅ Migration failures are captured and sent to server
- ✅ Migration events appear in agent History with proper categorization
- ✅ Error messages include sufficient detail for troubleshooting
- ✅ Migration success/failure status is clearly visible in UI
- ✅ Both detection and execution phases are monitored
- ✅ Integration testing validates end-to-end error reporting flow
Test Plan
-
Unit Tests
- Test migration event creation and validation
- Test error message formatting and context capture
- Test server-side event acceptance and storage
-
Integration Tests
- Simulate migration detection failure with invalid config
- Simulate migration execution failure with permission issues
- Verify events appear in server database
- Test API response handling for migration events
-
Manual Tests
- Create agent with old config format requiring migration
- Force migration failure (e.g., permissions, disk space)
- Verify error appears in History within reasonable time
- Test error message clarity and usefulness
Files to Modify
aggregator-agent/internal/migration/detection.go- Add error reporting wrapperaggregator-agent/internal/migration/executor.go- Add error reporting wrapperaggregator-agent/cmd/agent/main.go- Handle migration event reportingaggregator-server/internal/api/handlers/agent_updates.go- Accept migration eventsaggregator-web/src/components/AgentUpdate.tsx- Display migration eventsaggregator-web/src/components/AgentUpdatesEnhanced.tsx- Enhanced display if used
Migration Event Types
-
Detection Events:
migration_detection_success- Detected need for migrationmigration_detection_failed- Error during migration detectionmigration_detection_not_needed- No migration required
-
Execution Events:
migration_execution_success- Migration completed successfullymigration_execution_failed- Migration failed with errorsmigration_execution_partial- Partial success with warnings
Estimated Effort
- Development: 8-12 hours
- Testing: 4-6 hours
- Review: 2-3 hours
Dependencies
- Existing agent update reporting infrastructure
- Current migration detection and execution systems
- Agent check-in mechanism for event transmission
Risk Assessment
Low Risk - This feature enhances existing functionality without modifying core migration logic. The biggest risk is error message formatting, which can be easily adjusted based on testing feedback.