# Issue #1 Proper Solution Design ## Problem Root Cause Agent check-in interval was being incorrectly overridden by scanner subsystem intervals. ## Current State After Kimi's "Fast Fix" - Line that overrode check-in interval was removed - Scanner intervals are logged but not applied to agent polling - Separation exists but without validation or protection ## What's Missing (Why It's Not "Proper" Yet) ### 1. No Validation - No bounds checking on interval values - Could accept negative intervals - Could accept intervals that are too short (causing server overload) - Could accept intervals that are too long (causing agent to appear dead) ### 2. No Idempotency Verification - Not tested that operations can be run multiple times safely - Config updates might not be idempotent - No verification that rapid mode toggling is safe ### 3. No Protection Against Regressions - No guardrails to prevent future developer from re-introducing the bug - No comments explaining WHY separation is critical - No architectural documentation - No tests that would catch if someone re-introduces the override ### 4. Insufficient Error Handling - syncServerConfig runs in goroutine with no error recovery - No retry logic if server temporarily unavailable - No degraded mode operation - No circuit breaker pattern ### 5. No Comprehensive Logging - No context about WHAT changed in interval - No history of interval changes over time - No error context for debugging ## Proper Solution Design ### Component 1: Validation Layer ```go type IntervalValidator struct { minCheckInSeconds int // 60 seconds (1 minute) maxCheckInSeconds int // 3600 seconds (1 hour) minScannerMinutes int // 1 minute maxScannerMinutes int // 1440 minutes (24 hours) } func (v *IntervalValidator) ValidateCheckInInterval(seconds int) error { if seconds < v.minCheckInSeconds { return fmt.Errorf("check-in interval %d seconds below minimum %d", seconds, v.minCheckInSeconds) } if seconds > v.maxCheckInSeconds { return fmt.Errorf("check-in interval %d seconds above maximum %d", seconds, v.maxCheckInSeconds) } return nil } func (v *IntervalValidator) ValidateScannerInterval(minutes int) error { if minutes < v.minScannerMinutes { return fmt.Errorf("scanner interval %d minutes below minimum %d", minutes, v.minScannerMinutes) } if minutes > v.maxScannerMinutes { return fmt.Errorf("scanner interval %d minutes above maximum %d", minutes, v.maxScannerMinutes) } return nil } ``` ### Component 2: Idempotency Protection ```go type IntervalGuardian struct { lastValidatedInterval int violationCount int } func (g *IntervalGuardian) CheckForOverrideAttempt(currentInterval, proposedInterval int) error { if currentInterval != proposedInterval { g.violationCount++ return fmt.Errorf("INTERVAL_OVERRIDE_DETECTED: current=%d, proposed=%d, violations=%d", currentInterval, proposedInterval, g.violationCount) } return nil } func (g *IntervalGuardian) GetViolationCount() int { return g.violationCount } ``` ### Component 3: Error Recovery with Retry ```go func syncServerConfigWithRetry(apiClient *client.Client, cfg *config.Config, maxRetries int) error { var lastErr error for attempt := 1; attempt <= maxRetries; attempt++ { if err := syncServerConfigProper(apiClient, cfg); err != nil { lastErr = err log.Printf("[ERROR] [agent] [config] sync attempt %d/%d failed: %v", attempt, maxRetries, err) if attempt < maxRetries { // Exponential backoff: 1s, 2s, 4s, 8s... backoff := time.Duration(1<