Redflag/docs/3_BACKLOG/P3-001_Duplicate-Command-Prevention.md

# Duplicate Command Prevention System

**Priority**: P3 (Enhancement)
**Source Reference**: From quick-todos.md line 21
**Status**: Analyzed, Ready for Implementation

## Problem Statement

The current command scheduling system has no duplicate detection mechanism. Multiple instances of the same command can be queued for an agent (e.g., multiple `scan_apt` commands), causing unnecessary work, potential conflicts, and wasted system resources.

## Feature Description

Implement duplicate command prevention logic that checks for existing pending/sent commands of the same type before creating new ones, while preserving legitimate retry and interval scheduling behavior.

## Acceptance Criteria

1. System checks for recent duplicate commands before creating new ones
2. Uses `AgentID` + `CommandType` + `Status IN ('pending', 'sent')` as duplicate criteria
3. Time-based window to allow legitimate repeats (e.g., 5 minutes)
4. Skip duplicates only if recent (configurable timeframe)
5. Preserve legitimate scheduling and retry logic
6. Logging of duplicate prevention for monitoring
7. Manual commands can override duplicate prevention

## Technical Approach

### 1. Database Query Layer

**New Query Function** (`aggregator-server/internal/database/queries/`):
```sql
-- Check for recent duplicate commands
SELECT COUNT(*) FROM commands
WHERE agent_id = $1
  AND command_type = $2
  AND status IN ('pending', 'sent')
  AND created_at > NOW() - INTERVAL '5 minutes';
```

**Go Implementation**:
```go
func (q *Queries) CheckRecentDuplicate(agentID uuid.UUID, commandType string, timeWindow time.Duration) (bool, error) {
    var count int
    err := q.db.QueryRow(`
        SELECT COUNT(*) FROM commands
        WHERE agent_id = $1
          AND command_type = $2
          AND status IN ('pending', 'sent')
          AND created_at > NOW() - $3::INTERVAL
    `, agentID, commandType, timeWindow).Scan(&count)
    return count > 0, err
}
```

### 2. Scheduler Integration

**Enhanced Command Creation** (`aggregator-server/internal/services/scheduler.go`):
```go
func (s *Scheduler) CreateCommandWithDuplicateCheck(agentID uuid.UUID, commandType string, payload interface{}, force bool) error {
    // Skip duplicate check for forced commands
    if !force {
        isDuplicate, err := s.queries.CheckRecentDuplicate(agentID, commandType, 5*time.Minute)
        if err != nil {
            return fmt.Errorf("failed to check for duplicates: %w", err)
        }
        if isDuplicate {
            log.Printf("Skipping duplicate %s command for agent %s (created within 5 minutes)", commandType, agentID)
            return nil
        }
    }

    // Create command normally
    return s.queries.CreateCommand(agentID, commandType, payload)
}
```

### 3. Configuration

**Duplicate Prevention Settings**:
- Time window: 5 minutes (configurable via environment)
- Command types to check: `scan_apt`, `scan_dnf`, `scan_updates`, etc.
- Manual command override: Force flag to bypass duplicate check
- Logging level: Debug vs Info for duplicate skips

### 4. Monitoring and Logging

**Duplicate Prevention Metrics**:
- Counter for duplicates prevented per command type
- Logging of duplicate prevention with agent and command details
- Dashboard metrics showing duplicate prevention effectiveness

## Definition of Done

- ✅ Database query for duplicate detection implemented
- ✅ Scheduler integrates duplicate checking before command creation
- ✅ Configurable time window for duplicate detection
- ✅ Manual commands can bypass duplicate prevention
- ✅ Proper logging and monitoring of duplicate prevention
- ✅ Unit tests for various duplicate scenarios
- ✅ Integration testing with scheduler behavior
- ✅ Performance impact assessment (minimal overhead)

## Test Plan

1. **Unit Tests**
   - Test duplicate detection with various time windows
   - Test command type filtering
   - Test agent-specific duplicate checking
   - Test force override functionality

2. **Integration Tests**
   - Test scheduler behavior with duplicate prevention
   - Test legitimate retry scenarios still work
   - Test manual command override
   - Test performance impact under load

3. **Scenario Tests**
   - Multiple rapid `scan_apt` commands for same agent
   - Different command types for same agent (should not duplicate)
   - Same command type for different agents (should not duplicate)
   - Commands older than time window (should create new command)

## Files to Modify

- `aggregator-server/internal/database/queries/commands.go` - Add duplicate check query
- `aggregator-server/internal/services/scheduler.go` - Integrate duplicate checking
- `aggregator-server/cmd/server/main.go` - Configuration for time window
- `aggregator-server/internal/services/metrics.go` - Add duplicate prevention metrics

## Duplicate Detection Logic

### Criteria for Duplicate
1. **Same Agent ID**: Commands for different agents are not duplicates
2. **Same Command Type**: `scan_apt` vs `scan_dnf` are different commands
3. **Recent Creation**: Within configured time window (default 5 minutes)
4. **Active Status**: Only 'pending' or 'sent' commands count as duplicates

### Time Window Considerations
- **5 minutes**: Prevents rapid-fire duplicate scheduling
- **Configurable**: Can be adjusted per deployment needs
- **Per Command Type**: Different windows for different command types

### Override Mechanisms
1. **Manual Commands**: Admin-initiated commands can force execution
2. **Critical Commands**: Security or emergency updates bypass duplicate prevention
3. **Different Payloads**: Commands with different parameters may not be duplicates

## Estimated Effort

- **Development**: 6-8 hours
- **Testing**: 4-6 hours
- **Review**: 2-3 hours

## Dependencies

- Existing command queue system
- Scheduler service architecture
- Database query layer

## Risk Assessment

**Low Risk** - Enhancement that doesn't change existing functionality, only adds prevention logic. The force override provides safety valve for edge cases. Configurable time window allows tuning based on operational needs.

## Performance Impact

- **Database Overhead**: One additional query per command creation (minimal)
- **Memory Impact**: Negligible
- **Network Impact**: None
- **CPU Impact**: Minimal (simple query with indexed columns)

## Monitoring Metrics

- Duplicates prevented per hour/day
- Command creation success rate
- Average time between duplicate attempts
- Most frequent duplicate command types
- Agent-specific duplicate patterns