Files

Fimeg 484a7f77ce Add docs and project files - force for Culurien

2026-03-28 20:46:24 -04:00

16 KiB

Raw Permalink Blame History

P4-005: Testing Infrastructure Gaps

Priority: P4 (Technical Debt) Source Reference: From analysis of codebase testing coverage and existing test files Date Identified: 2025-11-12

Problem Description

RedFlag has minimal testing infrastructure with only 5 test files covering basic functionality. Critical components like agent communication, authentication, scanner integration, and database operations lack comprehensive test coverage. This creates high risk for regressions and makes confident deployment difficult.

Current Test Coverage Analysis

Existing Tests (5 files)

aggregator-agent/internal/circuitbreaker/circuitbreaker_test.go - Basic circuit breaker
aggregator-agent/test_disk.go - Disk detection testing (development)
test_disk_detection.go - Disk detection integration test
aggregator-server/internal/scheduler/queue_test.go - Queue operations (21 tests passing)
aggregator-server/internal/scheduler/scheduler_test.go - Scheduler logic (21 tests passing)

Critical Missing Test Areas

Agent Components (0% coverage)

Agent registration and authentication
Scanner implementations (APT, DNF, Docker, Windows, Winget)
Command execution and acknowledgment
File management and state persistence
Error handling and resilience
Cross-platform compatibility

Server Components (Minimal coverage)

API endpoints and handlers
Database operations and queries
Authentication and authorization
Rate limiting and security middleware
Agent lifecycle management
Update package distribution

Integration Testing (0% coverage)

End-to-end agent-server communication
Multi-agent scenarios
Error recovery and failover
Performance under load
Security validation (Ed25519, nonces, machine binding)

Security Testing (0% coverage)

Cryptographic operations validation
Authentication bypass attempts
Input validation and sanitization
Rate limiting effectiveness
Machine binding enforcement

Impact

Regression Risk: No safety net for code changes
Deployment Confidence: Cannot verify system reliability
Quality Assurance: Manual testing is time-consuming and error-prone
Security Validation: No automated security testing
Performance Testing: No way to detect performance regressions
Documentation Gaps: Tests serve as living documentation

Proposed Solution

Implement comprehensive testing infrastructure across all components:

1. Unit Testing Framework

// Test configuration and utilities
// aggregator/internal/testutil/testutil.go
package testutil

import (
    "testing"
    "github.com/stretchr/testify/assert"
    "github.com/stretchr/testify/mock"
    "github.com/stretchr/testify/suite"
)

type TestSuite struct {
    suite.Suite
    DB     *sql.DB
    Config *Config
    Server *httptest.Server
}

func (s *TestSuite) SetupSuite() {
    // Initialize test database
    s.DB = setupTestDB()

    // Initialize test configuration
    s.Config = &Config{
        DatabaseURL: "postgres://test:test@localhost/redflag_test",
        ServerPort:  0, // Random port for testing
    }
}

func (s *TestSuite) TearDownSuite() {
    if s.DB != nil {
        s.DB.Close()
    }
    cleanupTestDB()
}

func (s *TestSuite) SetupTest() {
    // Reset database state before each test
    resetTestDB(s.DB)
}

// Mock implementations
type MockScanner struct {
    mock.Mock
}

func (m *MockScanner) ScanForUpdates() ([]UpdateReportItem, error) {
    args := m.Called()
    return args.Get(0).([]UpdateReportItem), args.Error(1)
}

2. Agent Component Tests

// aggregator-agent/cmd/agent/main_test.go
func TestAgentRegistration(t *testing.T) {
    tests := []struct {
        name           string
        token          string
        expectedStatus int
        expectedError  string
    }{
        {
            name:           "Valid registration",
            token:          "valid-token-123",
            expectedStatus: http.StatusCreated,
        },
        {
            name:           "Invalid token",
            token:          "invalid-token",
            expectedStatus: http.StatusUnauthorized,
            expectedError:  "invalid registration token",
        },
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            server := setupTestServer(t)
            defer server.Close()

            agent := &Agent{
                ServerURL: server.URL,
                Token:     tt.token,
            }

            err := agent.Register()
            if tt.expectedError != "" {
                assert.Contains(t, err.Error(), tt.expectedError)
            } else {
                assert.NoError(t, err)
            }
        })
    }
}

// aggregator-agent/internal/scanner/dnf_test.go
func TestDNFScanner(t *testing.T) {
    // Test with mock dnf command
    scanner := &DNFScanner{}

    t.Run("Successful scan", func(t *testing.T) {
        // Mock successful dnf check-update output
        withMockCommand("dnf", "check-update", successfulDNFOutput, func() {
            updates, err := scanner.ScanForUpdates()
            assert.NoError(t, err)
            assert.NotEmpty(t, updates)

            // Verify update parsing
            nginx := findUpdate(updates, "nginx")
            assert.NotNil(t, nginx)
            assert.Equal(t, "1.20.1", nginx.CurrentVersion)
            assert.Equal(t, "1.21.0", nginx.AvailableVersion)
        })
    })

    t.Run("DNF not available", func(t *testing.T) {
        scanner.executable = "nonexistent-dnf"
        _, err := scanner.ScanForUpdates()
        assert.Error(t, err)
        assert.Contains(t, err.Error(), "dnf not found")
    })
}

3. Server Component Tests

// aggregator-server/internal/api/handlers/agents_test.go
func TestAgentsHandler_RegisterAgent(t *testing.T) {
    suite := &TestSuite{}
    suite.SetupSuite()
    defer suite.TearDownSuite()

    tests := []struct {
        name           string
        requestBody    string
        expectedStatus int
        setupToken     bool
    }{
        {
            name:        "Valid registration",
            requestBody: `{"hostname":"test-host","os_type":"linux","agent_version":"0.1.23"}`,
            setupToken:  true,
            expectedStatus: http.StatusCreated,
        },
        {
            name:           "Invalid JSON",
            requestBody:    `{"hostname":}`,
            expectedStatus: http.StatusBadRequest,
        },
        {
            name:           "Missing token",
            requestBody:    `{"hostname":"test-host"}`,
            expectedStatus: http.StatusUnauthorized,
        },
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            suite.SetupTest()

            if tt.setupToken {
                token := createTestToken(suite.DB, 5)
                suite.Config.JWTSecret = "test-secret"
            }

            req := httptest.NewRequest("POST", "/api/v1/agents/register",
                strings.NewReader(tt.requestBody))
            req.Header.Set("Content-Type", "application/json")
            if tt.setupToken {
                req.Header.Set("Authorization", "Bearer test-token")
            }

            w := httptest.NewRecorder()
            handler := NewAgentsHandler(suite.DB, suite.Config)
            handler.RegisterAgent(w, req)

            assert.Equal(t, tt.expectedStatus, w.Code)
        })
    }
}

// aggregator-server/internal/database/queries/agents_test.go
func TestAgentQueries(t *testing.T) {
    db := setupTestDB(t)
    queries := NewAgentQueries(db)

    t.Run("Create and retrieve agent", func(t *testing.T) {
        agent := &models.Agent{
            ID:        uuid.New(),
            Hostname:  "test-host",
            OSType:    "linux",
            Version:   "0.1.23",
            CreatedAt: time.Now(),
        }

        // Create agent
        err := queries.CreateAgent(agent)
        assert.NoError(t, err)

        // Retrieve agent
        retrieved, err := queries.GetAgent(agent.ID)
        assert.NoError(t, err)
        assert.Equal(t, agent.Hostname, retrieved.Hostname)
        assert.Equal(t, agent.OSType, retrieved.OSType)
    })
}

4. Integration Tests

// integration/agent_server_test.go
func TestAgentServerIntegration(t *testing.T) {
    if testing.Short() {
        t.Skip("Skipping integration test in short mode")
    }

    // Setup test environment
    server := setupIntegrationServer(t)
    defer server.Cleanup()

    agent := setupIntegrationAgent(t, server.URL)
    defer agent.Cleanup()

    t.Run("Complete agent lifecycle", func(t *testing.T) {
        // Registration
        err := agent.Register()
        assert.NoError(t, err)

        // First check-in (no commands)
        commands, err := agent.CheckIn()
        assert.NoError(t, err)
        assert.Empty(t, commands)

        // Send scan command
        scanCmd := &Command{
            Type: "scan_updates",
            ID:   uuid.New(),
        }
        err = server.SendCommand(agent.ID, scanCmd)
        assert.NoError(t, err)

        // Second check-in (should receive command)
        commands, err = agent.CheckIn()
        assert.NoError(t, err)
        assert.Len(t, commands, 1)
        assert.Equal(t, "scan_updates", commands[0].Type)

        // Execute command and report results
        result := agent.ExecuteCommand(commands[0])
        err = agent.ReportResult(result)
        assert.NoError(t, err)

        // Verify command completion
        cmdStatus, err := server.GetCommandStatus(scanCmd.ID)
        assert.NoError(t, err)
        assert.Equal(t, "completed", cmdStatus.Status)
    })
}

// integration/security_test.go
func TestSecurityFeatures(t *testing.T) {
    server := setupIntegrationServer(t)
    defer server.Cleanup()

    t.Run("Machine binding enforcement", func(t *testing.T) {
        agent1 := setupIntegrationAgent(t, server.URL)
        agent2 := setupIntegrationAgentWithMachineID(t, server.URL, agent1.MachineID)

        // Register first agent
        err := agent1.Register()
        assert.NoError(t, err)

        // Attempt to register second agent with same machine ID
        err = agent2.Register()
        assert.Error(t, err)
        assert.Contains(t, err.Error(), "machine ID already registered")
    })

    t.Run("Ed25519 signature validation", func(t *testing.T) {
        // Test with valid signature
        validPackage := createSignedPackage(t, server.PrivateKey)
        err := agent.VerifyPackageSignature(validPackage)
        assert.NoError(t, err)

        // Test with invalid signature
        invalidPackage := createSignedPackage(t, "wrong-key")
        err = agent.VerifyPackageSignature(invalidPackage)
        assert.Error(t, err)
        assert.Contains(t, err.Error(), "invalid signature")
    })
}

5. Performance Tests

// performance/load_test.go
func BenchmarkAgentCheckIn(b *testing.B) {
    server := setupBenchmarkServer(b)
    defer server.Cleanup()

    agent := setupBenchmarkAgent(b, server.URL)

    b.ResetTimer()
    b.RunParallel(func(pb *testing.PB) {
        for pb.Next() {
            _, err := agent.CheckIn()
            if err != nil {
                b.Fatal(err)
            }
        }
    })
}

func TestConcurrentAgents(t *testing.T) {
    server := setupIntegrationServer(t)
    defer server.Cleanup()

    numAgents := 100
    var wg sync.WaitGroup
    errors := make(chan error, numAgents)

    for i := 0; i < numAgents; i++ {
        wg.Add(1)
        go func(id int) {
            defer wg.Done()

            agent := setupIntegrationAgentWithID(t, server.URL, fmt.Sprintf("agent-%d", id))
            err := agent.Register()
            if err != nil {
                errors <- fmt.Errorf("agent %d registration failed: %w", id, err)
                return
            }

            // Perform several check-ins
            for j := 0; j < 5; j++ {
                _, err := agent.CheckIn()
                if err != nil {
                    errors <- fmt.Errorf("agent %d check-in %d failed: %w", id, j, err)
                    return
                }
            }
        }(i)
    }

    wg.Wait()
    close(errors)

    // Check for any errors
    for err := range errors {
        t.Error(err)
    }
}

6. Test Database Setup

// internal/testutil/db.go
package testutil

import (
    "database/sql"
    "fmt"
    "os"
)

func setupTestDB(t *testing.T) *sql.DB {
    db, err := sql.Open("postgres", "postgres://test:test@localhost/redflag_test?sslmode=disable")
    if err != nil {
        t.Fatalf("Failed to connect to test database: %v", err)
    }

    // Run migrations
    if err := runMigrations(db); err != nil {
        t.Fatalf("Failed to run migrations: %v", err)
    }

    return db
}

func resetTestDB(db *sql.DB) error {
    tables := []string{
        "agent_commands", "update_logs", "registration_token_usage",
        "registration_tokens", "refresh_tokens", "agents",
    }

    tx, err := db.Begin()
    if err != nil {
        return err
    }
    defer tx.Rollback()

    for _, table := range tables {
        _, err := tx.Exec(fmt.Sprintf("DELETE FROM %s", table))
        if err != nil {
            return err
        }
    }

    return tx.Commit()
}

Definition of Done

Unit test coverage >80% for all critical components
Integration test coverage for all major workflows
Performance tests for scalability validation
Security tests for authentication and cryptographic features
CI/CD pipeline with automated testing
Test database setup and migration testing
Mock implementations for external dependencies
Test documentation and examples

Implementation Plan

Phase 1: Foundation (Week 1)

Set up testing framework and utilities
Create test database setup
Implement mock objects for external dependencies
Add basic unit tests for core components

Phase 2: Agent Testing (Week 2)

Scanner implementation tests
Agent lifecycle tests
Error handling and resilience tests
Cross-platform compatibility tests

Phase 3: Server Testing (Week 3)

API endpoint tests
Database operation tests
Authentication and security tests
Rate limiting and middleware tests

Phase 4: Integration & Performance (Week 4)

End-to-end integration tests
Multi-agent scenarios
Performance and load tests
Security validation tests

Testing Strategy

Unit Tests

Focus on individual component behavior
Mock external dependencies
Fast execution (<1 second per test)
Cover edge cases and error conditions

Integration Tests

Test component interactions
Use real database and filesystem
Slower execution but comprehensive coverage
Validate complete workflows

Performance Tests

Measure response times and throughput
Test under realistic load conditions
Identify performance bottlenecks
Validate scalability claims

Security Tests

Validate authentication mechanisms
Test cryptographic operations
Verify input validation
Check for common vulnerabilities

Prerequisites

Test database instance (PostgreSQL)
CI/CD pipeline infrastructure
Mock implementations for external services
Performance testing environment
Security testing tools and knowledge

Effort Estimate

Complexity: High Effort: 4 weeks (1 developer)

Week 1: Testing framework and foundation
Week 2: Agent component tests
Week 3: Server component tests
Week 4: Integration and performance tests

Success Metrics

Code coverage >80% for critical components
All major workflows covered by integration tests
Performance tests validate 10,000+ agent support
Security tests verify authentication and cryptography
CI/CD pipeline runs tests automatically
Regression detection for new features
Documentation includes testing guidelines

Monitoring

Track these metrics after implementation:

Test execution time trends
Code coverage percentage
Test failure rates
Performance benchmark results
Security test findings
Developer satisfaction with testing tools

16 KiB Raw Permalink Blame History