Redflag/docs/security_logging.md

# Security Logging Infrastructure

This document describes the structured security logging system implemented in RedFlag v0.2.x.

## Overview

The security logging system provides structured, JSON-formatted logging for security-related events across both the server and agent components. It enables:

- Audit trail of security events
- Real-time monitoring of potential threats
- Historical analysis of security incidents
- Privacy-respecting logging (IP address hashing)

## Architecture

### Server-Side Components

1. **Security Event Model** (`internal/models/security_event.go`)
   - Defines the structure of all security events
   - Includes timestamp, severity, event type, agent ID, and contextual data
   - Supports IP address hashing for privacy

2. **Security Logger** (`internal/logging/security_logger.go`)
   - Handles event logging to both files and database
   - Supports log rotation with lumberjack
   - Asynchronous processing with buffered events
   - Configurable log levels and event filtering

3. **Configuration** (`internal/config/config.go`)
   - Added SecurityLogging section with comprehensive options
   - Environment variable support for all settings
   - Default values optimized for production

### Agent-Side Components

1. **Security Logger** (`internal/logging/security_logger.go`)
   - Simplified implementation for agent constraints
   - Local file logging with batching
   - Optional forwarding to server
   - Minimal dependencies

2. **Configuration** (`internal/config/config.go`)
   - Security logging configuration embedded in agent config
   - Environment variable overrides supported
   - Sensible defaults for agent deployments

## Event Types

The system tracks the following security events:

| Event Type | Description | Default Severity |
|------------|-------------|------------------|
| CMD_SIGNATURE_VERIFICATION_FAILED | Command signature verification failed | CRITICAL |
| CMD_SIGNATURE_VERIFICATION_SUCCESS | Command signature verification succeeded | INFO |
| UPDATE_NONCE_INVALID | Update nonce validation failed | WARNING |
| UPDATE_SIGNATURE_VERIFICATION_FAILED | Update signature verification failed | CRITICAL |
| MACHINE_ID_MISMATCH | Machine ID change detected | WARNING |
| AUTH_JWT_VALIDATION_FAILED | JWT authentication failed | WARNING |
| PRIVATE_KEY_NOT_CONFIGURED | Private signing key missing | CRITICAL |
| AGENT_REGISTRATION_FAILED | Agent registration failed | WARNING |
| UNAUTHORIZED_ACCESS_ATTEMPT | Unauthorized API access attempt | WARNING |
| CONFIG_TAMPERING_DETECTED | Configuration file tampering detected | WARNING |
| ANOMALOUS_BEHAVIOR | Anomalous agent behavior detected | WARNING |

## Configuration

### Server Configuration

Environment variables:
- `REDFLAG_SECURITY_LOG_ENABLED`: Enable/disable security logging (default: true)
- `REDFLAG_SECURITY_LOG_LEVEL`: Minimum log level (none, error, warning, info, debug) (default: warning)
- `REDFLAG_SECURITY_LOG_SUCCESSES`: Log success events (default: false)
- `REDFLAG_SECURITY_LOG_PATH`: Log file path (default: /var/log/redflag/security.json)
- `REDFLAG_SECURITY_LOG_MAX_SIZE`: Maximum log file size in MB (default: 100)
- `REDFLAG_SECURITY_LOG_MAX_FILES`: Number of rotated log files (default: 10)
- `REDFLAG_SECURITY_LOG_RETENTION`: Retention period in days (default: 90)
- `REDFLAG_SECURITY_LOG_TO_DB`: Store events in database (default: true)
- `REDFLAG_SECURITY_LOG_HASH_IP`: Hash IP addresses for privacy (default: true)

### Agent Configuration

JSON configuration fields:
```json
{
  "security_logging": {
    "enabled": true,
    "level": "warning",
    "log_successes": false,
    "file_path": "security.log",
    "max_size_mb": 50,
    "max_files": 5,
    "batch_size": 10,
    "send_to_server": true
  }
}
```

Environment variables:
- `REDFLAG_AGENT_SECURITY_LOG_ENABLED`
- `REDFLAG_AGENT_SECURITY_LOG_LEVEL`
- `REDFLAG_AGENT_SECURITY_LOG_SUCCESSES`
- `REDFLAG_AGENT_SECURITY_LOG_PATH`

## Usage Examples

### Server Integration

```go
// Initialize in main.go
securityLogger, err := logging.NewSecurityLogger(config.SecurityLogging, db)
if err != nil {
    log.Fatal(err)
}
defer securityLogger.Close()

// Log verification failure
securityLogger.LogCommandVerificationFailure(
    agentID,
    commandID,
    "signature mismatch",
)

// Create custom event
event := models.NewSecurityEvent(
    "WARNING",
    models.SecurityEventTypes.AnomalousBehavior,
    agentID,
    "Agent check-in frequency changed dramatically",
)
event.WithDetail("previous_interval", "300s")
event.WithDetail("current_interval", "5s")
securityLogger.Log(event)
```

### Agent Integration

```go
// Initialize in main.go
securityLogger, err := logging.NewSecurityLogger(config, dataDir)
if err != nil {
    log.Fatal(err)
}
defer securityLogger.Close()

// Log signature verification failure
securityLogger.LogCommandVerificationFailure(
    commandID,
    "signature verification failed",
)

// Get and send batch to server
events := securityLogger.GetBatch()
if len(events) > 0 {
    if sendToServer(events) {
        securityLogger.ClearBatch()
    }
}
```

## Log Format

Events are logged as JSON objects:

```json
{
  "timestamp": "2025-12-13T10:30:45.123456Z",
  "level": "WARNING",
  "event_type": "CMD_SIGNATURE_VERIFICATION_FAILED",
  "agent_id": "550e8400-e29b-41d4-a716-446655440000",
  "message": "Command signature verification failed",
  "trace_id": "trace-123456",
  "ip_address": "192.168.1.100",
  "details": {
    "command_id": "cmd-123",
    "reason": "signature mismatch"
  },
  "metadata": {
    "source": "api",
    "user_agent": "redflag-agent/0.2.0"
  }
}
```

## Database Schema

When database logging is enabled, events are stored in the `security_events` table:

```sql
CREATE TABLE security_events (
    id SERIAL PRIMARY KEY,
    timestamp TIMESTAMP WITH TIME ZONE NOT NULL,
    level VARCHAR(20) NOT NULL,
    event_type VARCHAR(100) NOT NULL,
    agent_id UUID,
    message TEXT NOT NULL,
    trace_id VARCHAR(100),
    ip_address VARCHAR(100),
    details JSONB,
    metadata JSONB
);

-- Indexes for efficient querying
CREATE INDEX idx_security_events_timestamp ON security_events(timestamp);
CREATE INDEX idx_security_events_agent_id ON security_events(agent_id);
CREATE INDEX idx_security_events_level ON security_events(level);
CREATE INDEX idx_security_events_event_type ON security_events(event_type);
```

## Privacy Considerations

1. **IP Address Hashing**: When enabled, IP addresses are SHA256 hashed (first 8 characters shown)
2. **Minimal Data**: Only essential security data is logged
3. **Configurable Scope**: Can disable logging of successes to reduce noise
4. **Retention Configurable**: Automatic cleanup of old log files

## Monitoring and Alerting

The structured JSON format enables easy integration with monitoring tools:

- **Elasticsearch + Kibana**: Index logs for searching and visualization
- **Splunk**: Forward logs for SIEM analysis
- **Prometheus + Alertmanager**: Count events by type and trigger alerts
- **Grafana**: Create dashboards for security metrics

Example Prometheus queries:
```promql
# Rate of critical security events
increase(security_events_total{level="CRITICAL"}[5m])

# Top event types by count
topk(10, increase(security_events_total[1h]))

# Agents with most security events
topk(10, increase(security_events_total[5m]) by (agent_id))
```

## Performance Considerations

1. **Asynchronous Processing**: Server uses buffered channel to avoid blocking
2. **Batch Writing**: Agent batches events before sending to server
3. **Log Rotation**: Automatic rotation prevents disk space issues
4. **Level Filtering**: Events below configured level are dropped early

## Troubleshooting

### Common Issues

1. **Permission Denied**: Ensure log directory exists and is writable
   ```
   sudo mkdir -p /var/log/redflag
   sudo chown redflag:redflag /var/log/redflag
   ```

2. **Missing Database Table**: The logger creates tables automatically, but ensure DB user has CREATE privileges

3. **High CPU Usage**: Increase batch sizes or reduce log level in high-traffic environments

4. **Large Log Files**: Adjust retention policy and max file size settings

### Debug Mode

Set log level to "debug" for verbose logging:
```bash
export REDFLAG_SECURITY_LOG_LEVEL=debug
```

## Future Enhancements

1. **Structured Metrics**: Export counts by event type to Prometheus
2. **Event Correlation**: Link related events with correlation IDs
3. **Remote Logging**: Support for syslog and remote log aggregation
4. **Event Filtering**: Advanced filtering rules based on agent, type, or content
5. **Retention Policies**: Per-event-type retention configurations
6. **Encryption**: Encrypt sensitive log fields at rest