Redflag/docs/4_LOG/December_2025/2025-11-16_Security-Hardening-Implementation-Guide.md

# RedFlag Security Hardening Guide

## Overview

This guide provides comprehensive hardening recommendations for RedFlag deployments in production environments. It covers network security, key management, monitoring, and incident response procedures.

## Production Deployment Checklist

### Pre-Deployment Requirements

#### Security Configuration
- [ ] Generate unique Ed25519 signing key
- [ ] Set strong JWT secret (>32 random chars)
- [ ] Enable TLS 1.3 with valid certificates
- [ ] Configure minimum agent version (v0.1.22+)
- [ ] Set appropriate token and seat limits
- [ ] Enable all security logging
- [ ] Configure alerting thresholds

#### Network Security
- [ ] Place server behind corporate firewall
- [ ] Use dedicated security group/VPC segment
- [ ] Configure inbound port restrictions (default: 8443)
- [ ] Enable DDoS protection at network boundary
- [ ] Configure outbound restrictions if needed
- [ ] Set up VPN or private network for agent connectivity

#### Infrastructure Security
- [ ] Use dedicated service account for RedFlag
- [ ] Enable OS-level security updates
- [ ] Configure file system encryption
- [ ] Set up backup encryption
- [ ] Enable audit logging at OS level
- [ ] Configure intrusion detection system

### Server Hardening

#### TLS Configuration
```nginx
# nginx reverse proxy example
server {
    listen 443 ssl http2;
    server_name redflag.company.com;

    # TLS 1.3 only for best security
    ssl_protocols TLSv1.3;
    ssl_ciphers TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256;
    ssl_prefer_server_ciphers off;

    # Certificate chain
    ssl_certificate /etc/ssl/certs/redflag-fullchain.pem;
    ssl_certificate_key /etc/ssl/private/redflag.key;

    # HSTS
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

    # Security headers
    add_header X-Frame-Options DENY;
    add_header X-Content-Type-Options nosniff;
    add_header X-XSS-Protection "1; mode=block";
    add_header Referrer-Policy strict-origin-when-cross-origin;

    location / {
        proxy_pass http://localhost:8080;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}
```

#### System Service Configuration
```ini
# /etc/systemd/system/redflag-server.service
[Unit]
Description=RedFlag Server
After=network.target

[Service]
Type=simple
User=redflag
Group=redflag
Environment=REDFLAG_SIGNING_PRIVATE_KEY=/etc/redflag/private_key
Environment=REDFLAG_TLS_CERT_FILE=/etc/ssl/certs/redflag.crt
Environment=REDFLAG_TLS_KEY_FILE=/etc/ssl/private/redflag.key
ExecStart=/usr/local/bin/redflag-server
Restart=always
RestartSec=5
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/redflag /var/log/redflag

[Install]
WantedBy=multi-user.target
```

#### File Permissions
```bash
# Secure configuration files
chmod 600 /etc/redflag/private_key
chmod 600 /etc/redflag/config.env
chmod 640 /var/log/redflag/*.log

# Application permissions
chown root:root /usr/local/bin/redflag-server
chmod 755 /usr/local/bin/redflag-server

# Directory permissions
chmod 750 /var/lib/redflag
chmod 750 /var/log/redflag
chmod 751 /etc/redflag
```

### Agent Hardening

#### Agent Service Configuration (Linux)
```ini
# /etc/systemd/system/redflag-agent.service
[Unit]
Description=RedFlag Agent
After=network.target

[Service]
Type=simple
User=root
Group=root
ExecStart=/usr/local/bin/redflag-agent -config /etc/redflag/agent.json
Restart=always
RestartSec=30
CapabilityBoundingSet=
AmbientCapabilities=
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/redflag /var/log/redflag /tmp

[Install]
WantedBy=multi-user.target
```

#### Agent Configuration Hardening
```json
{
    "server_url": "https://redflag.company.com:8443",
    "agent_id": "generated-at-registration",
    "machine_binding": {
        "enforced": true,
        "validate_hardware": true
    },
    "security": {
        "require_tls": true,
        "verify_certificates": true,
        "public_key_fingerprint": "cached_from_server"
    },
    "logging": {
        "level": "info",
        "security_events": true
    }
}
```

## Key Management Best Practices

### Ed25519 Key Generation
```bash
#!/bin/bash
# Production key generation script

# Generate new key pair
PRIVATE_KEY=$(openssl rand -hex 32)
PUBLIC_KEY=$(echo -n "$PRIVATE_KEY" | xxd -r -p | tail -c 32 | xxd -p)

# Store securely
echo "$PRIVATE_KEY" | vault kv put secret/redflag/signing-key value=-
chmod 600 /tmp/private_key
echo "$PRIVATE_KEY" > /tmp/private_key

# Show fingerprint (first 8 bytes)
FINGERPRINT=$(echo "$PUBLIC_KEY" | cut -c1-16)
echo "Public key fingerprint: $FINGERPRINT"

# Cleanup
rm -f /tmp/private_key
```

### Using HashiCorp Vault
```bash
# Store key in Vault
vault kv put secret/redflag/signing-key \
    private_key=$PRIVATE_KEY \
    public_key=$PUBLIC_KEY \
    created_at=$(date -u +%Y-%m-%dT%H:%M:%SZ)

# Retrieve for deployment
export REDFLAG_SIGNING_PRIVATE_KEY=$(vault kv get -field=private_key secret/redflag/signing-key)
```

### Key Rotation Procedure
```bash
#!/bin/bash
# Key rotation with minimal downtime

NEW_KEY=$(openssl rand -hex 32)
OLD_KEY=$(vault kv get -field=private_key secret/redflag/signing-key)

# 1. Update server with both keys temporarily
export REDFLAG_SIGNING_PRIVATE_KEY=$NEW_KEY
systemctl restart redflag-server

# 2. Update agents (grace period starts)
# Agents will receive new public key on next check-in

# 3. Monitor for 24 hours
# Check that all agents have updated

# 4. Archive old key
vault kv patch secret/redflag/retired-keys \
    "$(date +%Y%m%d)_key=$OLD_KEY"

echo "Key rotation complete"
```

### AWS KMS Integration (Example)
```go
// Retrieve key from AWS KMS
func getSigningKeyFromKMS() (string, error) {
    sess := session.Must(session.NewSession())
    kms := kms.New(sess)

    result, err := kms.Decrypt(&kms.DecryptInput{
        CiphertextBlob: encryptedKey,
    })
    if err != nil {
        return "", err
    }

    return hex.EncodeToString(result.Plaintext), nil
}
```

## Network Security Recommendations

### Firewall Rules
```bash
# iptables rules for RedFlag server
iptables -A INPUT -p tcp --dport 8443 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 8443 -s 172.16.0.0/12 -j ACCEPT
iptables -A INPUT -p tcp --dport 8443 -s 192.168.0.0/16 -j ACCEPT
iptables -A INPUT -p tcp --dport 8443 -j DROP

# Allow only outbound HTTPS from agents
iptables -A OUTPUT -p tcp --dport 443 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 80 -j DROP
```

### AWS Security Group Example
```json
{
    "Description": "RedFlag Server Security Group",
    "IpPermissions": [
        {
            "IpProtocol": "tcp",
            "FromPort": 8443,
            "ToPort": 8443,
            "UserIdGroupPairs": [{"GroupId": "sg-agent-group"}],
            "IpRanges": [
                {"CidrIp": "10.0.0.0/8"},
                {"CidrIp": "172.16.0.0/12"},
                {"CidrIp": "192.168.0.0/16"}
            ]
        }
    ]
}
```

### Network Segmentation
```
[DMZ] --firewall--> [Application Tier] --firewall--> [Database Tier]

RedFlag Components:
- Load Balancer (DMZ)
- Web UI Server (Application Tier)
- API Server (Application Tier)
- PostgreSQL Database (Database Tier)
```

## Monitoring and Alerting Setup

### Prometheus Metrics Export
```yaml
# prometheus.yml
scrape_configs:
  - job_name: 'redflag'
    scheme: https
    tls_config:
      cert_file: /etc/ssl/certs/redflag.crt
      key_file: /etc/ssl/private/redflag.key
    static_configs:
      - targets: ['localhost:9090']
    metrics_path: '/metrics'
    scrape_interval: 15s
```

### Grafana Dashboard Panels
```json
{
    "dashboard": {
        "title": "RedFlag Security Overview",
        "panels": [
            {
                "title": "Failed Updates",
                "targets": [
                    {
                        "expr": "rate(redflag_update_failures_total[5m])",
                        "legendFormat": "Failed Updates/sec"
                    }
                ]
            },
            {
                "title": "Machine Binding Violations",
                "targets": [
                    {
                        "expr": "redflag_machine_binding_violations_total",
                        "legendFormat": "Total Violations"
                    }
                ]
            },
            {
                "title": "Authentication Failures",
                "targets": [
                    {
                        "expr": "rate(redflag_auth_failures_total[5m])",
                        "legendFormat": "Auth Failures/sec"
                    }
                ]
            }
        ]
    }
}
```

### AlertManager Rules
```yaml
# alertmanager.yml
groups:
- name: redflag-security
  rules:
  - alert: UpdateVerificationFailure
    expr: rate(redflag_update_failures_total[5m]) > 0.1
    for: 2m
    labels:
      severity: critical
    annotations:
      summary: "High update failure rate detected"
      description: "Update verification failures: {{ $value }}/sec"

  - alert: MachineBindingViolation
    expr: increase(redflag_machine_binding_violations_total[5m]) > 0
    for: 0m
    labels:
      severity: warning
    annotations:
      summary: "Machine binding violation detected"
      description: "Possible agent impersonation attempt"

  - alert: AuthenticationFailureSpike
    expr: rate(redflag_auth_failures_total[5m]) > 1
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "Authentication failure spike"
      description: "{{ $value }} failed auth attempts/sec"
```

### ELK Stack Configuration
```json
{
  "index": "redflag-security-*",
  "mappings": {
    "properties": {
      "timestamp": {"type": "date"},
      "event_type": {"type": "keyword"},
      "agent_id": {"type": "keyword"},
      "severity": {"type": "keyword"},
      "message": {"type": "text"},
      "source_ip": {"type": "ip"}
    }
  }
}
```

## Incident Response Procedures

### Detection Workflow

#### 1. Immediate Detection
```bash
# Check for recent security events
grep "SECURITY" /var/log/redflag/server.log | tail -100

# Monitor failed updates
curl -s "https://server:8443/api/v1/security/overview" | jq .

# Check agent compliance
curl -s "https://server:8443/api/v1/agents?compliance=false"
```

#### 2. Threat Classification
```
Critical:
- Update verification failures
- Machine binding violations
- Private key compromise

High:
- Authentication failure spikes
- Agent version downgrade attempts
- Unauthorized registration attempts

Medium:
- Configuration changes
- Unusual agent patterns
- Network anomalies
```

### Response Procedures

#### Update Tampering Incident
```bash
#!/bin/bash
# Incident response: update tampering

# 1. Isolate affected systems
iptables -I INPUT -s <affected-ip-range> -j DROP

# 2. Revoke potentially compromised update
curl -X DELETE -H "Authorization: Bearer $TOKEN" \
  https://server:8443/api/v1/updates/<update-id>

# 3. Rotate signing key
rotate-signing-key.sh

# 4. Force agent verification
for agent in $(get-all-agents.sh); do
  curl -X POST -H "Authorization: Bearer $TOKEN" \
    -d '{"action": "verify"}" \
    https://server:8443/api/v1/agents/$agent/verify
done

# 5. Generate incident report
generate-incident-report.sh update-tampering
```

#### Machine Binding Violation Response
```bash
#!/bin/bash
# Incident response: machine binding violation

AGENT_ID=$1
VIOLATION_COUNT=$(get-violation-count.sh $AGENT_ID)

if [ $VIOLATION_COUNT -gt 3 ]; then
  # Block agent
  curl -X POST -H "Authorization: Bearer $TOKEN" \
    -d '{"blocked": true, "reason": "machine binding violation"}' \
    https://server:8443/api/v1/agents/$AGENT_ID/block

  # Notify security team
  send-security-alert.sh "Agent $AGENT_ID blocked for machine ID violations"
else
  # Issue warning
  curl -X POST -H "Authorization: Bearer $TOKEN" \
    -d '{"message": "Security warning: machine ID mismatch detected"}' \
    https://server:8443/api/v1/agents/$AGENT_ID/warn
fi
```

### Forensics Collection

#### Evidence Collection Script
```bash
#!/bin/bash
# Collect forensic artifacts

INCIDENT_ID=$1
EVIDENCE_DIR="/evidence/$INCIDENT_ID"
mkdir -p $EVIDENCE_DIR

# Server logs
cp /var/log/redflag/*.log $EVIDENCE_DIR/
tar -czf $EVIDENCE_DIR/system-logs.tar.gz /var/log/syslog /var/log/auth.log

# Database dump of security events
pg_dump -h localhost -U redflag redflag \
  -t security_events -f $EVIDENCE_DIR/security_events.sql

# Agent states
curl -s "https://server:8443/api/v1/agents" | jq . > $EVIDENCE_DIR/agents.json

# Network connections
netstat -tulpn > $EVIDENCE_DIR/network-connections.txt
ss -tulpn >> $EVIDENCE_DIR/network-connections.txt

# Hash and sign evidence
find $EVIDENCE_DIR -type f -exec sha256sum {} \; > $EVIDENCE_DIR/hashes.txt
gpg --detach-sign --armor $EVIDENCE_DIR/hashes.txt
```

## Compliance Mapping

### SOC 2 Type II Controls
```
CC6.1 - Logical and Physical Access Controls:
- Machine binding implementation
- JWT authentication
- Registration token limits

CC7.1 - System Operation:
- Security event logging
- Monitoring and alerting
- Incident response procedures

CC6.7 - Transmission:
- TLS 1.3 encryption
- Update package signing
- Certificate management
```

### ISO 27001 Annex A Controls
```
A.10.1 - Cryptographic Controls:
- Ed25519 update signing
- Key management procedures
- Encryption at rest/in transit

A.12.4 - Event Logging:
- Comprehensive audit trails
- Log retention policies
- Tamper-evident logging

A.14.2 - Secure Development:
- Security by design
- Regular security assessments
- Vulnerability management
```

## Backup and Recovery

### Encrypted Backup Script
```bash
#!/bin/bash
# Secure backup procedure

BACKUP_DIR="/backup/redflag/$(date +%Y%m%d)"
mkdir -p $BACKUP_DIR

# 1. Database backup
pg_dump -h localhost -U redflag redflag | \
  gpg --cipher-algo AES256 --compress-algo 1 --symmetric \
      --output $BACKUP_DIR/database.sql.gpg

# 2. Configuration backup
tar -czf - /etc/redflag/ | \
  gpg --cipher-algo AES256 --compress-algo 1 --symmetric \
      --output $BACKUP_DIR/config.tar.gz.gpg

# 3. Keys backup (separate location)
tar -czf - /opt/redflag/keys/ | \
  gpg --cipher-algo AES256 --compress-algo 1 --symmetric \
      --output /secure/offsite/keys_$(date +%Y%m%d).tar.gz.gpg

# 4. Verify backup
gpg --batch --passphrase "$BACKUP_PASSPHRASE" \
  --decrypt $BACKUP_DIR/database.sql.gpg | \
  head -20

# 5. Clean old backups (retain 30 days)
find /backup/redflag -type d -mtime +30 -exec rm -rf {} \;
```

### Disaster Recovery Test
```bash
#!/bin/bash
# Monthly DR test

# 1. Spin up test environment
docker-compose -f docker-compose.test.yml up -d

# 2. Restore database
gpg --batch --passphrase "$BACKUP_PASSPHRASE" \
  --decrypt $BACKUP_DIR/database.sql.gpg | \
  psql -h localhost -U redflag redflag

# 3. Verify functionality
./dr-tests.sh

# 4. Cleanup
docker-compose -f docker-compose.test.yml down
```

## Security Testing

### Penetration Testing Checklist
```
Authentication:
- Test weak passwords
- JWT token manipulation attempts
- Registration token abuse
- Session fixation checks

Authorization:
- Privilege escalation attempts
- Cross-tenant data access
- API endpoint abuse

Update Security:
- Signed package tampering
- Replay attack attempts
- Downgrade attack testing

Infrastructure:
- TLS configuration validation
- Certificate chain verification
- Network isolation testing
```

### Automated Security Scanning
```yaml
# .github/workflows/security-scan.yml
name: Security Scan

on: [push, pull_request]

jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Run Gosec Security Scanner
        uses: securecodewarrior/github-action-gosec@master
        with:
          args: '-no-fail -fmt sarif -out results.sarif ./...'

      - name: Run Trivy vulnerability scanner
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'fs'
          scan-ref: '.'
          format: 'sarif'
          output: 'trivy-results.sarif'

      - name: Upload SARIF files
        uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: results.sarif
```

## Reference Architecture

### Enterprise Deployment
```
                    [Internet]
                         |
                 [CloudFlare/WAF]
                         |
                [Application Load Balancer]
                    (TLS Termination)
                         |
                 +-----------------+
                 |  Bastion Host   |
                 +-----------------+
                         |
          +------------------------------+
          |     Private Network          |
          |                              |
   +------+-----+               +--------+--------+
   | RedFlag     |               | PostgreSQL      |
   | Server      |               | (Encrypted)     |
   | (Cluster)   |               +-----------------+
   +------+-----+
          |
   +------+------------+------------+-------------+
   |                   |            |             |
[K8s Cluster]   [Bare Metal]  [VMware]     [Cloud VMs]
   |                   |            |             |
[RedFlag Agents] [RedFlag Agents][RedFlag Agents][RedFlag Agents]
```

## Security Contacts and Resources

### Team Contacts
- Security Team: security@company.com
- Incident Response: ir@company.com
- Engineering: redflag-team@company.com

### External Resources
- CVE Database: https://cve.mitre.org
- OWASP Testing Guide: https://owasp.org/www-project-web-security-testing-guide/
- NIST Cybersecurity Framework: https://www.nist.gov/cyberframework

### Internal Resources
- Security Documentation: `/docs/SECURITY.md`
- Configuration Guide: `/docs/SECURITY-SETTINGS.md`
- Incident Response Runbook: `/docs/INCIDENT-RESPONSE.md`
- Architecture Decisions: `/docs/ADR/`