Files
Redflag/docs/4_LOG/December_2025/2025-11-16_Security-Hardening-Implementation-Guide.md

710 lines
18 KiB
Markdown

# RedFlag Security Hardening Guide
## Overview
This guide provides comprehensive hardening recommendations for RedFlag deployments in production environments. It covers network security, key management, monitoring, and incident response procedures.
## Production Deployment Checklist
### Pre-Deployment Requirements
#### Security Configuration
- [ ] Generate unique Ed25519 signing key
- [ ] Set strong JWT secret (>32 random chars)
- [ ] Enable TLS 1.3 with valid certificates
- [ ] Configure minimum agent version (v0.1.22+)
- [ ] Set appropriate token and seat limits
- [ ] Enable all security logging
- [ ] Configure alerting thresholds
#### Network Security
- [ ] Place server behind corporate firewall
- [ ] Use dedicated security group/VPC segment
- [ ] Configure inbound port restrictions (default: 8443)
- [ ] Enable DDoS protection at network boundary
- [ ] Configure outbound restrictions if needed
- [ ] Set up VPN or private network for agent connectivity
#### Infrastructure Security
- [ ] Use dedicated service account for RedFlag
- [ ] Enable OS-level security updates
- [ ] Configure file system encryption
- [ ] Set up backup encryption
- [ ] Enable audit logging at OS level
- [ ] Configure intrusion detection system
### Server Hardening
#### TLS Configuration
```nginx
# nginx reverse proxy example
server {
listen 443 ssl http2;
server_name redflag.company.com;
# TLS 1.3 only for best security
ssl_protocols TLSv1.3;
ssl_ciphers TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256;
ssl_prefer_server_ciphers off;
# Certificate chain
ssl_certificate /etc/ssl/certs/redflag-fullchain.pem;
ssl_certificate_key /etc/ssl/private/redflag.key;
# HSTS
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
# Security headers
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
add_header Referrer-Policy strict-origin-when-cross-origin;
location / {
proxy_pass http://localhost:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
```
#### System Service Configuration
```ini
# /etc/systemd/system/redflag-server.service
[Unit]
Description=RedFlag Server
After=network.target
[Service]
Type=simple
User=redflag
Group=redflag
Environment=REDFLAG_SIGNING_PRIVATE_KEY=/etc/redflag/private_key
Environment=REDFLAG_TLS_CERT_FILE=/etc/ssl/certs/redflag.crt
Environment=REDFLAG_TLS_KEY_FILE=/etc/ssl/private/redflag.key
ExecStart=/usr/local/bin/redflag-server
Restart=always
RestartSec=5
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/redflag /var/log/redflag
[Install]
WantedBy=multi-user.target
```
#### File Permissions
```bash
# Secure configuration files
chmod 600 /etc/redflag/private_key
chmod 600 /etc/redflag/config.env
chmod 640 /var/log/redflag/*.log
# Application permissions
chown root:root /usr/local/bin/redflag-server
chmod 755 /usr/local/bin/redflag-server
# Directory permissions
chmod 750 /var/lib/redflag
chmod 750 /var/log/redflag
chmod 751 /etc/redflag
```
### Agent Hardening
#### Agent Service Configuration (Linux)
```ini
# /etc/systemd/system/redflag-agent.service
[Unit]
Description=RedFlag Agent
After=network.target
[Service]
Type=simple
User=root
Group=root
ExecStart=/usr/local/bin/redflag-agent -config /etc/redflag/agent.json
Restart=always
RestartSec=30
CapabilityBoundingSet=
AmbientCapabilities=
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/redflag /var/log/redflag /tmp
[Install]
WantedBy=multi-user.target
```
#### Agent Configuration Hardening
```json
{
"server_url": "https://redflag.company.com:8443",
"agent_id": "generated-at-registration",
"machine_binding": {
"enforced": true,
"validate_hardware": true
},
"security": {
"require_tls": true,
"verify_certificates": true,
"public_key_fingerprint": "cached_from_server"
},
"logging": {
"level": "info",
"security_events": true
}
}
```
## Key Management Best Practices
### Ed25519 Key Generation
```bash
#!/bin/bash
# Production key generation script
# Generate new key pair
PRIVATE_KEY=$(openssl rand -hex 32)
PUBLIC_KEY=$(echo -n "$PRIVATE_KEY" | xxd -r -p | tail -c 32 | xxd -p)
# Store securely
echo "$PRIVATE_KEY" | vault kv put secret/redflag/signing-key value=-
chmod 600 /tmp/private_key
echo "$PRIVATE_KEY" > /tmp/private_key
# Show fingerprint (first 8 bytes)
FINGERPRINT=$(echo "$PUBLIC_KEY" | cut -c1-16)
echo "Public key fingerprint: $FINGERPRINT"
# Cleanup
rm -f /tmp/private_key
```
### Using HashiCorp Vault
```bash
# Store key in Vault
vault kv put secret/redflag/signing-key \
private_key=$PRIVATE_KEY \
public_key=$PUBLIC_KEY \
created_at=$(date -u +%Y-%m-%dT%H:%M:%SZ)
# Retrieve for deployment
export REDFLAG_SIGNING_PRIVATE_KEY=$(vault kv get -field=private_key secret/redflag/signing-key)
```
### Key Rotation Procedure
```bash
#!/bin/bash
# Key rotation with minimal downtime
NEW_KEY=$(openssl rand -hex 32)
OLD_KEY=$(vault kv get -field=private_key secret/redflag/signing-key)
# 1. Update server with both keys temporarily
export REDFLAG_SIGNING_PRIVATE_KEY=$NEW_KEY
systemctl restart redflag-server
# 2. Update agents (grace period starts)
# Agents will receive new public key on next check-in
# 3. Monitor for 24 hours
# Check that all agents have updated
# 4. Archive old key
vault kv patch secret/redflag/retired-keys \
"$(date +%Y%m%d)_key=$OLD_KEY"
echo "Key rotation complete"
```
### AWS KMS Integration (Example)
```go
// Retrieve key from AWS KMS
func getSigningKeyFromKMS() (string, error) {
sess := session.Must(session.NewSession())
kms := kms.New(sess)
result, err := kms.Decrypt(&kms.DecryptInput{
CiphertextBlob: encryptedKey,
})
if err != nil {
return "", err
}
return hex.EncodeToString(result.Plaintext), nil
}
```
## Network Security Recommendations
### Firewall Rules
```bash
# iptables rules for RedFlag server
iptables -A INPUT -p tcp --dport 8443 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 8443 -s 172.16.0.0/12 -j ACCEPT
iptables -A INPUT -p tcp --dport 8443 -s 192.168.0.0/16 -j ACCEPT
iptables -A INPUT -p tcp --dport 8443 -j DROP
# Allow only outbound HTTPS from agents
iptables -A OUTPUT -p tcp --dport 443 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 80 -j DROP
```
### AWS Security Group Example
```json
{
"Description": "RedFlag Server Security Group",
"IpPermissions": [
{
"IpProtocol": "tcp",
"FromPort": 8443,
"ToPort": 8443,
"UserIdGroupPairs": [{"GroupId": "sg-agent-group"}],
"IpRanges": [
{"CidrIp": "10.0.0.0/8"},
{"CidrIp": "172.16.0.0/12"},
{"CidrIp": "192.168.0.0/16"}
]
}
]
}
```
### Network Segmentation
```
[DMZ] --firewall--> [Application Tier] --firewall--> [Database Tier]
RedFlag Components:
- Load Balancer (DMZ)
- Web UI Server (Application Tier)
- API Server (Application Tier)
- PostgreSQL Database (Database Tier)
```
## Monitoring and Alerting Setup
### Prometheus Metrics Export
```yaml
# prometheus.yml
scrape_configs:
- job_name: 'redflag'
scheme: https
tls_config:
cert_file: /etc/ssl/certs/redflag.crt
key_file: /etc/ssl/private/redflag.key
static_configs:
- targets: ['localhost:9090']
metrics_path: '/metrics'
scrape_interval: 15s
```
### Grafana Dashboard Panels
```json
{
"dashboard": {
"title": "RedFlag Security Overview",
"panels": [
{
"title": "Failed Updates",
"targets": [
{
"expr": "rate(redflag_update_failures_total[5m])",
"legendFormat": "Failed Updates/sec"
}
]
},
{
"title": "Machine Binding Violations",
"targets": [
{
"expr": "redflag_machine_binding_violations_total",
"legendFormat": "Total Violations"
}
]
},
{
"title": "Authentication Failures",
"targets": [
{
"expr": "rate(redflag_auth_failures_total[5m])",
"legendFormat": "Auth Failures/sec"
}
]
}
]
}
}
```
### AlertManager Rules
```yaml
# alertmanager.yml
groups:
- name: redflag-security
rules:
- alert: UpdateVerificationFailure
expr: rate(redflag_update_failures_total[5m]) > 0.1
for: 2m
labels:
severity: critical
annotations:
summary: "High update failure rate detected"
description: "Update verification failures: {{ $value }}/sec"
- alert: MachineBindingViolation
expr: increase(redflag_machine_binding_violations_total[5m]) > 0
for: 0m
labels:
severity: warning
annotations:
summary: "Machine binding violation detected"
description: "Possible agent impersonation attempt"
- alert: AuthenticationFailureSpike
expr: rate(redflag_auth_failures_total[5m]) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "Authentication failure spike"
description: "{{ $value }} failed auth attempts/sec"
```
### ELK Stack Configuration
```json
{
"index": "redflag-security-*",
"mappings": {
"properties": {
"timestamp": {"type": "date"},
"event_type": {"type": "keyword"},
"agent_id": {"type": "keyword"},
"severity": {"type": "keyword"},
"message": {"type": "text"},
"source_ip": {"type": "ip"}
}
}
}
```
## Incident Response Procedures
### Detection Workflow
#### 1. Immediate Detection
```bash
# Check for recent security events
grep "SECURITY" /var/log/redflag/server.log | tail -100
# Monitor failed updates
curl -s "https://server:8443/api/v1/security/overview" | jq .
# Check agent compliance
curl -s "https://server:8443/api/v1/agents?compliance=false"
```
#### 2. Threat Classification
```
Critical:
- Update verification failures
- Machine binding violations
- Private key compromise
High:
- Authentication failure spikes
- Agent version downgrade attempts
- Unauthorized registration attempts
Medium:
- Configuration changes
- Unusual agent patterns
- Network anomalies
```
### Response Procedures
#### Update Tampering Incident
```bash
#!/bin/bash
# Incident response: update tampering
# 1. Isolate affected systems
iptables -I INPUT -s <affected-ip-range> -j DROP
# 2. Revoke potentially compromised update
curl -X DELETE -H "Authorization: Bearer $TOKEN" \
https://server:8443/api/v1/updates/<update-id>
# 3. Rotate signing key
rotate-signing-key.sh
# 4. Force agent verification
for agent in $(get-all-agents.sh); do
curl -X POST -H "Authorization: Bearer $TOKEN" \
-d '{"action": "verify"}" \
https://server:8443/api/v1/agents/$agent/verify
done
# 5. Generate incident report
generate-incident-report.sh update-tampering
```
#### Machine Binding Violation Response
```bash
#!/bin/bash
# Incident response: machine binding violation
AGENT_ID=$1
VIOLATION_COUNT=$(get-violation-count.sh $AGENT_ID)
if [ $VIOLATION_COUNT -gt 3 ]; then
# Block agent
curl -X POST -H "Authorization: Bearer $TOKEN" \
-d '{"blocked": true, "reason": "machine binding violation"}' \
https://server:8443/api/v1/agents/$AGENT_ID/block
# Notify security team
send-security-alert.sh "Agent $AGENT_ID blocked for machine ID violations"
else
# Issue warning
curl -X POST -H "Authorization: Bearer $TOKEN" \
-d '{"message": "Security warning: machine ID mismatch detected"}' \
https://server:8443/api/v1/agents/$AGENT_ID/warn
fi
```
### Forensics Collection
#### Evidence Collection Script
```bash
#!/bin/bash
# Collect forensic artifacts
INCIDENT_ID=$1
EVIDENCE_DIR="/evidence/$INCIDENT_ID"
mkdir -p $EVIDENCE_DIR
# Server logs
cp /var/log/redflag/*.log $EVIDENCE_DIR/
tar -czf $EVIDENCE_DIR/system-logs.tar.gz /var/log/syslog /var/log/auth.log
# Database dump of security events
pg_dump -h localhost -U redflag redflag \
-t security_events -f $EVIDENCE_DIR/security_events.sql
# Agent states
curl -s "https://server:8443/api/v1/agents" | jq . > $EVIDENCE_DIR/agents.json
# Network connections
netstat -tulpn > $EVIDENCE_DIR/network-connections.txt
ss -tulpn >> $EVIDENCE_DIR/network-connections.txt
# Hash and sign evidence
find $EVIDENCE_DIR -type f -exec sha256sum {} \; > $EVIDENCE_DIR/hashes.txt
gpg --detach-sign --armor $EVIDENCE_DIR/hashes.txt
```
## Compliance Mapping
### SOC 2 Type II Controls
```
CC6.1 - Logical and Physical Access Controls:
- Machine binding implementation
- JWT authentication
- Registration token limits
CC7.1 - System Operation:
- Security event logging
- Monitoring and alerting
- Incident response procedures
CC6.7 - Transmission:
- TLS 1.3 encryption
- Update package signing
- Certificate management
```
### ISO 27001 Annex A Controls
```
A.10.1 - Cryptographic Controls:
- Ed25519 update signing
- Key management procedures
- Encryption at rest/in transit
A.12.4 - Event Logging:
- Comprehensive audit trails
- Log retention policies
- Tamper-evident logging
A.14.2 - Secure Development:
- Security by design
- Regular security assessments
- Vulnerability management
```
## Backup and Recovery
### Encrypted Backup Script
```bash
#!/bin/bash
# Secure backup procedure
BACKUP_DIR="/backup/redflag/$(date +%Y%m%d)"
mkdir -p $BACKUP_DIR
# 1. Database backup
pg_dump -h localhost -U redflag redflag | \
gpg --cipher-algo AES256 --compress-algo 1 --symmetric \
--output $BACKUP_DIR/database.sql.gpg
# 2. Configuration backup
tar -czf - /etc/redflag/ | \
gpg --cipher-algo AES256 --compress-algo 1 --symmetric \
--output $BACKUP_DIR/config.tar.gz.gpg
# 3. Keys backup (separate location)
tar -czf - /opt/redflag/keys/ | \
gpg --cipher-algo AES256 --compress-algo 1 --symmetric \
--output /secure/offsite/keys_$(date +%Y%m%d).tar.gz.gpg
# 4. Verify backup
gpg --batch --passphrase "$BACKUP_PASSPHRASE" \
--decrypt $BACKUP_DIR/database.sql.gpg | \
head -20
# 5. Clean old backups (retain 30 days)
find /backup/redflag -type d -mtime +30 -exec rm -rf {} \;
```
### Disaster Recovery Test
```bash
#!/bin/bash
# Monthly DR test
# 1. Spin up test environment
docker-compose -f docker-compose.test.yml up -d
# 2. Restore database
gpg --batch --passphrase "$BACKUP_PASSPHRASE" \
--decrypt $BACKUP_DIR/database.sql.gpg | \
psql -h localhost -U redflag redflag
# 3. Verify functionality
./dr-tests.sh
# 4. Cleanup
docker-compose -f docker-compose.test.yml down
```
## Security Testing
### Penetration Testing Checklist
```
Authentication:
- Test weak passwords
- JWT token manipulation attempts
- Registration token abuse
- Session fixation checks
Authorization:
- Privilege escalation attempts
- Cross-tenant data access
- API endpoint abuse
Update Security:
- Signed package tampering
- Replay attack attempts
- Downgrade attack testing
Infrastructure:
- TLS configuration validation
- Certificate chain verification
- Network isolation testing
```
### Automated Security Scanning
```yaml
# .github/workflows/security-scan.yml
name: Security Scan
on: [push, pull_request]
jobs:
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Gosec Security Scanner
uses: securecodewarrior/github-action-gosec@master
with:
args: '-no-fail -fmt sarif -out results.sarif ./...'
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
scan-type: 'fs'
scan-ref: '.'
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload SARIF files
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: results.sarif
```
## Reference Architecture
### Enterprise Deployment
```
[Internet]
|
[CloudFlare/WAF]
|
[Application Load Balancer]
(TLS Termination)
|
+-----------------+
| Bastion Host |
+-----------------+
|
+------------------------------+
| Private Network |
| |
+------+-----+ +--------+--------+
| RedFlag | | PostgreSQL |
| Server | | (Encrypted) |
| (Cluster) | +-----------------+
+------+-----+
|
+------+------------+------------+-------------+
| | | |
[K8s Cluster] [Bare Metal] [VMware] [Cloud VMs]
| | | |
[RedFlag Agents] [RedFlag Agents][RedFlag Agents][RedFlag Agents]
```
## Security Contacts and Resources
### Team Contacts
- Security Team: security@company.com
- Incident Response: ir@company.com
- Engineering: redflag-team@company.com
### External Resources
- CVE Database: https://cve.mitre.org
- OWASP Testing Guide: https://owasp.org/www-project-web-security-testing-guide/
- NIST Cybersecurity Framework: https://www.nist.gov/cyberframework
### Internal Resources
- Security Documentation: `/docs/SECURITY.md`
- Configuration Guide: `/docs/SECURITY-SETTINGS.md`
- Incident Response Runbook: `/docs/INCIDENT-RESPONSE.md`
- Architecture Decisions: `/docs/ADR/`