# RedFlag Security Architecture This document outlines the security architecture and implementation details for RedFlag's Ed25519-based cryptographic update system. ## Overview RedFlag implements a defense-in-depth security model for agent updates using: - **Ed25519 Digital Signatures** for binary authenticity - **Runtime Public Key Distribution** via Trust-On-First-Use (TOFU) - **Nonce-based Replay Protection** for command freshness (<5min freshness) - **Atomic Update Process** with automatic rollback and watchdog ## Architecture Overview ```mermaid graph TB A[Server Signs Package] --> B[Ed25519 Signature] B --> C[Package Distribution] C --> D[Agent Downloads] D --> E[Signature Verification] E --> F[AES-256-GCM Decryption] F --> G[Checksum Validation] G --> H[Atomic Installation] H --> I[Service Restart] I --> J[Update Confirmation] subgraph "Security Layers" K[Nonce Validation] L[Signature Verification] M[Encryption] N[Checksum Validation] end ``` ## Threat Model ### Protected Against - **Package Tampering**: Ed25519 signatures prevent unauthorized modifications - **Replay Attacks**: Nonce-based validation ensures command freshness (< 5 minutes) - **Eavesdropping**: AES-256-GCM encryption protects transit - **Rollback Protection**: Version-based updates prevent downgrade attacks - **Privilege Escalation**: Atomic updates with proper file permissions ### Assumptions - Server private key is securely stored and protected - Agent system has basic file system protections - Network transport uses HTTPS/TLS - Initial agent registration is secure ## Cryptographic Operations ### Key Generation (Server Setup) ```bash # Generate Ed25519 key pair for RedFlag go run scripts/generate-keypair.go # Output: # REDFLAG_SIGNING_PRIVATE_KEY=c038751ba992c9335501a0853b83e93190021075... # REDFLAG_PUBLIC_KEY=37f6d2a4ffe0f83bcb91d0ee2eb266833f766e8180866d31... # Add the private key to server environment # (Public key is distributed to agents automatically via API) ``` ### Package Signing Flow ```mermaid sequenceDiagram participant S as Server participant PKG as Update Package participant A as Agent S->>PKG: 1. Generate Package S->>PKG: 2. Calculate SHA-256 Checksum S->>PKG: 3. Sign with Ed25519 Private Key S->>PKG: 4. Add Metadata (version, platform, etc.) S->>PKG: 5. Encrypt with AES-256-GCM (optional) PKG->>A: 6. Distribute Package A->>A: 7. Verify Signature A->>A: 8. Validate Nonce (< 5min) A->>A: 9. Decrypt Package (if encrypted) A->>A: 10. Verify Checksum A->>A: 11. Atomic Installation A->>S: 12. Update Confirmation ``` ## Implementation Details ### 1. Ed25519 Signature System #### Server-side (signing.go) ```go // SignFile creates Ed25519 signature for update packages func (s *SigningService) SignFile(filePath string) (*models.AgentUpdatePackage, error) { content, err := io.ReadAll(file) hash := sha256.Sum256(content) signature := ed25519.Sign(s.privateKey, content) return &models.AgentUpdatePackage{ Signature: hex.EncodeToString(signature), Checksum: hex.EncodeToString(hash[:]), // ... other metadata }, nil } // VerifySignature validates package authenticity func (s *SigningService) VerifySignature(content []byte, signatureHex string) (bool, error) { signature, _ := hex.DecodeString(signatureHex) return ed25519.Verify(s.publicKey, content, signature), nil } ``` #### Agent-side (subsystem_handlers.go) ```go // Fetch and cache public key at agent startup publicKey, err := crypto.FetchAndCacheServerPublicKey(serverURL) // Cached to /etc/aggregator/server_public_key // Signature verification during update signature, _ := hex.DecodeString(params["signature"].(string)) if valid := ed25519.Verify(publicKey, packageContent, signature); !valid { return fmt.Errorf("invalid package signature") } ``` ### Public Key Distribution (TOFU Model) #### Server provides public key via API ```go // GET /api/v1/public-key (no authentication required) { "public_key": "37f6d2a4ffe0f83bcb91d0ee2eb266833f766e8180866d31...", "fingerprint": "37f6d2a4ffe0f83b", "algorithm": "ed25519", "key_size": 32 } ``` #### Agent fetches and caches at startup ```go // During agent registration publicKey, err := crypto.FetchAndCacheServerPublicKey(serverURL) // Cached to /etc/aggregator/server_public_key for future use ``` **Security Model**: Trust-On-First-Use (TOFU) - Like SSH fingerprints - trust the first connection - Protected by HTTPS/TLS during initial fetch - Cached locally for all future verifications - Optional: Manual fingerprint verification (out-of-band) ### 2. Nonce-Based Replay Protection #### Server-side Nonce Generation ```go // Generate and sign nonce for update command func (s *SigningService) SignNonce(nonceUUID uuid.UUID, timestamp time.Time) (string, error) { nonceData := fmt.Sprintf("%s:%d", nonceUUID.String(), timestamp.Unix()) signature := ed25519.Sign(s.privateKey, []byte(nonceData)) return hex.EncodeToString(signature), nil } // Verify nonce freshness and signature func (s *SigningService) VerifyNonce(nonceUUID uuid.UUID, timestamp time.Time, signatureHex string, maxAge time.Duration) (bool, error) { if time.Since(timestamp) > maxAge { return false, fmt.Errorf("nonce expired") } // ... signature verification } ``` #### Agent-side Validation ```go // Extract nonce parameters from command nonceUUIDStr := params["nonce_uuid"].(string) nonceTimestampStr := params["nonce_timestamp"].(string) nonceSignature := params["nonce_signature"].(string) // TODO: Implement full validation // - Parse timestamp // - Verify < 5min freshness // - Verify Ed25519 signature // - Prevent replay attacks ``` ### 3. AES-256-GCM Encryption #### Key Derivation ```go // Derive AES-256 key from nonce func deriveKeyFromNonce(nonce string) []byte { hash := sha256.Sum256([]byte(nonce)) return hash[:] // 32 bytes for AES-256 } ``` #### Decryption Process ```go // Decrypt update package with AES-256-GCM func decryptAES256GCM(encryptedData, nonce string) ([]byte, error) { key := deriveKeyFromNonce(nonce) data, _ := hex.DecodeString(encryptedData) block, _ := aes.NewCipher(key) gcm, _ := cipher.NewGCM(block) // Extract nonce and ciphertext nonceSize := gcm.NonceSize() nonceBytes, ciphertext := data[:nonceSize], data[nonceSize:] // Decrypt and verify return gcm.Open(nil, nonceBytes, ciphertext, nil) } ``` ## Update Process Flow ### 1. Server Startup 1. **Load Private Key**: From `REDFLAG_SIGNING_PRIVATE_KEY` environment variable 2. **Initialize Signing Service**: Ed25519 operations ready 3. **Serve Public Key**: Available at `GET /api/v1/public-key` ### 2. Agent Installation (One-Liner) ```bash curl -sSL https://redflag.example/install.sh | bash ``` 1. **Download Agent**: Pre-built binary from server 2. **Start Agent**: Automatic startup 3. **Register**: Agent ↔ Server authentication 4. **Fetch Public Key**: From `GET /api/v1/public-key` 5. **Cache Key**: Saved to `/etc/aggregator/server_public_key` ### 3. Package Preparation (Server) 1. **Build**: Compile agent binary for target platform 2. **Sign**: Create Ed25519 signature using server private key 3. **Store**: Persist package with signature + metadata in database ### 4. Command Distribution (Server → Agent) 1. **Generate Nonce**: Create UUID + timestamp for freshness (<5min) 2. **Sign Nonce**: Ed25519 sign nonce for authenticity 3. **Create Command**: Bundle update parameters with signed nonce 4. **Distribute**: Send command to target agents ### 5. Package Reception (Agent) 1. **Validate Nonce**: Check timestamp < 5 minutes, verify Ed25519 signature 2. **Download**: Fetch package from secure URL 3. **Verify Signature**: Validate Ed25519 signature against cached public key 4. **Verify Checksum**: SHA-256 integrity check ### 6. Atomic Installation (Agent) 1. **Backup**: Copy current binary to `.bak` 2. **Install**: Atomically replace with new binary 3. **Restart**: Restart agent service (systemd/service/Windows service) 4. **Watchdog**: Poll server every 15s for version confirmation (5min timeout) 5. **Confirm or Rollback**: - ✓ Success → cleanup backup - ✗ Timeout/Failure → automatic rollback from backup ## Security Best Practices ### Server Operations - ✅ Private key stored in secure environment (hardware security module recommended) - ✅ Regular key rotation (see TODO in signing.go) - ✅ Audit logging of all signing operations - ✅ Network access controls for signing endpoints ### Agent Operations - ✅ Public key fetched via TOFU (Trust-On-First-Use) - ✅ Nonce validation prevents replay attacks (<5min freshness) - ✅ Signature verification prevents tampering - ✅ Watchdog polls server for version confirmation - ✅ Atomic updates prevent partial installations - ✅ Automatic rollback on watchdog timeout/failure ### Network Security - ✅ HTTPS/TLS for all communications - ✅ Package integrity verification - ✅ Timeout controls for downloads - ✅ Rate limiting on update endpoints ## Key Rotation Strategy ### Planned Implementation (TODO) ```mermaid graph LR A[Key v1 Active] --> B[Generate Key v2] B --> C[Dual-Key Period] C --> D[Sign with v1+v2] D --> E[Phase out v1] E --> F[Key v2 Active] ``` ### Rotation Steps 1. **Generate**: Create new Ed25519 key pair (v2) 2. **Distribute**: Add v2 public key to agents 3. **Transition**: Sign packages with both v1 and v2 4. **Verify**: Agents accept signatures from either key 5. **Phase-out**: Gradually retire v1 6. **Cleanup**: Remove v1 from agent trust store ### Migration Considerations - Backward compatibility during transition - Graceful period for key rotation (30 days recommended) - Monitoring for rotation completion - Emergency rollback procedures ## Vulnerability Management ### Known Mitigations - **Supply Chain**: Ed25519 signatures prevent package tampering - **Replay Attacks**: Nonce validation ensures freshness - **Privilege Escalation**: Atomic updates with proper permissions - **Information Disclosure**: AES-256-GCM encryption for transit ### Security Monitoring - Monitor for failed signature verifications - Alert on nonce replay attempts - Track update success/failure rates - Audit signing service access logs ### Incident Response 1. **Compromise Detection**: Monitor for signature verification failures 2. **Key Rotation**: Immediate rotation if private key compromised 3. **Agent Update**: Force security updates to all agents 4. **Investigation**: Audit logs for unauthorized access ## Compliance Considerations - **Cryptography**: Uses FIPS-validated algorithms (Ed25519, AES-256-GCM, SHA-256) - **Audit Trail**: Complete logging of all signing and update operations - **Access Control**: Role-based access to signing infrastructure - **Data Protection**: Encryption in transit and at rest ## Future Enhancements ### Planned Security Features - [ ] Hardware Security Module (HSM) integration for private key protection - [ ] Certificate-based agent authentication - [ ] Mutual TLS for server-agent communication - [ ] Package reputation scoring - [ ] Zero-knowledge proof-based update verification ### Performance Optimizations - [ ] Parallel signature verification - [ ] Cached public key validation - [ ] Optimized crypto operations - [ ] Delta update support ## Testing and Validation ### Security Testing - **Unit Tests**: 80% coverage for crypto operations - **Integration Tests**: Full update cycle simulation - **Penetration Testing**: Regular third-party security assessments - **Fuzz Testing**: Cryptographic input validation ### Test Scenarios 1. **Valid Update**: Normal successful update flow 2. **Invalid Signature**: Tampered package rejection 3. **Expired Nonce**: Replay attack prevention 4. **Corrupted Package**: Checksum validation 5. **Service Failure**: Automatic rollback 6. **Network Issues**: Timeout and retry handling ## References - [Ed25519 Specification](https://tools.ietf.org/html/rfc8032) - [AES-GCM Specification](https://tools.ietf.org/html/rfc5116) - [NIST Cryptographic Standards](https://csrc.nist.gov/projects/cryptographic-standards-and-guidelines) ## Reporting Security Issues Please report security vulnerabilities responsibly: - Email: security@redflag-project.org - PGP Key: Available on request - Response time: Within 48 hours --- *Last updated: v0.1.21* *Security classification: Internal use*