Redflag/docs/4_LOG/November_2025/security/securitygaps.md

# RedFlag Security Analysis & Gaps

**Last Updated**: October 30, 2025
**Version**: 0.1.16
**Status**: Pre-Alpha Security Review

## Executive Summary

RedFlag implements a three-tier authentication system (Registration Tokens → JWT Access Tokens → Refresh Tokens) with proper token validation and seat-based registration control. While the core authentication is production-ready, several security enhancements should be implemented before widespread deployment.

**Overall Security Rating**: 🟡 **Good with Recommendations**

---

## ✅ Current Security Strengths

### 1. **Authentication System**
- ✅ **Registration Token Validation**: All agent registrations require valid, non-expired tokens
- ✅ **Seat-Based Tokens**: Multi-use tokens with configurable seat limits prevent unlimited registrations
- ✅ **Token Consumption Enforcement**: Server-side rollback if token can't be consumed
- ✅ **JWT Authentication**: Industry-standard JWT tokens (24-hour expiry)
- ✅ **Refresh Token System**: 90-day refresh tokens reduce frequent re-authentication
- ✅ **Bcrypt Password Hashing**: Admin passwords hashed with bcrypt (cost factor 10)
- ✅ **Token Audit Trail**: `registration_token_usage` table tracks all token uses

### 2. **Network Security**
- ✅ **TLS/HTTPS Support**: Proxy-aware configuration supports HTTPS termination
- ✅ **Rate Limiting**: Configurable rate limits on all API endpoints
- ✅ **CORS Configuration**: Proper CORS headers configured in Nginx

### 3. **Installation Security (Linux)**
- ✅ **Dedicated System User**: Agents run as `redflag-agent` user (not root)
- ✅ **Limited Sudo Access**: Only specific update commands allowed via `/etc/sudoers.d/`
- ✅ **Systemd Hardening**: Service isolation and resource limits

---

## ⚠️ Security Gaps & Recommendations

### **CRITICAL** - High Priority Issues

#### 1. **No Agent Identity Verification**
**Risk**: Medium-High
**Impact**: Agent impersonation, duplicate agents

**Current State**:
- Agents authenticate via JWT stored in `config.json`
- No verification that the agent is on the original machine
- Copying `config.json` to another machine allows impersonation

**Attack Scenario**:
```bash
# Attacker scenario:
# 1. Compromise one agent machine
# 2. Copy C:\ProgramData\RedFlag\config.json
# 3. Install agent on multiple machines using same config.json
# 4. All machines appear as the same agent (hostname collision)
```

**Recommendations**:
1. **Machine Fingerprinting** (Implement Soon):
   ```go
   // Generate machine ID from hardware
   machineID := hash(MAC_ADDRESS + BIOS_SERIAL + CPU_ID)

   // Store in agent record
   agent.MachineID = machineID

   // Verify on every check-in
   if storedMachineID != reportedMachineID {
       log.Alert("Possible agent impersonation detected")
       requireReAuthentication()
   }
   ```

2. **Certificate-Based Authentication** (Future Enhancement):
   - Generate unique TLS client certificates during registration
   - Mutual TLS (mTLS) for agent-server communication
   - Automatic certificate rotation

3. **Hostname Uniqueness Constraint** (Easy Win):
   ```sql
   ALTER TABLE agents ADD CONSTRAINT unique_hostname UNIQUE (hostname);
   ```
   - Prevents multiple agents with same hostname
   - Alerts admin to potential duplicates
   - **Note**: May be false positive for legitimate re-installs

---

#### 2. **No Hostname Uniqueness Enforcement**
**Risk**: Medium
**Impact**: Confusion, potential security monitoring bypass

**Current State**:
- Database allows multiple agents with identical hostnames
- No warning when registering duplicate hostname
- UI may not distinguish between agents clearly

**Attack Scenario**:
- Attacker registers rogue agent with same hostname as legitimate agent
- Monitoring/alerting may miss malicious activity
- Admin may update wrong agent

**Recommendations**:
1. **Add Unique Constraint** (Database Level):
   ```sql
   -- Option 1: Strict (may break legitimate re-installs)
   ALTER TABLE agents ADD CONSTRAINT unique_hostname UNIQUE (hostname);

   -- Option 2: Soft (warning only)
   CREATE INDEX idx_agents_hostname ON agents(hostname);
   -- Check for duplicates in application code
   ```

2. **UI Warnings**:
   - Show warning icon next to duplicate hostnames
   - Display machine ID or IP address for disambiguation
   - Require admin confirmation before allowing duplicate

3. **Registration Policy**:
   - Allow "replace" mode: deactivate old agent when registering same hostname
   - Require manual admin approval for duplicates

---

#### 3. **Insecure config.json Storage**
**Risk**: Medium
**Impact**: Token theft, unauthorized access

**Current State**:
- Linux: `/etc/aggregator/config.json` (readable by `redflag-agent` user)
- Windows: `C:\ProgramData\RedFlag\config.json` (readable by service account)
- Contains sensitive JWT tokens and refresh tokens

**Attack Scenario**:
```bash
# Linux privilege escalation:
# 1. Compromise limited user account
# 2. Exploit local privilege escalation vuln
# 3. Read config.json as redflag-agent user
# 4. Extract JWT/refresh tokens
# 5. Impersonate agent from any machine
```

**Recommendations**:
1. **File Permissions** (Immediate):
   ```bash
   # Linux
   chmod 600 /etc/aggregator/config.json  # Only owner readable
   chown redflag-agent:redflag-agent /etc/aggregator/config.json

   # Windows (via ACLs)
   icacls "C:\ProgramData\RedFlag\config.json" /grant "NT AUTHORITY\SYSTEM:(F)" /inheritance:r
   ```

2. **Encrypted Storage** (Future):
   - Encrypt tokens at rest using machine-specific key
   - Use OS keyring/credential manager:
     - Linux: `libsecret` or `keyctl`
     - Windows: Windows Credential Manager

3. **Token Rotation Monitoring**:
   - Alert on suspicious token refresh patterns
   - Rate limit refresh token usage per agent

---

### **HIGH** - Important Security Enhancements

#### 4. **No Admin User Enumeration Protection**
**Risk**: Medium
**Impact**: Account takeover, brute force attacks

**Current State**:
- Login endpoint reveals whether username exists:
  - Valid username + wrong password: "Invalid password"
  - Invalid username: "User not found"
- Enables username enumeration attacks

**Recommendations**:
1. **Generic Error Messages**:
   ```go
   // Bad (current):
   if user == nil {
       return "User not found"
   }
   if !checkPassword() {
       return "Invalid password"
   }

   // Good (proposed):
   if user == nil || !checkPassword() {
       return "Invalid username or password"
   }
   ```

2. **Rate Limiting** (already implemented ✅):
   - Current: 10 requests/minute for login
   - Good baseline, consider reducing to 5/minute

3. **Account Lockout** (Future):
   - Lock account after 5 failed attempts
   - Require admin unlock or auto-unlock after 30 minutes

---

#### 5. **JWT Secret Not Configurable**
**Risk**: Medium
**Impact**: Token forgery if secret compromised

**Current State**:
- JWT secrets hardcoded in server code
- No rotation mechanism
- Shared across all deployments (if using defaults)

**Recommendations**:
1. **Environment Variable Configuration** (Immediate):
   ```go
   // server/cmd/server/main.go
   jwtSecret := os.Getenv("JWT_SECRET")
   if jwtSecret == "" {
       jwtSecret = generateRandomSecret() // Generate if not provided
       log.Warn("JWT_SECRET not set, using generated secret (won't persist across restarts)")
   }
   ```

2. **Secret Rotation** (Future):
   - Support multiple active secrets (old + new)
   - Gradual rollover: issue with new, accept both
   - Documented rotation procedure

3. **Kubernetes Secrets Integration** (For Containerized Deployments):
   - Store JWT secret in Kubernetes Secret
   - Mount as environment variable or file

---

#### 6. **No Request Origin Validation**
**Risk**: Low-Medium
**Impact**: CSRF attacks, unauthorized API access

**Current State**:
- API accepts requests from any origin (behind Nginx)
- No CSRF token validation for state-changing operations
- Relies on JWT authentication only

**Recommendations**:
1. **CORS Strictness**:
   ```nginx
   # Current (permissive):
   add_header 'Access-Control-Allow-Origin' '*';

   # Recommended (strict):
   add_header 'Access-Control-Allow-Origin' 'https://your-domain.com';
   add_header 'Access-Control-Allow-Credentials' 'true';
   ```

2. **CSRF Protection** (For Web UI):
   - Add CSRF tokens to state-changing forms
   - Validate Origin/Referer headers for non-GET requests

---

### **MEDIUM** - Best Practice Improvements

#### 7. **Insufficient Audit Logging**
**Risk**: Low
**Impact**: Forensic investigation difficulties

**Current State**:
- Basic logging to stdout (captured by Docker/systemd)
- No centralized audit log for security events
- No alerting on suspicious activity

**Recommendations**:
1. **Structured Audit Events**:
   ```go
   // Log security events with context
   auditLog.Log(AuditEvent{
       Type: "AGENT_REGISTERED",
       Actor: "registration-token-abc123",
       Target: "agent-hostname-xyz",
       IP: "192.168.1.100",
       Success: true,
       Timestamp: time.Now(),
   })
   ```

2. **Log Retention**:
   - Minimum 90 days for compliance
   - Immutable storage (append-only)

3. **Security Alerts**:
   - Failed login attempts > threshold
   - Token seat exhaustion (potential attack)
   - Multiple agents from same IP
   - Unusual update patterns

---

#### 8. **No Input Validation on Agent Metadata**
**Risk**: Low
**Impact**: XSS, log injection, data corruption

**Current State**:
- Agent metadata stored as JSONB without sanitization
- Could contain malicious payloads
- Displayed in UI without proper escaping

**Recommendations**:
1. **Input Sanitization**:
   ```go
   // Validate metadata before storage
   if len(metadata.Hostname) > 255 {
       return errors.New("hostname too long")
   }

   // Sanitize for XSS
   metadata.Hostname = html.EscapeString(metadata.Hostname)
   ```

2. **Output Encoding** (Frontend):
   - React already escapes by default ✅
   - Verify no `dangerouslySetInnerHTML` usage

---

#### 9. **Database Credentials in Environment**
**Risk**: Low
**Impact**: Database compromise if environment leaked

**Current State**:
- PostgreSQL credentials in `.env` file
- Environment variables visible to all container processes

**Recommendations**:
1. **Secrets Management** (Production):
   - Use Docker Secrets or Kubernetes Secrets
   - Vault integration for enterprise deployments

2. **Principle of Least Privilege**:
   - App user: SELECT, INSERT, UPDATE only
   - Migration user: DDL permissions
   - No SUPERUSER for application

---

## 🔒 Auto-Update Security Considerations

### **New Feature**: Agent Self-Update Capability

#### Threats:
1. **Man-in-the-Middle (MITM) Attack**:
   - Attacker intercepts binary download
   - Serves malicious binary to agent
   - Agent installs compromised version

2. **Rollout Bomb**:
   - Bad update pushed to all agents simultaneously
   - Mass service disruption
   - Difficult rollback at scale

3. **Downgrade Attack**:
   - Force agent to install older, vulnerable version
   - Exploit known vulnerabilities

#### Mitigations (Recommended Implementation):

1. **Binary Signing & Verification**:
   ```go
   // Server signs binary with private key
   signature := signBinary(binary, privateKey)

   // Agent verifies with public key (embedded in agent)
   if !verifySignature(binary, signature, publicKey) {
       return errors.New("invalid binary signature")
   }
   ```

2. **Checksum Validation**:
   ```go
   // Server provides SHA-256 checksum
   expectedHash := "abc123..."

   // Agent verifies after download
   actualHash := sha256.Sum256(downloadedBinary)
   if actualHash != expectedHash {
       return errors.New("checksum mismatch")
   }
   ```

3. **HTTPS-Only Downloads**:
   - Require TLS for binary downloads
   - Certificate pinning (optional)

4. **Staggered Rollout**:
   ```go
   // Update in waves to limit blast radius
   rolloutStrategy := StaggeredRollout{
       Wave1: 5%,   // Canary group
       Wave2: 25%,  // After 1 hour
       Wave3: 100%, // After 24 hours
   }
   ```

5. **Version Pinning**:
   - Prevent downgrades (only allow newer versions)
   - Admin override for emergency rollback

6. **Rollback Capability**:
   - Keep previous binary as backup
   - Automatic rollback if new version fails health check

---

## 📊 Security Scorecard

| Category | Status | Score | Notes |
|----------|--------|-------|-------|
| **Authentication** | 🟢 Good | 8/10 | Strong token system, needs machine fingerprinting |
| **Authorization** | 🟡 Fair | 6/10 | JWT-based, needs RBAC for multi-tenancy |
| **Data Protection** | 🟡 Fair | 6/10 | TLS supported, config.json needs encryption |
| **Input Validation** | 🟡 Fair | 7/10 | Basic validation, needs metadata sanitization |
| **Audit Logging** | 🟡 Fair | 5/10 | Basic logging, needs structured audit events |
| **Secret Management** | 🟡 Fair | 6/10 | Basic .env, needs secrets manager |
| **Network Security** | 🟢 Good | 8/10 | Rate limiting, HTTPS, proper CORS |
| **Update Security** | 🔴 Not Implemented | 0/10 | Auto-update not yet implemented |

**Overall Score**: 6.5/10 - **Good for Alpha, Needs Hardening for Production**

---

## 🎯 Recommended Implementation Order

### Phase 1: Critical (Before Beta)
1. ✅ Fix rate-limiting page errors
2. ⬜ Implement machine fingerprinting for agents
3. ⬜ Add hostname uniqueness constraint (soft warning)
4. ⬜ Secure config.json file permissions
5. ⬜ Implement auto-update with signature verification

### Phase 2: Important (Before Production)
1. ⬜ Generic login error messages (prevent enumeration)
2. ⬜ Configurable JWT secrets via environment
3. ⬜ Structured audit logging
4. ⬜ Input validation on all agent metadata

### Phase 3: Best Practices (Production Hardening)
1. ⬜ Encrypted config.json storage
2. ⬜ Secrets management integration (Vault/Kubernetes)
3. ⬜ Security event alerting
4. ⬜ Automated security scanning (Dependabot, Snyk)

---

## 🔍 Penetration Testing Checklist

Before production deployment, conduct testing for:

- [ ] **JWT Token Manipulation**: Attempt to forge/tamper with JWTs
- [ ] **Registration Token Reuse**: Verify seat limits enforced
- [ ] **Agent Impersonation**: Copy config.json between machines
- [ ] **Brute Force**: Login attempts, token validation
- [ ] **SQL Injection**: All database queries (use parameterized queries ✅)
- [ ] **XSS**: Agent metadata in UI
- [ ] **CSRF**: State-changing operations without token
- [ ] **Path Traversal**: Binary downloads, file operations
- [ ] **Rate Limit Bypass**: Multiple IPs, header manipulation
- [ ] **Privilege Escalation**: Agent user permissions on host OS

---

## 📝 Compliance Notes

### GDPR / Privacy
- ✅ No PII collected by default
- ⚠️ IP addresses logged (may be PII in EU)
- ⚠️ Consider data retention policy for logs

### SOC 2 / ISO 27001
- ⬜ Needs documented security policies
- ⬜ Needs access control matrix
- ⬜ Needs incident response plan

---

## 📚 References

- [OWASP Top 10](https://owasp.org/www-project-top-ten/)
- [CWE Top 25](https://cwe.mitre.org/top25/)
- [JWT Best Practices](https://datatracker.ietf.org/doc/html/rfc8725)
- [NIST Cybersecurity Framework](https://www.nist.gov/cyberframework)

---

**Document Version**: 1.0
**Next Review**: After auto-update implementation
**Maintained By**: Development Team