Redflag/docs/4_LOG/November_2025/analysis/answer.md

# RedFlag Token Authentication System Analysis

Based on comprehensive analysis of the RedFlag codebase, here's a detailed breakdown of the token authentication system:

## Executive Summary

RedFlag uses a three-tier token system with different lifetimes and purposes:
1. **Registration Tokens** - One-time use for initial agent enrollment (multi-seat capable)
2. **JWT Access Tokens** - Short-lived (24h) stateless tokens for API authentication
3. **Refresh Tokens** - Long-lived (90d) rotating tokens for automatic renewal

**Important Clarification**: The "rotating token system" is **ACTIVE and working** (not discontinued). It refers to the refresh token system that rotates every 24h during renewal.

## 1. Registration Tokens (One-Time Use Multi-Seat Tokens)

### Purpose & Characteristics
- **Initial agent registration/enrollment** with the server
- **Multi-seat support** - Single token can register multiple agents
- **One-time use per agent** - Each agent uses it once during registration
- **Configurable expiration** - Admins set expiration (max 7 days)

### Technical Implementation

#### Token Generation
```go
// aggregator-server/internal/config/config.go:138-144
func GenerateSecureToken() string {
    bytes := make([]byte, 32)
    if _, err := rand.Read(bytes); err != nil {
        return ""
    }
    return hex.EncodeToString(bytes)
}
```
- **Method**: Cryptographically secure 32-byte random token → 64-character hex string
- **Algorithm**: `crypto/rand.Read()` for entropy

#### Database Schema
```sql
-- 011_create_registration_tokens_table.up.sql
CREATE TABLE registration_tokens (
    token VARCHAR(64) UNIQUE PRIMARY KEY,
    max_seats INT DEFAULT 1,
    seats_used INT DEFAULT 0,
    expires_at TIMESTAMP NOT NULL,
    status ENUM('active', 'used', 'expired', 'revoked') DEFAULT 'active',
    metadata JSONB,
    created_at TIMESTAMP DEFAULT NOW()
);
```

#### Seat Tracking System
- **Validation**: `status = 'active' AND expires_at > NOW() AND seats_used < max_seats`
- **Usage Tracking**: `registration_token_usage` table maintains audit trail
- **Status Flow**: `active` → `used` (seats exhausted) or `expired` (time expires)

### Registration Flow
```
1. Admin creates registration token with seat limit
2. Token distributed to agents (via config, environment variable, etc.)
3. Agent uses token for initial registration at /api/v1/agents/register
4. Server validates token and decrements available seats
5. Server generates AgentID + JWT + Refresh token
6. Agent saves AgentID, discards registration token
```

## 2. JWT Access Tokens (Stateless Short-Lived Tokens)

### Purpose & Characteristics
- **API authentication** for agent-server communication
- **Web dashboard authentication** for users
- **Stateless validation** - No database lookup required
- **Short lifetime** - 24 hours for security

### Token Structure

#### Agent JWT Claims
```go
// aggregator-server/internal/api/middleware/auth.go:13-17
type AgentClaims struct {
    AgentID string `json:"agent_id"`
    jwt.RegisteredClaims
}
```

#### User JWT Claims
```go
// aggregator-server/internal/api/handlers/auth.go:41-47
type UserClaims struct {
    UserID   int    `json:"user_id"`
    Username string `json:"username"`
    Role     string `json:"role"`
    jwt.RegisteredClaims
}
```

### Security Properties
- **Algorithm**: HS256 using shared secret
- **Secret Storage**: `REDFLAG_JWT_SECRET` environment variable
- **Validation**: Bearer token in `Authorization: Bearer {token}` header
- **Stateless**: Server validates using secret, no database lookup needed

### Key Security Consideration
```go
// aggregator-server/cmd/server/main.go:130
if cfg.Admin.JWTSecret == "" {
    cfg.Admin.JWTSecret = GenerateSecureToken()
    log.Printf("Generated JWT secret: %s", cfg.Admin.JWTSecret) // Debug exposure!
}
```
- **Development Risk**: JWT secret logged in debug mode
- **Production Requirement**: Must set `REDFLAG_JWT_SECRET` consistently

## 3. Refresh Tokens (Rotating Long-Lived Tokens)

### Purpose & Characteristics
- **Automatic agent renewal** without re-registration
- **Long lifetime** - 90 days with sliding window
- **Rotating mechanism** - New tokens issued on each renewal
- **Secure storage** - Only SHA-256 hashes stored in database

### Database Schema
```sql
-- 008_create_refresh_tokens_table.up.sql
CREATE TABLE refresh_tokens (
    agent_id UUID REFERENCES agents(id) PRIMARY KEY,
    token_hash VARCHAR(64) UNIQUE NOT NULL,
    expires_at TIMESTAMP NOT NULL,
    last_used_at TIMESTAMP DEFAULT NOW(),
    revoked BOOLEAN DEFAULT FALSE
);
```

### Token Generation & Security
```go
// aggregator-server/internal/database/queries/refresh_tokens.go
func GenerateRefreshToken() (string, error) {
    bytes := make([]byte, 32)
    if _, err := rand.Read(bytes); err != nil {
        return "", err
    }
    return hex.EncodeToString(bytes), nil
}

func HashRefreshToken(token string) string {
    hash := sha256.Sum256([]byte(token))
    return hex.EncodeToString(hash[:])
}
```

### Renewal Process (The "Rotating Token System")
```
1. Agent JWT expires (after 24h)
2. Agent sends refresh request to /api/v1/agents/renew
3. Server validates refresh token hash against database
4. Server generates NEW JWT access token (24h)
5. Server updates refresh_token.last_used_at
6. Server resets refresh_token.expires_at to NOW() + 90 days (sliding window)
7. Agent updates config with new JWT token
```

#### Key Features
- **Sliding Window Expiration**: 90-day window resets on each use
- **Hash Storage**: Only SHA-256 hashes stored, plaintext tokens never persisted
- **Rotation**: New JWT issued each time, refresh token extended
- **Revocation Support**: Manual revocation possible via database

## 4. Agent Configuration & Token Usage

### Configuration Structure
```go
// aggregator-agent/internal/config/config.go:48-90
type Config struct {
    // ... other fields ...
    RegistrationToken string `json:"registration_token,omitempty"` // One-time registration token
    Token             string `json:"token"`                      // JWT access token (24h)
    RefreshToken      string `json:"refresh_token"`              // Refresh token (90d)
}
```

### File Storage & Security
```go
// config.go:274-280
func (c *Config) Save() error {
    // ... validation logic ...
    jsonData, err := json.MarshalIndent(c, "", "  ")
    if err != nil {
        return err
    }

    return os.WriteFile(c.Path, jsonData, 0600) // Owner read/write only
}
```
- **Storage**: Plaintext JSON configuration file
- **Permissions**: 0600 (owner read/write only)
- **Location**: Typically `/etc/redflag/agent.json` or user-specified path

### Agent Registration Flow
```go
// aggregator-agent/cmd/agent/main.go:450-476
func runRegistration(cfg *config.Config) (*config.Config, error) {
    if cfg.RegistrationToken == "" {
        return nil, fmt.Errorf("registration token required for initial setup")
    }

    // Create temporary client with registration token
    client := api.NewClient("", cfg.ServerURL, cfg.SkipTLSVerify)

    // Register with server
    regReq := api.RegisterRequest{
        RegistrationToken: cfg.RegistrationToken,
        Hostname:          cfg.Hostname,
        Version:           version.Version,
    }

    // Process registration response
    // ...
    cfg.Token = resp.Token           // JWT access token
    cfg.RefreshToken = resp.RefreshToken
    cfg.AgentID = resp.AgentID

    return cfg, nil
}
```

### Token Renewal Logic
```go
// aggregator-agent/cmd/agent/main.go:484-519
func renewTokenIfNeeded(cfg *config.Config) error {
    if cfg.RefreshToken == "" {
        return fmt.Errorf("no refresh token available")
    }

    // Create temporary client without auth for renewal
    client := api.NewClient("", cfg.ServerURL, cfg.SkipTLSVerify)

    renewReq := api.RenewRequest{
        AgentID:      cfg.AgentID,
        RefreshToken: cfg.RefreshToken,
    }

    resp, err := client.RenewToken(renewReq)
    if err != nil {
        return err // Falls back to re-registration
    }

    // Update config with new JWT token
    cfg.Token = resp.Token
    return cfg.Save() // Persist updated config
}
```

## 5. Security Analysis & Configuration Encryption Implications

### Current Security Posture

#### Strengths
- **Strong Token Generation**: Cryptographically secure random tokens
- **Proper Token Separation**: Different tokens for different purposes
- **Hash Storage**: Refresh tokens stored as hashes only
- **JWT Stateless Validation**: No database storage for access tokens
- **File Permissions**: Config files with 0600 permissions

#### Vulnerabilities
- **Plaintext Storage**: All tokens stored in clear text JSON
- **JWT Secret Exposure**: Debug logging in development
- **Registration Token Exposure**: Stored in plaintext until used
- **Config File Access**: Anyone with file access can steal tokens

### Configuration Encryption Impact Analysis

#### Critical Challenge: Token Refresh Workflow
```
Current Flow:
1. Agent reads config (plaintext) → gets refresh_token
2. Agent calls /api/v1/agents/renew with refresh_token
3. Server returns new JWT access_token
4. Agent writes new access_token to config (plaintext)

Encrypted Config Flow:
1. Agent must decrypt config to get refresh_token
2. Agent calls /api/v1/agents/renew
3. Server returns new JWT access_token
4. Agent must encrypt and write updated config
```

#### Key Implementation Challenges

1. **Key Management**
   - Where to store encryption keys?
   - How to handle key rotation?
   - Agent process must have access to keys

2. **Atomic Operations**
   - Decrypt → Modify → Encrypt must be atomic
   - Prevent partial writes during token updates
   - Handle encryption/decryption failures gracefully

3. **Debugging & Recovery**
   - Encrypted configs complicate debugging
   - Lost encryption keys = lost agent registration
   - Backup/restore complexity increases

4. **Performance Overhead**
   - Decryption on every config read
   - Encryption on every token renewal
   - Memory footprint for decrypted config

#### Recommended Encryption Strategy

1. **Selective Field Encryption**
   ```json
   {
     "agent_id": "123e4567-e89b-12d3-a456-426614174000",
     "token": "enc:v1:aes256gcm:encrypted_jwt_token_here",
     "refresh_token": "enc:v1:aes256gcm:encrypted_refresh_token_here",
     "server_url": "https://redflag.example.com"
   }
   ```
   - Encrypt only sensitive fields (tokens)
   - Preserve JSON structure for compatibility
   - Include version prefix for future migration

2. **Key Storage Options**
   - **Environment Variables**: `REDFLAG_ENCRYPTION_KEY`
   - **Kernel Keyring**: Store keys in OS keyring
   - **Dedicated KMS**: AWS KMS, Azure Key Vault, etc.
   - **File-Based**: Encrypted key file with strict permissions

3. **Graceful Degradation**
   ```go
   func LoadConfig() (*Config, error) {
       // Try encrypted first
       if cfg, err := loadEncryptedConfig(); err == nil {
           return cfg, nil
       }

       // Fallback to plaintext for migration
       return loadPlaintextConfig()
   }
   ```

4. **Migration Path**
   - Detect plaintext configs and auto-encrypt on first load
   - Provide migration utilities for existing deployments
   - Support both encrypted and plaintext during transition

## 6. Token Lifecycle Summary

```
Registration Token Lifecycle:
┌─────────────┐    ┌──────────────┐    ┌─────────────┐    ┌─────────────┐
│   Generated │───▶│ Distributed  │───▶│   Used      │───▶│  Expired/   │
│  (Admin UI) │    │ (To Agents)  │    │ (Agent Reg) │    │  Revoked    │
└─────────────┘    └──────────────┘    └─────────────┘    └─────────────┘
                       │
                       ▼
              ┌──────────────────┐
              │ Agent Registration│
              │  (Creates:       │
              │   AgentID, JWT,   │
              │   RefreshToken)   │
              └──────────────────┘

JWT Access Token Lifecycle:
┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Generated │───▶│   Valid     │───▶│   Expired   │───▶│   Renewed   │
│ (Reg/Renew) │    │   (24h)     │    │             │    │ (via Refresh)│
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
                                                  │
                                                  ▼
                                          ┌──────────────┐
                                          │ API Requests │
                                          │ (Bearer Auth)│
                                          └──────────────┘

Refresh Token Lifecycle (The "Rotating System"):
┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Generated │───▶│   Valid     │───▶│   Used for  │───▶│  Rotated    │
│(Registration)│   │   (90d)     │    │  Renewal    │    │ (90d Reset) │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
```

## 7. Security Recommendations

### Immediate Improvements
1. **Remove JWT Secret Logging** in production builds
2. **Implement Config File Encryption** for sensitive fields
3. **Add Token Usage Monitoring** and anomaly detection
4. **Secure Registration Token Distribution** beyond config files

### Configuration Encryption Implementation
1. **Use AES-256-GCM** for field-level encryption
2. **Store encryption keys** in kernel keyring or secure environment
3. **Implement atomic config updates** to prevent corruption
4. **Provide migration utilities** for existing deployments
5. **Add config backup/restore** functionality

### Long-term Security Enhancements
1. **Hardware Security Modules (HSMs)** for key management
2. **Certificate-based authentication** as alternative to tokens
3. **Zero-trust architecture** for agent-server communication
4. **Regular security audits** and penetration testing

## 8. Conclusion

The RedFlag token authentication system is well-designed with proper separation of concerns and appropriate token lifetimes. The main security consideration is the plaintext storage of tokens in agent configuration files.

**Key Takeaways:**
- The rotating token system is **ACTIVE** and refers to refresh token rotation
- Config encryption is feasible but requires careful key management
- Token refresh workflow must remain functional after encryption
- Gradual migration path is essential for existing deployments

The recommended approach is **selective field encryption** with strong key management practices, ensuring the token refresh workflow remains operational while significantly improving security at rest.