# A2 — Replay Attack Fix Implementation Report **Date:** 2026-03-28 **Branch:** unstabledeveloper **Audit Reference:** docs/A2_Replay_Attack_Audit.md --- ## Summary This document covers the implementation of fixes for 7 audit findings (F-1 through F-7) identified in the replay attack surface audit. All fixes maintain backward compatibility with pre-A1 agents and servers. --- ## Files Changed ### Server Side | File | Change | |------|--------| | `aggregator-server/internal/services/signing.go` | v3 signed message format includes agent_id (F-1) | | `aggregator-server/internal/models/command.go` | Added `ExpiresAt`, `AgentID`, `CreatedAt` to structs (F-7, F-1, F-3) | | `aggregator-server/internal/database/queries/commands.go` | TTL filter in GetPendingCommands/GetStuckCommands, expires_at in CreateCommand (F-6, F-7) | | `aggregator-server/internal/api/handlers/updates.go` | RetryCommand refactored to sign via signAndCreateCommand (F-5) | | `aggregator-server/internal/api/handlers/agents.go` | GetCommands passes AgentID and CreatedAt to CommandItem (F-1, F-3) | | `aggregator-server/internal/database/queries/docker.go` | Fix pre-existing fmt.Sprintf build error (unrelated) | | `aggregator-server/internal/database/migrations/026_add_expires_at.up.sql` | New migration: expires_at column + index + backfill (F-7) | | `aggregator-server/internal/database/migrations/026_add_expires_at.down.sql` | Rollback migration (F-7) | ### Agent Side | File | Change | |------|--------| | `aggregator-agent/internal/crypto/verification.go` | v3 message format, field-count detection, old-format 48h expiry (F-1, F-3) | | `aggregator-agent/internal/orchestrator/command_handler.go` | Dedup set, commandMaxAge=4h, CleanupExecutedIDs (F-2, F-4) | | `aggregator-agent/internal/client/client.go` | Added AgentID and CreatedAt to Command struct (F-1, F-3) | | `aggregator-agent/cmd/agent/main.go` | Wired CleanupExecutedIDs into key refresh cycle (F-2) | ### Test Files (Updated) | File | Tests Updated | |------|---------------| | `aggregator-server/internal/services/signing_replay_test.go` | TestRetryCommandIsUnsigned, TestRetryCommandMustBeSigned, TestSignedCommandNotBoundToAgent, TestOldFormatCommandHasNoExpiry | | `aggregator-server/internal/database/queries/commands_ttl_test.go` | TestGetPendingCommandsHasNoTTLFilter, TestGetPendingCommandsMustHaveTTLFilter | | `aggregator-server/internal/api/handlers/retry_signing_test.go` | simulateRetryCommand, TestRetryCommandEndpointProducesUnsignedCommand, TestRetryCommandEndpointMustProduceSignedCommand | | `aggregator-agent/internal/crypto/replay_test.go` | TestOldFormatReplayIsUnbounded, TestNewFormatCommandCanBeReplayedWithin24Hours, TestSameCommandCanBeVerifiedTwice, TestCrossAgentSignatureVerifies + new: TestOldFormatRecentCommandStillPasses, TestCommandBeyond4HoursIsRejected | | `aggregator-agent/internal/crypto/verification_test.go` | All tests updated for v3 format (AgentID), signCommand helper updated, signCommandV2 added | --- ## Signed Message Format (v3) ### New Format ``` "{agent_id}:{cmd_id}:{command_type}:{sha256(params)}:{unix_timestamp}" ``` 5 colon-separated fields. ### Previous Formats (backward compat) - **v2 (4 fields):** `"{cmd_id}:{command_type}:{sha256(params)}:{unix_timestamp}"` — has signed_at, no agent_id - **v1 (3 fields):** `"{cmd_id}:{command_type}:{sha256(params)}"` — no timestamp, no agent_id ### Backward Compatibility Detection The agent's `VerifyCommandWithTimestamp` detects the format: 1. If `cmd.AgentID != ""` → try v3 first. If v3 fails, fall back to v2 with warning. 2. If `cmd.AgentID == ""` and `cmd.SignedAt != nil` → v2 format with warning. 3. If `cmd.SignedAt == nil` → v1 format (oldest) with warning + 48h created_at check. Warnings are logged at the `[crypto]` level to alert operators to upgrade. --- ## Deduplication Window - **Implementation:** In-memory `executedIDs map[string]time.Time` in `CommandHandler` - **Window:** Entries are kept for `commandMaxAge` (4 hours) - **Cleanup:** Runs every 6 hours when `ShouldRefreshKey()` fires - **Restart Limitation:** The map is lost on agent restart. Commands issued within `commandMaxAge` can be replayed if the agent restarts. A TODO comment documents the future disk persistence path. --- ## Two-Phase Plan for Retiring Old-Format Commands ### Phase 1 (Implemented Now) - Old-format commands (no `signed_at`) with `created_at > 48h` are rejected by `VerifyCommand` - Old-format commands within 48h still pass (backward compat for recent commands) - The `created_at` field is now included in the `CommandItem` API response ### Phase 2 (Future Work — 90 Days After Migration 025 Deployment) - Remove the old-format fallback in `VerifyCommandWithTimestamp` entirely - Enforce `signed_at` as required on all commands - Remove `VerifyCommand()` from the public API - This ensures all commands use timestamped, agent-bound signatures --- ## Docker Build + Test Output ### Server Build ``` docker-compose build server # ... builds successfully Service server Built ``` ### Server Tests ``` === RUN TestRetryCommandIsUnsigned --- PASS: TestRetryCommandIsUnsigned (0.00s) === RUN TestRetryCommandMustBeSigned --- PASS: TestRetryCommandMustBeSigned (0.00s) === RUN TestSignedCommandNotBoundToAgent --- PASS: TestSignedCommandNotBoundToAgent (0.00s) === RUN TestOldFormatCommandHasNoExpiry --- PASS: TestOldFormatCommandHasNoExpiry (0.00s) ok github.com/Fimeg/RedFlag/aggregator-server/internal/services === RUN TestGetPendingCommandsHasNoTTLFilter --- PASS: TestGetPendingCommandsHasNoTTLFilter (0.00s) === RUN TestGetPendingCommandsMustHaveTTLFilter --- PASS: TestGetPendingCommandsMustHaveTTLFilter (0.00s) === RUN TestRetryCommandQueryDoesNotCopySignature --- PASS: TestRetryCommandQueryDoesNotCopySignature (0.00s) ok github.com/Fimeg/RedFlag/aggregator-server/internal/database/queries === RUN TestRetryCommandEndpointProducesUnsignedCommand --- PASS: TestRetryCommandEndpointProducesUnsignedCommand (0.00s) === RUN TestRetryCommandEndpointMustProduceSignedCommand --- PASS: TestRetryCommandEndpointMustProduceSignedCommand (0.00s) === RUN TestRetryCommandHTTPHandlerProducesUnsignedCommand_Integration --- SKIP: TestRetryCommandHTTPHandlerProducesUnsignedCommand_Integration (0.00s) ok github.com/Fimeg/RedFlag/aggregator-server/internal/api/handlers ``` ### Agent Tests ``` === RUN TestCacheMetadataIsExpired --- PASS: TestCacheMetadataIsExpired (0.00s) === RUN TestOldFormatReplayIsUnbounded --- PASS: TestOldFormatReplayIsUnbounded (0.00s) === RUN TestOldFormatRecentCommandStillPasses --- PASS: TestOldFormatRecentCommandStillPasses (0.00s) === RUN TestNewFormatCommandCanBeReplayedWithin24Hours --- PASS: TestNewFormatCommandCanBeReplayedWithin24Hours (0.00s) === RUN TestCommandBeyond4HoursIsRejected --- PASS: TestCommandBeyond4HoursIsRejected (0.00s) === RUN TestSameCommandCanBeVerifiedTwice --- PASS: TestSameCommandCanBeVerifiedTwice (0.00s) === RUN TestCrossAgentSignatureVerifies --- PASS: TestCrossAgentSignatureVerifies (0.00s) === RUN TestVerifyCommandWithTimestamp_ValidRecent --- PASS: TestVerifyCommandWithTimestamp_ValidRecent (0.00s) === RUN TestVerifyCommandWithTimestamp_TooOld --- PASS: TestVerifyCommandWithTimestamp_TooOld (0.00s) === RUN TestVerifyCommandWithTimestamp_FutureBeyondSkew --- PASS: TestVerifyCommandWithTimestamp_FutureBeyondSkew (0.00s) === RUN TestVerifyCommandWithTimestamp_FutureWithinSkew --- PASS: TestVerifyCommandWithTimestamp_FutureWithinSkew (0.00s) === RUN TestVerifyCommandWithTimestamp_BackwardCompatNoTimestamp --- PASS: TestVerifyCommandWithTimestamp_BackwardCompatNoTimestamp (0.00s) === RUN TestVerifyCommandWithTimestamp_WrongKey --- PASS: TestVerifyCommandWithTimestamp_WrongKey (0.00s) === RUN TestVerifyCommand_BackwardCompat --- PASS: TestVerifyCommand_BackwardCompat (0.00s) ok github.com/Fimeg/RedFlag/aggregator-agent/internal/crypto ``` All tests pass. No regressions detected.