Complete RedFlag codebase with two major security audit implementations.
== A-1: Ed25519 Key Rotation Support ==
Server:
- SignCommand sets SignedAt timestamp and KeyID on every signature
- signing_keys database table (migration 020) for multi-key rotation
- InitializePrimaryKey registers active key at startup
- /api/v1/public-keys endpoint for rotation-aware agents
- SigningKeyQueries for key lifecycle management
Agent:
- Key-ID-aware verification via CheckKeyRotation
- FetchAndCacheAllActiveKeys for rotation pre-caching
- Cache metadata with TTL and staleness fallback
- SecurityLogger events for key rotation and command signing
== A-2: Replay Attack Fixes (F-1 through F-7) ==
F-5 CRITICAL - RetryCommand now signs via signAndCreateCommand
F-1 HIGH - v3 format: "{agent_id}:{cmd_id}:{type}:{hash}:{ts}"
F-7 HIGH - Migration 026: expires_at column with partial index
F-6 HIGH - GetPendingCommands/GetStuckCommands filter by expires_at
F-2 HIGH - Agent-side executedIDs dedup map with cleanup
F-4 HIGH - commandMaxAge reduced from 24h to 4h
F-3 CRITICAL - Old-format commands rejected after 48h via CreatedAt
Verification fixes: migration idempotency (ETHOS #4), log format
compliance (ETHOS #1), stale comments updated.
All 24 tests passing. Docker --no-cache build verified.
See docs/ for full audit reports and deviation log (DEV-001 to DEV-019).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
15 KiB
A-2 Pre-Fix Test Suite
Date: 2026-03-28 Branch: unstabledeveloper Purpose: Document replay attack bugs BEFORE fixes are applied.
These tests prove that the bugs exist today and will prove the fixes work when applied. Do NOT modify these tests before the fix is ready — they are the regression baseline.
Test Files Created
| File | Package | Bugs Documented |
|---|---|---|
aggregator-server/internal/services/signing_replay_test.go |
services_test |
F-5, F-1, F-3 |
aggregator-agent/internal/crypto/replay_test.go |
crypto |
F-3, F-4, F-2, F-1 |
aggregator-server/internal/database/queries/commands_ttl_test.go |
queries_test |
F-6, F-7, F-5 |
aggregator-server/internal/api/handlers/retry_signing_test.go |
handlers_test |
F-5 |
How to Run
# Server-side tests (all pre-fix tests)
cd aggregator-server && go test ./internal/services/... -v -run "TestRetry|TestSigned|TestOld"
cd aggregator-server && go test ./internal/database/queries/... -v -run TestGetPending
cd aggregator-server && go test ./internal/api/handlers/... -v -run TestRetryCommand
# Agent-side tests (all pre-fix tests)
cd aggregator-agent && go test ./internal/crypto/... -v -run "TestOld|TestNew|TestSame|TestCross"
# Run everything with verbose output
cd aggregator-server && go test ./... -v 2>&1 | grep -E "(PASS|FAIL|BUG|---)"
cd aggregator-agent && go test ./... -v 2>&1 | grep -E "(PASS|FAIL|BUG|---)"
Test Inventory
Behaviour Categories
PASS-NOW / FAIL-AFTER-FIX — Asserts the CURRENT (buggy) behaviour. The test passes because the bug exists. When the fix is applied, the behaviour changes and this test fails — signalling that the test itself needs to be updated to assert the new correct state.
FAIL-NOW / PASS-AFTER-FIX — Asserts the CORRECT post-fix behaviour. The test fails because the bug exists. When the fix is applied, the assertion becomes true and the test passes — proving the fix works.
File 1: aggregator-server/internal/services/signing_replay_test.go
TestRetryCommandIsUnsigned
- Bug: F-5 — RetryCommand creates unsigned commands
- What it asserts:
retried.Signature == "",retried.SignedAt == nil,retried.KeyID == "" - Category: PASS-NOW / FAIL-AFTER-FIX
- Why it currently passes:
queries.RetryCommand(commands.go:189) builds a newAgentCommandstruct without callingsignAndCreateCommand. All three signature fields are zero values. - What changes after fix:
RetryCommandwill callsignAndCreateCommand, soretried.Signaturewill be non-empty — the assertions flip to failures. - Operator impact: Until fixed, every "Retry" click in the dashboard creates an
unsigned command. In strict enforcement mode the agent rejects it silently, logging
"command verification failed: strict enforcement requires signed commands". The server returns HTTP 200 so the operator sees no error.
TestRetryCommandMustBeSigned
- Bug: F-5 — RetryCommand creates unsigned commands
- What it asserts:
retried.Signature != "",retried.SignedAt != nil,retried.KeyID != "" - Category: FAIL-NOW / PASS-AFTER-FIX
- Why it currently fails: The retry command is unsigned (bug F-5 exists).
- What changes after fix: All three fields will be populated; test passes.
TestSignedCommandNotBoundToAgent
- Bug: F-1 —
agent_idabsent from signed payload - What it asserts:
agentA.String()is NOT in the signed message, anded25519.Verifyreturnstruefor the command regardless of which agent receives it. - Category: PASS-NOW / FAIL-AFTER-FIX
- Why it currently passes: Signed message is
{id}:{type}:{sha256(params)}:{ts}. Noagent_idcomponent.ed25519.Verifyignores anything outside the signed message. - What changes after fix: When
agent_idis added to the signed message, the message reconstructed in the test (withoutagent_id) will not match the signature —ed25519.Verifyreturnsfalseand the test fails. - Attack scenario: An attacker with DB write access can copy a signed command from
agent A into agent B's
agent_commandsqueue. The signature passes verification on agent B.
TestOldFormatCommandHasNoExpiry
- Bug: F-3 — Old-format commands (no
signed_at) valid forever - What it asserts:
ed25519.Verifyreturnstruefor an old-format signature (no timestamp in the message) regardless of when verification occurs. - Category: PASS-NOW / FAIL-AFTER-FIX
- Why it currently passes:
ed25519.Verifyis a pure cryptographic check — it has no time component. The old format{id}:{type}:{sha256(params)}contains no timestamp, so there is nothing to expire. - What changes after fix: Either
VerifyCommandis updated to reject old-format commands outright (requiringsigned_at), or acreated_atcheck is added — the test would then need to be updated to expect rejection.
File 2: aggregator-agent/internal/crypto/replay_test.go
Uses helpers generateKeyPair, signCommand, signCommandOld from verification_test.go
(same package).
TestOldFormatReplayIsUnbounded
- Bug: F-3 —
VerifyCommandhas no time check - What it asserts:
v.VerifyCommand(cmd, pub)returnsnilfor a command withSignedAt == nil(old format), regardless of age. - Category: PASS-NOW / FAIL-AFTER-FIX
- Why it currently passes:
VerifyCommand(verification.go:25) performs only an Ed25519 signature check. Nocreated_atorSignedAtfield is examined. - What changes after fix: After adding an expiry check,
VerifyCommandwill return an error for old-format commands beyond a defined age, and this test will fail.
TestNewFormatCommandCanBeReplayedWithin24Hours
- Bug: F-4 — 24-hour replay window (large but intentional)
- What it asserts:
VerifyCommandWithTimestampreturnsnilfor a command signed 23h59m ago (within the 24hcommandMaxAge). - Category: PASS-NOW / WILL-REMAIN-PASSING until
commandMaxAgeis reduced - Why it currently passes: By design — the 24h window is intentional to accommodate polling intervals and network delays.
- What changes after fix: If
commandMaxAgeis reduced (e.g. to 4h per the A-2 audit recommendation), this test will FAIL for commands older than the new limit. Update thetime.Durationin the test whencommandMaxAgeis changed.
TestSameCommandCanBeVerifiedTwice
- Bug: F-2 — No nonce; same command verifies any number of times
- What it asserts:
VerifyCommandWithTimestampreturnsnilon the second and third call with identical inputs. - Category: PASS-NOW / FAIL-AFTER-FIX
- Why it currently passes:
VerifyCommandWithTimestampis a stateless pure function. No nonce, no executed-command set, no single-use guarantee. - What changes after fix: After agent-side deduplication (executed-command ID set) is added, the second call for a previously-seen command UUID will return an error.
TestCrossAgentSignatureVerifies
- Bug: F-1 — Signed message has no agent binding
- What it asserts: The signed message components are
[cmd_id, cmd_type, sha256(params), timestamp]— noagent_id.VerifyCommandWithTimestamppasses for a copy of the command representing delivery to a different agent. - Category: PASS-NOW / FAIL-AFTER-FIX
- Why it currently passes:
client.Commandhas noagent_idfield, andreconstructMessageWithTimestampdoes not include one. - What changes after fix: After
agent_idis added to the signed message (and correspondingly toclient.Command), the reconstructed message in the verifier will includeagent_id, and a command signed for agent A will fail verification on agent B.
File 3: aggregator-server/internal/database/queries/commands_ttl_test.go
These tests operate on a copied query string constant. When the fix adds a TTL clause to
GetPendingCommands, update getPendingCommandsQuery in this file to match.
TestGetPendingCommandsHasNoTTLFilter
- Bug: F-6 + F-7 —
GetPendingCommandshas no TTL filter; noexpires_atcolumn - What it asserts: The query string does NOT contain
"INTERVAL"or"expires_at". - Category: PASS-NOW / FAIL-AFTER-FIX
- Why it currently passes: The production query (commands.go:52) is:
Neither
SELECT * FROM agent_commands WHERE agent_id = $1 AND status = 'pending' ORDER BY created_at ASC LIMIT 100INTERVALnorexpires_atappears. - What changes after fix: Update
getPendingCommandsQueryto the new query containing the TTL clause. The absence-assertions will then fail (indicator found) — update them.
TestGetPendingCommandsMustHaveTTLFilter
- Bug: F-6 + F-7 — same
- What it asserts: The query DOES contain a TTL indicator (
"INTERVAL"or"expires_at"). - Category: FAIL-NOW / PASS-AFTER-FIX
- Why it currently fails: No TTL clause exists in the current query.
- What changes after fix: Update
getPendingCommandsQuery; the indicator will be found and the test passes.
TestRetryCommandQueryDoesNotCopySignature
- Bug: F-5 (query-layer confirmation)
- What it asserts: Documentary — logs that
RetryCommandomitssignature,key_id,signed_atfrom the new command struct. - Category: Always passes (documentation test). Update the logged field lists when fix is applied.
File 4: aggregator-server/internal/api/handlers/retry_signing_test.go
TestRetryCommandEndpointProducesUnsignedCommand
- Bug: F-5 — Handler returns 200 but creates an unsigned command
- What it asserts:
retried.Signature == "",retried.SignedAt == nil,retried.KeyID == ""usingsimulateRetryCommandwhich replicates the exact struct construction inqueries.RetryCommand. - Category: PASS-NOW / FAIL-AFTER-FIX
- Why it currently passes:
simulateRetryCommandexactly mirrors the current production code (commands.go:202) — no signing call. - What changes after fix:
simulateRetryCommandmust be updated to include the signing call, or the test must be rewritten against the fixed implementation.
TestRetryCommandEndpointMustProduceSignedCommand
- Bug: F-5
- What it asserts:
retried.Signature != "",retried.SignedAt != nil,retried.KeyID != "" - Category: FAIL-NOW / PASS-AFTER-FIX
- Why it currently fails:
simulateRetryCommandproduces an unsigned command (bug exists). - What changes after fix: The production code will produce a signed command; update
simulateRetryCommandto call the signing service and the assertions will pass.
TestRetryCommandHTTPHandlerProducesUnsignedCommand_Integration
- Bug: F-5
- Status: Skipped — requires live DB or interface extraction (see TODO in file).
- How to enable: Extract
CommandQueriesInterfacefromCommandQueriesand update handlers to accept the interface, then replacesimulateRetryCommandwith a real handler invocation viahttptest.
State-Change Summary
| Test | Current State | After A-2 Fix |
|---|---|---|
| TestRetryCommandIsUnsigned | PASS | FAIL (flip expected) |
| TestRetryCommandMustBeSigned | FAIL | PASS |
| TestSignedCommandNotBoundToAgent | PASS | FAIL (flip expected) |
| TestOldFormatCommandHasNoExpiry | PASS | FAIL (flip expected) |
| TestOldFormatReplayIsUnbounded | PASS | FAIL (flip expected) |
| TestNewFormatCommandCanBeReplayedWithin24Hours | PASS | PASS (or FAIL if maxAge reduced) |
| TestSameCommandCanBeVerifiedTwice | PASS | FAIL (flip expected) |
| TestCrossAgentSignatureVerifies | PASS | FAIL (flip expected) |
| TestGetPendingCommandsHasNoTTLFilter | PASS | FAIL (flip expected) |
| TestGetPendingCommandsMustHaveTTLFilter | FAIL | PASS |
| TestRetryCommandQueryDoesNotCopySignature | PASS | documentary (update manually) |
| TestRetryCommandEndpointProducesUnsignedCommand | PASS | FAIL (flip expected) |
| TestRetryCommandEndpointMustProduceSignedCommand | FAIL | PASS |
Tests in bold currently FAIL — these are the "tests written to fail with current code" that satisfy the TDD requirement directly. All other tests currently PASS, documenting the bug-as-behavior, and will flip to FAIL when the fix changes the behavior they assert.
Maintenance Notes
-
When applying the fix for F-5: Update
simulateRetryCommandinretry_signing_test.goto reflect the new signed-command production. Update the assertions inTestRetryCommandIsUnsignedandTestRetryCommandEndpointProducesUnsignedCommandto assert the correct post-fix state. -
When applying the fix for F-6/F-7: Update
getPendingCommandsQueryincommands_ttl_test.goto the new query text. Invert the assertions inTestGetPendingCommandsHasNoTTLFilterto assert presence (not absence) of TTL. -
When applying the fix for F-3: Update
TestOldFormatCommandHasNoExpiryandTestOldFormatReplayIsUnboundedto assert that old-format commands ARE rejected, or that the backward-compat path has a defined expiry. -
When applying the fix for F-1: Update
TestSignedCommandNotBoundToAgentandTestCrossAgentSignatureVerifiesto pass anagent_idinto the signed message and assert that a cross-agent replay fails verification. -
When applying the fix for F-2: Update
TestSameCommandCanBeVerifiedTwiceto assert that the second call returns an error (deduplication firing).
Post-Fix Status (2026-03-28)
All fixes have been applied. Test status:
| Test | Pre-Fix | Post-Fix | Status |
|---|---|---|---|
| TestRetryCommandIsUnsigned | PASS | UPDATED — now asserts signed | VERIFIED PASSING |
| TestRetryCommandMustBeSigned | FAIL | UPDATED — now passes | VERIFIED PASSING |
| TestSignedCommandNotBoundToAgent | PASS | UPDATED — asserts agent_id binding | VERIFIED PASSING |
| TestOldFormatCommandHasNoExpiry | PASS | UPDATED — documents crypto vs app-layer | VERIFIED PASSING |
| TestOldFormatReplayIsUnbounded | PASS | UPDATED — asserts 48h rejection | VERIFIED PASSING |
| TestOldFormatRecentCommandStillPasses | N/A | NEW — backward compat for recent old-format | VERIFIED PASSING |
| TestNewFormatCommandCanBeReplayedWithin24Hours | PASS | UPDATED — uses 4h window (3h59m) | VERIFIED PASSING |
| TestCommandBeyond4HoursIsRejected | N/A | NEW — asserts 4h rejection | VERIFIED PASSING |
| TestSameCommandCanBeVerifiedTwice | PASS | UPDATED — documents verifier purity, dedup at ProcessCommand | VERIFIED PASSING |
| TestCrossAgentSignatureVerifies | PASS | UPDATED — asserts cross-agent failure | VERIFIED PASSING |
| TestGetPendingCommandsHasNoTTLFilter | PASS | UPDATED — asserts TTL presence | VERIFIED PASSING |
| TestGetPendingCommandsMustHaveTTLFilter | FAIL | UPDATED — now passes | VERIFIED PASSING |
| TestRetryCommandQueryDoesNotCopySignature | PASS | Unchanged (documentary) | VERIFIED PASSING |
| TestRetryCommandEndpointProducesUnsignedCommand | PASS | UPDATED — asserts signed | VERIFIED PASSING |
| TestRetryCommandEndpointMustProduceSignedCommand | FAIL | UPDATED — now passes | VERIFIED PASSING |