Commit Graph

3 Commits

Author SHA1 Message Date
f71f878a35 fix(concurrency): wire retry_count increment for stuck command re-delivery (DEV-029)
retry_count column and filter existed but counter was never
incremented. Stuck commands always had retry_count=0 and
always passed the WHERE retry_count < 5 filter, making
the cap ineffective.

Fix: Added RedeliverStuckCommandTx that sets
retry_count = retry_count + 1 on stuck->sent re-delivery.
GetCommands handler now uses MarkCommandSentTx for new
commands (retry_count stays 0) and RedeliverStuckCommandTx
for stuck re-delivery (retry_count increments).

All 77 tests pass. DEV-029 resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 08:16:12 -04:00
3ca42d50f4 fix(concurrency): B-2 data integrity and race condition fixes
- Wrap agent registration in DB transaction (F-B2-1/F-B2-8)
  All 4 ops atomic, manual DeleteAgent rollback removed
- Use SELECT FOR UPDATE SKIP LOCKED for atomic command delivery (F-B2-2)
  Concurrent requests get different commands, no duplicates
- Wrap token renewal in DB transaction (F-B2-9)
  Validate + update expiry atomic
- Add rate limit to GET /agents/:id/commands (F-B2-4)
  agent_checkin rate limiter applied
- Add retry_count column, cap stuck command retries at 5 (F-B2-10)
  Migration 029, GetStuckCommands filters retry_count < 5
- Cap polling jitter at current interval (fixes rapid mode) (F-B2-5)
  maxJitter = min(pollingInterval/2, 30s)
- Add exponential backoff with full jitter on reconnection (F-B2-7)
  calculateBackoff: base=10s, cap=5min, reset on success

All tests pass. No regressions from A-series or B-1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 08:00:36 -04:00
59ab7cbd5f test(concurrency): B-2 pre-fix tests for data integrity and concurrency bugs
Pre-fix test suite documenting 7 data integrity and concurrency
bugs. Tests FAIL where they assert correct post-fix behavior,
PASS where they document current buggy state.

Tests added:
- F-B2-1/8 HIGH: Registration not transactional (3 tests)
- F-B2-2 MEDIUM: Command delivery race condition (3 tests)
- F-B2-9 MEDIUM: Token renewal not transactional (2 tests)
- F-B2-4 MEDIUM: No rate limit on GetCommands (3 tests)
- F-B2-5 LOW: Jitter negates rapid mode (2 tests)
- F-B2-10 LOW: No max retry for stuck commands (2 tests)
- F-B2-7 MEDIUM: No exponential backoff on reconnection (2 tests)

Current state: 7 FAIL, 10 PASS. No A/B-1 regressions.
See docs/B2_PreFix_Tests.md for full inventory.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 07:45:16 -04:00