5.1 KiB
P0-001: Rate Limit First Request Bug
Priority: P0 (Critical) Source Reference: From RateLimitFirstRequestBug.md line 4 Date Identified: 2025-11-12
Problem Description
Every FIRST agent registration gets rate limited with HTTP 429 Too Many Requests, even though it's the very first request from a clean system. This happens consistently when running the one-liner installer, forcing a 1-minute wait before the registration succeeds.
Expected Behavior: First registration should succeed immediately (0/5 requests used) Actual Behavior: First registration gets 429 Too Many Requests
Reproduction Steps
-
Full rebuild to ensure clean state:
docker-compose down -v --remove-orphans && \ rm config/.env && \ docker-compose build --no-cache && \ cp config/.env.bootstrap.example config/.env && \ docker-compose up -d -
Wait for server to be ready (sleep 10)
-
Complete setup wizard and generate a registration token
-
Make first registration API call:
curl -v -X POST http://localhost:8080/api/v1/agents/register \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $TOKEN" \ -d '{ "hostname": "test-host", "os_type": "linux", "os_version": "Fedora 39", "os_architecture": "x86_64", "agent_version": "0.1.17" }' -
Observe 429 response on first request
Root Cause Analysis
Most likely cause is Rate Limiter Key Namespace Bug - rate limiter keys aren't namespaced by limit type, causing different endpoints to share the same counter.
Current (broken) implementation:
key := keyFunc(c) // Just "127.0.0.1"
allowed, resetTime := rl.checkRateLimit(key, config)
The issue: Download + Install + Register endpoints all use the same IP-based key, so 3 requests count against a shared 5-request limit.
Proposed Solution
Implement namespacing for rate limiter keys by limit type:
key := keyFunc(c)
namespacedKey := limitType + ":" + key // "agent_registration:127.0.0.1"
allowed, resetTime := rl.checkRateLimit(namespacedKey, config)
This ensures:
agent_registrationendpoints get their own counter per IPpublic_accessendpoints (downloads, install scripts) get their own counteragent_reportsendpoints get their own counter
Definition of Done
- First agent registration request succeeds with HTTP 200/201
- Rate limit headers show
X-RateLimit-Remaining: 4on first request - Multiple endpoints don't interfere with each other's counters
- Rate limiting still works correctly after 5 requests to same endpoint type
- Agent one-liner installer works without forced 1-minute wait
Test Plan
-
Direct API Test:
# Test 1: Verify first request succeeds curl -s -w "\nStatus: %{http_code}, Remaining: %{x-ratelimit-remaining}\n" \ -X POST http://localhost:8080/api/v1/agents/register \ -H "Authorization: Bearer $TOKEN" \ -d '{"hostname":"test","os_type":"linux","os_version":"test","os_architecture":"x86_64","agent_version":"0.1.17"}' # Expected: Status: 200/201, Remaining: 4 -
Cross-Endpoint Isolation Test:
# Make requests to different endpoint types curl http://localhost:8080/api/v1/downloads/linux/amd64 # public_access curl http://localhost:8080/api/v1/install/linux # public_access curl -X POST http://localhost:8080/api/v1/agents/register -H "Authorization: Bearer $TOKEN" -d '{"hostname":"test"}' # agent_registration # Registration should still have full limit available -
Rate Limit Still Works Test:
# Make 6 registration requests for i in {1..6}; do curl -s -w "Request $i: %{http_code}\n" \ -X POST http://localhost:8080/api/v1/agents/register \ -H "Authorization: Bearer $TOKEN" \ -d "{\"hostname\":\"test-$i\",\"os_type\":\"linux\"}" done # Expected: Requests 1-5 = 200/201, Request 6 = 429 -
Agent Binary Integration Test:
# Download and test actual agent registration wget http://localhost:8080/api/v1/downloads/linux/amd64 -O redflag-agent chmod +x redflag-agent ./redflag-agent --server http://localhost:8080 --token "$TOKEN" --register # Should succeed immediately without rate limit errors
Files to Modify
aggregator-server/internal/api/middleware/rate_limiter.go(likely location)- Any rate limiting configuration files
- Tests for rate limiting functionality
Impact
- Critical: Blocks new agent installations
- User Experience: Forces unnecessary 1-minute delays during setup
- Reliability: Makes system appear broken during normal operations
- Production: Prevents smooth agent deployment workflows
Verification Commands
After fix implementation:
# Check rate limit headers on first request
curl -I -X POST http://localhost:8080/api/v1/agents/register \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"hostname":"test"}'
# Should show:
# X-RateLimit-Limit: 5
# X-RateLimit-Remaining: 4
# X-RateLimit-Reset: [timestamp]