Add docs and project files - force for Culurien
This commit is contained in:
153
docs/3_BACKLOG/P0-001_Rate-Limit-First-Request-Bug.md
Normal file
153
docs/3_BACKLOG/P0-001_Rate-Limit-First-Request-Bug.md
Normal file
@@ -0,0 +1,153 @@
|
||||
# P0-001: Rate Limit First Request Bug
|
||||
|
||||
**Priority:** P0 (Critical)
|
||||
**Source Reference:** From RateLimitFirstRequestBug.md line 4
|
||||
**Date Identified:** 2025-11-12
|
||||
|
||||
## Problem Description
|
||||
|
||||
Every FIRST agent registration gets rate limited with HTTP 429 Too Many Requests, even though it's the very first request from a clean system. This happens consistently when running the one-liner installer, forcing a 1-minute wait before the registration succeeds.
|
||||
|
||||
**Expected Behavior:** First registration should succeed immediately (0/5 requests used)
|
||||
**Actual Behavior:** First registration gets 429 Too Many Requests
|
||||
|
||||
## Reproduction Steps
|
||||
|
||||
1. Full rebuild to ensure clean state:
|
||||
```bash
|
||||
docker-compose down -v --remove-orphans && \
|
||||
rm config/.env && \
|
||||
docker-compose build --no-cache && \
|
||||
cp config/.env.bootstrap.example config/.env && \
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
2. Wait for server to be ready (sleep 10)
|
||||
|
||||
3. Complete setup wizard and generate a registration token
|
||||
|
||||
4. Make first registration API call:
|
||||
```bash
|
||||
curl -v -X POST http://localhost:8080/api/v1/agents/register \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-d '{
|
||||
"hostname": "test-host",
|
||||
"os_type": "linux",
|
||||
"os_version": "Fedora 39",
|
||||
"os_architecture": "x86_64",
|
||||
"agent_version": "0.1.17"
|
||||
}'
|
||||
```
|
||||
|
||||
5. Observe 429 response on first request
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
Most likely cause is **Rate Limiter Key Namespace Bug** - rate limiter keys aren't namespaced by limit type, causing different endpoints to share the same counter.
|
||||
|
||||
**Current (broken) implementation:**
|
||||
```go
|
||||
key := keyFunc(c) // Just "127.0.0.1"
|
||||
allowed, resetTime := rl.checkRateLimit(key, config)
|
||||
```
|
||||
|
||||
**The issue:** Download + Install + Register endpoints all use the same IP-based key, so 3 requests count against a shared 5-request limit.
|
||||
|
||||
## Proposed Solution
|
||||
|
||||
Implement namespacing for rate limiter keys by limit type:
|
||||
|
||||
```go
|
||||
key := keyFunc(c)
|
||||
namespacedKey := limitType + ":" + key // "agent_registration:127.0.0.1"
|
||||
allowed, resetTime := rl.checkRateLimit(namespacedKey, config)
|
||||
```
|
||||
|
||||
This ensures:
|
||||
- `agent_registration` endpoints get their own counter per IP
|
||||
- `public_access` endpoints (downloads, install scripts) get their own counter
|
||||
- `agent_reports` endpoints get their own counter
|
||||
|
||||
## Definition of Done
|
||||
|
||||
- [ ] First agent registration request succeeds with HTTP 200/201
|
||||
- [ ] Rate limit headers show `X-RateLimit-Remaining: 4` on first request
|
||||
- [ ] Multiple endpoints don't interfere with each other's counters
|
||||
- [ ] Rate limiting still works correctly after 5 requests to same endpoint type
|
||||
- [ ] Agent one-liner installer works without forced 1-minute wait
|
||||
|
||||
## Test Plan
|
||||
|
||||
1. **Direct API Test:**
|
||||
```bash
|
||||
# Test 1: Verify first request succeeds
|
||||
curl -s -w "\nStatus: %{http_code}, Remaining: %{x-ratelimit-remaining}\n" \
|
||||
-X POST http://localhost:8080/api/v1/agents/register \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-d '{"hostname":"test","os_type":"linux","os_version":"test","os_architecture":"x86_64","agent_version":"0.1.17"}'
|
||||
|
||||
# Expected: Status: 200/201, Remaining: 4
|
||||
```
|
||||
|
||||
2. **Cross-Endpoint Isolation Test:**
|
||||
```bash
|
||||
# Make requests to different endpoint types
|
||||
curl http://localhost:8080/api/v1/downloads/linux/amd64 # public_access
|
||||
curl http://localhost:8080/api/v1/install/linux # public_access
|
||||
curl -X POST http://localhost:8080/api/v1/agents/register -H "Authorization: Bearer $TOKEN" -d '{"hostname":"test"}' # agent_registration
|
||||
|
||||
# Registration should still have full limit available
|
||||
```
|
||||
|
||||
3. **Rate Limit Still Works Test:**
|
||||
```bash
|
||||
# Make 6 registration requests
|
||||
for i in {1..6}; do
|
||||
curl -s -w "Request $i: %{http_code}\n" \
|
||||
-X POST http://localhost:8080/api/v1/agents/register \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-d "{\"hostname\":\"test-$i\",\"os_type\":\"linux\"}"
|
||||
done
|
||||
|
||||
# Expected: Requests 1-5 = 200/201, Request 6 = 429
|
||||
```
|
||||
|
||||
4. **Agent Binary Integration Test:**
|
||||
```bash
|
||||
# Download and test actual agent registration
|
||||
wget http://localhost:8080/api/v1/downloads/linux/amd64 -O redflag-agent
|
||||
chmod +x redflag-agent
|
||||
./redflag-agent --server http://localhost:8080 --token "$TOKEN" --register
|
||||
|
||||
# Should succeed immediately without rate limit errors
|
||||
```
|
||||
|
||||
## Files to Modify
|
||||
|
||||
- `aggregator-server/internal/api/middleware/rate_limiter.go` (likely location)
|
||||
- Any rate limiting configuration files
|
||||
- Tests for rate limiting functionality
|
||||
|
||||
## Impact
|
||||
|
||||
- **Critical:** Blocks new agent installations
|
||||
- **User Experience:** Forces unnecessary 1-minute delays during setup
|
||||
- **Reliability:** Makes system appear broken during normal operations
|
||||
- **Production:** Prevents smooth agent deployment workflows
|
||||
|
||||
## Verification Commands
|
||||
|
||||
After fix implementation:
|
||||
```bash
|
||||
# Check rate limit headers on first request
|
||||
curl -I -X POST http://localhost:8080/api/v1/agents/register \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"hostname":"test"}'
|
||||
|
||||
# Should show:
|
||||
# X-RateLimit-Limit: 5
|
||||
# X-RateLimit-Remaining: 4
|
||||
# X-RateLimit-Reset: [timestamp]
|
||||
```
|
||||
Reference in New Issue
Block a user