Files
Redflag/docs/D1_Fix_Implementation.md
jpetree331 db67049e92 fix(identity): D-1 machine ID deduplication fixes
- Remove unhashed 'unknown-' fallback from registration (F-D1-1)
  Registration aborts if GetMachineID() fails (no bad data)
- Add POST /admin/agents/:id/rebind-machine-id endpoint (F-D1-2)
  Admin can update stored machine ID after hardware change
- Delete dead example_integration.go with wrong usage (F-D1-3)
- Remove redundant Windows machineid.ID() retry (F-D1-4)
- Replace fmt.Printf with log.Printf in client.go (F-D1-5)

Operator note: agents registered with 'unknown-' machine IDs
must be rebound before upgrading. See D1_Fix_Implementation.md.

All tests pass. No regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 09:53:43 -04:00

48 lines
2.2 KiB
Markdown

# D-1 Machine ID Fix Implementation
**Date:** 2026-03-29
**Branch:** culurien
---
## Files Changed
| File | Change |
|------|--------|
| `cmd/agent/main.go` | Removed unhashed "unknown-" fallback; registration aborts if GetMachineID() fails (F-D1-1) |
| `internal/client/client.go` | Replaced fmt.Printf with log.Printf for machine ID errors (F-D1-5) |
| `internal/system/machine_id.go` | Removed redundant machineid.ID() retry in Windows fallback, added Windows reinstall documentation (F-D1-4) |
| `internal/logging/example_integration.go` | DELETED — dead code with incorrect machineid.ID() usage (F-D1-3) |
| `server/internal/api/handlers/agents.go` | Added RebindMachineID admin endpoint (F-D1-2) |
| `server/internal/database/queries/agents.go` | Added UpdateMachineID query function (F-D1-2) |
| `server/cmd/server/main.go` | Registered rebind-machine-id admin route (F-D1-2) |
## Strategy (Task 1): Option C
Used Option C — trust canonical `system.GetMachineID()` entirely. If it fails (which requires ALL fallbacks to fail including hostname-os-arch), abort registration with `log.Fatalf`. This is the safest approach: the internal fallback chain in GetMachineID() always produces a SHA256 hash, so format consistency is guaranteed.
## Operator Migration Guide
If any agents were registered with the old "unknown-hostname" fallback (identifiable by `machine_id` not being 64 hex chars in the DB), they will be locked out after this upgrade because the new runtime client sends a proper SHA256 hash. To recover:
```sql
SELECT id, hostname, machine_id FROM agents
WHERE LENGTH(machine_id) != 64 OR machine_id LIKE 'unknown-%';
```
For each agent found, use the rebind endpoint:
```
POST /api/v1/admin/agents/{id}/rebind-machine-id
{"new_machine_id": "<64-char hex string from agent>"}
```
Or re-register the agent with a new registration token.
## Rebind Endpoint Specification
- **Route:** `POST /api/v1/admin/agents/:id/rebind-machine-id`
- **Auth:** WebAuthMiddleware + RequireAdmin (admin group)
- **Input:** `{"new_machine_id": "64-char-lowercase-hex-string"}`
- **Validation:** exactly 64 chars, lowercase hex only [0-9a-f]
- **Audit log:** old and new machine ID logged with admin user ID