Files
Redflag/docs/D1_Fix_Implementation.md
jpetree331 db67049e92 fix(identity): D-1 machine ID deduplication fixes
- Remove unhashed 'unknown-' fallback from registration (F-D1-1)
  Registration aborts if GetMachineID() fails (no bad data)
- Add POST /admin/agents/:id/rebind-machine-id endpoint (F-D1-2)
  Admin can update stored machine ID after hardware change
- Delete dead example_integration.go with wrong usage (F-D1-3)
- Remove redundant Windows machineid.ID() retry (F-D1-4)
- Replace fmt.Printf with log.Printf in client.go (F-D1-5)

Operator note: agents registered with 'unknown-' machine IDs
must be rebound before upgrading. See D1_Fix_Implementation.md.

All tests pass. No regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 09:53:43 -04:00

2.2 KiB

D-1 Machine ID Fix Implementation

Date: 2026-03-29 Branch: culurien


Files Changed

File Change
cmd/agent/main.go Removed unhashed "unknown-" fallback; registration aborts if GetMachineID() fails (F-D1-1)
internal/client/client.go Replaced fmt.Printf with log.Printf for machine ID errors (F-D1-5)
internal/system/machine_id.go Removed redundant machineid.ID() retry in Windows fallback, added Windows reinstall documentation (F-D1-4)
internal/logging/example_integration.go DELETED — dead code with incorrect machineid.ID() usage (F-D1-3)
server/internal/api/handlers/agents.go Added RebindMachineID admin endpoint (F-D1-2)
server/internal/database/queries/agents.go Added UpdateMachineID query function (F-D1-2)
server/cmd/server/main.go Registered rebind-machine-id admin route (F-D1-2)

Strategy (Task 1): Option C

Used Option C — trust canonical system.GetMachineID() entirely. If it fails (which requires ALL fallbacks to fail including hostname-os-arch), abort registration with log.Fatalf. This is the safest approach: the internal fallback chain in GetMachineID() always produces a SHA256 hash, so format consistency is guaranteed.

Operator Migration Guide

If any agents were registered with the old "unknown-hostname" fallback (identifiable by machine_id not being 64 hex chars in the DB), they will be locked out after this upgrade because the new runtime client sends a proper SHA256 hash. To recover:

SELECT id, hostname, machine_id FROM agents
WHERE LENGTH(machine_id) != 64 OR machine_id LIKE 'unknown-%';

For each agent found, use the rebind endpoint:

POST /api/v1/admin/agents/{id}/rebind-machine-id
{"new_machine_id": "<64-char hex string from agent>"}

Or re-register the agent with a new registration token.

Rebind Endpoint Specification

  • Route: POST /api/v1/admin/agents/:id/rebind-machine-id
  • Auth: WebAuthMiddleware + RequireAdmin (admin group)
  • Input: {"new_machine_id": "64-char-lowercase-hex-string"}
  • Validation: exactly 64 chars, lowercase hex only [0-9a-f]
  • Audit log: old and new machine ID logged with admin user ID