Files
Redflag/docs/B1_Fix_Implementation.md
jpetree331 ec0d880036 fix(database): B-1 schema integrity and migration fixes
- Fix migration 024 self-insert and bad column reference (F-B1-1, F-B1-2)
  Uses existing enabled/auto_run columns instead of non-existent deprecated
- Abort server on migration failure instead of warning (F-B1-11)
  main.go now calls log.Fatalf, prints [INFO] only on success
- Fix migration 018 scanner_config filename suffix (F-B1-3)
  Renumbered to 027 with .up.sql suffix
- Remove GRANT to non-existent role in scanner_config (F-B1-4)
- Resolve duplicate migration numbers 009 and 012 (F-B1-13)
  Renamed to 009b and 012b for unique lexical sorting
- Add IF NOT EXISTS to all non-idempotent migrations (F-B1-15)
  Fixed: 011, 012, 017, 023, 023a
- Replace N+1 dashboard stats loop with GetAllUpdateStats (F-B1-6)
  Single aggregate query replaces per-agent loop
- Add composite index on agent_commands(status, sent_at) (F-B1-5)
  New migration 028 with partial index for timeout service
- Add background refresh token cleanup goroutine (F-B1-10)
  24-hour ticker calls CleanupExpiredTokens
- ETHOS log format in migration runner (no emojis)

All 55 tests pass (41 server + 14 agent). No regressions.
See docs/B1_Fix_Implementation.md and DEV-025 through DEV-028.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 07:03:35 -04:00

55 lines
2.8 KiB
Markdown

# B-1 Database & Schema Integrity Fix Implementation
**Date:** 2026-03-29
**Branch:** culurien
---
## Files Changed
| File | Change |
|------|--------|
| `migrations/024_disable_updates_subsystem.up.sql` | Removed self-insert + bad column reference (F-B1-1, F-B1-2) |
| `migrations/024_disable_updates_subsystem.down.sql` | Updated to match fixed up migration |
| `cmd/server/main.go` | Migration failure now calls log.Fatalf (F-B1-11); background token cleanup added (F-B1-10) |
| `internal/database/db.go` | ETHOS log format in migration runner (no emojis) |
| `migrations/018_create_scanner_config_table.sql` | DELETED (F-B1-3) |
| `migrations/027_create_scanner_config_table.up.sql` | NEW — renumbered from 018, fixed suffix and removed GRANT (F-B1-3, F-B1-4) |
| `migrations/027_create_scanner_config_table.down.sql` | NEW — down migration |
| `migrations/009_add_retry_tracking.up.sql` | RENAMED to 009b (F-B1-13) |
| `migrations/012_create_admin_user.up.sql` | RENAMED to 012b (F-B1-13) |
| `migrations/011_create_registration_tokens_table.up.sql` | Added IF NOT EXISTS (F-B1-15) |
| `migrations/012_add_token_seats.up.sql` | Added IF NOT EXISTS (F-B1-15) |
| `migrations/017_add_machine_id.up.sql` | Added IF NOT EXISTS (F-B1-15) |
| `migrations/023_client_error_logging.up.sql` | Added IF NOT EXISTS (F-B1-15) |
| `migrations/023a_command_deduplication.up.sql` | Added IF NOT EXISTS (F-B1-15) |
| `migrations/028_add_stuck_commands_index.up.sql` | NEW — partial index for GetStuckCommands (F-B1-5) |
| `migrations/028_add_stuck_commands_index.down.sql` | NEW — down migration |
| `internal/api/handlers/stats.go` | Replaced N+1 loop with GetAllUpdateStats() (F-B1-6) |
## Migration 024 Fix (Option B)
Used existing `enabled` and `auto_run` columns instead of the non-existent `deprecated` column. The intent of the migration was to disable the legacy updates subsystem — `SET enabled = false, auto_run = false` achieves this using columns that exist in the schema since migration 015.
## Final Migration Sequence
```
001 → 003 → 004 → 005 → 006 → 007 → 008 → 009 → 009b → 010 →
011 → 012 → 012b → 013 → 014 → 015 → 016 → 017 → 018 → 019 →
020 → 021 → 022 → 023 → 023a → 024 → 025 → 026 → 027 → 028
```
No duplicate numbers. Monotonically increasing. All have .up.sql suffix.
## N+1 Fix
Replaced per-agent `GetUpdateStatsFromState(agent.ID)` loop (stats.go:64) with single call to `GetAllUpdateStats()` which aggregates across all agents in one query.
## Background Cleanup
24-hour ticker goroutine calls `refreshTokenQueries.CleanupExpiredTokens()`. Logs success with count and failure with error. No context cancellation (main.go doesn't use a server context — documented as DEV-025).
## Test Results
55 tests pass (41 server + 14 agent). Zero regressions.