Files
Redflag/docs/historical/IMPLEMENTATION_COMPLETE.md

6.7 KiB

Heartbeat Fix - Implementation Complete

Summary

Fixed the heartbeat UI refresh issue by implementing smart polling with a recentlyTriggered state.

What Was Fixed

Problem

When users clicked "Enable Heartbeat", the UI showed "Sending..." but never updated to show the heartbeat badge. Users had to manually refresh the page to see changes.

Root Cause

The polling interval was 2 minutes when heartbeat was inactive. After clicking the button, users had to wait up to 2 minutes for the next poll to see the agent's response.

Solution Implemented

1. useHeartbeat.ts - Added Smart Polling

export const useHeartbeatStatus = (agentId: string, enabled: boolean = true) => {
  const [recentlyTriggered, setRecentlyTriggered] = useState(false);

  const query = useQuery({
    queryKey: ['heartbeat', agentId],
    refetchInterval: (data) => {
      // Fast polling (5s) waiting for agent response
      if (recentlyTriggered) return 5000;

      // Medium polling (10s) when heartbeat is active
      if (data?.active) return 10000;

      // Slow polling (2min) when idle
      return 120000;
    },
  });

  // Auto-clear flag when agent confirms
  if (recentlyTriggered && query.data?.active) {
    setRecentlyTriggered(false);
  }

  return { ...query, recentlyTriggered, setRecentlyTriggered };
};

2. Agents.tsx - Trigger Fast Polling on Button Click

const { data: heartbeatStatus, recentlyTriggered, setRecentlyTriggered } = useHeartbeatStatus(...);

const handleRapidPollingToggle = async (agentId, enabled) => {
  // ... API call ...

  // Trigger 5-second polling for 15 seconds
  setRecentlyTriggered(true);
  setTimeout(() => setRecentlyTriggered(false), 15000);
};

How It Works Now

  1. User clicks "Enable Heartbeat"

    • Button shows "Sending..."
    • recentlyTriggered set to true
    • Polling increases from 2 minutes to 5 seconds
  2. Agent processes command (2-3 seconds)

    • Agent receives command
    • Agent enables rapid polling
    • Agent sends immediate check-in with heartbeat metadata
  3. Next poll catches update (within 5 seconds)

    • Polling every 5 seconds catches agent's response
    • UI updates to show RED/BLUE badge
    • recentlyTriggered auto-clears when active=true
  4. Total wait time: 5-8 seconds (not 30+ seconds)

Files Modified

  1. /aggregator-web/src/hooks/useHeartbeat.ts - Added recentlyTriggered state and smart polling logic
  2. /aggregator-web/src/pages/Agents.tsx - Updated to use new hook API and trigger fast polling

Performance Impact

  • When idle: 1 API call per 2 minutes (83% reduction from original 5-second polling)
  • After button click: 1 API call per 5 seconds for 15 seconds
  • During active heartbeat: 1 API call per 10 seconds
  • Window focus: Instant refresh (refetchOnWindowFocus: true)

Testing Checklist

Click "Enable Heartbeat" - badge appears within 5-8 seconds Badge shows RED for manual heartbeat Badge shows BLUE for system heartbeat (trigger DNF update) Switch tabs and return - state refreshes correctly No manual page refresh needed Polling slows down after 15 seconds

Additional Notes

  • The fix respects the agent as the source of truth (no optimistic UI updates)
  • Server doesn't need to report "success" before agent confirms
  • The 5-second polling window gives agent time to report (typically 2-3 seconds)
  • After 15 seconds, polling returns to normal speed (2 minutes when idle)

History vs Agents Overview - Unified Command Display

Current State:

  • History page (/home/casey/Projects/RedFlag/aggregator-web/src/pages/History.tsx): Full timeline, all agents, detailed with logs
  • Agents Overview tab (/home/casey/Projects/RedFlag/aggregator-web/src/pages/Agents.tsx:590-750): Compact view, single agent, max 3-4 entries

Problems Identified:

  1. Display inconsistency: Same command type shows differently in History vs Overview
  2. Hard-coded mappings: Each page has its own command type → display name logic
  3. No shared utilities: "scan_storage" displays as "Storage Scan" in one place, "scan storage" in another

Recommendation: Create shared command display utilities

File: aggregator-web/src/lib/command-display.ts (NEW - 1 hour)

export interface CommandDisplay {
  action: string;
  verb: string;
  noun: string;
  icon: string;
}

export const getCommandDisplay = (commandType: string): CommandDisplay => {
  const map = {
    'scan_storage': { action: 'Storage Scan', verb: 'Scan', noun: 'Disk', icon: 'HardDrive' },
    'scan_system': { action: 'System Scan', verb: 'Scan', noun: 'Metrics', icon: 'Cpu' },
    'scan_docker': { action: 'Docker Scan', verb: 'Scan', noun: 'Images', icon: 'Container' },
    // ... all platform-specific scans
  };
  return map[commandType] || { action: commandType, verb: 'Operation', noun: 'Unknown', icon: 'Activity' };
};

Why: Single source of truth, both pages use same mappings

Command Display Consolidation

Current Command Display Locations:

  1. History page: Full timeline with logs, syntax highlighting, pagination
  2. Agents Overview: Compact list (3-4 entries), agent-specific, real-time
  3. Updates page: Recent commands (50 limit), all agents

Are they too similar?:

  • Similar: All show command_type, status, timestamp, icons
  • Different: History shows full logs, Overview is compact, Updates has retry feature

Architectural Decision: PARTIAL CONSOLIDATION (not full)

Recommended:

  1. Extract shared display logic (1 hour)
    • Same command → same name, icon, color everywhere
  2. Keep specialized components (don't over-engineer)
    • History = full timeline with all features
    • Overview = compact window (3-4 entries max)
    • Updates = full list with retry

What NOT to do: Don't create abstract "CommandComponent" that tries to be all three (over-engineering)

What TO do: Extract utility functions into shared lib, keep components focused on their job

Technical Debt: Too Many TODO Files

Current State: Created 30+ MD files in 3 days, most have TODO sections

Violation: ETHOS Section 5 - "NEVER use banned words..." and Section 1 - "Errors are History"

Problem: Files that won't be completed = documentation debt

Why this happens:

  1. We create files during planning (good intention)
  2. Code changes faster than docs get updated (reality)
  3. Docs become out-of-sync (technical debt)

Solution:

  • Stop creating new MD files with TODOs
  • Put implementation details in JSDoc above functions
  • Completed features get a brief "# Completed" section in main README
  • Unfinished work stays in git branch until done

Recommendation: No new MD files unless feature is 100% complete and merged