10 KiB
Future Enhancements & Considerations
Critical Testing Issues
Windows Agent Update Persistence Bug
Status: Needs Investigation
Problem: Microsoft Security Defender updates reappearing after installation
- Updates marked as installed but show back up in scan results
- Possible Windows Update state caching issue
- May be related to Windows Update Agent refresh timing
Investigation Needed:
- Verify update installation actually completes on Windows side
- Check Windows Update API state after installation
- Compare package state in database vs Windows registry
- Test with different update types (Defender vs other updates)
- May need to force WUA refresh after installation
Priority: High - affects Windows agent reliability
Immediate Priority - Real-Time Operations
Intelligent Heartbeat System Enhancement
Current State:
- Manual heartbeat toggle (pink icon when active)
- User-initiated only
- Fixed duration options
Proposed Enhancement:
- Auto-trigger heartbeat on operations: Any command sent to agent triggers heartbeat automatically
- Color coding:
- Blue: System-initiated heartbeat (scan, install, etc)
- Pink: User-initiated manual heartbeat
- Lifecycle management: Heartbeat auto-ends when operation completes
- Smart detection: Don't spam heartbeat commands if already active
Implementation Strategy: Phase 1: Scan operations auto-trigger heartbeat Phase 2: Install/approve operations auto-trigger heartbeat Phase 3: Any agent command auto-triggers appropriate heartbeat duration Phase 4: Heartbeat duration scales with operation type (30s scan vs 10m install)
User Experience:
- User clicks "Scan Now" → blue heartbeat activates → scan completes → heartbeat stops
- User clicks "Install" → blue heartbeat activates → install completes → heartbeat stops
- User manually triggers heartbeat → pink icon → user controls duration
Priority: High - improves responsiveness without manual intervention
Dashboard Visualization Enhancement:
- Live Commands Dashboard Widget: Aggregate view of all active operations
- Color coding extends to commands:
- Pink badges: User-initiated commands (manual scan, manual install, etc)
- Blue badges: System-orchestrated commands (auto-scan, auto-heartbeat, approved workflows)
- Fleet monitoring at a glance:
- Visual breakdown: "X agents with blue (system) operations | Y agents with pink (manual) operations"
- Quick filtering: "Show only system-orchestrated operations" vs "Show only user-initiated"
- Live count: "Active system operations triggering heartbeats: 3"
- Agent list integration:
- Small blue/pink indicator dots next to agent names
- Sort/filter by active heartbeat status and source
- Dashboard stats showing heartbeat distribution across fleet
Use Case: MSP/homelab fleet monitoring - differentiate between automated orchestration (blue) and manual intervention (pink) at a glance. Helps identify which systems need attention vs which are running autonomously.
Note: Backend tracking complete (source field in commands, metadata storage). Frontend visualization deferred for post-V1.0.
Strategic Architecture Decisions
Update Management Philosophy - Pre-V1.0 Discussion Needed
Core Questions:
- Are we a mirror? Do we cache/store update packages locally?
- Are we a gatekeeper? Do we proxy updates through our server?
- Are we an orchestrator? Do we just coordinate direct agent→repo downloads?
Current Implementation: Orchestrator model
- Agents download directly from upstream repos
- Server coordinates approval/installation
- No package caching or storage
Alternative Models to Consider:
Model A: Package Proxy/Cache
- Server downloads and caches approved updates
- Agents pull from local server instead of internet
- Pros: Bandwidth savings, offline capability, version pinning
- Cons: Storage requirements, security responsibility, repo sync complexity
Model B: Approval Database
- Server stores approval decisions without packages
- Agents check "is package X approved?" before installing from upstream
- Pros: Lightweight, flexible, audit trail
- Cons: No offline capability, no bandwidth savings
Model C: Hybrid Approach
- Critical updates: Cache locally (security patches)
- Regular updates: Direct from upstream
- User-configurable per update category
Windows Enforcement Challenge:
- Linux: Can control APT/DNF sources easily
- Windows: Windows Update has limited local control
- Winget: Can control sources
- Need unified approach that works cross-platform
Questions for V1.0:
- Do users want local update caching?
- Is bandwidth savings worth storage complexity?
- Should "disapprove" mean "block installation" or just "don't auto-install"?
- How do we handle Windows Update's limited control surface?
Decision Timeline: Before V1.0 - this affects database schema, agent architecture, storage requirements
High Priority - Security & Authentication
Cryptographically Signed Agent Binaries
Problem: Currently agents can be copied between servers, duplicated, or spoofed. Rate limiting is IP-based which doesn't prevent abuse at the agent level.
Proposed Solution:
- Server generates unique cryptographic signature when building/distributing agent binaries
- Each agent binary is bound to the specific server instance via:
- SSH keys or x.509 certificates
- Server's public/private key pair
- Unique server identifier embedded in binary at build time
- Agent presents cryptographic proof of authenticity during registration and check-ins
- Server validates signature before accepting any agent communication
Benefits:
-
Better Rate Limiting: Track and limit per-agent-binary instead of per-IP
- Prevents multiple agents from same host sharing rate limit bucket
- Each unique agent has its own quota
- Detect and block duplicated/copied agents
-
Prevents Cross-Server Agent Migration:
- Agent built for Server A cannot register with Server B
- Stops unauthorized agent redistribution
- Ensures agents only communicate with their originating server
-
Audit Trail:
- Track which specific binary version is running where
- Identify compromised or rogue agent binaries
- Revoke specific agent signatures if needed
Implementation Considerations:
- Use Ed25519 or RSA for signing (fast, secure)
- Embed server public key in agent binary at build time
- Store server private key securely (not in env file)
- Agent includes signature in Authorization header alongside token
- Server validates: signature + token + agent_id combo
- Migration path for existing unsigned agents
Timeline: Sooner than initially thought - foundational security improvement
Medium Priority - UI/UX Improvements
Rate Limit Settings UI
Current State: API endpoints exist, UI skeleton present but non-functional
Needed:
- Display current rate limit values for all endpoint types
- Live editing of limits with validation
- Show current usage/remaining per limit type
- Reset to defaults button
- Preview impact before applying changes
- Warning when setting limits too low
Location: Settings page → Rate Limits section
Server Status/Splash During Operations
Current State: Dashboard shows "Failed to load" during server restarts/maintenance
Needed:
- Detect when server is unreachable vs actual error
- Show friendly "Server restarting..." splash instead of error
- Maybe animated spinner or progress indicator
- Different states:
- Server starting up
- Server restarting (config change)
- Server maintenance
- Actual error (needs user action)
Possible Implementation:
- SetupCompletionChecker could handle this (already polling /health)
- Add status overlay component
- Detect specific error types (network vs 500 vs 401)
Dashboard Statistics Loading State
Current: Hard error when stats unavailable
Better:
- Skeleton loaders for stat cards
- Graceful degradation if some stats fail
- Retry button for failed stat fetches
- Cache last-known-good values briefly
Lower Priority - Feature Enhancements
Agent Auto-Update System
Currently agents must be manually updated. Need:
- Server-initiated agent updates
- Rollback capability
- Staged rollouts (canary deployments)
- Version compatibility checks
Proxmox Integration
Planned feature for managing VMs/containers:
- Detect Proxmox hosts
- List VMs and containers
- Trigger updates at VM/container level
- Separate update categories for host vs guests
Mobile-Responsive Dashboard
Works but not optimized:
- Better mobile nav (hamburger menu)
- Touch-friendly buttons
- Responsive tables (card view on mobile)
- PWA support for installing as app
Notification System
- Email alerts for failed updates
- Webhook integration (Discord, Slack, etc)
- Configurable notification rules
- Quiet hours / alert throttling
Scheduled Update Windows
- Define maintenance windows per agent
- Auto-approve updates during windows
- Block updates outside windows
- Timezone-aware scheduling
Technical Debt
Configuration Management
Current: Settings scattered between database, .env file, and hardcoded defaults
Better:
- Unified settings table in database
- Web UI for all configuration
- Import/export settings
- Settings version history
Testing Coverage
- Add integration tests for rate limiter
- Test agent registration flow end-to-end
- UI component tests for critical paths
- Load testing for concurrent agents
Documentation
- API reference needs expansion
- Agent installation guide for edge cases
- Troubleshooting guide
- Architecture diagrams
Code Organization
- Rate limiter settings should be database-backed (currently in-memory only)
- Agent timeout values hardcoded (need to be configurable)
- Shutdown delay hardcoded at 1 minute (user-adjustable needed)
Notes & Philosophy
- Less is more: No enterprise BS, keep it simple
- FOSS mentality: All software has bugs, best effort approach
- Homelab-first: Build for real use cases, not investor pitches
- Honest about limitations: Document what doesn't work
- Community-driven: Users know their needs best
Implementation Priority Order
- Cryptographic agent signing - Security foundation, enables better rate limiting
- Rate limit UI completion - Already have API, just need frontend
- Server status splash - UX improvement, quick win
- Settings management refactor - Enables other features
- Auto-update system - Major feature, needs careful design
- Everything else - As time permits
Last updated: 2025-10-31