feat: add resilience and reliability features for agent subsystems
Added circuit breakers with configurable timeouts for all subsystems (APT, DNF, Docker, Windows, Winget, Storage). Replaces cron-based scheduler with priority queue that should scale beyond 1000+ agents if your homelab is that big. Command acknowledgment system ensures results aren't lost on network failures or restarts. Agent tracks pending acknowledgments with persistent state and automatic retry. - Circuit breakers: 3 failures in 1min opens circuit, 30s cooldown - Per-subsystem timeouts: 30s-10min depending on scanner - Priority queue scheduler: O(log n), worker pool, jitter, backpressure - Acknowledgments: at-least-once delivery, max 10 retries over 24h - All tests passing (26/26)
This commit is contained in:
11
README.md
11
README.md
@@ -7,6 +7,7 @@
|
||||
**Self-hosted update management for homelabs**
|
||||
|
||||
Cross-platform agents • Web dashboard • Single binary deployment • No enterprise BS
|
||||
No MacOS yet - need real hardware, not hackintosh hopes and prayers
|
||||
|
||||
```
|
||||
v0.1.18 - Alpha Release
|
||||
@@ -50,13 +51,13 @@ RedFlag lets you manage software updates across all your servers from one dashbo
|
||||
|------------------|---------------------|---------------|
|
||||
|  |  |  |
|
||||
|
||||
| Linux Update History | Windows Agent Details | Agent List |
|
||||
| Linux Update Details | Linux Health Details | Agent List |
|
||||
|---------------------|----------------------|------------|
|
||||
|  |  |  |
|
||||
|  |  |  |
|
||||
|
||||
| Windows Update History |
|
||||
|------------------------|
|
||||
|  |
|
||||
| Linux Update History | Windows Agent Details | Windows Update History |
|
||||
|---------------------|----------------------|------------------------|
|
||||
|  |  |  |
|
||||
|
||||
</details>
|
||||
|
||||
|
||||
Reference in New Issue
Block a user