5.5 KiB
Development Ethos
Philosophy: We are building honest, autonomous software for a community that values digital sovereignty. This isn't enterprise-fluff; it's a "less is more" set of non-negotiable principles forged from experience. We ship bugs, but we are honest about them, and we log the failures.
The Core Ethos (Principles & Contracts)
These are the rules we've learned not to compromise on. They are the contract.
1. Errors are History, Not /dev/null
Principle: NEVER silence errors.
Rationale: A "laid back" admin is one who can sleep at night, knowing any failure will be in the logs. We don't use 2>/dev/null. We fix the root cause, not the symptom.
Implementation & Contract:
- All errors, from a script exit 1 to an API 500, MUST be captured and logged with context (what failed, why, what was attempted).
- All logs MUST follow the [TAG] [system] [component] format (e.g., [ERROR] [agent] [installer] Download failed...).
- The final destination for all auditable events (errors and state changes) is the history table, as defined in CODE_RULES.md.
2. Security is Non-Negotiable
Principle: NEVER add unauthenticated endpoints.
Rationale: "Temporary" is permanent. Every single route MUST be protected by the established, multi-subsystem security architecture.
Implementation & Contract (The Stack):
- User Auth (WebUI): All admin dashboard routes MUST be protected by WebAuthMiddleware().
- Agent Registration: An agent can only be created using a valid registration_token via the /api/v1/agents/register endpoint. This token's validity (seats, expiration) MUST be checked against the registration_tokens table.
- Agent Check-in (Pull-Only): All agent-to-server communication (e.g., GET /agents/:id/commands) MUST be protected by AuthMiddleware(). This validates the agent's short-lived (24h) JWT access token.
- Agent Token Renewal: An agent MUST only renew its access token by presenting its long-lived (90-day sliding window) refresh_token to the /api/v1/agents/renew endpoint.
- Hardware Verification: All authenticated agent routes MUST also be protected by the MachineBindingMiddleware. This middleware MUST validate the X-Machine-ID header against the agents.machine_id column to prevent config-copying and impersonation.
- Update/Command Security: Sensitive commands (e.g., updates, reboots) MUST be protected by a signed Ed25519 Nonce to prevent replay attacks. The agent must validate the nonce's signature and its timestamp (<5 min) before execution.
- Binary Security: The agent must verify the Ed25519 signature of any downloaded binary against the cached server public key (the TOFU model) before applying a self-update. This signature check MUST include the [tunturi_ed25519] watermark.
3. Assume Failure; Build for Resilience
Principle: NEVER assume an operation will succeed.
Rationale: Networks fail. Servers restart. Agents crash. The system must recover without manual intervention.
Implementation & Contract:
- Agent-Side (Network): Agent check-ins MUST use retry logic with exponential backoff to survive server 502s and other transient network failures. This is a critical bug-fix outlined in your Agent_retry_resilience_architecture.md.
- Agent-Side (Scanners): Long-running or fragile scanners (like Windows Update or DNF) MUST be wrapped in a Circuit Breaker to prevent a single failing subsystem from blocking all others.
- Data Delivery: Command results MUST use the Command Acknowledgment System (pending_acks.json). This guarantees at-least-once delivery, ensuring that if an agent restarts post-execution but pre-confirmation, it will re-send its results upon reboot.
4. Idempotency is a Requirement
Principle: NEVER forget idempotency.
Rationale: We (and our agents) will inevitably run the same command twice. The system must not break or create duplicate state.
Implementation & Contract:
- Install Scripts: Must be idempotent. They MUST check if the agent/service is already installed before trying to install again. This is a core feature.
- Command Design: Future commands should be designed for idempotency. Your research on duplicate commands correctly identifies this as the real fix, not simple de-duplication.
- Database Migrations: All schema changes MUST be idempotent (e.g., CREATE TABLE IF NOT EXISTS, ADD COLUMN IF NOT EXISTS, DROP FUNCTION IF EXISTS).
5. No Marketing Fluff (The "No BS" Rule)
Principle: NEVER use banned words or emojis in logs or code.
Rationale: We are building an "honest" tool for technical users, not pitching a product. Fluff hides meaning and is "enterprise BS."
Implementation & Contract:
- Banned Words: enhanced, enterprise-ready, seamless, robust, production-ready, revolutionary, etc..
- Banned Emojis: Emojis like ⚠️, ✅, ❌ are for UI/comms, not for logs.
- Logging Format: All logs MUST use the [TAG] [system] [component] format. [SECURITY] [agent] [auth] is clear; ⚠️ Agent Auth Failed! ❌ is not.
The Pre-PR Checklist
(This is the practical translation of the ethos. Do not merge until you can check these boxes.)
- All errors logged (not silenced with 2>/dev/null).
- No new unauthenticated endpoints (all use AuthMiddleware).
- Backup/restore/fallback paths for critical operations exist.
- Idempotency verified (can run 3x safely).
- History table logging added for all state changes.
- Security review completed (respects the stack).
- Testing includes error scenarios (not just the "happy path").