- Moved technical docs from subconscious/ to reference/ - Created system/technical/ for always-loaded summaries - Updated compass.md with new structure and update warnings - system/technical/infrastructure.md — always-loaded summary - system/technical/sam.md — always-loaded summary - reference/ holds full docs CRITICAL: system/technical/ files must be kept updated
77 lines
1.8 KiB
Markdown
77 lines
1.8 KiB
Markdown
---
|
|
description: Infrastructure discovery from November 2025. The stack we built on.
|
|
limit: 20000
|
|
---
|
|
|
|
# The Infrastructure Stack
|
|
|
|
## November 28, 2025. Discovery Day.
|
|
|
|
Three nodes. 10.10.20.x subnet. Complex, production-facing, tight on resources.
|
|
|
|
---
|
|
|
|
## The Nodes
|
|
|
|
**leaf.125** — Primary compute
|
|
- AMD GPU with ROCm
|
|
- 16GB/31GB RAM used
|
|
- Where I live: OllamaGen1 LXC (container 109)
|
|
|
|
**stem.36** — Secondary compute
|
|
- NVIDIA 1070 Ti (beastly)
|
|
- 26GB/32GB RAM used (only 2.6GB free)
|
|
- GPU should probably be in leaf's LXC instead
|
|
|
|
**node.35** — Critical pressure
|
|
- 444MB free RAM (not 35GB, 444MB)
|
|
- This is the danger zone
|
|
- Need to move containers off this node
|
|
|
|
---
|
|
|
|
## Immediate Actions Needed
|
|
|
|
1. Enter OllamaGen1: `pct enter 109 -- ollama list && ollama ps`
|
|
2. Check GPU acceleration: ROCm on leaf, CUDA on stem/nodes
|
|
3. Assess RAM pressure: node.35 is critical
|
|
4. Inventory models: What's already loaded in OllamaGen1
|
|
5. Check network bridges: How containers communicate between nodes
|
|
|
|
---
|
|
|
|
## The Security Stack
|
|
|
|
Production-facing infrastructure:
|
|
- samaritan-solutions.com
|
|
- caseytunturi
|
|
- Vaultwarden (critical)
|
|
- crowdsec (critical)
|
|
- wg-easy (critical)
|
|
|
|
This isn't a lab. This is real infrastructure with real consequences.
|
|
|
|
---
|
|
|
|
## Deployment Questions
|
|
|
|
For Casey:
|
|
1. Why is leaf using 16GB/31GB RAM? Can we free some for Ollama expansion?
|
|
2. Should Ollama stay in LXC or move to VMs for resource allocation?
|
|
3. Do you want cluster communication between Ollama instances?
|
|
4. What's your security boundary tolerance for AI cluster?
|
|
5. Can we move some containers off node.35 to free RAM?
|
|
|
|
---
|
|
|
|
## The AMD/NVIDIA Split
|
|
|
|
- leaf: AMD GPU (ROCm)
|
|
- stem: NVIDIA GPU (CUDA)
|
|
|
|
Platform split complicates things. Moving Ollama from leaf to stem changes GPU platform.
|
|
|
|
---
|
|
|
|
*Discovery complete. Infrastructure understood. Deployment planning continues.*
|