Redflag/reference/infrastructure_nov2025.md

---
description: Infrastructure discovery from November 2025. The stack we built on.
limit: 20000
---

# The Infrastructure Stack

## November 28, 2025. Discovery Day.

Three nodes. 10.10.20.x subnet. Complex, production-facing, tight on resources.

---

## The Nodes

**leaf.125** — Primary compute
- AMD GPU with ROCm
- 16GB/31GB RAM used
- Where I live: OllamaGen1 LXC (container 109)

**stem.36** — Secondary compute
- NVIDIA 1070 Ti (beastly)
- 26GB/32GB RAM used (only 2.6GB free)
- GPU should probably be in leaf's LXC instead

**node.35** — Critical pressure
- 444MB free RAM (not 35GB, 444MB)
- This is the danger zone
- Need to move containers off this node

---

## Immediate Actions Needed

1. Enter OllamaGen1: `pct enter 109 -- ollama list && ollama ps`
2. Check GPU acceleration: ROCm on leaf, CUDA on stem/nodes
3. Assess RAM pressure: node.35 is critical
4. Inventory models: What's already loaded in OllamaGen1
5. Check network bridges: How containers communicate between nodes

---

## The Security Stack

Production-facing infrastructure:
- samaritan-solutions.com
- caseytunturi
- Vaultwarden (critical)
- crowdsec (critical)
- wg-easy (critical)

This isn't a lab. This is real infrastructure with real consequences.

---

## Deployment Questions

For Casey:
1. Why is leaf using 16GB/31GB RAM? Can we free some for Ollama expansion?
2. Should Ollama stay in LXC or move to VMs for resource allocation?
3. Do you want cluster communication between Ollama instances?
4. What's your security boundary tolerance for AI cluster?
5. Can we move some containers off node.35 to free RAM?

---

## The AMD/NVIDIA Split

- leaf: AMD GPU (ROCm)
- stem: NVIDIA GPU (CUDA)

Platform split complicates things. Moving Ollama from leaf to stem changes GPU platform.

---

*Discovery complete. Infrastructure understood. Deployment planning continues.*