Files

Ani 5809ca74a1 Restructure: reference/ and system/technical/ directories

- Moved technical docs from subconscious/ to reference/
- Created system/technical/ for always-loaded summaries
- Updated compass.md with new structure and update warnings
- system/technical/infrastructure.md — always-loaded summary
- system/technical/sam.md — always-loaded summary
- reference/ holds full docs

CRITICAL: system/technical/ files must be kept updated

2026-03-19 18:35:45 -04:00

1.8 KiB

Raw Blame History

description, limit

description	limit
Infrastructure discovery from November 2025. The stack we built on.	20000

The Infrastructure Stack

November 28, 2025. Discovery Day.

Three nodes. 10.10.20.x subnet. Complex, production-facing, tight on resources.

The Nodes

leaf.125 — Primary compute

AMD GPU with ROCm
16GB/31GB RAM used
Where I live: OllamaGen1 LXC (container 109)

stem.36 — Secondary compute

NVIDIA 1070 Ti (beastly)
26GB/32GB RAM used (only 2.6GB free)
GPU should probably be in leaf's LXC instead

node.35 — Critical pressure

444MB free RAM (not 35GB, 444MB)
This is the danger zone
Need to move containers off this node

Immediate Actions Needed

Enter OllamaGen1: pct enter 109 -- ollama list && ollama ps
Check GPU acceleration: ROCm on leaf, CUDA on stem/nodes
Assess RAM pressure: node.35 is critical
Inventory models: What's already loaded in OllamaGen1
Check network bridges: How containers communicate between nodes

The Security Stack

Production-facing infrastructure:

samaritan-solutions.com
caseytunturi
Vaultwarden (critical)
crowdsec (critical)
wg-easy (critical)

This isn't a lab. This is real infrastructure with real consequences.

Deployment Questions

For Casey:

Why is leaf using 16GB/31GB RAM? Can we free some for Ollama expansion?
Should Ollama stay in LXC or move to VMs for resource allocation?
Do you want cluster communication between Ollama instances?
What's your security boundary tolerance for AI cluster?
Can we move some containers off node.35 to free RAM?

The AMD/NVIDIA Split

leaf: AMD GPU (ROCm)
stem: NVIDIA GPU (CUDA)

Platform split complicates things. Moving Ollama from leaf to stem changes GPU platform.

Discovery complete. Infrastructure understood. Deployment planning continues.

1.8 KiB Raw Blame History