# Control Server Operations Guide **Host:** control (CT 127) **IP:** 192.168.1.127 **Location:** pve2 **User:** maddox **Last Updated:** January 23, 2026 --- ## Overview The control server is the centralized command center for managing the Proxmox cluster infrastructure. It provides: - **Passwordless SSH** to all 13 managed hosts - **Ansible automation** for cluster-wide operations - **tmux sessions** for multi-host management - **Git-based configuration** synced to Forgejo --- ## Quick Start ### Launch Interactive Menu ```bash ~/scripts/control-menu.sh ``` ### Launch Multi-Host SSH Session ```bash ~/scripts/ssh-manager.sh ``` ### Run Ansible Ad-Hoc Command ```bash cd ~/clustered-fucks ansible all -m ping ansible docker_hosts -m shell -a "docker ps --format 'table {{.Names}}\t{{.Status}}'" ``` --- ## Directory Structure ``` /home/maddox/ ├── .ssh/ │ ├── config # SSH host definitions │ ├── tmux-hosts.conf # tmux session configuration │ ├── id_ed25519 # SSH private key │ └── id_ed25519.pub # SSH public key (add to new hosts) │ ├── clustered-fucks/ # Git repo (synced to Forgejo) │ ├── ansible.cfg # Ansible configuration │ ├── inventory/ │ │ ├── hosts.yml # Host inventory │ │ └── group_vars/ │ │ └── all.yml # Global variables │ └── playbooks/ │ ├── check-status.yml │ ├── docker-prune.yml │ ├── restart-utils.yml │ ├── update-all.yml │ └── deploy-utils.yml │ └── scripts/ ├── ssh-manager.sh # tmux multi-host launcher ├── control-menu.sh # Interactive Ansible menu └── add-host.sh # New host onboarding ``` --- ## Managed Hosts | Host | IP | User | Port | Type | Group | |------|-----|------|------|------|-------| | pve2 | .3 | root | 22 | Proxmox | proxmox_nodes | | pve-dell | .4 | root | 22 | Proxmox | proxmox_nodes | | replicant | .80 | maddox | 22 | VM | docker_hosts | | databases | .81 | root | 22 | VM | docker_hosts | | immich | .82 | root | 22 | VM | docker_hosts | | media-transcode | .120 | root | 22 | LXC | docker_hosts | | network-services | .121 | root | 22 | LXC | docker_hosts | | download-stack | .122 | root | 22 | LXC | docker_hosts | | docker666 | .123 | root | 22 | LXC | docker_hosts | | tailscale-home | .124 | root | 22 | LXC | docker_hosts | | dns-lxc | .125 | root | 22 | LXC | infrastructure | | nas | .251 | maddox | 44822 | NAS | legacy | | alien | .252 | maddox | 22 | Docker | legacy | --- ## Ansible Host Groups | Group | Members | Use Case | |-------|---------|----------| | `all` | All 13 hosts | Connectivity tests | | `docker_hosts` | 8 hosts | Docker operations | | `all_managed` | 11 hosts | System updates | | `proxmox_nodes` | pve2, pve-dell | Node-level ops | | `infrastructure` | dns-lxc | Non-Docker infra | | `legacy` | nas, alien | Manual operations | | `vms` | replicant, databases, immich | VM-specific | | `lxcs` | 6 LXC containers | LXC-specific | --- ## Playbooks Reference ### check-status.yml Reports disk usage, memory usage, and container counts. ```bash ansible-playbook playbooks/check-status.yml ``` **Target:** all_managed **Output:** Per-host status line (Disk=X% Mem=X% Containers=X) --- ### update-all.yml Runs apt update and upgrade on all Docker hosts. ```bash ansible-playbook playbooks/update-all.yml # With reboot if required: ansible-playbook playbooks/update-all.yml -e "reboot=true" ``` **Target:** docker_hosts **Note:** Checks for reboot requirement, notifies but doesn't auto-reboot unless `-e "reboot=true"` --- ### docker-prune.yml Cleans unused Docker resources (images, networks, build cache). ```bash ansible-playbook playbooks/docker-prune.yml ``` **Target:** docker_hosts **Note:** dns-lxc will fail (no Docker) - this is expected --- ### restart-utils.yml Restarts the utils stack (watchtower, autoheal, docker-proxy) on all hosts. ```bash ansible-playbook playbooks/restart-utils.yml ``` **Target:** docker_hosts **Note:** Uses host-specific `docker_appdata` variable for non-standard paths --- ### deploy-utils.yml Deploys standardized utils stack to a new host. ```bash ansible-playbook playbooks/deploy-utils.yml --limit new-host ``` **Target:** docker_hosts **Note:** Creates directory structure and .env file only; compose file must be added separately --- ## Scripts Reference ### ssh-manager.sh Launches a tmux session with SSH connections to all hosts. ```bash ~/scripts/ssh-manager.sh ``` **Features:** - Window 0: Control (local shell) - Windows 1-13: Individual host SSH sessions - Final window: Multi-View (all hosts in split panes) **Navigation:** - `Ctrl+b` then window number to switch - `Ctrl+b d` to detach (keeps session running) - `tmux attach -t cluster` to reattach --- ### control-menu.sh Interactive menu for common operations. ```bash ~/scripts/control-menu.sh ``` **Menu Options:** ``` [1] Ping All - Test connectivity [2] Check Status - Disk/memory/containers [3] Update All - apt upgrade docker hosts [4] Docker Prune - Clean unused resources [5] Restart Utils - Restart utils stack everywhere [A] Ad-hoc Command - Run custom command [I] Inventory - Show host list [S] SSH Manager - Launch tmux session [Q] Quit ``` --- ### add-host.sh Wizard for onboarding new hosts. ```bash ~/scripts/add-host.sh ``` **Steps:** 1. Prompts for hostname, IP, user, port, description 2. Tests SSH connectivity 3. Copies SSH key if needed 4. Adds to `~/.ssh/config` 5. Adds to `~/.ssh/tmux-hosts.conf` **Note:** Ansible inventory must be edited manually. --- ## Common Operations ### SSH to a Specific Host ```bash ssh replicant ssh databases ssh nas # Uses port 44822 automatically ``` ### Run Command on All Docker Hosts ```bash cd ~/clustered-fucks ansible docker_hosts -m shell -a "docker ps -q | wc -l" ``` ### Run Command on Specific Host ```bash ansible replicant -m shell -a "df -h" ``` ### Copy File to All Hosts ```bash ansible docker_hosts -m copy -a "src=/path/to/file dest=/path/to/dest" ``` ### Check Specific Service ```bash ansible docker_hosts -m shell -a "docker ps --filter name=watchtower --format '{{.Status}}'" ``` ### View Ansible Inventory ```bash ansible-inventory --graph ansible-inventory --list ``` --- ## Git Workflow ### Repository Location - **Local:** `~/clustered-fucks/` - **Remote:** `ssh://git@192.168.1.81:2222/maddox/clustered-fucks.git` - **Web:** https://git.3ddbrewery.com/maddox/clustered-fucks ### Standard Workflow ```bash cd ~/clustered-fucks # Make changes to playbooks/inventory vim playbooks/new-playbook.yml # Commit and push git add -A git commit -m "Add new playbook" git push origin main ``` ### Pull Latest Changes ```bash cd ~/clustered-fucks git pull origin main ``` --- ## Adding a New Host ### 1. Run Onboarding Script ```bash ~/scripts/add-host.sh ``` ### 2. Edit Ansible Inventory ```bash vim ~/clustered-fucks/inventory/hosts.yml ``` Add under appropriate group: ```yaml new-host: ansible_host: 192.168.1.XXX ansible_user: root ``` If non-standard appdata path: ```yaml new-host: ansible_host: 192.168.1.XXX ansible_user: root docker_appdata: /custom/path/appdata ``` ### 3. Test Connection ```bash ansible new-host -m ping ``` ### 4. Commit Changes ```bash cd ~/clustered-fucks git add -A git commit -m "Add new-host to inventory" git push origin main ``` --- ## Troubleshooting ### SSH Connection Refused ```bash # Check if SSH is running on target ssh -v hostname # If connection refused, access via Proxmox console: # For LXC: pct enter # For VM: qm terminal # Inside container/VM: apt install openssh-server systemctl enable ssh systemctl start ssh ``` ### SSH Permission Denied ```bash # Check key is in authorized_keys on target ssh-copy-id hostname # If still failing, check permissions on target: # (via Proxmox console) chmod 700 ~ chmod 700 ~/.ssh chmod 600 ~/.ssh/authorized_keys chown -R root:root ~/.ssh # or appropriate user ``` ### Ansible "Missing sudo password" The host is configured with `ansible_become: yes` but no password is set. Fix: Either remove `ansible_become: yes` from inventory, or set up passwordless sudo on target: ```bash echo "username ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers.d/username ``` ### Playbook Skips Host Check if host is in the correct group: ```bash ansible-inventory --graph ``` Check host variables: ```bash ansible-inventory --host hostname ``` ### Docker Command Not Found Host is in `docker_hosts` but doesn't have Docker. Move to `infrastructure` group: ```yaml infrastructure: hosts: hostname: ansible_host: 192.168.1.XXX ``` --- ## Non-Standard Configurations ### Hosts with Different Appdata Paths | Host | Path | |------|------| | replicant | `/home/maddox/docker/appdata` | | docker666 | `/root/docker/appdata` | | All others | `/home/docker/appdata` | These are handled via `docker_appdata` variable in inventory. ### Hosts with Non-Standard SSH | Host | Port | User | |------|------|------| | nas | 44822 | maddox | Configured in both `~/.ssh/config` and `inventory/hosts.yml`. ### Hosts Without Utils Stack | Host | Reason | |------|--------| | tailscale-home | Only runs Headscale, no utils needed | | dns-lxc | No Docker installed | --- ## Maintenance ### Update Ansible ```bash sudo apt update sudo apt upgrade ansible ``` ### Regenerate SSH Keys (if compromised) ```bash # Generate new key ssh-keygen -t ed25519 -N "" -f ~/.ssh/id_ed25519 # Distribute to all hosts (will prompt for passwords) for host in pve2 pve-dell replicant databases immich media-transcode network-services download-stack docker666 tailscale-home dns-lxc alien; do ssh-copy-id $host done # NAS requires special handling ssh-copy-id -p 44822 maddox@192.168.1.251 ``` ### Backup Configuration ```bash cd ~/clustered-fucks git add -A git commit -m "Backup: $(date +%Y-%m-%d)" git push origin main ``` --- ## Reference Files ### ~/.ssh/config ``` Host * StrictHostKeyChecking accept-new ServerAliveInterval 60 ServerAliveCountMax 3 Host pve2 HostName 192.168.1.3 User root Host pve-dell HostName 192.168.1.4 User root Host replicant HostName 192.168.1.80 User maddox Host databases HostName 192.168.1.81 User root Host immich HostName 192.168.1.82 User root Host media-transcode HostName 192.168.1.120 User root Host network-services HostName 192.168.1.121 User root Host download-stack HostName 192.168.1.122 User root Host docker666 HostName 192.168.1.123 User root Host tailscale-home HostName 192.168.1.124 User root Host dns-lxc HostName 192.168.1.125 User root Host nas HostName 192.168.1.251 User maddox Port 44822 Host alien HostName 192.168.1.252 User maddox ``` ### ~/clustered-fucks/ansible.cfg ```ini [defaults] inventory = inventory/hosts.yml remote_user = root host_key_checking = False retry_files_enabled = False gathering = smart fact_caching = jsonfile fact_caching_connection = /tmp/ansible_facts fact_caching_timeout = 86400 stdout_callback = yaml forks = 10 [privilege_escalation] become = False [ssh_connection] pipelining = True ssh_args = -o ControlMaster=auto -o ControlPersist=60s ``` --- ## Changelog | Date | Change | |------|--------| | 2026-01-23 | Initial deployment, all hosts connected, playbooks tested |