4.5 KiB
Disaster Recovery Plan
This document outlines the procedures to restore critical services in the event of a catastrophic failure. The primary backup mechanism is ArchiveForge, which stores compressed tar.gz archives of service data on the NAS at 192.168.1.251.
Guiding Principles
- Prioritize Critical Services: Restore essential services first (e.g., Traefik, Authelia, ArchiveForge), followed by high-priority applications.
- Assume Total Loss: This plan assumes the primary application host (
192.168.1.252) is unrecoverable and a new host is being provisioned. - Test Regularly: This plan should be tested quarterly to ensure its effectiveness.
Phase 1: Infrastructure Restoration
This phase focuses on bringing the core infrastructure back online on a new host.
-
Provision New Host:
- Install a fresh OS (e.g., Ubuntu Server).
- Install Docker and Docker Compose.
- Configure networking to match the old host's static IP (
192.168.1.252).
-
Mount External Storage:
- Mount the NAS storage to the new host. Ensure the mount points are identical to the previous setup (e.g.,
/volume1/Media,/volume1/docker/backup). - Verify read/write access.
- Mount the NAS storage to the new host. Ensure the mount points are identical to the previous setup (e.g.,
-
Restore Core Services:
- Traefik: Restore the Traefik configuration from its backup location (if not part of
appdata) or from a backup. Start Traefik. - Authelia: Restore the Authelia configuration and start the service.
- ArchiveForge: Restore the ArchiveForge service. This is critical for restoring other applications.
- Traefik: Restore the Traefik configuration from its backup location (if not part of
Phase 2: Application Service Restoration
This phase details the process of restoring individual application services from the ArchiveForge backups.
General Restoration Steps
The general process for restoring a service from an ArchiveForge backup is as follows:
- Identify the Latest Backup: Locate the most recent backup for the desired service in the
ArchiveForgebackup directory on the NAS (e.g.,/volume1/docker/backup/ArchiveForge/daily/...). - Stop the Service: If the service is running (e.g., with a fresh but empty configuration), stop it:
cd /mnt/docker-storage/appdata/[service-name] docker-compose down - Restore the Data: Extract the backup archive into the service's
appdatadirectory. This will overwrite the existing configuration and data.tar -xzf /path/to/backup/[service-name]-YYYYMMDD-HHMMSS.tar.gz -C /mnt/docker-storage/appdata/[service-name] - Verify Permissions: Ensure the restored files have the correct ownership and permissions. This is especially important if the
PUIDandPGIDare used in thedocker-compose.yml. - Start the Service:
cd /mnt/docker-storage/appdata/[service-name] docker-compose up -d - Verify Functionality: Check the container logs and access the service's web UI to ensure it's running correctly and the data has been restored.
Example: Restoring Readarr
- Locate Backup: Find the latest
readarrbackup on the NAS. - Stop Readarr:
cd /mnt/docker-storage/appdata/readarr docker-compose down - Restore Data:
# Example path, replace with actual backup file tar -xzf /volume1/docker/backup/ArchiveForge/daily/2025-12-09/readarr-20251209-020000.tar.gz -C /mnt/docker-storage/appdata/readarr - Start and Verify:
Accessdocker-compose up -d docker-compose logs -fhttps://readarr.3ddbrewery.comto confirm your library and settings are restored.
Phase 3: External Database Restoration
For services that use the external database on the NAS (192.168.1.251), a separate restoration procedure is required. This procedure depends on how that database is backed up (e.g., mysqldump snapshots).
This section needs to be completed once the backup strategy for the external database is fully documented.
- Identify Backup: Locate the latest SQL dump file.
- Restore Dump: Use the appropriate database command to restore the backup.
-- For MySQL/MariaDB mysql -u [username] -p [database_name] < /path/to/backup.sql -- For PostgreSQL psql -U [username] -d [database_name] -f /path/to/backup.sql - Verify: Check the database to ensure the data has been restored correctly.
Next Review: This document should be reviewed and updated quarterly, or whenever there is a significant change to the infrastructure.