# Troubleshooting Guide _Last updated: 2026-01-05_ This guide provides solutions to common issues encountered in this Docker-based infrastructure. ## Issue: Container is restarting or won't start **Symptoms:** - `docker ps` shows the container is `restarting` or `exited`. - `docker-compose up -d` command fails with an error. **Diagnosis:** 1. **Check the logs:** The first step is always to check the container's logs. ```bash docker-compose logs -f ``` Look for error messages, stack traces, or any indication of what might be wrong. 2. **Check dependencies:** If the container depends on other services (e.g., a database), ensure those services are running and healthy. ```bash docker-compose ps ``` 3. **Check configuration:** - **Environment variables:** Ensure all required environment variables are set correctly in the `.env` file or `docker-compose.yml`. - **Volumes:** Verify that all volume paths are correct and that the files and directories on the host have the correct permissions. The user running the Docker container (often specified with `PUID` and `PGID`) needs to have read and write access to the volume paths. - **Ports:** Check for port conflicts. If another service on the host is using the same port, the container will fail to start. Use `sudo lsof -i -P -n | grep LISTEN` to check for listening ports. **Resolution:** - Once the root cause is identified from the logs or configuration check, address the issue. This may involve: - Correcting an environment variable. - Fixing file permissions on a volume. - Changing a port mapping. - Restarting a dependency. - After applying the fix, try starting the container again: ```bash docker-compose up -d --force-recreate ``` ## Issue: 502 Bad Gateway from Traefik **Symptoms:** - Accessing a service through its domain (e.g., `https://books.3ddbrewery.com`) results in a "502 Bad Gateway" error from Traefik. **Diagnosis:** 1. **Check the Traefik dashboard:** The Traefik dashboard (if accessible) provides a wealth of information about routers, services, and middleware. Look for any errors related to the service in question. 2. **Check Traefik's logs:** ```bash docker logs traefik ``` Look for errors related to the service, such as "no servers found". 3. **Check the service's logs:** ```bash docker-compose logs -f ``` The service itself might be crashing or unhealthy. 4. **Check network connectivity:** - Ensure the service is connected to the `traefik_proxy` network in its `docker-compose.yml`. - From the Traefik container, try to ping the service's container. ```bash docker exec -it traefik /bin/sh ping ``` 5. **Check Traefik labels:** - Ensure the `traefik.http.services..loadbalancer.server.port` label in the `docker-compose.yml` file is set to the correct port that the container is exposing. - Verify that all Traefik labels are correctly formatted. **Resolution:** - **Service not on `traefik_proxy` network:** Add the service to the `traefik_proxy` network in its `docker-compose.yml`. - **Incorrect port:** Correct the port in the `traefik.http.services..loadbalancer.server.port` label. - **Service not running:** Troubleshoot the service using the "Container is restarting" guide above. ## Issue: 404 Not Found from Traefik **Symptoms:** - Accessing a service through its domain results in a "404 Not Found" error. **Diagnosis:** 1. **Check the Traefik dashboard:** Verify that a router has been created for the domain you are trying to access. 2. **Check the `rule` label:** Ensure the `traefik.http.routers..rule` label is set to the correct `Host(...)`. 3. **Check DNS:** Make sure your DNS is correctly pointing the domain to the IP address of the Traefik server. **Resolution:** - **Incorrect rule:** Correct the `Host(...)` rule in the `docker-compose.yml` file. - **DNS issue:** Correct the DNS record for the domain. ## Issue: Authentication Failures **Symptoms:** - Being unable to log in to a service that is protected by Authelia. - Seeing "Unauthorized" or "Forbidden" errors. **Diagnosis:** 1. **Check Authelia's logs:** ```bash docker logs authelia ``` Look for any errors related to the authentication attempt. 2. **Check the application's logs:** The application might be rejecting the authentication for some reason. ```bash docker-compose logs -f ``` In the case of `books_webv2`, check the backend logs for any errors related to the `Remote-User` header. 3. **Check the Traefik middleware:** Ensure the `traefik.http.routers..middleware` label is correctly set to `authelia-brewery` or `authelia-fails`. **Resolution:** - **Restart Authelia:** Sometimes, simply restarting Authelia can resolve issues. ```bash docker restart authelia ``` - **Check user credentials:** Double-check the username and password. - **Check Authelia configuration:** Review Authelia's `configuration.yml` for any errors. ## Issue: MariaDB/MySQL Replication Stopped **⚠️ CURRENT STATUS**: As of January 2026, `node` database replication has been **intentionally disabled**. All applications connect directly to the primary server (`192.168.1.251`). This section is retained for reference if replication is re-enabled in the future. **Symptoms:** - Secondary database server shows `Replica_IO_Running` or `Replica_SQL_Running` as `No`. - `Seconds_Behind_Source` is not `0` or shows a large number. - Applications using the secondary database have stale data. **Diagnosis:** 1. **Check replication status on secondary server:** Connect to the secondary database server using phpMyAdmin or MySQL client and run: ```sql SHOW REPLICA STATUS\G ``` Or for older versions: ```sql SHOW SLAVE STATUS\G ``` 2. **Check key fields:** - `Replica_IO_Running`: Should be `Yes` - `Replica_SQL_Running`: Should be `Yes` - `Seconds_Behind_Source`: Should be `0` - `Last_Error`: Should be empty - if there's an error here, it will indicate what went wrong 3. **Check primary server status:** ```sql SHOW MASTER STATUS; ``` Note the `File` and `Position` values. 4. **Check binary log settings:** Ensure binary logging is enabled on the primary server: ```sql SHOW VARIABLES LIKE 'log_bin'; ``` **Resolution:** **Common Fix - Restart Replication:** ```sql -- On secondary server STOP REPLICA; START REPLICA; SHOW REPLICA STATUS\G ``` **If there's a specific error:** - **Skip one transaction (if error is known to be safe):** ```sql STOP REPLICA; SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; START REPLICA; ``` **⚠️ Warning:** Only use this if you understand the error and know it's safe to skip. **If replication is completely broken:** - **Re-establish replication from current position:** 1. Get current position from primary: ```sql -- On primary SHOW MASTER STATUS; ``` 2. Reset and reconfigure replica: ```sql -- On secondary STOP REPLICA; CHANGE MASTER TO MASTER_LOG_FILE='', MASTER_LOG_POS=; START REPLICA; SHOW REPLICA STATUS\G ``` **Prevention:** - Monitor replication status regularly - Ensure both servers have sufficient disk space - Check network connectivity between primary and secondary servers - Review MariaDB error logs: `/var/log/mysql/error.log`