7.6 KiB
Troubleshooting Guide
Last updated: 2026-01-05
This guide provides solutions to common issues encountered in this Docker-based infrastructure.
Issue: Container is restarting or won't start
Symptoms:
docker psshows the container isrestartingorexited.docker-compose up -dcommand fails with an error.
Diagnosis:
-
Check the logs: The first step is always to check the container's logs.
docker-compose logs -f <service-name>Look for error messages, stack traces, or any indication of what might be wrong.
-
Check dependencies: If the container depends on other services (e.g., a database), ensure those services are running and healthy.
docker-compose ps -
Check configuration:
- Environment variables: Ensure all required environment variables are set correctly in the
.envfile ordocker-compose.yml. - Volumes: Verify that all volume paths are correct and that the files and directories on the host have the correct permissions. The user running the Docker container (often specified with
PUIDandPGID) needs to have read and write access to the volume paths. - Ports: Check for port conflicts. If another service on the host is using the same port, the container will fail to start. Use
sudo lsof -i -P -n | grep LISTENto check for listening ports.
- Environment variables: Ensure all required environment variables are set correctly in the
Resolution:
- Once the root cause is identified from the logs or configuration check, address the issue. This may involve:
- Correcting an environment variable.
- Fixing file permissions on a volume.
- Changing a port mapping.
- Restarting a dependency.
- After applying the fix, try starting the container again:
docker-compose up -d --force-recreate <service-name>
Issue: 502 Bad Gateway from Traefik
Symptoms:
- Accessing a service through its domain (e.g.,
https://books.3ddbrewery.com) results in a "502 Bad Gateway" error from Traefik.
Diagnosis:
-
Check the Traefik dashboard: The Traefik dashboard (if accessible) provides a wealth of information about routers, services, and middleware. Look for any errors related to the service in question.
-
Check Traefik's logs:
docker logs traefikLook for errors related to the service, such as "no servers found".
-
Check the service's logs:
docker-compose logs -f <service-name>The service itself might be crashing or unhealthy.
-
Check network connectivity:
- Ensure the service is connected to the
traefik_proxynetwork in itsdocker-compose.yml. - From the Traefik container, try to ping the service's container.
docker exec -it traefik /bin/sh ping <container_name>
- Ensure the service is connected to the
-
Check Traefik labels:
- Ensure the
traefik.http.services.<service-name>.loadbalancer.server.portlabel in thedocker-compose.ymlfile is set to the correct port that the container is exposing. - Verify that all Traefik labels are correctly formatted.
- Ensure the
Resolution:
- Service not on
traefik_proxynetwork: Add the service to thetraefik_proxynetwork in itsdocker-compose.yml. - Incorrect port: Correct the port in the
traefik.http.services.<service-name>.loadbalancer.server.portlabel. - Service not running: Troubleshoot the service using the "Container is restarting" guide above.
Issue: 404 Not Found from Traefik
Symptoms:
- Accessing a service through its domain results in a "404 Not Found" error.
Diagnosis:
- Check the Traefik dashboard: Verify that a router has been created for the domain you are trying to access.
- Check the
rulelabel: Ensure thetraefik.http.routers.<service-name>.rulelabel is set to the correctHost(...). - Check DNS: Make sure your DNS is correctly pointing the domain to the IP address of the Traefik server.
Resolution:
- Incorrect rule: Correct the
Host(...)rule in thedocker-compose.ymlfile. - DNS issue: Correct the DNS record for the domain.
Issue: Authentication Failures
Symptoms:
- Being unable to log in to a service that is protected by Authelia.
- Seeing "Unauthorized" or "Forbidden" errors.
Diagnosis:
-
Check Authelia's logs:
docker logs autheliaLook for any errors related to the authentication attempt.
-
Check the application's logs: The application might be rejecting the authentication for some reason.
docker-compose logs -f <service-name>In the case of
books_webv2, check the backend logs for any errors related to theRemote-Userheader. -
Check the Traefik middleware: Ensure the
traefik.http.routers.<service-name>.middlewarelabel is correctly set toauthelia-breweryorauthelia-fails.
Resolution:
- Restart Authelia: Sometimes, simply restarting Authelia can resolve issues.
docker restart authelia - Check user credentials: Double-check the username and password.
- Check Authelia configuration: Review Authelia's
configuration.ymlfor any errors.
Issue: MariaDB/MySQL Replication Stopped
⚠️ CURRENT STATUS: As of January 2026, node database replication has been intentionally disabled. All applications connect directly to the primary server (192.168.1.251). This section is retained for reference if replication is re-enabled in the future.
Symptoms:
- Secondary database server shows
Replica_IO_RunningorReplica_SQL_RunningasNo. Seconds_Behind_Sourceis not0or shows a large number.- Applications using the secondary database have stale data.
Diagnosis:
-
Check replication status on secondary server: Connect to the secondary database server using phpMyAdmin or MySQL client and run:
SHOW REPLICA STATUS\GOr for older versions:
SHOW SLAVE STATUS\G -
Check key fields:
Replica_IO_Running: Should beYesReplica_SQL_Running: Should beYesSeconds_Behind_Source: Should be0Last_Error: Should be empty - if there's an error here, it will indicate what went wrong
-
Check primary server status:
SHOW MASTER STATUS;Note the
FileandPositionvalues. -
Check binary log settings: Ensure binary logging is enabled on the primary server:
SHOW VARIABLES LIKE 'log_bin';
Resolution:
Common Fix - Restart Replication:
-- On secondary server
STOP REPLICA;
START REPLICA;
SHOW REPLICA STATUS\G
If there's a specific error:
- Skip one transaction (if error is known to be safe):
⚠️ Warning: Only use this if you understand the error and know it's safe to skip.STOP REPLICA; SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1; START REPLICA;
If replication is completely broken:
- Re-establish replication from current position:
- Get current position from primary:
-- On primary SHOW MASTER STATUS; - Reset and reconfigure replica:
-- On secondary STOP REPLICA; CHANGE MASTER TO MASTER_LOG_FILE='<file from primary>', MASTER_LOG_POS=<position from primary>; START REPLICA; SHOW REPLICA STATUS\G
- Get current position from primary:
Prevention:
- Monitor replication status regularly
- Ensure both servers have sufficient disk space
- Check network connectivity between primary and secondary servers
- Review MariaDB error logs:
/var/log/mysql/error.log