this post was submitted on 24 Oct 2024
13 points (100.0% liked)

Selfhosted

40349 readers
468 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

I generally let my server do its thing, but I run into an issue consistently when I install system updates and then reboot: Some docker containers come online, while others need to be started manually. All containers were running before the system shut down.

  • My containers are managed with docker compose.
  • Their compose files have restart: always
  • It's not always the same containers that fail to come online
  • Some of them depend on an NFS mount point being ready on the host, but not all

Host is running Ubuntu Noble

Most of these containers were migrated from my previous server, and this issue never manifested.

I wonder if anyone has ideas for what to look for?

SOLVED

The issue was that docker was starting before my NFS mount point was ready, and the containers which depended on it were crashing.

Symptoms: journalctl -b0 -u docker showed the following log lines (-b0 means to limit logs to the most recent boot):

level=error msg="failed to start container" container=fe98f37d1bc3debb204a52eddd0c9448e8f0562aea533c5dc80d7abbbb969ea3 error="error while creating mount source path '/mnt/nas/REDACTED': mkdir /mnt/nas/REDACTED: operation not permitted"
...
level=warning msg="ShouldRestart failed, container will not be restarted" container=fe98f37d1bc3debb204a52eddd0c9448e8f0562aea533c5dc80d7abbbb969ea3 daemonShuttingDown=true error="restart canceled" execDuration=5m8.349967675s exitStatus="{0 2024-10-29 00:07:32.878574627 +0000 UTC}" hasBeenManuallyStopped=false restartCount=0

I had previously set my mount directory to be un-writable if the NFS were not ready, so this lined up with my expectations.

I couldn't remember how systemd names mount points, but the following command helped me find it: systemctl list-units -t mount | grep /mnt/nas

It gave me mnt-nas.mount as the name of the mount unit, so then I just added it to the After= and Requires= lines in my /etc/systemd/system/docker.service file:

[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target docker.socket firewalld.service containerd.service time-set.target mnt-nas.mount
Wants=network-online.target containerd.service
Requires=docker.socket mnt-nas.mount
...
top 8 comments
sorted by: hot top controversial new old
[–] [email protected] 15 points 1 month ago

I have recently discovered what was causing this to me for years. It was IP specific port bindings. Ports of a few containers were only bound for the LAN IP of the system, but if DHCP couldn't obtain an IP until the Docker service started its startup, then those containers couldn't be started at all, and Docker in it's wisdom won't bother with retrying.

The reasons to move my compose stacks to separate systemd services are counting.

[–] CameronDev 6 points 1 month ago (1 children)

Can you make your docker service start after the NFS Mount to rule that out?

A restart policy only takes effect after a container starts successfully. In this case, starting successfully means that the container is up for at least 10 seconds and Docker has started monitoring it. This prevents a container which doesn't start at all from going into a restart loop.

https://docs.docker.com/engine/containers/start-containers-automatically/#restart-policy-details

If your containers are crashing before the 10 timeout, then they won't restart.

[–] [email protected] 2 points 3 weeks ago (1 children)

Yep, the problem was that docker started before the NFS mount. Adding the dependency to my systemd docker unit did the trick!

[–] CameronDev 1 points 3 weeks ago

Excellent, thanks for the update!

[–] [email protected] 4 points 1 month ago

I hate to be that guy, but uh, what do the logs say?

The container logs would probably be most useful since you should (probably) be able to tell if they're having issues starting and/or simply not attempting to launch at all.

[–] [email protected] 1 points 1 month ago

Put that mount point into the compose file(s). You can define volumes with type nfs and basically have Docker-Compose manage the mounts.

[–] [email protected] 1 points 1 month ago

Add a healthcheck to test the mount directory. If it's deemed unhealthy, the container should restart until it passes. If your NFS mount is automounting properly, you should fix that though.

[–] [email protected] -1 points 2 weeks ago

docker container update restart always [container name]