dependency failed to start: container for service "web" is unhealthy #67

clone3448 · 2023-11-07T16:09:55Z

Good day, I tried to deploy the production docker compose image on the container manager on my synology ds923+ but got the error: dependency failed to start: container for service "web" is unhealthy.
I have altered the compose provided on github a bit to this (mainly the volumes):
`services:
web:
image: wger/server:latest
container_name: wger_server
depends_on:
db:
condition: service_healthy
cache:
condition: service_healthy
env_file:
- /volume1/docker/wger/config/prod.env
volumes:
- static:/home/wger/static
- media:/home/wger/media
expose:
- 8000
healthcheck:
test: wget --no-verbose --tries=1 --spider http://localhost:8000
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped

nginx:
image: nginx:stable
container_name: wger_nginx
depends_on:
- web
volumes:
- /volume1/docker/wger/config/nginx.conf:/etc/nginx/conf.d/default.conf
- static:/wger/static:ro
- media:/wger/media:ro
ports:
- "8001:80"
healthcheck:
test: service nginx status
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped

db:
image: postgres:15-alpine
container_name: wger_db
environment:
- POSTGRES_USER=wger
- POSTGRES_PASSWORD=wger
- POSTGRES_DB=wger
volumes:
- postgres-data:/var/lib/postgresql/data/
expose:
- 5432
healthcheck:
test: pg_isready -U wger
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped

cache:
image: redis
container_name: wger_cache
expose:
- 6379
volumes:
- redis-data:/data
healthcheck:
test: redis-cli ping
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped

celery_worker:
image: wger/server:latest
container_name: wger_celery_worker
command: /start-worker
env_file:
- /volume1/docker/wger/config/prod.env
volumes:
- media:/home/wger/media
depends_on:
web:
condition: service_healthy
healthcheck:
test: celery -A wger inspect ping
interval: 10s
timeout: 5s
retries: 5

celery_beat:
image: wger/server:latest
container_name: wger_celery_beat
command: /start-beat
volumes:
- celery-beat:/home/wger/beat/
env_file:
- /volume1/docker/wger/config/prod.env
depends_on:
celery_worker:
condition: service_healthy

volumes:
postgres-data:
celery-beat:
static:
media:
redis-data:

networks:
default:
name: wger_network`
Furthermore the nginx.conf is not altered, and the prod.env is only altered with the SECRET_KEY and SIGNING_KEY.

I can access the website, which looks like this:

The wger_server docker container seems to be not working correctly, looking in the log I see the following:

After thousands of items being deleted, I get this:

The other docker containers seem to not have a lot of issues in the log, except wger_celery_worker

The console terminal of the entire stack looks like this:

What is going wrong in my configurations and how can I deal with it? First time I am using databases in a docker compose file.

bbkz · 2023-11-07T18:01:56Z

I don't know the docker-compose setup. But starting up the wger container takes a long time especialy on lower end hardware. As i'm running it on raspberry pi's and similar i had to do some tweaks.

For gunicorn not to run into a timeout, you may need to add the following environment variable:

GUNICORN_CMD_ARGS="--timeout 240 --workers=2"

A other idea would be to also disable the healthchecks , i don't know on docker compose but kubernetes will otherwise kill the container and start it again (loop).

rolandgeider · 2023-11-07T18:44:08Z

Hi! Do you get some error in the logs when opening the application? (in the web service) I just started a new instance with the default compose and conf file and everything booted up nicely:

NAME                 IMAGE                COMMAND                  SERVICE         CREATED              STATUS                        PORTS
wger_cache           redis                "docker-entrypoint.s…"   cache           About a minute ago   Up About a minute (healthy)   0.0.0.0:6379->6379/tcp
wger_celery_beat     wger/server:latest   "/start-beat"            celery_beat     About a minute ago   Up About a minute             8000/tcp
wger_celery_flower   wger/server:latest   "/start-flower"          celery_flower   About a minute ago   Up About a minute (healthy)   0.0.0.0:5555->5555/tcp, 8000/tcp
wger_celery_worker   wger/server:latest   "/start-worker"          celery_worker   About a minute ago   Up About a minute (healthy)   8000/tcp
wger_db              postgres:15-alpine   "docker-entrypoint.s…"   db              About a minute ago   Up About a minute (healthy)   0.0.0.0:5432->5432/tcp
wger_nginx           nginx:stable         "/docker-entrypoint.…"   nginx           About a minute ago   Up About a minute (healthy)   0.0.0.0:80->80/tcp, 0.0.0.0:8080->80/tcp
wger_server          wger/server:latest   "/home/wger/entrypoi…"   web             About a minute ago   Up About a minute (healthy)   8000/tcp

Somebody else had the problem that the application tried to setup the database before it was ready so some things were missing. What helped them was to drop the db volume, start the db service manually first and then all the rest (this only this first initial run, later it's not important)

clone3448 · 2023-11-08T14:41:12Z

First of all, thank you for responding.
@rolandgeider When I open the application I do not see new logs after the following logs when I rebuilded the stack (no change):

When I deleted the volume entry at the db service in the compose, and start the db service manually I have the same issue.
Do you think I should disable the healthchecks under wger_service as proposed by bbkz? Because when I did, still have the same issue. However, then I was thinking about celery_worker and celery_beat, they do not activate due to this healthcheck dependency.

@bbkz I don't think it is a problem based on lower end hardware. However I tried to add that env entry GUNICORN_CMD_ARGS="--timeout 240 --workers=2" in the prod.env file. But no difference in the result.

clone3448 · 2023-11-08T15:11:53Z

When I removed the healthcheck dependency for the celery_worker, I produced a log for that container, maybe this might help troubleshooting:

But okay, when I restored back to the first compose file. I altered the prod.env for the debugging mode DJANGO_DEBUG=True
The webpage now shows the following:

`Environment:

Request Method: GET
Request URL: http://workout.XXXXXXX.com/en/software/terms-of-service

Django Version: 4.1.9
Python Version: 3.10.6
Installed Applications:
('django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.messages',
'django.contrib.sessions',
'django.contrib.sites',
'django.contrib.staticfiles',
'django_extensions',
'storages',
'wger.config',
'wger.core',
'wger.mailer',
'wger.exercises',
'wger.gym',
'wger.manager',
'wger.nutrition',
'wger.software',
'wger.utils',
'wger.weight',
'wger.gallery',
'wger.measurements',
'captcha',
'django.contrib.sitemaps',
'easy_thumbnails',
'compressor',
'crispy_forms',
'crispy_bootstrap5',
'rest_framework',
'rest_framework.authtoken',
'django_filters',
'rest_framework_simplejwt',
'drf_spectacular',
'drf_spectacular_sidecar',
'django_bootstrap_breadcrumbs',
'corsheaders',
'axes',
'simple_history',
'django_email_verification',
'actstream',
'fontawesomefree')
Installed Middleware:
('corsheaders.middleware.CorsMiddleware',
'django.middleware.common.CommonMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'wger.utils.middleware.JavascriptAJAXRedirectionMiddleware',
'wger.utils.middleware.WgerAuthenticationMiddleware',
'wger.utils.middleware.RobotsExclusionMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
'django.middleware.locale.LocaleMiddleware',
'simple_history.middleware.HistoryRequestMiddleware',
'axes.middleware.AxesMiddleware')

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/django/core/handlers/exception.py", line 56, in inner
response = get_response(request)
File "/usr/local/lib/python3.10/dist-packages/django/core/handlers/base.py", line 220, in _get_response
response = response.render()
File "/usr/local/lib/python3.10/dist-packages/django/template/response.py", line 114, in render
self.content = self.rendered_content
File "/usr/local/lib/python3.10/dist-packages/django/template/response.py", line 92, in rendered_content
return template.render(context, self._request)
File "/usr/local/lib/python3.10/dist-packages/django/template/backends/django.py", line 61, in render
return self.template.render(context)
File "/usr/local/lib/python3.10/dist-packages/django/template/base.py", line 173, in render
with context.bind_template(self):
File "/usr/lib/python3.10/contextlib.py", line 135, in enter
return next(self.gen)
File "/usr/local/lib/python3.10/dist-packages/django/template/context.py", line 254, in bind_template
updates.update(processor(self.request))
File "/home/wger/src/wger/utils/context_processor.py", line 85, in processor
get_custom_header(request),
File "/home/wger/src/wger/utils/context_processor.py", line 126, in get_custom_header
global_gymconfig = GymConfig.objects.get(pk=1)
File "/usr/local/lib/python3.10/dist-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/django/db/models/query.py", line 650, in get
raise self.model.DoesNotExist(

Exception Type: DoesNotExist at /en/software/terms-of-service
Exception Value: GymConfig matching query does not exist.`
The logs of wger_server container talks about an internal server error:

rolandgeider · 2023-11-08T17:56:46Z

yes the gymconfig stuff, that definitely means that the database wasn't initialised properly.

sorry, I didn't mean that you remove the volume from the compose file, just to delete the volume itself and start the service manually, so like this

docker compose down
docker volume rm docker_postgres-data
docker compose up db -d # wait some seconds
docker compose up

(also you should get a medal for all the logs you provide!)

clone3448 · 2023-11-08T18:33:38Z

thank you! I expected providing as much as possible might be the best to troubleshoot :)
When you talked about not deleting the volume from the compose and just deleing the volume itself, I was then looking where the files were actually stored; they were not stored anywhere. So I changed the volume paths again to correct folders that I created now, because the folders did not exist at first. I changed the compose file to the following:
`services:
web:
image: wger/server:latest
container_name: wger_server
depends_on:
db:
condition: service_healthy
cache:
condition: service_healthy
env_file:
- /volume1/docker/wger/config/prod.env
volumes:
- static:/home/wger/static
- media:/home/wger/media
expose:
- 8000
healthcheck:
test: wget --no-verbose --tries=1 --spider http://localhost:8000
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped

nginx:
image: nginx:stable
container_name: wger_nginx
depends_on:
- web
volumes:
- /volume1/docker/wger/config/nginx.conf:/etc/nginx/conf.d/default.conf
- static:/wger/static:ro
- media:/wger/media:ro
ports:
- "8001:80"
healthcheck:
test: service nginx status
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped

db:
image: postgres:15-alpine
container_name: wger_db
environment:
- POSTGRES_USER=wger
- POSTGRES_PASSWORD=wger
- POSTGRES_DB=wger
volumes:
- /volume1/docker/wger/postgres-data:/var/lib/postgresql/data/
expose:
- 5432
healthcheck:
test: pg_isready -U wger
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped

cache:
image: redis
container_name: wger_cache
expose:
- 6379
volumes:
- redis-data:/data
healthcheck:
test: redis-cli ping
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped

celery_worker:
image: wger/server:latest
container_name: wger_celery_worker
command: /start-worker
env_file:
- /volume1/docker/wger/config/prod.env
volumes:
- media:/home/wger/media
depends_on:
web:
condition: service_healthy
healthcheck:
test: celery -A wger inspect ping
interval: 10s
timeout: 5s
retries: 5

celery_beat:
image: wger/server:latest
container_name: wger_celery_beat
command: /start-beat
volumes:
- celery-beat:/home/wger/beat/
env_file:
- /volume1/docker/wger/config/prod.env
depends_on:
celery_worker:
condition: service_healthy

volumes:
postgres-data:
celery-beat:
static:
media:
redis-data:

networks:
default:
name: wger_network`

Now I can find the db volume, and it holds files and folders! So that is some progress. Now the website looks like this, and I think this is more familiar to you:

I will check whether all features work another time, maybe tonight and update you. However, where should I look to really know it all works according to plan?

rolandgeider · 2023-11-08T18:47:26Z

the volumes are handled by docker and are stored... somewhere, but solve a lot problems with things like permissions etc. You can inspect a volume with docker volume inspect <name> if you want to know where the actual files are stored. But mapping folders manually should work as well.

You can download the exercise images with docker compose exec web python3 manage.py download-exercise-images and see if they appear (I'm not sure if we did fix the issue with the cache, they might need some time to show up), but if you can see those and the rest seems to work, you should be good to go

goodnewz · 2024-03-31T22:34:16Z

@clone3448 I had a similar problem. It turns out the first time wger starts, it does some extra setup things that require a bit more time. If it does not finish within the healthcheck interval of 5*10s, it fails the healthcheck with state unhealty. Docker provides an option for such a situation called start_period. All you do is add start_period: 300s to the healthcheck: section of the web container, and Bob is your uncle.

The first time Wger starts, it does some extra setup things that require a bit more time to finish before the health check calls it quits. This commit adds a reasonable warmup period before it starts to enforce the health checks. It addresses wger-project#67.

rolandgeider · 2024-04-01T09:30:37Z

FYI the PR with the start period is merged, hopefully this fixes it

greenbagels · 2024-05-05T13:26:54Z

@clone3448 I had a similar problem. It turns out the first time wger starts, it does some extra setup things that require a bit more time. If it does not finish within the healthcheck interval of 5*10s, it fails the healthcheck with state unhealty. Docker provides an option for such a situation called start_period. All you do is add start_period: 300s to the healthcheck: section of the web container, and Bob is your uncle.

Hi, just curious: you mentioned you need to add that start_period option to the web container, but your PR doesn't (it adds it to the nginx container). Is this intentional?

clone3448 changed the title ~~dependency failed to start: container for service "web" in unhealthy~~ dependency failed to start: container for service "web" is unhealthy Nov 7, 2023

goodnewz mentioned this issue Mar 31, 2024

Add reasonable warmup timers when first starting the containers #83

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dependency failed to start: container for service "web" is unhealthy #67

dependency failed to start: container for service "web" is unhealthy #67

clone3448 commented Nov 7, 2023 •

edited

bbkz commented Nov 7, 2023

rolandgeider commented Nov 7, 2023

clone3448 commented Nov 8, 2023 •

edited

clone3448 commented Nov 8, 2023 •

edited

rolandgeider commented Nov 8, 2023

clone3448 commented Nov 8, 2023

rolandgeider commented Nov 8, 2023

goodnewz commented Mar 31, 2024

rolandgeider commented Apr 1, 2024

greenbagels commented May 5, 2024

dependency failed to start: container for service "web" is unhealthy #67

dependency failed to start: container for service "web" is unhealthy #67

Comments

clone3448 commented Nov 7, 2023 • edited

bbkz commented Nov 7, 2023

rolandgeider commented Nov 7, 2023

clone3448 commented Nov 8, 2023 • edited

clone3448 commented Nov 8, 2023 • edited

rolandgeider commented Nov 8, 2023

clone3448 commented Nov 8, 2023

rolandgeider commented Nov 8, 2023

goodnewz commented Mar 31, 2024

rolandgeider commented Apr 1, 2024

greenbagels commented May 5, 2024

clone3448 commented Nov 7, 2023 •

edited

clone3448 commented Nov 8, 2023 •

edited

clone3448 commented Nov 8, 2023 •

edited