Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persist database in docker image #18

Open
hallo02 opened this issue Apr 30, 2019 · 1 comment
Open

Persist database in docker image #18

hallo02 opened this issue Apr 30, 2019 · 1 comment

Comments

@hallo02
Copy link

hallo02 commented Apr 30, 2019

Hey there

What would be the drawbacks if the desired pbf file would be processed and loaded into postgres during the image build?
Therefore, the docker-entrypoint.sh wouldn't distinguish between a "CREATE" and "RESTORE" mode, it would just start postgresql. The origin approach of restoring data using GKE would be needlessly.

Obviously the image size could be enormous.

I tried the approach successfully with https://download.geofabrik.de/europe/liechtenstein-latest.osm.pbf (2.2MB).
docker.entrypoint.sh got reduced basically to

#!/bin/bash
# Start PostgreSQL
service postgresql start

# Tail Apache logs
# tail -f /var/log/apache2/* &

# Run Apache in the foreground
/usr/sbin/apache2ctl -D FOREGROUND

The Dockerfile got expanded, like

# Import pbf and import database
ARG NOMINATIM_PBF_URL

RUN NOMINATIM_DATA_PATH=${NOMINATIM_DATA_PATH:="/srv/nominatim/data"} \
    && NOMINATIM_DATA_LABEL=${NOMINATIM_DATA_LABEL:="data"} \
    && NOMINATIM_PBF_URL=${NOMINATIM_PBF_URL:="http://download.geofabrik.de/europe/switzerland-latest.osm.pbf"} \
    && NOMINATIM_POSTGRESQL_DATA_PATH=${NOMINATIM_POSTGRESQL_DATA_PATH:="/var/lib/postgresql/9.3/main"} \
    && curl -L $NOMINATIM_PBF_URL --create-dirs -o $NOMINATIM_DATA_PATH/$NOMINATIM_DATA_LABEL.osm.pbf \
    && chmod 755 $NOMINATIM_DATA_PATH \
    && service postgresql start \
    && sudo -u postgres psql postgres -tAc "SELECT 1 FROM pg_roles WHERE rolname='nominatim'" | grep -q 1 || sudo -u postgres createuser -s nominatim \
    && sudo -u postgres psql postgres -tAc "SELECT 1 FROM pg_roles WHERE rolname='www-data'" | grep -q 1 || sudo -u postgres createuser -SDR www-data \
    && sudo -u postgres psql postgres -c "DROP DATABASE IF EXISTS nominatim" \
    && useradd -m -p password1234 nominatim \
    && sudo -u nominatim /srv/nominatim/build/utils/setup.php --osm-file $NOMINATIM_DATA_PATH/$NOMINATIM_DATA_LABEL.osm.pbf --all --threads 2

For a 2nd approach I took https://download.geofabrik.de/europe/switzerland-latest.osm.pbf (295MB).
The image creation took around 90minutes. But I faced the following problem: moby/moby#22610.
It seems the host space got filled up during container startup until kubernetes killed the pod. I will provide more information about the reason the next days. It is not about data from within the container. Maybe, like the issue points out, it is about accumulated container logs.

Thank you for your thoughts.

@peter-evans
Copy link
Owner

Hi @hallo02

What would be the drawbacks if the desired pbf file would be processed and loaded into postgres during the image build?

I guess the main drawbacks are that you need to create a new image every time you want to update and the size of the image would be enormous, as you pointed out. However, if it works for your use case go for it.

What might help in your case is to load into Postgres during the as builder image and then copy the Postgres data files to the final image. This might leave behind a lot of accumulated logs and dependencies that you don't need when the container is being run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants