Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graylog fails to run after updating docker image from 4.3.3 to 4.3.4 #217

Open
MahdiGhiasi opened this issue Aug 4, 2022 · 25 comments
Open
Labels

Comments

@MahdiGhiasi
Copy link

MahdiGhiasi commented Aug 4, 2022

We've updated our Graylog docker image to latest version pushed on graylog/graylog:4.3 (which is 4.3.4), and since then the graylog container fails to start.

Inspecting docker logs shows that this seems to be the issue:

graylog_1        | adding environment opts
graylog_1        | mkdir: cannot create directory ‘/data’: Permission denied

Our elastic and mongo are running fine, but graylog container fails to start with the error message above.

We have two separate instances of Graylog (not linked, used separately by separate teams), and both have encountered this issue after upgrading. They were both working fine on v4.3.3.


We use a folder-mounted journal folder in our docker-compose file:

 graylog:
    image: graylog/graylog:4.3
    restart: always
    volumes:
      - /root/docker-data/graylog/graylog_journal:/usr/share/graylog/data/journal

Update (by @mpfz0r):

This only happens on docker versions older than 20.10.10 https://docs.docker.com/engine/release-notes/#201010
Because their default seccomp policy does not support clone3 syscalls.

@mpfz0r
Copy link
Member

mpfz0r commented Aug 4, 2022

@MahdiGhiasi
could you run your graylog in docker-compose with a debugging entry point for us and give us the output?
entrypoint: "/bin/bash -c 'find /usr/share/graylog -type d -ls'"

@mpfz0r
Copy link
Member

mpfz0r commented Aug 4, 2022

This seems to be where this happens:
But I don't see why there is a regression between 4.3.3 and 4.3.4

https://github.com/Graylog2/graylog-docker/blob/4.3/docker-entrypoint.sh#L91-L94

  # Create data directories
  for d in journal log plugin config contentpacks
  do
    dir=${GRAYLOG_HOME}/data/${d}
    [[ -d "${dir}" ]] || mkdir -p "${dir}"

    if [[ "$(stat --format='%U:%G' $dir)" != 'graylog:graylog' ]] && [[ -w "$dir" ]]; then
      chown -R graylog:graylog "$dir" || echo "Warning can not change owner to graylog:graylog"
    fi
  done
}

@coffee-squirrel
Copy link

coffee-squirrel commented Aug 4, 2022

We're currently running 2 environments on the 4.3.4 images (1 OSS, 1 Enterprise) and haven't ran into this yet.

mkdir: cannot create directory ‘/data’ almost makes it seem like ${GRAYLOG_HOME} isn't set. So far I've only seen that type of message with something like mkdir -p /data/foo (or /data, of course).

@MahdiGhiasi
Copy link
Author

@mpfz0r Here's the output for the debugging entry point you requested:

graylog_1        |   1317781      4 drwxr-xr-x   8 graylog  graylog      4096 Aug  3 19:02 /usr/share/graylog
graylog_1        |   1317784      4 drwxr-xr-x   2 graylog  graylog      4096 Aug  3 19:02 /usr/share/graylog/bin
graylog_1        |   1317797      4 drwxr-xr-x   2 graylog  graylog      4096 Aug  3 19:02 /usr/share/graylog/plugin
graylog_1        |   1317805      4 drwxr-xr-x   2 graylog  graylog      4096 Aug  3 19:02 /usr/share/graylog/plugins-merged
graylog_1        |    527453      4 drwxr-x---   7 graylog  graylog      4096 Aug  4 17:26 /usr/share/graylog/data
graylog_1        |    527460      4 drwxr-x---   2 graylog  graylog      4096 Aug  3 19:02 /usr/share/graylog/data/plugin
graylog_1        |   1703937      4 drwxrwxrwx   3 graylog  graylog      4096 Aug  4 04:20 /usr/share/graylog/data/journal
graylog_1        |   1703938      4 drwxr-xr-x   2 graylog  graylog      4096 Aug  4 04:01 /usr/share/graylog/data/journal/messagejournal-0
graylog_1        |    527457      4 drwxr-x---   2 graylog  graylog      4096 Aug  3 19:02 /usr/share/graylog/data/data
graylog_1        |    527459      4 drwxr-x---   2 graylog  graylog      4096 Aug  3 19:02 /usr/share/graylog/data/log
graylog_1        |    527454      4 drwxr-x---   2 graylog  graylog      4096 Aug  4 17:26 /usr/share/graylog/data/config
graylog_1        |   1317796      4 drwxr-xr-x   2 graylog  graylog      4096 Aug  3 18:01 /usr/share/graylog/log
graylog_1        |   1317798      4 drwxr-xr-x   2 graylog  graylog      4096 Aug  3 19:02 /usr/share/graylog/plugins-default

@MahdiGhiasi
Copy link
Author

I've also confirmed again that docker image graylog/graylog:4.3.3 can start properly but graylog/graylog:4.3.4 gives the error I described, in our environment (nothing is changed in our environment other than the docker image tag).

I've rolled back our production Graylog instance to 4.3.3 for now and it runs properly; but I'm happy to do any tests or give any details that might help finding the root cause of this in 4.3.4 on our environment.

@mpfz0r
Copy link
Member

mpfz0r commented Aug 4, 2022

@MahdiGhiasi Thanks.

Could you run entrypoint: "/bin/bash -xv /docker-entrypoint.sh"
for me?
I think @coffee-squirrel might be right. the /data error output is odd

@MahdiGhiasi
Copy link
Author

@mpfz0r There you go:

graylog_1        | + grep -q UseConcMarkSweepGC
graylog_1        | + /opt/java/openjdk/bin/java -XX:+PrintFlagsFinal
graylog_1        | adding environment opts
graylog_1        |
graylog_1        | # and add the previous saved settings to our defaults
graylog_1        | if [[ ! -z ${__GRAYLOG_SERVER_JAVA_OPTS} ]]
graylog_1        | then
graylog_1        |   echo "adding environment opts"
graylog_1        |   GRAYLOG_SERVER_JAVA_OPTS="${GRAYLOG_SERVER_JAVA_OPTS} ${__GRAYLOG_SERVER_JAVA_OPTS}"
graylog_1        |   export GRAYLOG_SERVER_JAVA_OPTS
graylog_1        | fi
graylog_1        | + [[ ! -z -Xms256m -Xmx256m ]]
graylog_1        | + echo 'adding environment opts'
graylog_1        | + GRAYLOG_SERVER_JAVA_OPTS='-Xms256m -Xmx256m -Xms256m -Xmx256m'
graylog_1        | + export GRAYLOG_SERVER_JAVA_OPTS
graylog_1        |
graylog_1        | # Convert all environment variables with names ending in __FILE into the content of
graylog_1        | # the file that they point at and use the name without the trailing __FILE.
graylog_1        | # This can be used to carry in Docker secrets.
graylog_1        | for VAR_NAME in $(env | grep '^GRAYLOG_[^=]\+__FILE=.\+' | sed -r 's/^(GRAYLOG_[^=]*)__FILE=.*/\1/g'); do
graylog_1        |   VAR_NAME_FILE="${VAR_NAME}__FILE"
graylog_1        |   if [ "${!VAR_NAME}" ]; then
graylog_1        |     echo >&2 "ERROR: Both ${VAR_NAME} and ${VAR_NAME_FILE} are set but are exclusive"
graylog_1        |     exit 1
graylog_1        |   fi
graylog_1        |   VAR_FILENAME="${!VAR_NAME_FILE}"
graylog_1        |   echo "Getting secret ${VAR_NAME} from ${VAR_FILENAME}"
graylog_1        |   if [ ! -r "${VAR_FILENAME}" ]; then
graylog_1        |     echo >&2 "ERROR: ${VAR_FILENAME} does not exist or is not readable"
graylog_1        |     exit 1
graylog_1        |   fi
graylog_1        |   export "${VAR_NAME}"="$(< "${VAR_FILENAME}")"
graylog_1        |   unset "${VAR_NAME_FILE}"
graylog_1        | done
graylog_1        | ++ env
graylog_1        | ++ sed -r 's/^(GRAYLOG_[^=]*)__FILE=.*/\1/g'
graylog_1        | ++ grep '^GRAYLOG_[^=]\+__FILE=.\+'
graylog_1        |
graylog_1        |
graylog_1        | # Delete outdated PID file
graylog_1        | [[ -e /tmp/graylog.pid ]] && rm --force /tmp/graylog.pid
graylog_1        | + [[ -e /tmp/graylog.pid ]]
graylog_1        |
graylog_1        | # check if we are inside kubernetes, Graylog should be run as statefulset and $POD_NAME env var should be defined like this
graylog_1        | #          env:
graylog_1        | #          - name: POD_NAME
graylog_1        | #            valueFrom:
graylog_1        | #              fieldRef:
graylog_1        | #                fieldPath: metadata.name
graylog_1        | # First stateful member is having pod name ended with -0, so
graylog_1        | if [[ ! -z "${POD_NAME}" ]]
graylog_1        | then
graylog_1        |  if echo "${POD_NAME}" | grep "\\-0$" >/dev/null
graylog_1        |  then
graylog_1        |    export GRAYLOG_IS_MASTER="true"
graylog_1        |  else
graylog_1        |    export GRAYLOG_IS_MASTER="false"
graylog_1        |  fi
graylog_1        | fi
graylog_1        | + [[ ! -z '' ]]
graylog_1        |
graylog_1        | # check if we are inside a nomad cluster
graylog_1        | # First member is having alloc-index 0, so
graylog_1        | if [[ ! -z "${NOMAD_ALLOC_INDEX}" ]]; then
graylog_1        |   if [ ${NOMAD_ALLOC_INDEX} == 0 ]; then
graylog_1        |     export GRAYLOG_IS_MASTER="true"
graylog_1        |   else
graylog_1        |     export GRAYLOG_IS_MASTER="false"
graylog_1        |   fi
graylog_1        | fi
graylog_1        | + [[ ! -z '' ]]
graylog_1        |
graylog_1        | # Merge plugin dirs to allow mounting of /plugin as a volume
graylog_1        | export GRAYLOG_PLUGIN_DIR=/usr/share/graylog/plugins-merged
graylog_1        | + export GRAYLOG_PLUGIN_DIR=/usr/share/graylog/plugins-merged
graylog_1        | + GRAYLOG_PLUGIN_DIR=/usr/share/graylog/plugins-merged
graylog_1        | rm -f /usr/share/graylog/plugins-merged/*
graylog_1        | + rm -f /usr/share/graylog/plugins-merged/graylog-plugin-aws-4.3.4.jar /usr/share/graylog/plugins-merged/graylog-plugin-collector-4.3.4.jar /usr/share/graylog/plugins-merged/graylog-plugin-integrations-4.3.4.jar /usr/share/graylog/plugins-merged/graylog-plugin-threatintel-4.3.4.jar /usr/share/graylog/plugins-merged/graylog-storage-elasticsearch6-4.3.4.jar /usr/share/graylog/plugins-merged/graylog-storage-elasticsearch7-4.3.4.jar
graylog_1        | find /usr/share/graylog/plugins-default/ -type f -exec cp {} /usr/share/graylog/plugins-merged/ \;
graylog_1        | + find /usr/share/graylog/plugins-default/ -type f -exec cp '{}' /usr/share/graylog/plugins-merged/ ';'
graylog_1        | find /usr/share/graylog/plugin/ -type f -exec cp {} /usr/share/graylog/plugins-merged/ \;
graylog_1        | + find /usr/share/graylog/plugin/ -type f -exec cp '{}' /usr/share/graylog/plugins-merged/ ';'
graylog_1        |
graylog_1        |
graylog_1        | setup() {
graylog_1        |   # Create data directories
graylog_1        |   for d in journal log plugin config contentpacks
graylog_1        |   do
graylog_1        |     dir=${GRAYLOG_HOME}/data/${d}
graylog_1        |     [[ -d "${dir}" ]] || mkdir -p "${dir}"
graylog_1        |
graylog_1        |     if [[ "$(stat --format='%U:%G' $dir)" != 'graylog:graylog' ]] && [[ -w "$dir" ]]; then
graylog_1        |       chown -R graylog:graylog "$dir" || echo "Warning can not change owner to graylog:graylog"
graylog_1        |     fi
graylog_1        |   done
graylog_1        | }
graylog_1        |
graylog_1        | graylog() {
graylog_1        |
graylog_1        |   exec "${JAVA_HOME}/bin/java" \
graylog_1        |     ${GRAYLOG_SERVER_JAVA_OPTS} \
graylog_1        |     -jar \
graylog_1        |     -Dlog4j.configurationFile="${GRAYLOG_HOME}/data/config/log4j2.xml" \
graylog_1        |     -Djava.library.path="${GRAYLOG_HOME}/lib/sigar/" \
graylog_1        |     -Dgraylog2.installation_source=docker \
graylog_1        |     "${GRAYLOG_HOME}/graylog.jar" \
graylog_1        |     "$@" \
graylog_1        |     -f "${GRAYLOG_HOME}/data/config/graylog.conf"
graylog_1        | }
graylog_1        |
graylog_1        | run() {
graylog_1        |   setup
graylog_1        |
graylog_1        |   # if being called without an argument assume "server" for backwards compatibility
graylog_1        |   if [ $# = 0 ]; then
graylog_1        |     graylog server "$@"
graylog_1        |   fi
graylog_1        |
graylog_1        |   graylog "$@"
graylog_1        | }
graylog_1        |
graylog_1        | run "$@"
graylog_1        | + run
graylog_1        | + setup
graylog_1        | + for d in journal log plugin config contentpacks
graylog_1        | + dir=/data/journal
graylog_1        | + [[ -d /data/journal ]]
graylog_1        | + mkdir -p /data/journal
graylog_1        | mkdir: cannot create directory ‘/data’: Permission denied
graylog_graylog_1 exited with code 1

It seems that @coffee-squirrel is right, the last few lines seem to indicate that ${GRAYLOG_HOME} is not being set properly.

@mpfz0r
Copy link
Member

mpfz0r commented Aug 4, 2022

@MahdiGhiasi Yeah, looks like it. But your output is truncated. It misses the part where /etc/profile is sourced, that's where GRAYLOG_HOME should be set

@MahdiGhiasi
Copy link
Author

@mpfz0r Oh, sorry. Here's the complete output:

#!/bin/bash

set -e
+ set -e

# save the settings over the docker(-compose) environment
__GRAYLOG_SERVER_JAVA_OPTS=${GRAYLOG_SERVER_JAVA_OPTS}
+ __GRAYLOG_SERVER_JAVA_OPTS='-Xms256m -Xmx256m'

# shellcheck disable=SC1091
source /etc/profile
+ source /etc/profile
# /etc/profile: system-wide .profile file for the Bourne shell (sh(1))
# and Bourne compatible shells (bash(1), ksh(1), ash(1), ...).

if [ "${PS1-}" ]; then
  if [ "${BASH-}" ] && [ "$BASH" != "/bin/sh" ]; then
    # The file bash.bashrc already sets the default PS1.
    # PS1='\h:\w\$ '
    if [ -f /etc/bash.bashrc ]; then
      . /etc/bash.bashrc
    fi
  else
    if [ "$(id -u)" -eq 0 ]; then
      PS1='# '
    else
      PS1='$ '
    fi
  fi
fi
++ '[' '' ']'

if [ -d /etc/profile.d ]; then
  for i in /etc/profile.d/*.sh; do
    if [ -r $i ]; then
      . $i
    fi
  done
  unset i
fi
++ '[' -d /etc/profile.d ']'
++ for i in /etc/profile.d/*.sh
++ '[' -r /etc/profile.d/01-locale-fix.sh ']'
++ for i in /etc/profile.d/*.sh
++ '[' -r /etc/profile.d/graylog.sh ']'
++ unset i

#Set default GC
if [[ -z ${GRAYLOG_DOCKER_DISABLE_CMS_GC} ]]
then
  if "${JAVA_HOME}/bin/java" -XX:+PrintFlagsFinal 2>&1 |grep -q UseParNewGC; then
    GRAYLOG_SERVER_JAVA_OPTS="${GRAYLOG_SERVER_JAVA_OPTS} -XX:+UseParNewGC"
    export GRAYLOG_SERVER_JAVA_OPTS
  fi
  if "${JAVA_HOME}/bin/java" -XX:+PrintFlagsFinal 2>&1 |grep -q UseConcMarkSweepGC; then
    GRAYLOG_SERVER_JAVA_OPTS="${GRAYLOG_SERVER_JAVA_OPTS} -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled"
    export GRAYLOG_SERVER_JAVA_OPTS
  fi
fi
+ [[ -z '' ]]
+ /opt/java/openjdk/bin/java -XX:+PrintFlagsFinal
+ grep -q UseParNewGC
+ /opt/java/openjdk/bin/java -XX:+PrintFlagsFinal
+ grep -q UseConcMarkSweepGC

# and add the previous saved settings to our defaults
if [[ ! -z ${__GRAYLOG_SERVER_JAVA_OPTS} ]]
then
  echo "adding environment opts"
  GRAYLOG_SERVER_JAVA_OPTS="${GRAYLOG_SERVER_JAVA_OPTS} ${__GRAYLOG_SERVER_JAVA_OPTS}"
  export GRAYLOG_SERVER_JAVA_OPTS
fi
+ [[ ! -z -Xms256m -Xmx256m ]]
+ echo 'adding environment opts'
adding environment opts
+ GRAYLOG_SERVER_JAVA_OPTS='-Xms256m -Xmx256m -Xms256m -Xmx256m'
+ export GRAYLOG_SERVER_JAVA_OPTS

# Convert all environment variables with names ending in __FILE into the content of
# the file that they point at and use the name without the trailing __FILE.
# This can be used to carry in Docker secrets.
for VAR_NAME in $(env | grep '^GRAYLOG_[^=]\+__FILE=.\+' | sed -r 's/^(GRAYLOG_[^=]*)__FILE=.*/\1/g'); do
  VAR_NAME_FILE="${VAR_NAME}__FILE"
  if [ "${!VAR_NAME}" ]; then
    echo >&2 "ERROR: Both ${VAR_NAME} and ${VAR_NAME_FILE} are set but are exclusive"
    exit 1
  fi
  VAR_FILENAME="${!VAR_NAME_FILE}"
  echo "Getting secret ${VAR_NAME} from ${VAR_FILENAME}"
  if [ ! -r "${VAR_FILENAME}" ]; then
    echo >&2 "ERROR: ${VAR_FILENAME} does not exist or is not readable"
    exit 1
  fi
  export "${VAR_NAME}"="$(< "${VAR_FILENAME}")"
  unset "${VAR_NAME_FILE}"
done
++ sed -r 's/^(GRAYLOG_[^=]*)__FILE=.*/\1/g'
++ grep '^GRAYLOG_[^=]\+__FILE=.\+'
++ env


# Delete outdated PID file
[[ -e /tmp/graylog.pid ]] && rm --force /tmp/graylog.pid
+ [[ -e /tmp/graylog.pid ]]

# check if we are inside kubernetes, Graylog should be run as statefulset and $POD_NAME env var should be defined like this
#          env:
#          - name: POD_NAME
#            valueFrom:
#              fieldRef:
#                fieldPath: metadata.name
# First stateful member is having pod name ended with -0, so
if [[ ! -z "${POD_NAME}" ]]
then
 if echo "${POD_NAME}" | grep "\\-0$" >/dev/null
 then
   export GRAYLOG_IS_MASTER="true"
 else
   export GRAYLOG_IS_MASTER="false"
 fi
fi
+ [[ ! -z '' ]]

# check if we are inside a nomad cluster
# First member is having alloc-index 0, so
if [[ ! -z "${NOMAD_ALLOC_INDEX}" ]]; then
  if [ ${NOMAD_ALLOC_INDEX} == 0 ]; then
    export GRAYLOG_IS_MASTER="true"
  else
    export GRAYLOG_IS_MASTER="false"
  fi
fi
+ [[ ! -z '' ]]

# Merge plugin dirs to allow mounting of /plugin as a volume
export GRAYLOG_PLUGIN_DIR=/usr/share/graylog/plugins-merged
+ export GRAYLOG_PLUGIN_DIR=/usr/share/graylog/plugins-merged
+ GRAYLOG_PLUGIN_DIR=/usr/share/graylog/plugins-merged
rm -f /usr/share/graylog/plugins-merged/*
+ rm -f '/usr/share/graylog/plugins-merged/*'
find /usr/share/graylog/plugins-default/ -type f -exec cp {} /usr/share/graylog/plugins-merged/ \;
+ find /usr/share/graylog/plugins-default/ -type f -exec cp '{}' /usr/share/graylog/plugins-merged/ ';'
find /usr/share/graylog/plugin/ -type f -exec cp {} /usr/share/graylog/plugins-merged/ \;
+ find /usr/share/graylog/plugin/ -type f -exec cp '{}' /usr/share/graylog/plugins-merged/ ';'


setup() {
  # Create data directories
  for d in journal log plugin config contentpacks
  do
    dir=${GRAYLOG_HOME}/data/${d}
    [[ -d "${dir}" ]] || mkdir -p "${dir}"

    if [[ "$(stat --format='%U:%G' $dir)" != 'graylog:graylog' ]] && [[ -w "$dir" ]]; then
      chown -R graylog:graylog "$dir" || echo "Warning can not change owner to graylog:graylog"
    fi
  done
}

graylog() {

  exec "${JAVA_HOME}/bin/java" \
    ${GRAYLOG_SERVER_JAVA_OPTS} \
    -jar \
    -Dlog4j.configurationFile="${GRAYLOG_HOME}/data/config/log4j2.xml" \
    -Djava.library.path="${GRAYLOG_HOME}/lib/sigar/" \
    -Dgraylog2.installation_source=docker \
    "${GRAYLOG_HOME}/graylog.jar" \
    "$@" \
    -f "${GRAYLOG_HOME}/data/config/graylog.conf"
}

run() {
  setup

  # if being called without an argument assume "server" for backwards compatibility
  if [ $# = 0 ]; then
    graylog server "$@"
  fi

  graylog "$@"
}

run "$@"
+ run
+ setup
+ for d in journal log plugin config contentpacks
+ dir=/data/journal
+ [[ -d /data/journal ]]
+ mkdir -p /data/journal
mkdir: cannot create directory ‘/data’: Permission denied

@mpfz0r
Copy link
Member

mpfz0r commented Aug 4, 2022

@MahdiGhiasi
looks like /etc/profile.d/graylog.sh is not readable for some reason.

what is the output of entrypoint: "/bin/bash -c 'ls -l /etc/profile.d/'"

@MahdiGhiasi
Copy link
Author

@mpfz0r

graylog_1        | total 8
graylog_1        | -rw-r--r-- 1 root root  96 Oct 15  2021 01-locale-fix.sh
graylog_1        | -rw-r--r-- 1 root root 564 Aug  3 19:02 graylog.sh

@mpfz0r
Copy link
Member

mpfz0r commented Aug 4, 2022

@MahdiGhiasi entrypoint: "/bin/bash -c 'ls -ld /etc/profile.d/; ls -ld /etc; ls -ld /'"

@MahdiGhiasi
Copy link
Author

@mpfz0r

graylog_1        | drwxr-xr-x 1 root root 4096 Aug  3 19:02 /etc/profile.d/
graylog_1        | drwxr-xr-x 1 root root 4096 Aug  4 18:24 /etc
graylog_1        | drwxr-xr-x 1 root root 4096 Aug  4 18:24 /

@mpfz0r
Copy link
Member

mpfz0r commented Aug 4, 2022

@MahdiGhiasi entrypoint: "/bin/bash -c '[ -r /etc/profile.d/graylog.sh ] && echo YEAH'"

@MahdiGhiasi
Copy link
Author

@mpfz0r This does not print YEAH.

image

@mpfz0r
Copy link
Member

mpfz0r commented Aug 4, 2022

@MahdiGhiasi hmm, that's weird.. And for now I'm out of ideas. Maybe some one else?
Which docker version are you running?

@MahdiGhiasi
Copy link
Author

MahdiGhiasi commented Aug 4, 2022

@mpfz0r Docker version 20.10.5 (build 55c4c88), running on Ubuntu 20.04.1 LTS.


On a maybe related note, I've also tried to upgrade another machine to Graylog 4.3.4 from 4.3.3, this one also fails but for an entirely different reason! (This machine is running Docker version 20.10.8 build 3967b7d, Ubuntu 20.04.2 LTS)

This one passes the adding environment opts successfully, but java fails to start in the container due to insufficient memory.

However, heap size is set to 1.5GB (-Xms1536m -Xmx1536m) and there's at least 8GB free memory on this server, so I don't know why it's complaining about that.

And this one also works fine on 4.3.3, but breaks on 4.3.4.

Here's the log for that:

graylog_1        | [0.003s][warning][os,thread] Failed to start thread "GC Thread#0" - pthread_create failed (EPERM) for attributes: stacksize: 1024k, guardsize: 4k, detached.
graylog_1        | #
graylog_1        | # There is insufficient memory for the Java Runtime Environment to continue.
graylog_1        | # Cannot create worker GC thread. Out of system resources.
graylog_1        | # An error report file with more information is saved as:
graylog_1        | # /usr/share/graylog/hs_err_pid8.log
graylog_1        | adding environment opts
graylog_1        | [0.003s][warning][os,thread] Failed to start thread "GC Thread#0" - pthread_create failed (EPERM) for attributes: stacksize: 1024k, guardsize: 4k, detached.
graylog_1        | #
graylog_1        | # There is insufficient memory for the Java Runtime Environment to continue.
graylog_1        | # Cannot create worker GC thread. Out of system resources.
graylog_1        | # Can not save log file, dump to screen..
graylog_1        | #
graylog_1        | # There is insufficient memory for the Java Runtime Environment to continue.
graylog_1        | # Cannot create worker GC thread. Out of system resources.
graylog_1        | # Possible reasons:
graylog_1        | #   The system is out of physical RAM or swap space
graylog_1        | #   The process is running with CompressedOops enabled, and the Java Heap may be blocking the growth of the native heap
graylog_1        | # Possible solutions:
graylog_1        | #   Reduce memory load on the system
graylog_1        | #   Increase physical memory or swap space
graylog_1        | #   Check if swap backing store is full
graylog_1        | #   Decrease Java heap size (-Xmx/-Xms)
graylog_1        | #   Decrease number of Java threads
graylog_1        | #   Decrease Java thread stack sizes (-Xss)
graylog_1        | #   Set larger code cache with -XX:ReservedCodeCacheSize=
graylog_1        | #   JVM is running with Unscaled Compressed Oops mode in which the Java heap is
graylog_1        | #     placed in the first 4GB address space. The Java Heap base address is the
graylog_1        | #     maximum limit for the native heap growth. Please use -XX:HeapBaseMinAddress
graylog_1        | #     to set the Java Heap base and to place the Java Heap above 4GB virtual address.
graylog_1        | # This output file may be truncated or incomplete.
graylog_1        | #
graylog_1        | #  Out of Memory Error (workerManager.hpp:70), pid=7, tid=7

@coffee-squirrel
Copy link

coffee-squirrel commented Aug 4, 2022

Since you're on Docker 20.10.5 (released 2021-03-02) and 20.10.8 (released 2021-08-03), I'd suggest trying to get that upgraded to at least 20.10.10 (released 2021-10-25; latest is 20.10.17 released 2022-06-06). Doing a bit of searching on the second issue, I found https://stackoverflow.com/a/72841934 (and therefore adoptium/containers#215), which seems like it could be related (potentially to both issues) given 4.3.4 is now based upon the Jammy/22.0.4 variant of the eclipse-temurin images.

@pschichtel
Copy link

What @coffee-squirrel said, except that we have also noticed this behavior with alpine based temurin images.

@mpfz0r
Copy link
Member

mpfz0r commented Aug 16, 2022

@MahdiGhiasi
I still have no idea what's causing this, but out of the blue, could you try this for me?
entrypoint: "/bin/cat /etc/profile.d/graylog.sh"

@pschichtel
Copy link

@mpfz0r I think the profile-script thing is a red herring. The root cause is the fact that the new eclipse-temurin base-image uses a new glibc version that uses the clone3 syscall, which is blocked by docker's default seccomp policy, which was updated with 20.10.10. so all older docker versions will fail with the same issue.

If you are able to downgrade your docker version to 20.10.9 or older you should be able to reproduce this issue.

@mpfz0r
Copy link
Member

mpfz0r commented Aug 16, 2022

@pschichtel Thanks! That makes a lot of sense. For reference, it's this comment: adoptium/containers#215 (comment) in particular

OK, so I guess we can close this case. Unless we need to support docker versions older than 20.10.10

@mpfz0r
Copy link
Member

mpfz0r commented Aug 16, 2022

@MahdiGhiasi

Can you update your docker version?
If not, you can try running it unconfined as workaround:
https://stackoverflow.com/questions/46053672/set-secomp-to-unconfined-in-docker-compose

@MahdiGhiasi
Copy link
Author

I can confirm that updating docker to version 20.10.17 solves both issues we were facing.

@mpfz0r
Copy link
Member

mpfz0r commented Aug 17, 2022

Great. I'm gonna keep this open in case more people are running into this.
In the worst case we could switch to the temurin focal docker images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants