Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluentd worker crashing on startup when connecting to Graylog #1479

Open
sumith-aeropost opened this issue Jan 19, 2024 · 4 comments
Open

Fluentd worker crashing on startup when connecting to Graylog #1479

sumith-aeropost opened this issue Jan 19, 2024 · 4 comments

Comments

@sumith-aeropost
Copy link

sumith-aeropost commented Jan 19, 2024

Describe the bug

We've installed Fluentd in our AWS EKS cluster, connecting to Graylog, and it was functioning well. However, two days ago, the fluentd worker unexpectedly crashed. Fluentd pod logs consistently display the following messages:

2024-01-19 04:20:50 +0000 [error]: #0 unexpected error error_class=NameError error="uninitialized constant GELF::Notifier::Fixnum"
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/gelf-3.0.0/lib/gelf/notifier.rb:65:in `level='
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/gelf-3.0.0/lib/gelf/notifier.rb:24:in `initialize'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-gelf-hs-1.0.8/lib/fluent/plugin/out_gelf.rb:52:in `new'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-gelf-hs-1.0.8/lib/fluent/plugin/out_gelf.rb:52:in `start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/compat/call_super_mixin.rb:42:in `start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:203:in `block in start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:192:in `block (2 levels) in lifecycle'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:191:in `each'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:191:in `block in lifecycle'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:178:in `each'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:178:in `lifecycle'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:202:in `start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/engine.rb:248:in `start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/engine.rb:147:in `run'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:617:in `block in run_worker'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:962:in `main_process'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:608:in `run_worker'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/command/fluentd.rb:372:in `<top (required)>'
  2024-01-19 04:20:50 +0000 [error]: #0 <internal:/usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
  2024-01-19 04:20:50 +0000 [error]: #0 <internal:/usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/bin/fluentd:15:in `<top (required)>'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/bin/fluentd:25:in `load'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/bin/fluentd:25:in `<main>'
2024-01-19 04:20:50 +0000 [error]: Worker 0 exited unexpectedly with status 1
Logs from 19/01/2024, 09:50:44

Any help would be appreciated on how we could fix this, can give further logs/code if necessary.

To Reproduce

Fluentd Pod logs

2024-01-19 04:20:50 +0000 [error]: #0 unexpected error error_class=NameError error="uninitialized constant GELF::Notifier::Fixnum"
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/gelf-3.0.0/lib/gelf/notifier.rb:65:in `level='
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/gelf-3.0.0/lib/gelf/notifier.rb:24:in `initialize'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-gelf-hs-1.0.8/lib/fluent/plugin/out_gelf.rb:52:in `new'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-gelf-hs-1.0.8/lib/fluent/plugin/out_gelf.rb:52:in `start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/compat/call_super_mixin.rb:42:in `start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:203:in `block in start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:192:in `block (2 levels) in lifecycle'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:191:in `each'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:191:in `block in lifecycle'

Expected behavior

fluentd needs to connect graylog instance. It was working fine for long time, suddenly crashed.

Your Environment

- Tag of using fluentd-kubernetes-daemonset: v1-debian-graylog

Your Configuration

fluentd.yaml


#ref: https://github.com/fluent/fluentd-kubernetes-daemonset (fcdf045)

# create an identity for fluentd
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-system

# grant fluentd permissions to read, list, and watch pods and namespaces in Kubernetes cluster
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
  namespace: kube-system
rules:
  - apiGroups:
      - ""
    resources:
      - pods
      - namespaces
    verbs:
      - get
      - list
      - watch

# bind the fluentd ServiceAccount to these permissions using the ClusterRoleBinding
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
  - kind: ServiceAccount
    name: fluentd
    namespace: kube-system

# deploy fluentd DaemonSet
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
    version: v1
spec:
  selector:
    matchLabels:
      k8s-app: fluentd-logging
      version: v1
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        k8s-app: fluentd-logging
        version: v1
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      # Enable tolerations if you want to run daemonset on master nodes.
      # Recommended to disable on managed k8s.
      # tolerations:
      # - key: node-role.kubernetes.io/master
      #   effect: NoSchedule
      containers:
        - name: fluentd
          image: fluent/fluentd-kubernetes-daemonset:v1-debian-graylog
          imagePullPolicy: IfNotPresent
          env:
            - name: FLUENT_GRAYLOG_HOST
              value: "log.int.*****.com"
            - name: FLUENT_GRAYLOG_PORT
              value: "12208"
            - name: FLUENT_GRAYLOG_PROTOCOL
              value: "udp"
            - name: FLUENTD_SYSTEMD_CONF
              value: "disable"
          resources:
            requests:
              cpu: 200m
              memory: 0.5Gi
            limits:
              # ===========
              # Less memory leads to child process problems.
              cpu: 1000m
              memory: 1Gi
          volumeMounts:
            - name: varlog
              mountPath: /var/log
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
          securityContext:
              privileged: true
      terminationGracePeriodSeconds: 30
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers


### Your Error Log

```shell
2024-01-19 04:20:50 +0000 [error]: #0 unexpected error error_class=NameError error="uninitialized constant GELF::Notifier::Fixnum"
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/gelf-3.0.0/lib/gelf/notifier.rb:65:in `level='
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/gelf-3.0.0/lib/gelf/notifier.rb:24:in `initialize'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-gelf-hs-1.0.8/lib/fluent/plugin/out_gelf.rb:52:in `new'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluent-plugin-gelf-hs-1.0.8/lib/fluent/plugin/out_gelf.rb:52:in `start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/compat/call_super_mixin.rb:42:in `start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:203:in `block in start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:192:in `block (2 levels) in lifecycle'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:191:in `each'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:191:in `block in lifecycle'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:178:in `each'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:178:in `lifecycle'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/root_agent.rb:202:in `start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/engine.rb:248:in `start'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/engine.rb:147:in `run'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:617:in `block in run_worker'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:962:in `main_process'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/supervisor.rb:608:in `run_worker'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/lib/fluent/command/fluentd.rb:372:in `<top (required)>'
  2024-01-19 04:20:50 +0000 [error]: #0 <internal:/usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
  2024-01-19 04:20:50 +0000 [error]: #0 <internal:/usr/local/lib/ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/gems/fluentd-1.16.3/bin/fluentd:15:in `<top (required)>'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/bin/fluentd:25:in `load'
  2024-01-19 04:20:50 +0000 [error]: #0 /fluentd/vendor/bundle/ruby/3.2.0/bin/fluentd:25:in `<main>'
2024-01-19 04:20:50 +0000 [error]: Worker 0 exited unexpectedly with status 1
Logs from 19/01/2024, 09:50:44


### Additional context

_No response_
@sumith-aeropost sumith-aeropost changed the title Fluentd worker crashing on startup when trying to connect to Graylog Fluentd worker crashing on startup when connecting to Graylog Jan 19, 2024
@AleksanderGrzybowski
Copy link

Hi, I just got the same error when using this image. I'm not a Ruby programmer, but I've read somewhere that Fixnum class is deprecated. Maybe there is some Ruby version or GELF plugin version mismatch? If you check https://github.com/graylog-labs/gelf-rb/blob/master/lib/gelf/notifier.rb then you'll see there is Integer there. But the code in container is using Fixnum.

I'll try to update stuff in image to newest versions in custom Dockerfile. Maybe this will do the trick.

@AleksanderGrzybowski
Copy link

I've managed to work around this issue via the following Dockerfile + setting LD_PRELOAD="" to fix some other issue. This works for me:
RUN gem install gelf RUN gem install fluent-plugin-gelf-hs

@kemalceng
Copy link

kemalceng commented Mar 11, 2024

I've managed to work around this issue via the following Dockerfile + setting LD_PRELOAD="" to fix some other issue. This works for me: RUN gem install gelf RUN gem install fluent-plugin-gelf-hs

gem install gelf fluent-plugin-gelf-hs worked for us too. The difference was 3.1.0 version of gelf instead of 3.0.0. Manually changed the version in Gemfile used by docker image and it worked.

@mszyzdek
Copy link

This is the chain of related events that led to the disaster:

Good news is that unlucky Fixnum was removed in last 3.0.1 gelf gem version on commit that should prepare it to ruby 2.4 deprecation:
graylog-labs/gelf-rb@7cc3cbb
so maybe all to do is to bump up gelf version in Gemfile.erb in this project

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants