Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

free(): invalid pointer with latest fluent/fluentd-kubernetes-daemonset:v1-debian-forward-arm64 image #1478

Open
smparekh opened this issue Jan 17, 2024 · 4 comments

Comments

@smparekh
Copy link

Describe the bug

Using the latest v1-debian-forward-arm64 image results in the container throwing free(): invalid pointer and constantly restarting leading to a node eviction

To Reproduce

I have provided a redacted config to reproduce

Expected behavior

Worker should comeup and stay up

Your Environment

- Tag of using fluentd-kubernetes-daemonset:v1-debian-forward-arm64

Your Configuration

@include "#{ENV['FLUENTD_SYSTEMD_CONF'] || 'systemd'}.conf"
    @include "#{ENV['FLUENTD_PROMETHEUS_CONF'] || 'prometheus'}.conf"
    @include conf.d/*.

    <label @FLUENT_LOG>
      <match fluent.**>
        @type null
        @id ignore_fluent_logs
      </match>
    </label>

    <match kubelet>
      @type null
    </match>

    <filter kubernetes.**>
      @type kubernetes_metadata
      @id filter_kube_metadata
      kubernetes_url "#{ENV['FLUENT_FILTER_KUBERNETES_URL'] || 'https://' + ENV.fetch('KUBERNETES_SERVICE_HOST') + ':' + ENV.fetch('KUBERNETES_SERVICE_PORT') + '/api'}"
      verify_ssl "#{ENV['KUBERNETES_VERIFY_SSL'] || true}"
      ca_file "#{ENV['KUBERNETES_CA_FILE']}"
      skip_labels "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_LABELS'] || 'false'}"
      skip_container_metadata "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_CONTAINER_METADATA'] || 'false'}"
      skip_master_url "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_MASTER_URL'] || 'false'}"
      skip_namespace_metadata "#{ENV['FLUENT_KUBERNETES_METADATA_SKIP_NAMESPACE_METADATA'] || 'false'}"
      watch "#{ENV['FLUENT_KUBERNETES_WATCH'] || 'true'}"
    </filter>

    <source>
      @type tail
      @id in_tail_container_logs
      path "#{ENV['FLUENT_CONTAINER_TAIL_PATH'] || '/var/log/containers/*.log'}"
      pos_file "#{File.join('/var/log/', ENV.fetch('FLUENT_POS_EXTRA_DIR', ''), 'fluentd-containers.log.pos')}"
      tag "#{ENV['FLUENT_CONTAINER_TAIL_TAG'] || 'kubernetes.*'}"
      exclude_path "#{ENV['FLUENT_CONTAINER_TAIL_EXCLUDE_PATH'] || use_default}"
      read_from_head true
      <parse>
        @type "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TYPE'] || 'json'}"
        time_format "#{ENV['FLUENT_CONTAINER_TAIL_PARSER_TIME_FORMAT'] || '%Y-%m-%dT%H:%M:%S.%NZ'}"
      </parse>
    </source>

    <filter qfunctions.**>
      @type record_transformer
      enable_ruby true
      <record>
        message ${record["message"].gsub(/^.*std(out|err):\s/, '')}
      </record>
    </filter>

    <filter qfunctions.**>
      @type parser
      format json
      key_name message
      emit_invalid_record_to_error false
    </filter>

    <match qfunctions.**>
      @type rewrite_tag_filter
      <rule>
        key tenant_id
        pattern /^abc1234$/
        tag abc1234
      </rule>
      <rule>
        key tenant_id
        pattern /.+/
        tag clear
      </rule>
    </match>
    <match abc1234.**>
      @type http
      @id out_abc1234
      @log_level info
      
      endpoint "#{ENV['ENDPOINT']}"
      http_method post
      content_type application/json
      json_array true
      <format>
        @type json
      </format>
      headers {"X-P-Stream": "functions", "X-P-Meta-Org-Id": "abc1234"}
      <auth>
        method basic
        username "#{ENV['USERNAME']}"
        password "#{ENV['PASSWORD']}"
      </auth>
    </match>

    <match clear>
      @type null
    </match>


### Your Error Log

```shell
2024-01-17 15:48:14 +0000 [error]: Worker 0 exited unexpectedly with signal SIGABRT
2024-01-17 15:48:15 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil
2024-01-17 15:48:15 +0000 [info]: adding match in @FLUENT_LOG pattern="fluent.**" type="null"
2024-01-17 15:48:15 +0000 [info]: adding match pattern="kubelet" type="null"
2024-01-17 15:48:15 +0000 [info]: adding filter pattern="kubernetes.**" type="kubernetes_metadata"
2024-01-17 15:48:15 +0000 [info]: adding filter pattern="qfunctions.**" type="record_transformer"
2024-01-17 15:48:15 +0000 [info]: adding filter pattern="qfunctions.**" type="parser"
2024-01-17 15:48:15 +0000 [info]: adding match pattern="qfunctions.**" type="rewrite_tag_filter"
2024-01-17 15:48:15 +0000 [info]: #0 adding rewrite_tag_filter rule: tenant_id [#<Fluent::PluginHelper::RecordAccessor::Accessor:0x0000ffff7b7b91b8 @keys="tenant_id">, /^abc1234$/, "", "abc1234", nil]
2024-01-17 15:48:15 +0000 [info]: #0 adding rewrite_tag_filter rule: tenant_id [#<Fluent::PluginHelper::RecordAccessor::Accessor:0x0000ffff7b7b8790 @keys="tenant_id">, /.+/, "", "clear", nil]
2024-01-17 15:48:15 +0000 [info]: adding match pattern="abc1234.**" type="http"
2024-01-17 15:48:15 +0000 [warn]: #0 [out_abc1234] Status code 503 is going to be removed from default `retryable_response_codes` from fluentd v2. Please add it by yourself if you wish
2024-01-17 15:48:15 +0000 [info]: adding match pattern="clear" type="null"
2024-01-17 15:48:15 +0000 [info]: adding source type="systemd"
2024-01-17 15:48:15 +0000 [info]: adding source type="systemd"
2024-01-17 15:48:15 +0000 [info]: adding source type="systemd"
2024-01-17 15:48:15 +0000 [info]: adding source type="prometheus"
2024-01-17 15:48:15 +0000 [info]: adding source type="prometheus_output_monitor"
2024-01-17 15:48:15 +0000 [info]: adding source type="tail"
2024-01-17 15:48:15 +0000 [info]: #0 starting fluentd worker pid=361 ppid=6 worker=0
2024-01-17 15:48:15 +0000 [info]: #0 [in_tail_container_logs] following tail of /var/log/containers/contact-task-runtime-5cbd49696c-fmqkz_openfaas-fn_contact-task-runtime-90840620b3e6f1d26b85a666402b31aa3a5d5f9faf8f2388c919c87c5ce082a1.log
2024-01-17 15:48:15 +0000 [info]: #0 [in_tail_container_logs] following tail of /var/log/containers/ground-task-runtime-65446d7bcc-527dl_openfaas-fn_ground-task-runtime-b12db1d88da3a582965a7ff372367d9676e9e640f505694022c6f5da97649e46.log
2024-01-17 15:48:15 +0000 [info]: #0 fluentd worker is now running worker=0
free(): invalid pointer
2024-01-17 15:48:17 +0000 [error]: Worker 0 exited unexpectedly with signal SIGABRT
2024-01-17 15:48:18 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil
2024-01-17 15:48:18 +0000 [info]: adding match in @FLUENT_LOG pattern="fluent.**" type="null"
2024-01-17 15:48:18 +0000 [info]: adding match pattern="kubelet" type="null"
2024-01-17 15:48:18 +0000 [info]: adding filter pattern="kubernetes.**" type="kubernetes_metadata"
2024-01-17 15:48:18 +0000 [info]: adding filter pattern="qfunctions.**" type="record_transformer"
2024-01-17 15:48:18 +0000 [info]: adding filter pattern="qfunctions.**" type="parser"
2024-01-17 15:48:18 +0000 [info]: adding match pattern="qfunctions.**" type="rewrite_tag_filter"
2024-01-17 15:48:18 +0000 [info]: #0 adding rewrite_tag_filter rule: tenant_id [#<Fluent::PluginHelper::RecordAccessor::Accessor:0x0000ffff8cd245b0 @keys="tenant_id">, /^org_2Jf4UxF6FEwCMecX$/, "", "abc1234", nil]
2024-01-17 15:48:18 +0000 [info]: #0 adding rewrite_tag_filter rule: tenant_id [#<Fluent::PluginHelper::RecordAccessor::Accessor:0x0000ffff8cd23f98 @keys="tenant_id">, /.+/, "", "clear", nil]
2024-01-17 15:48:18 +0000 [info]: adding match pattern="abc1234.**" type="http"
2024-01-17 15:48:18 +0000 [warn]: #0 [out_abc1234] Status code 503 is going to be removed from default `retryable_response_codes` from fluentd v2. Please add it by yourself if you wish
2024-01-17 15:48:18 +0000 [info]: adding match pattern="clear" type="null"
2024-01-17 15:48:18 +0000 [info]: adding source type="systemd"
2024-01-17 15:48:18 +0000 [info]: adding source type="systemd"
2024-01-17 15:48:18 +0000 [info]: adding source type="systemd"
2024-01-17 15:48:18 +0000 [info]: adding source type="prometheus"
2024-01-17 15:48:18 +0000 [info]: adding source type="prometheus_output_monitor"
2024-01-17 15:48:18 +0000 [info]: adding source type="tail"
2024-01-17 15:48:18 +0000 [info]: #0 starting fluentd worker pid=376 ppid=6 worker=0
2024-01-17 15:48:18 +0000 [info]: #0 [in_tail_container_logs] following tail of /var/log/containers/contact-task-runtime-5cbd49696c-fmqkz_openfaas-fn_contact-task-runtime-90840620b3e6f1d26b85a666402b31aa3a5d5f9faf8f2388c919c87c5ce082a1.log
2024-01-17 15:48:18 +0000 [info]: #0 [in_tail_container_logs] following tail of /var/log/containers/ground-task-runtime-65446d7bcc-527dl_openfaas-fn_ground-task-runtime-b12db1d88da3a582965a7ff372367d9676e9e640f505694022c6f5da97649e46.log
2024-01-17 15:48:18 +0000 [info]: #0 fluentd worker is now running worker=0
free(): invalid pointer

Additional context

we have a daemonset in a cluster running from about 22d ago where we are not seeing the invalid pointer issue

@smparekh
Copy link
Author

the sha 256 digest we are having issue with: 59886dc179d52a43dfdf061c764e9856dafc67c41dd78e9d868872000d9e660a

@smparekh
Copy link
Author

reverting to this sha: f0c0d41aba562c5f4ce13f2b00ae50c381925063cfcc7ec7a9f2a4f622ee9535 doesn't throw invalid pointer

@StevenChangNoodoe
Copy link

I have the same issue in fluent/fluentd-kubernetes-daemonset:v1-debian-cloudwatch.
I revert to this sha: b7185b3483d2ca5c3e923e33641dd3814865321b34da05c46eda96576da905a0 doesn't throw this error too.
v1-debian-cloudwatch.log

@CAR6807
Copy link

CAR6807 commented Apr 4, 2024

Also seeing this in
fluent/fluentd-kubernetes-daemonset:v1.16.5-debian-forward-1.0 image

logging fails

2024-04-03 20:27:34 +0000 [info]: #0 [in_tail_container_logs] following tail of /var/log/containers/node-problem-detector-kwwk8_kube-system_node-problem-detector-4e2796e4c3ca14953fda355aca52c0200a0f53b7b0596d7e94ec89169c782f8a.log
2024-04-03 20:27:34 +0000 [info]: #0 [in_tail_container_logs] following tail of /var/log/containers/unbound-exporter-llm48_unbound_unbound-exporter-bd636614623be73dc03069f9a0fefffb779c47d2c034e796d3364fb49fb2e6fe.log
2024-04-03 20:27:34 +0000 [info]: #0 [in_tail_container_logs] following tail of /var/log/containers/unbound-exporter-llm48_unbound_unbound-exporter-init-1b88c92fa871c07c66d558a84a656879a1b13dfa12c6b533b37ec9ae74fc555f.log
2024-04-03 20:27:34 +0000 [info]: #0 fluentd worker is now running worker=0
free(): invalid pointer
2024-04-03 20:27:37 +0000 [error]: Worker 0 exited unexpectedly with signal SIGABRT
2024-04-03 20:27:37 +0000 [info]: #0 init worker0 logger path=nil rotate_age=nil rotate_size=nil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants