Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors while sending data to Eventhub #502

Open
dipendra-singh opened this issue Dec 1, 2023 · 3 comments
Open

Errors while sending data to Eventhub #502

dipendra-singh opened this issue Dec 1, 2023 · 3 comments
Labels

Comments

@dipendra-singh
Copy link

dipendra-singh commented Dec 1, 2023

Describe the bug

When I am trying to use out_kafka2 and send logs to eventhub which has 200 partitions, I am facing this issue:

2023-12-01 22:56:45 +0000 [warn]: #0 [kfk2] Send exception occurred: Failed to send messages to livelogs/192, livelogs/140, livelogs/148, livelogs/152, livelogs/22, livelogs/97, livelogs/114, livelogs/49, livelogs/174, livelogs/92, livelogs/26, livelogs/84, livelogs/64, livelogs/199, livelogs/68, livelogs/134, livelogs/30, livelogs/83, livelogs/8, livelogs/10, livelogs/158, livelogs/66, livelogs/71, livelogs/14, livelogs/94, livelogs/93, livelogs/117, livelogs/62, livelogs/89, livelogs/198, livelogs/40, livelogs/155, livelogs/180, livelogs/24, livelogs/179, livelogs/159, livelogs/184, livelogs/38, livelogs/141, livelogs/27, livelogs/46, livelogs/82, livelogs/53, livelogs/154, livelogs/16, livelogs/21, livelogs/34, livelogs/90, livelogs/181, livelogs/196, livelogs/79, livelogs/96, livelogs/23, livelogs/6, livelogs/182, livelogs/104, livelogs/183, livelogs/67, livelogs/124, livelogs/167, livelogs/43, livelogs/75, livelogs/156, livelogs/57, livelogs/102, livelogs/194, livelogs/130, livelogs/106, livelogs/32, livelogs/101, livelogs/170, livelogs/95, livelogs/146, livelogs/47, livelogs/103, livelogs/144, livelogs/176, livelogs/118, livelogs/185, livelogs/138, livelogs/9, livelogs/160, livelogs/41, livelogs/164, livelogs/166, livelogs/86, livelogs/28, livelogs/195, livelogs/127, livelogs/51, livelogs/61, livelogs/189, livelogs/54, livelogs/87, livelogs/139, livelogs/72, livelogs/7, livelogs/77, livelogs/123, livelogs/18, livelogs/50, livelogs/91, livelogs/5, livelogs/59, livelogs/132, livelogs/147, livelogs/110, livelogs/44, livelogs/136, livelogs/108, livelogs/173, livelogs/45, livelogs/151, livelogs/81, livelogs/172, livelogs/29, livelogs/131, livelogs/58, livelogs/197, livelogs/4, livelogs/178, livelogs/65, livelogs/129, livelogs/17, livelogs/98, livelogs/142, livelogs/137, livelogs/121, livelogs/150, livelogs/190, livelogs/116, livelogs/163, livelogs/162, livelogs/33, livelogs/69, livelogs/74, livelogs/187, livelogs/2, livelogs/35, livelogs/15, livelogs/42, livelogs/20, livelogs/171, livelogs/88, livelogs/126, livelogs/186, livelogs/36, livelogs/109, livelogs/193, livelogs/105, livelogs/149, livelogs/48, livelogs/3, livelogs/175, livelogs/63, livelogs/177, livelogs/115, livelogs/161, livelogs/122, livelogs/25, livelogs/11, livelogs/120, livelogs/31, livelogs/99, livelogs/70, livelogs/107, livelogs/73, livelogs/111, livelogs/1, livelogs/128, livelogs/76, livelogs/119, livelogs/165, livelogs/168, livelogs/78, livelogs/13, livelogs/12, livelogs/188, livelogs/191, livelogs/100, livelogs/112, livelogs/143, livelogs/80, livelogs/39, livelogs/153, livelogs/19, livelogs/60, livelogs/169, livelogs/52, livelogs/0, livelogs/56, livelogs/145, livelogs/125, livelogs/135, livelogs/113, livelogs/133, livelogs/157, livelogs/85, livelogs/55
2023-12-01 22:56:45 +0000 [warn]: #0 [kfk2] Exception Backtrace : /fluentd/vendor/bundle/ruby/3.1.0/gems/fluent-plugin-kafka-0.18.1/lib/fluent/plugin/kafka_producer_ext.rb:243:in `deliver_messages_with_retries'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluent-plugin-kafka-0.18.1/lib/fluent/plugin/kafka_producer_ext.rb:129:in `deliver_messages'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluent-plugin-kafka-0.18.1/lib/fluent/plugin/out_kafka2.rb:377:in `write'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluentd-1.16.1/lib/fluent/plugin/output.rb:1225:in `try_flush'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluentd-1.16.1/lib/fluent/plugin/output.rb:1538:in `flush_thread_run'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluentd-1.16.1/lib/fluent/plugin/output.rb:510:in `block (2 levels) in start'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluentd-1.16.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'

To Reproduce

Handling scale of about 80K logs/sec
Eventhub partitions 200 with dedicated tier (1CU)

Expected behavior

Retries to be successful

Your Environment

- Fluentd version: v1.16.1
- TD Agent version:
- fluent-plugin-kafka version: 1.2
- ruby-kafka version: v1.51
- Operating system: ubuntu
- Kernel version:

Your Configuration

and I am using the following config:

 <store>
        @type "kafka2"
        @id kfk2
        @log_level "debug"
        get_kafka_client_log true
        brokers xyz.servicebus.windows.net:9093
        use_event_time true
        default_topic "livelogs"
        max_send_retries 3
        required_acks -1
        idempotent true
        ssl_ca_certs_from_system true
        ssl_ca_cert 
        ack_timeout 10
        username "$ConnectionString"
        password xxxxxx
        <inject>
          time_key "time"
          time_type string
          time_format "%Y-%m-%dT%H:%M:%S"
        </inject>
        <buffer>
          flush_mode interval
          retry_type exponential_backoff
          @type "file"
          chunk_limit_size 1MB
          overflow_action drop_oldest_chunk
          flush_thread_count 16
        </buffer>
        <format>
          @type "json"
        </format>
      </store>

Your Error Log

2023-12-01 22:56:45 +0000 [warn]: #0 [kfk2] Exception Backtrace : /fluentd/vendor/bundle/ruby/3.1.0/gems/fluent-plugin-kafka-0.18.1/lib/fluent/plugin/kafka_producer_ext.rb:243:in `deliver_messages_with_retries'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluent-plugin-kafka-0.18.1/lib/fluent/plugin/kafka_producer_ext.rb:129:in `deliver_messages'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluent-plugin-kafka-0.18.1/lib/fluent/plugin/out_kafka2.rb:377:in `write'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluentd-1.16.1/lib/fluent/plugin/output.rb:1225:in `try_flush'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluentd-1.16.1/lib/fluent/plugin/output.rb:1538:in `flush_thread_run'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluentd-1.16.1/lib/fluent/plugin/output.rb:510:in `block (2 levels) in start'
/fluentd/vendor/bundle/ruby/3.1.0/gems/fluentd-1.16.1/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'`

Additional context

No response

@dipendra-singh
Copy link
Author

@ashie any update on this?

@jodr5786
Copy link

Is there any update on this? We've encountered the same issue when trying to output to an Azure Event Hub. Logs will flow for ~24 hours or so, then abruptly stop with the error below. Restarting the pods (we're running the operator in Kubernetes/AKS) will cause it to work again for a short time before the same errors occur.

2024-03-19 15:11:50 +0000 [warn]: #0 [clusterflow:cattle-logging-system:opt-in:clusteroutput:cattle-logging-system:kafka-rancher-logs] Send exception occurred: Failed to send messages to logging-rancher_logs-np/1, logging-rancher_logs-np/0
2024-03-19 15:11:50 +0000 [warn]: #0 [clusterflow:cattle-logging-system:opt-in:clusteroutput:cattle-logging-system:kafka-rancher-logs] Exception Backtrace : /usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-kafka-0.17.5/lib/fluent/plugin/kafka_producer_ext.rb:242:in `deliver_messages_with_retries'
/usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-kafka-0.17.5/lib/fluent/plugin/kafka_producer_ext.rb:128:in `deliver_messages'
/usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-kafka-0.17.5/lib/fluent/plugin/out_kafka2.rb:299:in `write'
/usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:1179:in `try_flush'
/usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:1500:in `flush_thread_run'
/usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:499:in `block (2 levels) in start'
/usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2024-03-19 15:11:50 +0000 [info]: #0 [clusterflow:cattle-logging-system:opt-in:clusteroutput:cattle-logging-system:kafka-rancher-logs] initialized kafka producer: fluentd
2024-03-19 15:11:50 +0000 [warn]: #0 [clusterflow:cattle-logging-system:opt-in:clusteroutput:cattle-logging-system:kafka-rancher-logs] failed to flush the buffer. retry_times=0 next_retry_time=2024-03-19 15:11:51 +0000 chunk="61404ba7f31e469aa7afc435ed28b99b" error_class=Kafka::DeliveryFailed error="Failed to send messages to logging-rancher_logs-np/1, logging-rancher_logs-np/0"
  2024-03-19 15:11:50 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-kafka-0.17.5/lib/fluent/plugin/kafka_producer_ext.rb:242:in `deliver_messages_with_retries'
  2024-03-19 15:11:50 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-kafka-0.17.5/lib/fluent/plugin/kafka_producer_ext.rb:128:in `deliver_messages'
  2024-03-19 15:11:50 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-kafka-0.17.5/lib/fluent/plugin/out_kafka2.rb:299:in `write'
  2024-03-19 15:11:50 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:1179:in `try_flush'
  2024-03-19 15:11:50 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:1500:in `flush_thread_run'
  2024-03-19 15:11:50 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:499:in `block (2 levels) in start'
  2024-03-19 15:11:50 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'

@dipendra-singh
Copy link
Author

dipendra-singh commented Mar 21, 2024

@jodr5786 moving to kafka_buffered worked for me. But I am expecting a better solve here since kafka_buffered is deprecated. Also, when I used kafka_buffered I saw some fragmented chunk errors but they were innocuous. The data flows without any issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants