Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Kafka topic connector having issues with kafka partition rebalance #7149

Open
1 task done
planetf1 opened this issue Nov 29, 2022 · 2 comments
Open
1 task done
Labels
bug Something isn't working message-bus message bus, async processessing, topics pinned Keep open (do not time out)

Comments

@planetf1
Copy link
Member

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Noticed this when debugging a db2 integration setup:

022-11-29 16:17:51.081  INFO 39183 --- [ool-56-thread-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-6c2845ff-e8f6-415a-84a8-537ed0d
f1102-55, groupId=6c2845ff-e8f6-415a-84a8-537ed0df1102] Setting offset for partition egeria.omag.openmetadata.repositoryservices.cohort.cocoCohort.OMRSTopic.reg
istration-0 to the committed offset FetchPosition{offset=792, offsetEpoch=Optional[0], currentLeader=LeaderAndEpoch{leader=Optional[192.168.178.134:9092 (id: 0
rack: null)], epoch=0}}
2022-11-29 16:17:51.082  INFO 39183 --- [ool-24-thread-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-acce90cb-18c0-4a75-8db6-917f483
tadata{offset=0, leaderEpoch=null, metadata=''}} failed: Offset commit cannot be completed since the consumer is not part of an active group for auto partition
assignment; it is likely that the consumer was kicked out of the group.
2022-11-29 16:17:50.492  WARN 39183 --- [ool-82-thread-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-40605fed-50e5-4a98-a924-af9dbbf
69530-81, groupId=40605fed-50e5-4a98-a924-af9dbbf69530] Asynchronous auto-commit of offsets {egeria.omag.server.cocoMDS6.omas.assetmanager.outTopic-0=OffsetAndM
etadata{offset=18983, leaderEpoch=0, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to anot
her member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the po
ll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches
returned in poll() with max.poll.records.
2022-11-29 16:17:50.494  INFO 39183 --- [ool-56-thread-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-6c2845ff-e8f6-415a-84a8-537ed0d
f1102-55, groupId=6c2845ff-e8f6-415a-84a8-537ed0df1102] Successfully joined group with generation Generation{generationId=7, memberId='consumer-6c2845ff-e8f6-41
5a-84a8-537ed0df1102-55-976183a4-3c2b-4f81-a28d-32f9737ed5b5', protocol='range'}
2022-11-29 16:17:50.493  INFO 39183 --- [ool-38-thread-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-67937cc0-f8e7-4bb8-8031-77a0b15
5b524-37, groupId=67937cc0-f8e7-4bb8-8031-77a0b155b524] Successfully joined group with generation Generation{generationId=9, memberId='consumer-67937cc0-f8e7-4b
b8-8031-77a0b155b524-37-9d45891d-6421-4726-bce9-484a24f02bcf', protocol='range'}
2022-11-29 16:17:50.495  INFO 39183 --- [ool-82-thread-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-40605fed-50e5-4a98-a924-af9dbbf
69530-81, groupId=40605fed-50e5-4a98-a924-af9dbbf69530] Failing OffsetCommit request since the consumer is not part of an active group
2022-11-29 16:17:50.495  INFO 39183 --- [ool-66-thread-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-405bf426-3888-402a-a28c-3cc8796
bf5a2-65, groupId=405bf426-3888-402a-a28c-3cc8796bf5a2] Request joining group due to: need to re-join with the given member-id: consumer-405bf426-3888-402a-a28c
-3cc8796bf5a2-65-7ac52688-23f5-4744-b054-9e62a3b143c1
2022-11-29 16:17:50.495  INFO 39183 --- [ool-38-thread-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-67937cc0-f8e7-4bb8-8031-77a0b15
5b524-37, groupId=67937cc0-f8e7-4bb8-8031-77a0b155b524] Finished assignment for group at generation 9: {consumer-67937cc0-f8e7-4bb8-8031-77a0b155b524-37-9d45891
d-6421-4726-bce9-484a24f02bcf=Assignment(partitions=[egeria.omag.openmetadata.repositoryservices.cohort.cocoCohort.OMRSTopic.registration-0])}
2022-11-29 16:17:50.495  WARN 39183 --- [ool-80-thread-1] o.a.k.c.c.internals.ConsumerCoordinator  : [Consumer clientId=consumer-40605fed-50e5-4a98-a924-af9dbbf
69530-79, groupId=40605fed-50e5-4a98-a924-af9dbbf69530] Asynchronous auto-commit of offsets {egeria.omag.server.cocoMDS6.omas.subjectarea.outTopic-0=OffsetAndMe
tadata{offset=0, leaderEpoch=null, metadata=''}} failed: Commit cannot be completed since the group has already rebalanced and assigned the partitions to anothe
r member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll
 loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches re
turned in poll() with max.poll.records.
Tue Nov 29 16:17:50 GMT 2022 cocoMDS6 Information OCF-KAFKA-TOPIC-CONNECTOR-0018 The Egeria client was rebalanced by Kafka and failed to commit already consumed
 events

This needs investigation as we may
a) be failing to commit messages that are read - leading to processing twice
b) we may have performance issues in the topic connector that could cause other processing issues

Expected Behavior

not to see rebalancing issues

Steps To Reproduce

No response

Environment

- Egeria:
- OS:
- Java:
- Browser (for UI issues):
- Additional connectors and integration:

Any Further Information?

No response

@planetf1 planetf1 added bug Something isn't working triage New bug/issue which needs checking & assigning labels Nov 29, 2022
@planetf1 planetf1 added the message-bus message bus, async processessing, topics label Dec 5, 2022
@planetf1 planetf1 self-assigned this Dec 5, 2022
@planetf1 planetf1 removed the triage New bug/issue which needs checking & assigning label Dec 5, 2022
@github-actions
Copy link

github-actions bot commented Feb 4, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the no-issue-activity Issues automatically marked as stale because they have not had recent activity. label Feb 4, 2023
@planetf1 planetf1 removed the no-issue-activity Issues automatically marked as stale because they have not had recent activity. label Feb 6, 2023
@github-actions
Copy link

github-actions bot commented Apr 8, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the no-issue-activity Issues automatically marked as stale because they have not had recent activity. label Apr 8, 2023
@planetf1 planetf1 removed the no-issue-activity Issues automatically marked as stale because they have not had recent activity. label Apr 14, 2023
@planetf1 planetf1 removed their assignment May 15, 2023
@mandy-chessell mandy-chessell added the pinned Keep open (do not time out) label Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working message-bus message bus, async processessing, topics pinned Keep open (do not time out)
Projects
None yet
Development

No branches or pull requests

2 participants