Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ServiceBusReceiver receive_messages max_wait_time race condition locks the message but sends back empty list #35591

Open
cabal-daniel opened this issue May 12, 2024 · 4 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Messaging Messaging crew needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus

Comments

@cabal-daniel
Copy link

  • Package Name: pip install azure-servicebus

  • Package Version: 7.12.2

  • Operating System: MacOS and Ubuntu

  • Python Version: 3.8.18

Describe the bug

Receiving code:

with sb_client.get_queue_receiver(queue) as receiver:
    messages = receiver.receive_messages(
        max_message_count=max_message_count,
        max_wait_time=max_wait_time,
    )
    print(messages)
    for message in messages:
           receiver.complete_message(message)

If max_wait_time is set in ServiceBusReceiver.receive_messages, then a race condition can happen where a message gets locked by the receive_message action but receive_messages returns back an empty list, locking this message for the duration of the lockout period.

To Reproduce
Steps to reproduce the behavior:

  1. We used max_wait_time to 2 and are receiving 1 message at a time
  2. Spin up 10 workers, constantly calling the Receiving code above in a polling interval in a while loop
  3. Have a sender send 100 or so messages to the queue 1 at a time with a decently large body
    Actual behaviour: some of the messages will get locked in the queue. From our experiments roughly 2-4 messages out of 100 get put into the locked state without them being complete. The receive_messages method shouldn't lock a message unless it can return that message in full.

Expected behavior
All the messages will get completed. No message would get locked out.

I suspect the code for that behaviour is somewhere in here: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/servicebus/azure-servicebus/azure/servicebus/_servicebus_receiver.py#L454

Perhaps it can be configured so that max_time_out can allow a bit of "grace period" to allow the full message to be received and then terminate afterwards once the max_time_out is reached.

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention This issue is responsible by Azure service team. Service Bus labels May 12, 2024
Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @EldertGrootenboer.

@kashifkhan
Copy link
Member

Thank you for the feedback @cabal-daniel . We will investigate and get back to you asap.

@kashifkhan kashifkhan added Messaging Messaging crew and removed Service Attention This issue is responsible by Azure service team. needs-team-attention This issue needs attention from Azure service team or SDK team labels May 13, 2024
@kashifkhan kashifkhan self-assigned this May 13, 2024
@github-actions github-actions bot added needs-team-attention This issue needs attention from Azure service team or SDK team needs-team-triage This issue needs the team to triage. labels May 13, 2024
@xiangyan99 xiangyan99 removed the needs-team-triage This issue needs the team to triage. label May 13, 2024
@cabal-daniel
Copy link
Author

Have you guys looked into this? It's happening in our prod environments a lot. We can't not set the max time out so messages get dropped

@cabal-daniel
Copy link
Author

@rakshith91
@swathipilrakshith91 and swathipil

It seems like you guys are the original authours for this line. We are seeing frames being dropped by service bus. Could you please possibly comment on a remedy for this line?

https://github.com/Azure/azure-sdk-for-python/blob/d335a043d50cb3f4c484202dedf77bb[…]b4/sdk/eventhub/azure-eventhub/azure/eventhub/_pyamqp/client.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Messaging Messaging crew needs-team-attention This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus
Projects
None yet
Development

No branches or pull requests

3 participants