Skip to content

Pub/sub: amount of streaming pull operations way higher than streaming ack operations #4841

@krelst

Description

@krelst

We are using python 2.7.12 with google-cloud-pubsub==0.30.1 and grpcio==1.9.1 and have a very basic subscriber script:

import random
import time

from google.cloud import pubsub

GCLOUD_PROJECT_ID = <our_project_id>
SUBSCRIPTION_NAME = <subscription_name>

def pubsub_message_callback(message):
    time.sleep(random.random())
    message.ack()

def main():
    sub_client = pubsub.SubscriberClient()
    full_subscription_name = 'projects/{project_id}/subscriptions/{subscription}'.format(
        project_id=GCLOUD_PROJECT_ID,
        subscription=SUBSCRIPTION_NAME,
    )
    pubsub_subscription = sub_client.subscribe(
        full_subscription_name,
        flow_control=pubsub.types.FlowControl(max_messages=1)
    )
    pubsub_subscription.open(
        pubsub_message_callback
    )
    while True:
        pass

if __name__ == '__main__':
    main()

When having a large backlog and running this script for 20 minutes, the amount of streaming pull operations is 100 times the amount of streaming ack operations. (see image)

As the the flow control is set to max_messages=1 I would expect that it will always wait to ack the received message, before receiving another one. And even when flow control was not set to max_messages=1 it still looks like unintended behaviour as the total amount of pull operations is still way higher than the amount of ack operations (so what happens with those received/unacked messages?). Is this a bug or is it intended behaviour?

Also asked this as a question on stackoverflow

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions