Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pubsub: Many instances of "The StreamingPull stream closed for an expected reason and should be recreated ..." #9788

Closed
HaraldNordgren opened this issue Apr 17, 2024 · 4 comments
Assignees
Labels
api: pubsub Issues related to the Pub/Sub API. status: investigating The issue is under investigation, which is determined to be non-trivial.

Comments

@HaraldNordgren
Copy link

HaraldNordgren commented Apr 17, 2024

Client

cloud.google.com/go/pubsub v1.37.0

Environment

GKE

Go Environment

$ go version
go version go1.21.8 darwin/arm64
$ go env
GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/Harald/Library/Caches/go-build'
GOENV='/Users/Harald/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/Harald/go/pkg/mod'
GONOPROXY='github.com/dietdoctor/*'
GONOSUMDB='github.com/dietdoctor/*'
GOOS='darwin'
GOPATH='/Users/Harald/go'
GOPRIVATE='github.com/dietdoctor/*'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/Users/Harald/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.8.darwin-arm64'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/Users/Harald/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.21.8.darwin-arm64/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.21.8'
GCCGO='gccgo'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/Users/Harald/dd/hive/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/j2/0w9hqz1x01g3j4_yz0w0sw780000gp/T/go-build3312079878=/tmp/go-build -gno-record-gcc-switches -fno-common'

Code

func initPubSubClient(cc *cli.Context, log logrus.FieldLogger) (*pubsub.Client, error) {
	clientOpts := googleAPIClientOptions(cc, log)

	log.Debug("Creating a new pubsub client.")
	client, err := pubsub.NewClient(context.Background(), cc.String("gcp-project"), clientOpts...)
	if err != nil {
		return nil, fmt.Errorf("failed to create a pubsub client: %v", err)
	}

	log.Debug("Pubsub client created.")
	return client, nil
}

func googleAPIClientOptions(cc *cli.Context, log logrus.FieldLogger) []option.ClientOption {
	var clientOpts []option.ClientOption
	if cc.IsSet("gcp-credentials-file") {
		log.Debugf("Using %s credentials file for Google API auth.", cc.String("gcp-credentials-file"))
		clientOpts = append(clientOpts, option.WithCredentialsFile(cc.String("gcp-credentials-file")))
	}
	return clientOpts
}

then using it like

func Action(log *logrus.Logger) cli.ActionFunc {
	fn := func(cc *cli.Context) error {
		psubClient, err := initPubSubClient(cc, log)
		if err != nil {
			return err
		}

		...

		c := Ctrl{
			sub: psubClient.Subscription(name),
		}

		g, ctx := errgroup.WithContext(context.Background())

		g.Go(func() error {
			sigChan := make(chan os.Signal, 1)
			signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM)

			select {
			case sig := <-sigChan:
				log.Infof("Received signal, exiting: %s", sig)
				return psubClient.Close()
			case <-ctx.Done():
				log.Infof("Received context cancel signal, exiting: %s", ctx.Err())
				return psubClient.Close()
			}
		})

		g.Go(func() error {
			return c.sub.Receive(ctx, c.msgHandler)
		})

		return g.Wait()
	}

	return newAction("name", log, fn)
}

Expected behavior

No error messages.

Actual behavior

The StreamingPull stream closed for an expected reason and should be recreated, which is done automatically if using Cloud Pub/Sub client libraries. Refer to https://cloud.google.com/pubsub/docs/pull#streamingpull for more information.

Screenshots

Screenshot 2024-04-17 at 11 51 58

@HaraldNordgren HaraldNordgren added the triage me I really want to be triaged. label Apr 17, 2024
@product-auto-label product-auto-label bot added the api: pubsub Issues related to the Pub/Sub API. label Apr 17, 2024
@hongalex
Copy link
Member

Hi, how long have you been experiencing this issue? Does this correspond to a recent version bump of the Pub/Sub library?

Separately, in the last code block you have, I wasn't able to see where exactly Receive is called. Could you amend that block to include that?

@hongalex hongalex added status: investigating The issue is under investigation, which is determined to be non-trivial. and removed triage me I really want to be triaged. labels Apr 17, 2024
@HaraldNordgren
Copy link
Author

Hi @hongalex!

My Honeycomb data stretches back for 2 months and the issue has been going on at least since then. At that point we were using pubsub v1.36.2, and since about 1 month we are using pubsub v1.37.0 and they both show this issue.

I have amended my code block to include info on how Receive is called 🤗

@hongalex
Copy link
Member

StreamingPull streams are periodically closed every 30ish minutes, which your above graph confirms. This is intentional for the server to reassign resources properly. While this behavior isn't documented specifically, StreamingPull streams being closed with a non-ok error is normal: https://cloud.google.com/pubsub/docs/pull-troubleshooting#troubleshooting_a_streamingpull_subscription

Given that the error isn't tied to poor behavior, this is working as intended. If you have noticed your streams behaving poorly though, please let us know so we can investigate further.

@hongalex
Copy link
Member

Closing this issue since I think my previous comment answers this. If you're experiencing other kinds of unexpected behavior, please open another issue and I'll investigate there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API. status: investigating The issue is under investigation, which is determined to be non-trivial.
Projects
None yet
Development

No branches or pull requests

2 participants