Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(storage): fix endpoint selection logic #3172

Merged
merged 13 commits into from Nov 16, 2020
36 changes: 22 additions & 14 deletions storage/storage.go
Expand Up @@ -43,6 +43,7 @@ import (
"cloud.google.com/go/internal/version"
"google.golang.org/api/googleapi"
"google.golang.org/api/option"
"google.golang.org/api/option/internaloption"
raw "google.golang.org/api/storage/v1"
htransport "google.golang.org/api/transport/http"
)
Expand Down Expand Up @@ -105,41 +106,48 @@ type Client struct {
func NewClient(ctx context.Context, opts ...option.ClientOption) (*Client, error) {
var host, readHost, scheme string

// In general, it is recommended to use NewService instead of NewClient, since NewService
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the recommended path is to call NewService without WithHTTPClient, which avoids the need for NewClient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. Maybe we can consider exporting the internal http client from NewService in the future so it can support the direct-access, client reuse scenario found in reader.go.

andyrzhao marked this conversation as resolved.
Show resolved Hide resolved
andyrzhao marked this conversation as resolved.
Show resolved Hide resolved
// configures the correct default endpoints when initializing the internal http client.
// However, in our case, reader.go needs to access the client directly, so the client
andyrzhao marked this conversation as resolved.
Show resolved Hide resolved
// gets created manually here so it can be re-used by both reader.go and NewService.
// This means we need to manually configure the default endpoint options on the http client.
// Furthermore, we need to account for STORAGE_EMULATOR_HOST override when setting
// the default endpoints.
if host = os.Getenv("STORAGE_EMULATOR_HOST"); host == "" {
scheme = "https"
readHost = "storage.googleapis.com"

// Prepend default options to avoid overriding options passed by the user.
opts = append([]option.ClientOption{option.WithScopes(ScopeFullControl), option.WithUserAgent(userAgent)}, opts...)

opts = append(opts, internaloption.WithDefaultEndpoint("https://storage.googleapis.com/storage/v1/"))
opts = append(opts, internaloption.WithDefaultMTLSEndpoint("https://storage.mtls.googleapis.com/storage/v1/"))
Comment on lines +123 to +124
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internaloption API question: why do we have WithDefault... rather than prepending the defaults like we do above (see "Prepend default options" comment)? Should have asked this when updating internaloption, but I didn't think of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yea, the reason why "WithDefaultEndpoint" was introduced is that we needed a way to distinguish between user specified override and auto-generated default endpoint. In other words, with the "Prepend" technique, downstream code has no way to tell whether someone wanted to override the endpoint explicitly, since it's all coming from a "WithEndpoint". This distinction is needed to support DCA, where we automatically upgrade the default endpoint to defaultMTLSendpoint, unless explicitly overridden by the user. Therefore, base clients should almost always use WithDefaultEndpoint. (Note that this PR sets WithEndpoint for NewService because the endpoint has already been computed - the alternative would be to pass in the full set of ClientOptions.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we should do that directly, rather than indirectly through "default" vs not. So, we could add another setting that's something like attemptDCA with a default of true. If the user specifies an option that means we shouldn't attempt DCA, set attemptDCA=false.

Right now, the logic you laid out (which makes sense) is all implicit, making this more difficult to understand. I could totally be missing something, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So DCA aside, I'd argue that having the client options "WithDefaultEndpoint" and "WithDefaultMTLSEndpoint" actually makes the client intentions more explicit and less error prone, and we no longer have to rely on endpoint options being propagated in the right order, which is made even more troublesome by hand-written clients. Having the defaults as separate options also allows us to support other use-cases, such as "endpoint merging". For example, cbro added the following logic in https://github.com/googleapis/google-api-go-client/blob/master/transport/internal/dca/dca.go a while back:

// If the endpoint override is an address (host:port) rather than full base
// URL (ex. https://...), then the user-provided address will be merged into
// the default endpoint. For example, WithEndpoint("myhost:8000") and
// WithDefaultEndpoint("https://foo.com/bar/baz") will return "https://myhost:8080/bar/baz"

And circling back to your original comment about DCA, I agree the logic feels a bit like a black box today, but it is by design, as we eventually want to make it the default behavior that happens without any additional user configuration. You can read more about the DCA spec at https://google.aip.dev/auth/4114, including the env variables we already support today to control the behavior. Thanks!

} else {
scheme = "http"
readHost = host

opts = append([]option.ClientOption{option.WithoutAuthentication()}, opts...)

opts = append(opts, internaloption.WithDefaultEndpoint(host))
opts = append(opts, internaloption.WithDefaultMTLSEndpoint(host))
}

// htransport selects the correct endpoint among WithEndpoint (user override), WithDefaultEndpoint, and WithDefaultMTLSEndpoint.
hc, ep, err := htransport.NewClient(ctx, opts...)
if err != nil {
return nil, fmt.Errorf("dialing: %v", err)
}
rawService, err := raw.NewService(ctx, option.WithHTTPClient(hc))
// RawService should be created with the chosen endpoint to take account of user override.
rawService, err := raw.NewService(ctx, option.WithEndpoint(ep), option.WithHTTPClient(hc))
if err != nil {
return nil, fmt.Errorf("storage client: %v", err)
}
if ep == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I'm understanding this correctly, we no longer need the ep == "" case since we can always expect a valid endpoint to be returned given that we are passing in WithEndpoint to NewService. Is that correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the endpoint is computed by htransport, which will now always return a non-empty endpoint after this change, since we are setting the defaults. Passing "WithEndpoint" to NewService is just a short-cut since the endpoint has already been computed (the alternative would be to pass the full set of user-specified ClientOptions to NewService, which would end up recomputing the endpoint.)

// Override the default value for BasePath from the raw client.
// TODO: remove when the raw client uses this endpoint as its default (~end of 2020)
rawService.BasePath = "https://storage.googleapis.com/storage/v1/"
} else {
// If the endpoint has been set explicitly, use this for the BasePath
// as well as readHost
rawService.BasePath = ep
u, err := url.Parse(ep)
if err != nil {
return nil, fmt.Errorf("supplied endpoint %v is not valid: %v", ep, err)
}
readHost = u.Host
// Update readHost with the chosen endpoint.
u, err := url.Parse(ep)
if err != nil {
return nil, fmt.Errorf("supplied endpoint %v is not valid: %v", ep, err)
andyrzhao marked this conversation as resolved.
Show resolved Hide resolved
}
readHost = u.Host

return &Client{
hc: hc,
Expand Down