Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(spanner): Adding GFE Latency and Header Missing Count Metrics #5199
feat(spanner): Adding GFE Latency and Header Missing Count Metrics #5199
Changes from 11 commits
b672cd0
dc4b117
67ae2ab
8f86321
91ca57a
444bd38
83cba34
a3aff82
32aa54c
77d40f9
58fab64
212d86a
06b42ef
f074b9f
50bf9c1
5dc51b1
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere: I think it is better to add an
if err != nil
{ return err }` statement before checking for the latency values. I worry that we otherwise might start returning errors that are harder to interpret for users if the following happens:client.PartitionRead
returns an error. That error is not checked here at the moment.md != nil
), but that header does not contain a GFE latency value. So instead of the error that was returned fromPartitionedRead
, we return an error that indicates that it could not find a GFE Latency value.I worry that the above would be confusing for users, and make it harder to debug errors that occur because of for example invalid requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually if there is no GFE Latency value it won't throw an error. Instead it will add to the header missing count. It only throws an error if it got a garbage value in the value for the key value pair "server-timing". Also currently in the Java implementation, they intercept every rpc call(even when there is a failed operation) and use it to increase the header missing count. I feel if we don't add to the Header missing count in this case then across languages the header missing count might be different in someway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. So agreed to keep this as is on that.
I still worry a little bit that we might be returning an error from extracting the metadata instead of the error from the actual operation. I think we had a discussion about that earlier, and agreed on returning the error from those methods. I think I'm starting to change my mind on that. It is not an error that the user can do anything about, and it might just cover other potentially more important errors. So I would suggest then ignoring those errors. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we log these errors instead then? How do we let the user know that an error has occurred?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have access to the logger, then logging it is probably a good idea. I don't think users can do anything with these errors anyways, so I don't think we need to actively inform them any more than that (and the chance that it happens should also be minimal, as it would mean that the data that is sent back is invalid)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per our conversation going to add a trace instead of returning an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this early return is no longer needed now that we don't return an error if getting the latency stats fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still needed as there might be an error in getting the Header, and in that case we return the error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but that happens on line 333. The reason that we added the early return above in the first place, was that we would return the error that might be returned
createContextAndCaptureGFELatencyMetrics
. As we don't do that anymore, we can safely wait until line 333 before returning the error from theclient.Header()
method.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then won't md, err := client.Header() override the value of the original error. I will have to add a separate check as md, errGFE : = client.Header()
if errGFE != nil {
return nil, errGFE
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we mean the same return in this case. I think you can safely return the
if err != nil
block starting at line 325. Theerr
on line 329 will not override theerr
from the call toclient.Header()
, as that is a different variable, although it does have the same name. Thaterr
is only visible in the statement on line 329.So to summarize:
if err != nil
on line 321 should remain.if err != nil
on line 325 can be removed. (And yes, themd, err := client.Header()
does override the previous value oferr
, but that is not a problem as we know thaterr
at that point isnil
)if err := createContextAndCaptureGFELatencyMetrics
on line 329 can safely remain. That will not override anything, as it is a separate variable only defined on line 329.return client, err
on line 333 will therefore always return the value oferr
that was returned byclient.Header()
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay makes sense. My bad. I thought you meant remove the error return on line 321. Will do that.