New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(bigtable): Add a DirectPath fallback integration test #3384
Conversation
bigtable/integration_test.go
Outdated
// Blackhole directpath address to test fallback. | ||
func TestIntegration_DirectPathFallback(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Blackhole directpath address to test fallback. | |
func TestIntegration_DirectPathFallback(t *testing.T) { | |
// TestIntegration_DirectPathFallback tests the fallback when the directpath address is unavailable. | |
func TestIntegration_DirectPathFallback(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. I remember I should let the first word in the comment the same as the function name. Thanks for this!
bigtable/integration_test.go
Outdated
for i := 0; i < numRPCsToSend; i++ { | ||
_, _ = table.ReadRow(ctx, "jadams") | ||
if _, useDp := isDirectPathRemoteAddress(testEnv); useDp != isBlackhole { | ||
atomic.AddUint64(&numCount, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm missing something -- is this run in parallel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bigtable client will open 4 channels by default (https://github.com/googleapis/google-cloud-go/blob/master/bigtable/bigtable.go#L78), so they could be run in parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular code will not run in parallel. The code you linked to is for pooling the underlying network connections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out. I removed the atomic operation.
bigtable/integration_test.go
Outdated
blackholeDpv6Cmd := "sudo ip6tables -I INPUT -s 2001:4860:8040::/42 -j DROP && sleep 5 && echo blackholeDpv6" | ||
blackholeDpv4Cmd := "sudo iptables -I INPUT -s 34.126.0.0/18 -j DROP && sleep 5 && echo blackholeDpv4" | ||
allowDpv6Cmd := "sudo ip6tables -I INPUT -s 2001:4860:8040::/42 -j ACCEPT && sleep 5 && echo allowDpv6" | ||
allowDpv4Cmd := "sudo iptables -I INPUT -s 34.126.0.0/18 -j ACCEPT && sleep 5 && echo allowDpv4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me really nervous. I don't expect a test to need sudo
or alter iptables
. Is there another way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test itself is a bit tricky because we want to create a bad situation to make the client think it can not receive dp response (i.e., blackholed), so that we can test fallback. Previously in java, we modified the grpc netty library to achieve the blackhole (code here: https://github.com/googleapis/java-bigtable/blob/master/google-cloud-bigtable/src/test/java/com/google/cloud/bigtable/data/v2/it/DirectPathFallbackIT.java#L184), but this is very hacky and sorry I have no idea how to do a similar thing in Go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that this seems really problematic as-is.
I'm more familiar with the HTTP clients but if I were testing something similar there, I would look into either setting up a small mock server to hit for my test, or using a custom roundtripper at the transport layer to produce the behavior I was looking for. Is it possible to do something similar in grpc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this test, we want to create a low level failure on TCP/IP, and grpc does not have any API to do that. Besides, a mock server is more like what we will use in unit tests, but this is an integration test and we want to make it as real as possible, so we may not want to use a mock server.
When we are running the tests, we have the ownership of the VM that runs the test (create the VM, use sudo
to setup the VM, run the test, destroy the VM..), so it will fine to use sudo
without worrying about environment consistency.
Also notice that users are not supposed to run this test, since users are not supposed to know if the traffic is DirectPath or CFE. I added a check for AttemptDirectPath
flag at the beginning of the test. Does this looks good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From @igorbernstein2, this go test might help us out: https://github.com/grpc/grpc-go/blob/ff1fc890e43ac77e922f53a2cef396b3c6a8f2a1/interop/grpclb_fallback/client.go#L78
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the feedback, tests should avoid mutating the machine state.
Ideally it should follow a pattern similar to the java impl:
https://github.com/googleapis/java-bigtable/blob/master/google-cloud-bigtable/src/test/java/com/google/cloud/bigtable/data/v2/it/DirectPathFallbackIT.java
Where the network failures are injected in process. It appears that grpc-go has some hooks to do something similar via the DialContext. the grpclb_fallback test that @kolea2 linked to, seems to use those hooks to test this exact scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolea2 @igorbernstein2 The grpclb_fallback test is using shell command to blackhole ip, https://github.com/grpc/grpc-go/blob/ff1fc890e43ac77e922f53a2cef396b3c6a8f2a1/interop/grpclb_fallback/client.go#L202-L207. And from what I know the command is iptables. So it seems we still need to use iptables here?
bigtable/integration_test.go
Outdated
} | ||
|
||
// Precondition: wait for DirectPath to connect. | ||
countEnough := exerciseDirectPath(ctx, testEnv, table /*blackholeDP = */, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go style does not have inline comments like this -- it's usually a sign the function being called should be split into more than one function. Perhaps one for the true
case and one for the false
case. That would also hopefully make it easier to understand & debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I do not understand this comment. I found this test https://github.com/googleapis/google-cloud-go/blob/master/bigtable/integration_test.go#L2013 also have similar inline comments. So could you give me an example about how should I write comments? Thanks for help!
bigtable/integration_test.go
Outdated
} | ||
} | ||
|
||
func exerciseDirectPath(ctx context.Context, testEnv IntegrationEnv, table *Table, isBlackhole bool) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me what the return value is. Perhaps a different function name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the function name to examineTraffic
. The function checks and returns if enough DirectPath or CFE traffic is seen within 2 minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @kolea2
bigtable/integration_test.go
Outdated
blackholeDpv6Cmd := "sudo ip6tables -I INPUT -s 2001:4860:8040::/42 -j DROP && sleep 5 && echo blackholeDpv6" | ||
blackholeDpv4Cmd := "sudo iptables -I INPUT -s 34.126.0.0/18 -j DROP && sleep 5 && echo blackholeDpv4" | ||
allowDpv6Cmd := "sudo ip6tables -I INPUT -s 2001:4860:8040::/42 -j ACCEPT && sleep 5 && echo allowDpv6" | ||
allowDpv4Cmd := "sudo iptables -I INPUT -s 34.126.0.0/18 -j ACCEPT && sleep 5 && echo allowDpv4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that this seems really problematic as-is.
I'm more familiar with the HTTP clients but if I were testing something similar there, I would look into either setting up a small mock server to hit for my test, or using a custom roundtripper at the transport layer to produce the behavior I was looking for. Is it possible to do something similar in grpc?
I have an idea about the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the flag setup much more. Thanks!
bigtable/integration_test.go
Outdated
for i := 0; i < numRPCsToSend; i++ { | ||
_, _ = table.ReadRow(ctx, "jadams") | ||
if _, useDp := isDirectPathRemoteAddress(testEnv); useDp != isBlackhole { | ||
atomic.AddUint64(&numCount, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular code will not run in parallel. The code you linked to is for pooling the underlying network connections.
flag.StringVar(&blackholeDpv6Cmd, "it.blackhole-dpv6-cmd", "", "Command to make LB and backend addresses blackholed over dpv6") | ||
flag.StringVar(&blackholeDpv4Cmd, "it.blackhole-dpv4-cmd", "", "Command to make LB and backend addresses blackholed over dpv4") | ||
flag.StringVar(&allowDpv6Cmd, "it.allow-dpv6-cmd", "", "Command to make LB and backend addresses allowed over dpv6") | ||
flag.StringVar(&allowDpv4Cmd, "it.allow-dpv4-cmd", "", "Command to make LB and backend addresses allowed over dpv4") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment with the expected values for these flags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added.
} | ||
|
||
if len(blackholeDpv6Cmd) == 0 { | ||
t.Fatal("-it.blackhole-dpv6-cmd unset") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this will fail the test if the flags aren't passed in. Perhaps we should skip instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have that 3 lines above, from L2164 to L2166:
if !testEnv.Config().AttemptDirectPath {
return
}
If the test want to test DirectPath, then we can fail the test if no command is given.
bigtable/integration_test.go
Outdated
} | ||
|
||
// Precondition: wait for DirectPath to connect. | ||
countEnough := examineTraffic(ctx, testEnv, table /*blackholeDP = */, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what this blackholdDP =
comment means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I want to make /*blackholeDP=*/
after the comma, like countEnough := examineTraffic(ctx, testEnv, table, /*blackholeDP = */false)
, to indicate the meaning of the parameter, i.e., DirectPath is blackholed or not, because it could be confusing with only true
and false
here. However, gofmt will move the comment before the comma.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those types of comments are not ordinarily done in Go style, which is why gofmt
doesn't handle them well -- they're uncommon and not designed for. I gave a few suggestions below of ways we could simplify.
bigtable/integration_test.go
Outdated
} | ||
} else { | ||
if blackholeDP { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | |
} else { | |
if blackholeDP { | |
} | |
return | |
} | |
if blackholeDP { |
https://github.com/golang/go/wiki/CodeReviewComments#indent-error-flow is related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks for catching the format!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly a Go style review. I'm not 100% sure I understand the goal of all of the code, so some questions are very open.
bigtable/integration_test.go
Outdated
minCompleteRPC = 40 | ||
) | ||
|
||
countEnough := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we return as soon as we set this to true, thereby avoiding the need for this variable at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I have removed the variable.
bigtable/integration_test.go
Outdated
|
||
// examineTraffic counts RPCs use DirectPath or CFE traffic. | ||
func examineTraffic(ctx context.Context, testEnv IntegrationEnv, table *Table, expectDP bool) bool { | ||
var numCount uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider:
var numCount uint64 | |
count := 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
bigtable/integration_test.go
Outdated
numCount++ | ||
} | ||
time.Sleep(100 * time.Millisecond) | ||
countEnough = numCount >= minCompleteRPC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the suggestion to return early above, the check should be above the time.Sleep
call. No need to sleep if we're already done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
bigtable/integration_test.go
Outdated
if _, useDP := isDirectPathRemoteAddress(testEnv); useDP != expectDP { | ||
numCount++ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does useDP
ever change in this loop? It appears that it will always be the same value?
Also, why send any RPCs if this will never be true? Could we just return before doing anything?
Perhaps the caller should check isDirectPathRemoteAddress
then this function doesn't need the expectDP
argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
useDP
does get changed. useDP
is returned by isDirectPathRemoteAddress()
, which will return true
if the traffic is DirectPath, and false
if the traffic is CFE. So when DP is allowed, we expect useDP
to be true
, but when DP is blackholed, we expect useDP
to be false.
Sorry the name of the variable expectDP
is misleading, I have rename it as blackholeDP
. Thus, when useDP != blackholeDP
, we get the traffic as expected. Specifically as described in the PR description, this test has three stages, as follows:
-
First wait until DirectPath traffic is observed;
In this stage,useDP
should betrue
, andblackholeDP
should be false. -
Blackhole DirectPath net, and wait until the client fallback to use CFE;
In this stage,useDP
should befalse
, andblackholeDP
should be true. -
Unblackhole DirectPath net, and the client should upgrade back to use DirectPath again.
The DirectPath and CFE traffic are distinguished by peer IP.
In this stage,useDP
should betrue
, andblackholeDP
should be false.
bigtable/integration_test.go
Outdated
@@ -2259,3 +2343,34 @@ func isDirectPathRemoteAddress(testEnv IntegrationEnv) (_ string, _ bool) { | |||
// DirectPath ipv6 can use either ipv4 or ipv6 traffic. | |||
return remoteIP, strings.HasPrefix(remoteIP, directPathIPV4Prefix) || strings.HasPrefix(remoteIP, directPathIPV6Prefix) | |||
} | |||
|
|||
func blackholeOrAllowDirectPath(testEnv IntegrationEnv, t *testing.T, blackholeDP bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we split this into two functions, one corresponding to blackholeDP=true
and one for false
?
That would get rid of the need for the blackholeDP
argument at all, making calls more clear and easier to debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
bigtable/integration_test.go
Outdated
} | ||
|
||
// Precondition: wait for DirectPath to connect. | ||
countEnough := examineTraffic(ctx, testEnv, table /*blackholeDP = */, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those types of comments are not ordinarily done in Go style, which is why gofmt
doesn't handle them well -- they're uncommon and not designed for. I gave a few suggestions below of ways we could simplify.
24cd371
to
001b5a7
Compare
bigtable/integration_test.go
Outdated
defer cleanup() | ||
|
||
if !testEnv.Config().AttemptDirectPath { | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this return
or t.Skip
? Return means the test "passes." So, skipping seems more correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
bigtable/integration_test.go
Outdated
countEnough := examineTraffic(ctx, testEnv, table, false) | ||
if !countEnough { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this name could be more clear?
countEnough := examineTraffic(ctx, testEnv, table, false) | |
if !countEnough { | |
dpEnabled := examineTraffic(ctx, testEnv, table, false) | |
if !dpEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
bigtable/integration_test.go
Outdated
if testEnv.Config().DirectPathIPV4Only { | ||
cmdRes := exec.Command("bash", "-c", blackholeDpv4Cmd) | ||
out, _ := cmdRes.CombinedOutput() | ||
t.Logf(string(out)) | ||
} else { | ||
cmdRes := exec.Command("bash", "-c", blackholeDpv4Cmd) | ||
out, _ := cmdRes.CombinedOutput() | ||
t.Logf(string(out)) | ||
cmdRes = exec.Command("bash", "-c", blackholeDpv6Cmd) | ||
out, _ = cmdRes.CombinedOutput() | ||
t.Logf(string(out)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if testEnv.Config().DirectPathIPV4Only { | |
cmdRes := exec.Command("bash", "-c", blackholeDpv4Cmd) | |
out, _ := cmdRes.CombinedOutput() | |
t.Logf(string(out)) | |
} else { | |
cmdRes := exec.Command("bash", "-c", blackholeDpv4Cmd) | |
out, _ := cmdRes.CombinedOutput() | |
t.Logf(string(out)) | |
cmdRes = exec.Command("bash", "-c", blackholeDpv6Cmd) | |
out, _ = cmdRes.CombinedOutput() | |
t.Logf(string(out)) | |
} | |
cmdRes := exec.Command("bash", "-c", blackholeDpv4Cmd) | |
out, _ := cmdRes.CombinedOutput() | |
t.Logf(string(out)) | |
if testEnv.Config().DirectPathIPV4Only { | |
return | |
} | |
cmdRes = exec.Command("bash", "-c", blackholeDpv6Cmd) | |
out, _ = cmdRes.CombinedOutput() | |
t.Logf(string(out)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
bigtable/integration_test.go
Outdated
} | ||
} | ||
|
||
// examineTraffic counts RPCs use DirectPath or CFE traffic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// examineTraffic counts RPCs use DirectPath or CFE traffic. | |
// examineTraffic returns whether RPCs use DirectPath (blackholeDP = false) or CFE (blackholeDP = true). |
Is that right? I'm not sure if there is a more clear way to say it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is correct.
Add a DirectPath fallback test for bigtable in Golang. The logic of the test is:
The DirectPath and CFE traffic are distinguished by peer IP.