Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(connlib): simulate IO in state machine tests #4728

Merged
merged 47 commits into from May 22, 2024

Conversation

thomaseizinger
Copy link
Member

@thomaseizinger thomaseizinger commented Apr 22, 2024

This is similar to #4097 and #4585 but for the entire ClientState and GatewayState. We also do it in the context of a property-based test with the vision that we can deterministically explore a large space of state transitions and see where our main property breaks: Being able to send an ICMP packet from the client to the gateway.

In other words, we now correctly pass all the Transmits back and forth between the components as if they would receive it from the network. Due to the nature of property-based tests, this already exercises a very large input space. For example, if the client does not have an IPv6 socket and the gateway doesn't have an IPv4 socket, this test already checks whether we then correctly fall back to using a relay (because the allocation we make on the relay is the only network path where the STUN requests pass through).

What this does not (yet) do is set up a proper network topology. The dispatch_transmit function will happily "route" a Transmit from e.g. the client to the gateway even if they are in different subnets. In other words, these tests assume that the actual network itself works and we can exchange UDP packets between the components.

For now, we only send ICMPs to CIDR resources. As a next step, we can extend this to DNS resources by sending DNS queries for our DNS resources and then sending an ICMP to the resolved IP.

Copy link

vercel bot commented Apr 22, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
firezone ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 22, 2024 11:01pm

Copy link

github-actions bot commented Apr 22, 2024

Terraform Cloud Plan Output

Plan: 15 to add, 15 to change, 15 to destroy.

Terraform Cloud Plan

Copy link

github-actions bot commented Apr 22, 2024

Performance Test Results

TCP

Test Name Received/s Sent/s Retransmits
direct-tcp-client2server 240.5 MiB (+1%) 243.1 MiB (+1%) 276 (+44%)
direct-tcp-server2client 244.3 MiB (-1%) 245.3 MiB (-1%) 229 (-14%)
relayed-tcp-client2server 224.2 MiB (-2%) 225.7 MiB (-1%) 247 (+16%)
relayed-tcp-server2client 240.1 MiB (+3%) 240.8 MiB (+3%) 332 (+3%)

UDP

Test Name Total/s Jitter Lost
direct-udp-client2server 500.0 MiB (-0%) 0.05ms (+46%) 41.14% (+4%)
direct-udp-server2client 500.0 MiB (+0%) 0.01ms (-56%) 24.61% (+10%)
relayed-udp-client2server 500.0 MiB (+0%) 0.06ms (-84%) 56.27% (-3%)
relayed-udp-server2client 500.0 MiB (+0%) 0.03ms (+71%) 42.92% (-16%)

@thomaseizinger thomaseizinger force-pushed the chore/connlib/tests-send-transmit branch from 19e7358 to 846b05c Compare April 24, 2024 07:49
@thomaseizinger thomaseizinger force-pushed the chore/connlib/tests-send-transmit branch from 1beedd2 to abf84f0 Compare May 20, 2024 09:23
@thomaseizinger thomaseizinger changed the title wip: wire up client & gateway in state machine tests test(gateway): simulate IO in state machine tests May 20, 2024
@github-actions github-actions bot added the kind/test PRs that are focused on increasing test coverage label May 20, 2024
@thomaseizinger thomaseizinger changed the title test(gateway): simulate IO in state machine tests test(connlib): simulate IO in state machine tests May 21, 2024
@thomaseizinger thomaseizinger force-pushed the chore/connlib/tests-send-transmit branch from ea0b736 to 1d9760f Compare May 21, 2024 05:25
@@ -5871,8 +5862,8 @@ checksum = "a2eb9349b6444b326872e140eb1cf5e7c522154d69e7a0ffb0fb81c06b37543f"

[[package]]
name = "str0m"
version = "0.5.0"
source = "git+https://github.com/firezone/str0m?branch=main#44b616945145033ade739f30cdce004728011619"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR updates str0m in to include algesten/str0m#515. Because main of str0m also includes algesten/str0m#508 which caused problems (#4826) when we integrated it into firezone, I've reverted that PR in our fork in firezone/str0m@07c4fb0.

Thus, our current diff to upsteam str0m looks like this: algesten/str0m@main...firezone:str0m:main

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In here, we update the APIs of ClientState to be easier to call from the tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing for the gateway. These updates make it easier / possible to call the necessary APIs from the proptest.

@thomaseizinger thomaseizinger marked this pull request as ready for review May 21, 2024 06:00
Copy link
Collaborator

@conectado conectado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still have to finish the review but it's looking really good!

leaving just a small suggestion

rust/relay/src/proptest.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@conectado conectado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

}
Transition::SendICMPPacketToIp4Resource { src, r_idx } => {
Transition::SendICMPPacketToIp4Resource { r_idx } => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something good to add in a later PR would be sending packets the other way, to test the encapsulate on the gateway-side and decapsulate on the client-side

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll record it as a follow-up issue!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: #5099

Comment on lines +294 to +295
let weight_ip4 = if num_ip4_resources == 0 { 0 } else { 3 };
let weight_ip6 = if num_ip6_resources == 0 { 0 } else { 3 };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use 3 here to represent that it's more likely to have packets flowing than resources changing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And more likely to generate packets for resources than non-resources.

rust/connlib/tunnel/src/tests.rs Outdated Show resolved Hide resolved
@thomaseizinger
Copy link
Member Author

@ReactorScram I am merging based on @conectado 's review and to move my stacked PRs forward. I'll implement any feedback in follow-up PRs!

@thomaseizinger thomaseizinger added this pull request to the merge queue May 22, 2024
Merged via the queue into main with commit 92676f0 May 22, 2024
135 checks passed
@thomaseizinger thomaseizinger deleted the chore/connlib/tests-send-transmit branch May 22, 2024 23:23
rust/connlib/shared/src/error.rs Show resolved Hide resolved
rust/connlib/snownet/src/node.rs Show resolved Hide resolved
fn preconditions(state: &Self::State, transition: &Self::Transition) -> bool {
match transition {
Transition::AddCidrResource(r) => {
// TODO: PRODUCTION CODE DOES NOT HANDLE THIS!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Production does not handle a gateway getting a resource that it doesn't have connectivity for?

rust/connlib/tunnel/src/tests.rs Show resolved Hide resolved
rust/connlib/tunnel/src/tests.rs Show resolved Hide resolved
}

/// Implementation of our reference state machine.
/// Stub implementation of the portal.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really cool if I could use all this to test the Windows Clients in CI without having access to Docker / Elixir / a staging portal. but the SimPortal couldn't hook up to a real Client tunnel yet, right?

I think we had it as a long-term goal but it would take a while?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are running in Windows CI already! But they don't do any platform-specific stuff so unless the Rust compiler mis-compiles some of the state machine stuff, I'd be surprised to find a windows-specific bug with these tests.

Or what exactly were you thinking of running in CI here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean I'd like to boot up the Client, open the GUI stuff, open wintun and the virtual network interface, and exercise all the platform-specific code with only a fake Portal and fake Gateway, since those don't run easily in Windows and are hard to hook into the runners.

Like sort of tell connlib "I know there is no portal but just connect everything anyway"

github-merge-queue bot pushed a commit that referenced this pull request May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/test PRs that are focused on increasing test coverage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants