Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent: Add agent process supervisor tests #20741

Merged
merged 95 commits into from
May 30, 2023
Merged

Conversation

averche
Copy link
Contributor

@averche averche commented May 24, 2023

This introduces 6 integration tests for agent running in process supervisor mode (#20739):

  1. simple test to ensure environment variables are injected correctly
  2. test when the child process exits early
  3. test when the child process exits early with an error
  4. test sending sigterm to the child process
  5. test sending sigusr1 to the child process
  6. test child process ignoring signals

This PR is part of a larger effort to implement secrets as environment variable support in agent (VLT-253)

This PR was started in #20722 by @dhuckins

dhuckins and others added 30 commits May 8, 2023 11:16
Co-authored-by: Anton Averchenkov <84287187+averche@users.noreply.github.com>
Co-authored-by: Anton Averchenkov <84287187+averche@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Co-authored-by: Anton Averchenkov <84287187+averche@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Co-authored-by: Anton Averchenkov <84287187+averche@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
Signed-off-by: Daniel Huckins <dhuckins@users.noreply.github.com>
@averche averche requested a review from a team as a code owner May 24, 2023 21:05
command/agent/exec/test-app/main.go Show resolved Hide resolved
command/agent/exec/exec_test.go Outdated Show resolved Hide resolved
command/agent/exec/exec_test.go Show resolved Hide resolved
"request_id": "8af096e9-518c-7351-eff5-5ba20554b21f",
"lease_id": "",
"renewable": false,
"lease_duration": 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a test on a dynamic secret, as well as static (it would likely make more sense on a real Vault server, as mentioned above).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dynamic secret might be a bit difficult to set up on a real server. I'm thinking we can use fake server that simply scrambles the response on each request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could request a token? That should be easy enough to set up, I think. For example, while TestAgent_Cache_DynamicSecret uses the API proxy subsystem of Agent, I don't see why it couldn't be doable using templating?

}},
testAppArgs: []string{"--stop-after", "10s"},
testAppStopSignal: syscall.SIGTERM,
testAppPort: 34001,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of static ports, could we use something similar to generateListenerAddress in agent_test to try and avoid collisions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a similar concern (port collisions when running the tests from multiple PR's) but if I understand correctly, GHA should run each PR in an isolated/containerized environment so it should not be an issue

}()

for name, testCase := range testCases {
t.Run(name, func(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use t.Parallel() on the next line here (and at the start of the function)? I can't see anything that might conflict, but I could be missing something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried t.Parallel but it's failing the tests -- possibly something related to the data race.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the tests currently can't be run in parallel since there is a shared setup (a common test app that is built ahead of time). This would probably work better in a test suite/fixture, but these don't seem to be used in Vault.

@VioletHynes
Copy link
Contributor

Oh, as an FYI, there's a data race test failure, in case you've not seen it:

WARNING: DATA RACE
Write at 0x00c000c46cc0 by goroutine 163:
  github.com/hashicorp/consul-template/child.(*Child).kill.func1()
      /home/runner/go/pkg/mod/github.com/hashicorp/consul-template@v0.32.0/child/child.go:439 +0x86
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.20.4/x64/src/runtime/panic.go:476 +0x32
  github.com/hashicorp/consul-template/child.(*Child).internalStop()
      /home/runner/go/pkg/mod/github.com/hashicorp/consul-template@v0.32.0/child/child.go:296 +0x1d5
  github.com/hashicorp/consul-template/child.(*Child).Stop()
      /home/runner/go/pkg/mod/github.com/hashicorp/consul-template@v0.32.0/child/child.go:274 +0x1877
  github.com/hashicorp/vault/command/agent/exec.(*Server).Run()
      /home/runner/work/vault/vault/command/agent/exec/exec.go:131 +0x1857
  github.com/hashicorp/vault/command/agent/exec.TestServer_Run.func2.1()
      /home/runner/work/vault/vault/command/agent/exec/exec_test.go:203 +0x67

Previous read at 0x00c000c46cc0 by goroutine 184:
  github.com/hashicorp/consul-template/child.(*Child).kill.func2()
      /home/runner/go/pkg/mod/github.com/hashicorp/consul-template@v0.32.0/child/child.go:457 +0x84

Goroutine 163 (running) created at:
  github.com/hashicorp/vault/command/agent/exec.TestServer_Run.func2()
      /home/runner/work/vault/vault/command/agent/exec/exec_test.go:202 +0xbbe
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.4/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.4/x64/src/testing/testing.go:1629 +0x47

Goroutine 184 (running) created at:
  github.com/hashicorp/consul-template/child.(*Child).kill()
      /home/runner/go/pkg/mod/github.com/hashicorp/consul-template@v0.32.0/child/child.go:455 +0x4ab
  github.com/hashicorp/consul-template/child.(*Child).internalStop()
      /home/runner/go/pkg/mod/github.com/hashicorp/consul-template@v0.32.0/child/child.go:296 +0x1d5
  github.com/hashicorp/consul-template/child.(*Child).Stop()
      /home/runner/go/pkg/mod/github.com/hashicorp/consul-template@v0.32.0/child/child.go:274 +0x1877
  github.com/hashicorp/vault/command/agent/exec.(*Server).Run()
      /home/runner/work/vault/vault/command/agent/exec/exec.go:131 +0x1857
  github.com/hashicorp/vault/command/agent/exec.TestServer_Run.func2.1()
      /home/runner/work/vault/vault/command/agent/exec/exec_test.go:203 +0x67
==================
2023-05-24T21:12:28.749Z [INFO]  exec server stopped
    testing.go:1446: race detected during execution of test

=== FAIL: command/agent/exec TestServer_Run (71.43s)

@averche
Copy link
Contributor Author

averche commented May 25, 2023

We've been looking into this data race and trying to figure it out, gives us a concern about the existing code 😄

@averche
Copy link
Contributor Author

averche commented May 25, 2023

The data race seems to be internal to consul-template. I've created an issue hashicorp/consul-template#1753 with a minimal example of failure.

@averche averche changed the base branch from main to agent-fix-exit-ch-issue May 26, 2023 18:00
@averche averche added the backport/1.14.x Backport changes to `release/1.14.x` label May 29, 2023
Base automatically changed from agent-fix-exit-ch-issue to main May 30, 2023 16:23
@averche averche merged commit 21eccf8 into main May 30, 2023
92 of 93 checks passed
@averche averche deleted the agent-process-supervisor-tests branch May 30, 2023 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.14.x Backport changes to `release/1.14.x`
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants