Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dependencies): further adopt dependencies API to ksymbols and probes #3967

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

AlonZivony
Copy link
Collaborator

@AlonZivony AlonZivony commented Apr 9, 2024

1. Explain what the PR does

This PR expands and uses the dependencies API introduced in #3931 for other events dependencies handling.
It adds the probes dependencies to the dependencies manager, and use it to manage probes cancelling.
This PR also add the dependencies watchers for adding and removing the probes, and use the dependencies to cancel events with missing ksymbols.

Fix #3933

2. Explain how to test it

3. Other comments

@AlonZivony
Copy link
Collaborator Author

This PR was extracted from #3965 and will serve it for its logic

@AlonZivony
Copy link
Collaborator Author

This however won't solve the issue that we need to update the events_map in the eBPF so events removed won't be there anymore. @geyslan is this something you can do? I am a bit lost with how to do it with the new policies versioning mechanism.

@AlonZivony AlonZivony force-pushed the feature/further-dependencies-adaptation branch from 34ba958 to 89b530b Compare April 9, 2024 12:32
@geyslan
Copy link
Member

geyslan commented Apr 9, 2024

This however won't solve the issue that we need to update the events_map in the eBPF so events removed won't be there anymore. @geyslan is this something you can do? I am a bit lost with how to do it with the new policies versioning mechanism.

For sure, could you put TODO comments?

@AlonZivony AlonZivony force-pushed the feature/further-dependencies-adaptation branch 7 times, most recently from 5542f04 to 706ea06 Compare April 14, 2024 13:44
@AlonZivony AlonZivony force-pushed the feature/further-dependencies-adaptation branch 3 times, most recently from d44a759 to 578bbb6 Compare April 30, 2024 16:43
@AlonZivony AlonZivony marked this pull request as ready for review May 1, 2024 08:38
Copy link
Collaborator

@NDStrahilevitz NDStrahilevitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to go over the logic again later this week.

pkg/ebpf/tracee.go Show resolved Hide resolved
pkg/ebpf/c/tracee.bpf.c Outdated Show resolved Hide resolved
tests/integration/dependencies_test.go Outdated Show resolved Hide resolved
tests/integration/dependencies_test.go Outdated Show resolved Hide resolved
tests/integration/dependencies_test.go Outdated Show resolved Hide resolved
return false, nil
}

// second stage: validate events
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added support for unexpected events only in this function (ExpectAtLeastOneForEach) as I only used this one.
If its seems good to you @NDStrahilevitz I can add it to all other functions as well.
I admit that I think that the current API created by @geyslan is a bit too complicated - it has too many functions so the differences between them is hard to understand. Anyways this PR is not about the testing infrastructure, so I only open it here for discussion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit that I think that the current API created by @geyslan is a bit too complicated

I agree with you @AlonZivony. Integration API must be rethought.

@AlonZivony AlonZivony force-pushed the feature/further-dependencies-adaptation branch from c134e0b to eaf952a Compare May 15, 2024 08:37
@AlonZivony AlonZivony force-pushed the feature/further-dependencies-adaptation branch from eaf952a to 4d3a9bf Compare May 15, 2024 08:49
@AlonZivony
Copy link
Collaborator Author

After a conversation with @yanivagman we arrived to the following problems that should be addressed with the watchers mechanism:

  1. Add watchers order should be from dependencies to dependants, and removed watchers as well. The dependants cannot work without the dependencies, so there is no logic in having them without their dependencies.
  2. All watchers should be called after all the tree management operations are finished.
  3. As some watchers might call RemoveEvent, we should either call watchers by order of invocation, or cancel irrelevant watchers (for example, cancel add watchers on event that its remove was called already).

Copy link
Collaborator

@yanivagman yanivagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @AlonZivony
The automatic handling of event dependencies will make it easier to implement dynamic policy modification in the future. @geyslan FYI

pkg/events/dependencies/manager.go Outdated Show resolved Hide resolved
pkg/events/dependencies/manager.go Outdated Show resolved Hide resolved
}

// removeNode removes the node from the tree.
func (m *Manager) removeHandle(handle *HandleNode) bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the meaning of returning bool here? If it represent an error, let's use error instead

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It represents if the remove was successful in the sense of if the handle existed in the manager.
I can replace it with an error, and create an error for "value not found". But as currently the remove cannot fail, the bool value seems cleaner to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using error for non existing value is a a common practice

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can replace it with an error, and create an error for "value not found".

So the check against it should be if errors.Is(err, dependencies.ValueNotFoundError) {. It's a good practice if we'll have more error types, if not... the check against the bool is cleaner.

}

func (hn *HandleNode) addDependant(dependant events.ID) {
hn.dependants = append(hn.dependants, dependant)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we check that dependant not already exists in the list?

probesToEvents[probeDep] = append(probesToEvents[probeDep], id)
for _, probe := range depsNode.GetDependencies().GetProbes() {
err := t.probes.Attach(probe.GetHandle(), t.cgroups)
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - to avoid extra indentation, use

if err == nil {
    continue
}

detachUnusedProbes := func(probesToCheck []events.Probe) {
for _, probe := range probesToCheck {
_, ok := t.eventsDependencies.GetHandle(probe.GetHandle())
if !ok {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - to avoid extra indentation, use:

if ok {
    continue
}

Comment on lines 1254 to 1256
if err != nil {
return errfmt.WrapError(err)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not required since we call it anyway below

pkg/ebpf/tracee.go Show resolved Hide resolved
Track dependencies on probes.
This makes it easier to understand when new probes are added or old ones removed.
Change the handling of the probes attach and detach to use the dependencies watchers.
This allows to handle correctly failure of events and future events updates.
Create an end-to-end test for the dependencies failure mechanism.
This should check that the new dependencies mechanism is integrating well with Tracee.
This PR introduce test events for the first time.
Move useful helper functions for tests in the logger and policies area
to the testutils package.
Use integration tests existing functions instead of self implemented.
Change the watchers to support any type of node to trigger watchers, according to required type.
Also add the ability for add watcher to cancel the add of the node to get less buggy order of operations.
@AlonZivony AlonZivony force-pushed the feature/further-dependencies-adaptation branch from 4d3a9bf to 0f62fe0 Compare May 22, 2024 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cancelled events probes are not cleaned properly
4 participants