Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add filter operator #2805

Merged
merged 3 commits into from
May 29, 2024
Merged

add filter operator #2805

merged 3 commits into from
May 29, 2024

Conversation

flyth
Copy link
Member

@flyth flyth commented May 6, 2024

This re-adds filter functionality to the image-based gadgets.

I've slightly changed the syntax to be a bit clearer now:

Before:
--filter ... field:abc ... field2:>=123

After:
--filter ... field==abc ... field2>=123

Also, now the colon : is used to target specific DataSources:

--filter ds:field==abc

Several things should be fixed/cleaned up before merging this (already in discussion / being worked on):

Later on this should be extended to also expose the filters to eBPF programs so that they can adopt filtering inside eBPF already if possilbe.

pkg/operators/filter/filter.go Fixed Show fixed Hide fixed
pkg/operators/filter/filter.go Fixed Show fixed Hide fixed
pkg/operators/filter/filter.go Fixed Show fixed Hide fixed
pkg/operators/filter/filter.go Fixed Show fixed Hide fixed
pkg/operators/filter/filter.go Fixed Show fixed Hide fixed
pkg/operators/filter/filter.go Fixed Show fixed Hide fixed
ds.Subscribe(func(ds datasource.DataSource, data datasource.Data) error {
for _, fn := range funcs {
if !fn(ds, data) {
return fmt.Errorf("skip")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your idea here? This approach wouldn't work for arrays. I was wondering how we could allow subscribers to "discard" data. I'm thinking to what we do for enriching containers where enrichers can return false if they want the container to be discarded. Another way is to expose a method to mark the data that a subscriber wants to discard. Something like data.Discard() and then, in EmitAndRelease, we stop sending that data element to the rest of subscribers. WDYT?

cc @mauriciovasquezbernal

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that error there is just a dummy - I discussed that already with @mauriciovasquezbernal and we came to the conclusion to use a specific error from the datasource package for it (datasource.ErrSkip or something like it). So this would make it skip that event without returning a real error message.

For arrays the same could happen in the EmitAndRelease() in that that specific error would shrink the array.

@flyth flyth force-pushed the michael/oci/filter branch 2 times, most recently from c15e7cb to 26136b2 Compare May 16, 2024 19:28
@flyth flyth marked this pull request as ready for review May 16, 2024 19:51
Copy link
Member

@burak-ok burak-ok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't run/test it yet

pkg/operators/filter/filter.go Show resolved Hide resolved
pkg/operators/filter/filter.go Show resolved Hide resolved
pkg/operators/filter/filter.go Show resolved Hide resolved
pkg/operators/filter/filter.go Outdated Show resolved Hide resolved
pkg/operators/filter/filter.go Show resolved Hide resolved
pkg/operators/filter/filter.go Outdated Show resolved Hide resolved
@flyth flyth force-pushed the michael/oci/filter branch 2 times, most recently from 4b10db8 to aeaaeb1 Compare May 23, 2024 12:06
Copy link
Member

@burak-ok burak-ok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filtering doesn't seem to work for me:

michael/oci/filter ✓ ➭ go run --exec "sudo -E" ./cmd/ig/... run trace_exec -F comm==date 
INFO[0000] Experimental features enabled                
RUNTIME.CONTAINER… MNTNS_ID PID        PPID       UID        GID        LOGINUID   SESSIONID … ARGS_COU… U… ARGS_SIZE COMM      ARGS      TIMESTAMP
hardcore_maxwell   40265327 412076     196850     0          0          4294967295 42949672… 0 1         f… 10        date      /bin/date 2024-05-2
hardcore_maxwell   40265327 412089     196850     0          0          4294967295 42949672… 0 2         f… 22        wget      /bin/wget 2024-05-2

pkg/operators/filter/filter_test.go Outdated Show resolved Hide resolved
pkg/operators/filter/filter_test.go Outdated Show resolved Hide resolved
match: false,
},
{
name: "string lte positive",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question: Do we want to allow > < operations on strings? Most of the time they are used by accident

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather keep it and not restrict it artificially - do you think chances of using it by accident are really that big? There are valid use cases IMHO (although it could also be done with the (much slower) regex feature).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't know if it is useful. Currently no use cases comes into my mind. But I might miss something ofc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, restrict the output to entries starting with certain characters (or a range) - like said, that could also be done using regex, but this is much faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have anything against it. It is just a part of my brain screams when it sees > < with strings :D

pkg/operators/filter/filter_test.go Show resolved Hide resolved
pkg/operators/filter/filter_test.go Show resolved Hide resolved
@flyth
Copy link
Member Author

flyth commented May 24, 2024

Filtering doesn't seem to work for me

Yeah, priorities... We have to come up with some values we can add to / subtract from. I thought the CLI would be 50000, but it's actually 10000. Adjusted it in here now.

Copy link
Member

@burak-ok burak-ok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the implementation

Filtering doesn't seem to work for me

Yeah, priorities... We have to come up with some values we can add to / subtract from. I thought the CLI would be 50000, but it's actually 10000. Adjusted it in here now.

We should put this all in a single file with consts

Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks really good. A lot of nits and minor comments.

pkg/datasource/data.go Outdated Show resolved Hide resolved
Comment on lines 36 to 47
type dsError string

func (err dsError) Error() string {
return string(err)
}

const (
ErrDiscard = dsError("discarded")
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't var ErrDiscard = errors.New("discarded") work?

(AFAIU it's how it's done by standard libraries https://cs.opensource.google/go/go/+/refs/tags/go1.22.3:src/io/fs/walk.go;l=20)

Please also add a comment indicating this is used by operator to discard a packet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does work, however it makes that a mutable variable. This way it's const. I've read a couple of articles around this, but I'm also torn because it's still heavily used in the standard library (maybe just because of backward compatibility - idk).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I was surprised it was done in that way by the standard library, however it seems it's because of compatibility reasons golang/go#63602 (comment).

For future PR: We could define

type dsError string

func (err dsError) Error() string {
	return string(err)
}

in a helper file and use it from other places, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that type would ofc need to be exported, then. But I think it's small enough to be kept inside the pkg it is being used from.

pkg/datasource/data.go Show resolved Hide resolved
pkg/datasource/data.go Outdated Show resolved Hide resolved
pkg/operators/filter/filter.go Outdated Show resolved Hide resolved
pkg/operators/filter/filter_test.go Outdated Show resolved Hide resolved
pkg/operators/filter/filter_test.go Show resolved Hide resolved
pkg/operators/filter/filter.go Show resolved Hide resolved
cmd/ig/main.go Outdated Show resolved Hide resolved
flyth added 3 commits May 29, 2024 21:10
…) loops

Signed-off-by: Michael Friese <mfriese@microsoft.com>
Signed-off-by: Michael Friese <mfriese@microsoft.com>
Signed-off-by: Michael Friese <mfriese@microsoft.com>
Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@flyth
Copy link
Member Author

flyth commented May 29, 2024

Thanks for the reviews!

@flyth flyth merged commit bd6ef58 into main May 29, 2024
60 checks passed
@flyth flyth deleted the michael/oci/filter branch May 29, 2024 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants