-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add filter operator #2805
add filter operator #2805
Conversation
pkg/operators/filter/filter.go
Outdated
ds.Subscribe(func(ds datasource.DataSource, data datasource.Data) error { | ||
for _, fn := range funcs { | ||
if !fn(ds, data) { | ||
return fmt.Errorf("skip") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's your idea here? This approach wouldn't work for arrays. I was wondering how we could allow subscribers to "discard" data. I'm thinking to what we do for enriching containers where enrichers can return false if they want the container to be discarded. Another way is to expose a method to mark the data that a subscriber wants to discard. Something like data.Discard()
and then, in EmitAndRelease
, we stop sending that data element to the rest of subscribers. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that error there is just a dummy - I discussed that already with @mauriciovasquezbernal and we came to the conclusion to use a specific error from the datasource package for it (datasource.ErrSkip or something like it). So this would make it skip that event without returning a real error message.
For arrays the same could happen in the EmitAndRelease()
in that that specific error would shrink the array.
c15e7cb
to
26136b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't run/test it yet
4b10db8
to
aeaaeb1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filtering doesn't seem to work for me:
michael/oci/filter ✓ ➭ go run --exec "sudo -E" ./cmd/ig/... run trace_exec -F comm==date
INFO[0000] Experimental features enabled
RUNTIME.CONTAINER… MNTNS_ID PID PPID UID GID LOGINUID SESSIONID … ARGS_COU… U… ARGS_SIZE COMM ARGS TIMESTAMP
hardcore_maxwell 40265327 412076 196850 0 0 4294967295 42949672… 0 1 f… 10 date /bin/date 2024-05-2
hardcore_maxwell 40265327 412089 196850 0 0 4294967295 42949672… 0 2 f… 22 wget /bin/wget 2024-05-2
match: false, | ||
}, | ||
{ | ||
name: "string lte positive", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General question: Do we want to allow > <
operations on strings? Most of the time they are used by accident
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather keep it and not restrict it artificially - do you think chances of using it by accident are really that big? There are valid use cases IMHO (although it could also be done with the (much slower) regex feature).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually don't know if it is useful. Currently no use cases comes into my mind. But I might miss something ofc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, restrict the output to entries starting with certain characters (or a range) - like said, that could also be done using regex, but this is much faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have anything against it. It is just a part of my brain screams when it sees > < with strings :D
Yeah, priorities... We have to come up with some values we can add to / subtract from. I thought the CLI would be 50000, but it's actually 10000. Adjusted it in here now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the implementation
Filtering doesn't seem to work for me
Yeah, priorities... We have to come up with some values we can add to / subtract from. I thought the CLI would be 50000, but it's actually 10000. Adjusted it in here now.
We should put this all in a single file with const
s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks really good. A lot of nits and minor comments.
pkg/datasource/datasource.go
Outdated
type dsError string | ||
|
||
func (err dsError) Error() string { | ||
return string(err) | ||
} | ||
|
||
const ( | ||
ErrDiscard = dsError("discarded") | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't var ErrDiscard = errors.New("discarded")
work?
(AFAIU it's how it's done by standard libraries https://cs.opensource.google/go/go/+/refs/tags/go1.22.3:src/io/fs/walk.go;l=20)
Please also add a comment indicating this is used by operator to discard a packet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does work, however it makes that a mutable variable. This way it's const. I've read a couple of articles around this, but I'm also torn because it's still heavily used in the standard library (maybe just because of backward compatibility - idk).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I was surprised it was done in that way by the standard library, however it seems it's because of compatibility reasons golang/go#63602 (comment).
For future PR: We could define
type dsError string
func (err dsError) Error() string {
return string(err)
}
in a helper file and use it from other places, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that type would ofc need to be exported, then. But I think it's small enough to be kept inside the pkg it is being used from.
…) loops Signed-off-by: Michael Friese <mfriese@microsoft.com>
Signed-off-by: Michael Friese <mfriese@microsoft.com>
Signed-off-by: Michael Friese <mfriese@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks for the reviews! |
This re-adds filter functionality to the image-based gadgets.
I've slightly changed the syntax to be a bit clearer now:
Before:
--filter ... field:abc ... field2:>=123
After:
--filter ... field==abc ... field2>=123
Also, now the colon
:
is used to target specific DataSources:--filter ds:field==abc
Several things should be fixed/cleaned up before merging this (already in discussion / being worked on):
fieldAccessor.Bool()
datasources: Add missing setters, allocate memory and add tests #2823fieldAccessor.String()
/fieldAccessor.CString()
handling datasources: Add missing setters, allocate memory and add tests #2823Drop
error that can be returned by DataSource subscribers) (done in this PR)Later on this should be extended to also expose the filters to eBPF programs so that they can adopt filtering inside eBPF already if possilbe.