Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maximum offset distance in condition - perhaps strings module idea? #1897

Open
tlansec opened this issue Mar 20, 2023 · 5 comments
Open

Maximum offset distance in condition - perhaps strings module idea? #1897

tlansec opened this issue Mar 20, 2023 · 5 comments

Comments

@tlansec
Copy link
Contributor

tlansec commented Mar 20, 2023

Is your feature request related to a problem? Please describe.
Sometimes I have a rule with a relatively complex set of strings. While most of the time I'm interested in matching against a single file I also want to match the rule against memory samples, therefore I don't want to use a filesize constraint as the memory sample is very large. Instead, what I want to do is specific the maximum distance between any matched strings.

Describe the solution you'd like
This may be an idea for the distant future, but I'd like a solution where I can (for a complex set of strings) have a special variable like:

strings:
     [whatever strings i want]
condition: 
     all of them and
     strings.max_offset - strings.min_offset< N

This strings variable (or module) could also allow inspection of:

  • total_string_matches
  • max_string_match_length
  • min_string_match_length
  • other things I haven't thought of
@wxsBSD
Copy link
Collaborator

wxsBSD commented Mar 23, 2023

At least one of these could be done already: total string matches could be done with the #a syntax.

I don't see any easy way to do the others right now as I don't think there is a way to pass a YARA string into a function right now. It is a good idea though.

I had initially thought about wanting to express the logic like this:

math.abs($a.max_offset() - $b.min_offset()) < 100

While that would be a nice way to do it and be extensible it would mean the compiler would get more and more complicated with each new function/attribute we want to expose on a string match. As such, I'm liking your idea of expressing this in a module:

math.abs(strings.max_offset($a) - strings.min_offset($b)) < 100

This is all assuming we eventually grow support for allowing for YARA strings to be used as arguments. I had wanted to spend my time on yara-x but this is such an intriguing idea that I'm curious what @plusvic says about it. I feel like this is something I could likely implement fairly quickly too.

@tlansec
Copy link
Contributor Author

tlansec commented Mar 24, 2023

For this:

total string matches could be done with the #a syntax.

It could be done, but for a rule containing 30 strings the rule becomes cumbersome to read and write.

[...] allowing for YARA strings to be used as arguments

I think this is the most elegant solution actually, because the condition you write is really the most common type of thing I want to express.

@plusvic
Copy link
Member

plusvic commented Mar 24, 2023

This request is interesting because it exposes the current limitations in the language. I agree @wxsBSD's comment, in order to implement this (and more powerful features in the feature) we may need to implement one of the following features (or both):

  1. String identifiers as arguments to functions, like in foo($a), where foo has access to all the information about the $a pattern, including the current matches.

  2. Methods associated to string identifiers, like in $a.foo(). This is really an special case of 1, once you have 1 implementing this should be straightforward.

There are more cases in which this would be helpful, and I'm getting more and more convinced that we must introduce something like this in order to unleash a series of enhancements that would bring YARA to the next level in terms of expressiveness.

I wouldn't implement this in the current C implementation, though. It would require a lot of changes, and my focus is now on bringing YARA-X forward. I also think that this is going to be easier to implement in YARA-X.

What we could start doing collectively is designing the changes that we want, writing RFCs like this one #1783. Even if we don't start implementing these ideas right away, we can start working on defining and polishing the ideas. The delay may be beneficial, as the ideas have time to settle down and mature before they are implemented.

@tlansec
Copy link
Contributor Author

tlansec commented Mar 24, 2023

I am happy to try and come up with an RFC like the one cited if you like (although maybe Wes has more experience writing such documents). Should RFCs go in "Discussions" as their own Conversation or do they get put in the Road Ahead discussion?

@plusvic
Copy link
Member

plusvic commented Mar 24, 2023

For the time being I would put each RFC as in independent discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants