Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow regular expressions in ctl:ruleRemoveTargetByX variable names II. #3121

Open
wants to merge 4 commits into
base: v2/master
Choose a base branch
from

Conversation

airween
Copy link
Member

@airween airween commented Apr 16, 2024

This PR is the renewal and addition of the PR #1683, and solves #911.

Example:

SecRule REQUEST_URI "@beginswith /index.php"
    "id:1001,phase:1,pass,nolog,
    ctl:ruleRemoveTargetById=942100;ARGS:/^password[\d+]$/"

The new patch works with PCRE2 too.

Actually it is only for v2, so until there is no patch for v3 I don't want to merge it (hope I can do that soon).

Many thanks for your work, @vvidic, the credit is yours.

Copy link
Contributor

@marcstern marcstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strlen(value) could be cached in a variable, as it's used several times

Copy link
Contributor

@marcstern marcstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(strlen(value) > 2 && value[0] == '/' && ...
Better to begin with check of value[0] as, most of the time, the test will be false.

Copy link
Contributor

@marcstern marcstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (regex == NULL) { ...
We should return 0 immediately to simplify the flow.

Let's do it for if (targets != NULL) also when we're at it

Copy link
Contributor

@marcstern marcstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if((strlen(myvalue) == strlen(value)) && strncasecmp(myvalue,value,strlen(myvalue)) == 0)
Why not
if(strcasecmp(myvalue,value,strlen(myvalue)) == 0)

Copy link
Contributor

@marcstern marcstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else if (value == NULL && myvalue == NULL)    {
   if (msr->txcfg->debuglog_level >= 9) {
      msr_log(msr, 9, "fetch_target_exception: Target %s will not be processed.", target);
   }
   match = 1;
} else if (value == NULL && myvalue != NULL)   {
   if (msr->txcfg->debuglog_level >= 9) {
      msr_log(msr, 9, "fetch_target_exception: Target %s will not be processed.", target);
   }
   match = 1;
}

Simplify:

else {
   if (msr->txcfg->debuglog_level >= 9) {
      msr_log(msr, 9, "fetch_target_exception: Target %s will not be processed.", target);
   }
   match = 1;
}

Copy link
Contributor

@marcstern marcstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inside if(strncasecmp(myname, name,strlen(myname)) == 0) there's a test if(value != NULL && myvalue != NULL)
If value is not NULL, like TX:a, REQUEST_HEADERS:abc, 'myvalue' (corresponding to the actual target) cannot be NULL (TX alone is not possible).
Thus if(value != NULL) is sufficient

Copy link

sonarcloud bot commented Apr 17, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
3 Security Hotspots
B Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

@airween
Copy link
Member Author

airween commented Apr 17, 2024

strlen(value) could be cached in a variable, as it's used several times

Thanks, fixed in c796804.

@airween
Copy link
Member Author

airween commented Apr 17, 2024

if(strlen(value) > 2 && value[0] == '/' && ... Better to begin with check of value[0] as, most of the time, the test will be false.

I introduced the value_len which is an int, I think now the order is no matter anymore, so I kept it as is.

@airween
Copy link
Member Author

airween commented Apr 17, 2024

if (regex == NULL) { ... We should return 0 immediately to simplify the flow.

Let's do it for if (targets != NULL) also when we're at it

I think we can change it in a next round. regex value is produced by msc_pregcomp(), which returns NULL in case of failure, so I should compare it with a correct type.

@airween
Copy link
Member Author

airween commented Apr 17, 2024

if((strlen(myvalue) == strlen(value)) && strncasecmp(myvalue,value,strlen(myvalue)) == 0) Why not if(strcasecmp(myvalue,value,strlen(myvalue)) == 0)

strlen(value) is replaced by value_len, and I think it's faster than the strncasecmp(). If the length of strings are not equal, then the engine does not start to compare the strings. I think this make sense.

@airween
Copy link
Member Author

airween commented Apr 17, 2024

...
Simplify:
...

Thanks, this make sense - see 1790659.

@airween
Copy link
Member Author

airween commented Apr 17, 2024

Inside if(strncasecmp(myname, name,strlen(myname)) == 0) there's a test if(value != NULL && myvalue != NULL) If value is not NULL, like TX:a, REQUEST_HEADERS:abc, 'myvalue' (corresponding to the actual target) cannot be NULL (TX alone is not possible). Thus if(value != NULL) is sufficient

After a quick analysis I'm not sure about that. value holds the inspected exclusion target in the cycle. But myvalue can be NULL even the value is not null. myvalue derived from the inspected target.

Even though it might myvalue depends on value, but an extra comparing is not a big load. I suggest we should keep that there.

@airween
Copy link
Member Author

airween commented Apr 17, 2024

Thus if(value != NULL) is sufficient

Using both checks prevents the segfault. If the user's config is wrong, then it can lead to myvalue is NULL but value is not, eg:

SecRule REQUEST_FILENAME "@rx /admin" \
    "id:1001,\
    phase:1,\
    t:none,\
    nolog,\
    ctl:ruleRemoveTargetById=2001;REMOTE_HOST:/^foo.*$/"

which makes no sense, but if the user confuses a collection with a variable, then it could be a problem. I suggest to keep that there.

@marcstern
Copy link
Contributor

Thus if(value != NULL) is sufficient

Using both checks prevents the segfault. If the user's config is wrong, then it can lead to myvalue is NULL but value is not, eg:

SecRule REQUEST_FILENAME "@rx /admin" \
    "id:1001,\
    phase:1,\
    t:none,\
    nolog,\
    ctl:ruleRemoveTargetById=2001;REMOTE_HOST:/^foo.*$/"

which makes no sense, but if the user confuses a collection with a variable, then it could be a problem. I suggest to keep that there.
This faulty syntax should be detected and refused, but that's another issue. If we want to support that case, then if (value && *value) is sufficient

if (regex == NULL) { ... We should return 0 immediately to simplify the flow.
Let's do it for if (targets != NULL) also when we're at it

I think we can change it in a next round. regex value is produced by msc_pregcomp(), which returns NULL in case of failure, so I should compare it with a correct type.

The following code is strictly equivalent to the existing one, except that it removes one 'if' level (return early) and simplify a bit this code that becomes complex to follow:

if (targets == NULL) {
            if (msr->txcfg->debuglog_level >= 9) {
                msr_log(msr, 9, "fetch_target_exception: No exception target found for rule id %s.", rule->actionset->id);
            return 0;
        }

@airween
Copy link
Member Author

airween commented Apr 21, 2024

The following code is strictly equivalent to the existing one, except that it removes one 'if' level (return early) and simplify a bit this code that becomes complex to follow:

if (targets == NULL) {
            if (msr->txcfg->debuglog_level >= 9) {
                msr_log(msr, 9, "fetch_target_exception: No exception target found for rule id %s.", rule->actionset->id);
            return 0;
        }

I see your point, yes, that would make the code more cleaner.

if(value_len > 2 && value[0] == '/' && value[value_len - 1] == '/') {
value[value_len - 1] = '\0';
#ifdef WITH_PCRE2
regex = msc_pregcomp(msr->mp, value + 1,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (copied) code contains a memory leak: the regex variable compiled in each cycle during the exclusion check, but never freed (never called msc_pcre_cleanup()).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should reorganize this part. I would expect that regex variable above must be compiled during startup, and not when the engine inspect the exclusion.

@airween
Copy link
Member Author

airween commented Apr 22, 2024

I'm wondering why is a cycle here, I mean target created by split of targets by , (comma) which is a copy of exceptions here.

This would be make sense if the syntax of rule were allowed multiple targets, eg:

   ctl:ruleRemoveTargetById=9XXXXX;ARGS_GET,ARGS_POST,ARGS

but this is not allowed.

Variable exceptions passed when the function is called here, the item build in this line.

@marcstern, @dune73 do you have any idea, what was the original aim of this solution?

@marcstern
Copy link
Contributor

Where is it defined that multiple targets is not allowed?
Remark: in case it's allowed, it would be quicker to return 1 immediately instead of parsing all targets.

@airween
Copy link
Member Author

airween commented Apr 22, 2024

Where is it defined that multiple targets is not allowed?

How looks like the syntax?

@marcstern
Copy link
Contributor

Where is it defined that multiple targets is not allowed?
How looks like the syntax?
ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

@airween
Copy link
Member Author

airween commented Apr 23, 2024

ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

The parser does not allow this syntax. I tried this and other similar syntax's with no luck. The error message is:

AH00526: Syntax error on line...
Error parsing actions: Unknown action: ARGS

The , (comma) character is the action separator, even we use a quoted form (I tried to quote the whole argument like you (including rule id), or just quote the targets).

@marcstern
Copy link
Contributor

ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

The parser does not allow this syntax. I tried this and other similar syntax's with no luck. The error message is:

AH00526: Syntax error on line...
Error parsing actions: Unknown action: ARGS

The , (comma) character is the action separator, even we use a quoted form (I tried to quote the whole argument like you (including rule id), or just quote the targets).

My syntax was indeed incorrect.
The following is accepted: ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

@dune73
Copy link
Member

dune73 commented Apr 23, 2024

@airween asked me to chime in. Sorry to be so silent otherwise.

I see how the definition of multiple targets on a single runtime rule exclusion within single quotes would work syntactically. But I do not think it brings much of an advantage beyond an assumed minimal speed improvement.

The rule exclusion becomes harder to read and if you think of a graphical rule editor the UX designer would probably curse you using very harsh words for this addition.

Or in other words: Adding a regex to the target definition: Solves an itch where we have no alternative. Adding the option to add multiple arguments to a single ctl statement: Solves a small problem at the cost of hard to maintain config.

@airween
Copy link
Member Author

airween commented Apr 23, 2024

ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

My syntax was indeed incorrect. The following is accepted: ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

Sorry, but I don't see any differences.

@airween
Copy link
Member Author

airween commented Apr 23, 2024

I see how the definition of multiple targets on a single runtime rule exclusion within single quotes would work syntactically.

do you think this syntax works? I wasn't able to write any rule with multiple targets.

But I do not think it brings much of an advantage beyond an assumed minimal speed improvement.

I'm not sure this brings any speed improvements. In both cases there are the same cycles.

The rule exclusion becomes harder to read and if you think of a graphical rule editor the UX designer would probably curse you using very harsh words for this addition.

This is a valuable argument on long time, but may be we don't need to care of this issue actually.

Or in other words: Adding a regex to the target definition: Solves an itch where we have no alternative. Adding the option to add multiple arguments to a single ctl statement: Solves a small problem at the cost of hard to maintain config.

Actually regex in targets (with this implementation) would be very-very slow. If there is a real user demand (and based on the comments under original PR there is) I agree we need to implement this feature, but not this way.

The questions are:

  • do our users want this feature? If yes, we should add it to both engine (and what about Coraza? @jptosso, @fzipi, @jcchavezs?)
  • what about the multiple targets in one exclusion? Should it work? If yes, with what syntax? (this is important to understand how the current code works and how can we improve the speed of regex targets)

@dune73
Copy link
Member

dune73 commented Apr 23, 2024

Honestly, I think the speed problem is around the whole concept of runtime rule exclusions and how it is implemented.

You have a target list and then you remove individual targets from that list based on the ignore-list at runtime. The way it is done is very slow, very often slower than executing the rule itself. That effectively means that disabling rules via rule exclusions for speed gain is not possible. (And that use case is very real: If I assure that an ARGS follows a positive definition, I would like to improve the performance by removing this ARGS from further evaluation, but this approach actually degrades performance for simple ARGS. It's faster to run the rules ...)

And I remember looking at the execution in detail and I think either ModSec v2 or v3 would actually run the rule and remove the result if there was a rule exclusion instead of skipping the execution. But I am no longer 100% sure this was really the case.

But back to this problem here: The situation that pops up regularly is random cookie names and also random or enumerated argument names, thus targets where you do not know the name in advance. For the cookies, there is no alternative to disabling cookie inspection for a rule completely because the (Drupal!) session cookie has a random name.

I think the regex definition of runtime rule exclusions is needed, but the speed should be improved.

@airween
Copy link
Member Author

airween commented Apr 23, 2024

Honestly, I think the speed problem is around the whole concept of runtime rule exclusions and how it is implemented.

yes.

You have a target list and then you remove individual targets from that list based on the ignore-list at runtime.

First I want to understand why the exclusions handling code wants to handle multiple targets, but the parser does not allow it. So it would be very good to see a valid syntax, there the exclusion has multiple targets and parser eats it.

@marcstern
Copy link
Contributor

ctl:'ruleRemoveTargetById=942100;ARGS:/^password[\d+]$/,ARGS:abc'"

@marcstern
Copy link
Contributor

I think having a multiple targets is probably (a little bit) quicker than specifying multiple ctl actions.

The main problem is that a lot of parsing is done at run-time and especially the regex compiling.
Pre-parsing the exceptions at config time and storing each of them in a table (with their pre-compiled regex) would probably enhance the perf a lot. Then, specifying multiple targets would be useless.

Nota that, for one of the most common cases, excluding a target from all rules (like the Drupal session cookie), there's a much better solution: allowing all collections to be writeable. That way, we can remove it completely. But that's another story (that will come back, I promise).

@airween
Copy link
Member Author

airween commented Apr 23, 2024

ctl:'ruleRemoveTargetById=942100;ARGS:/^password[\d+]$/,ARGS:abc'"

AH00526: Syntax error on line 22 of ....conf:
Error parsing actions: Invalid quoted pair at position 47: ctl:'ruleRemoveTargetById=942100;ARGS:/^password[\\d+]$/,ARGS:abc'"

@dune73
Copy link
Member

dune73 commented Apr 23, 2024

I agree with @marcstern completely.

And you probably need to give your parser some rest @airween. It's clearly having a bad day.

@airween
Copy link
Member Author

airween commented Apr 23, 2024

Pre-parsing the exceptions at config time and storing each of them in a table (with their pre-compiled regex) would probably enhance the perf a lot. Then, specifying multiple targets would be useless.

This is what I want to solve, but I don't see how multiple targets handling works. I mean I see the code, but I can't write a rule which will be allowed by the config parser.

@marcstern
Copy link
Contributor

You're right @airween, the syntax doesn't work ... because of a bug I fixed in my test version.
#2927 is aimed at fixing the processing to fully support regex in actions. Currently, backslashes are posing problems.

If you don't use a , it should work:
Use SecAction "phase:1,ctl:'ruleRemoveTargetById=942100;ARGS:/^password[0-9+]$/,ARGS:abc'"

@airween
Copy link
Member Author

airween commented Apr 23, 2024

You're right @airween, the syntax doesn't work ... because of a bug I fixed in my test version.

oh, it's good to know.

#2927 is aimed at fixing the processing to fully support regex in actions. Currently, backslashes are posing problems.

So before we continue the work, we should merge this PR?

If you don't use a , it should work: Use SecAction "phase:1,ctl:'ruleRemoveTargetById=942100;ARGS:/^password[0-9+]$/,ARGS:abc'"

I tried without regex format, but it still don't work:

    ...
    ctl:'ruleRemoveTargetById=2003;ARGS:cba,ARGS:abc'"

the result:

AH00526: Syntax error on line 22 of testconf.conf:
Error parsing actions: Unknown action: "

The config:

$ nl -ba testconf.conf | grep 22
    22	ctl:'ruleRemoveTargetById=2003;ARGS:cba,ARGS:abc'"

@marcstern
Copy link
Contributor

#2927 is aimed at fixing the processing to fully support regex in actions. Currently, backslashes are posing problems.

So before we continue the work, we should merge this PR?
Definitely

@airween airween added the 2.x Related to ModSecurity version 2.x label May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to ModSecurity version 2.x do not merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants