feat: Allow regular expressions in ctl:ruleRemoveTargetByX variable names II. #3121

airween · 2024-04-16T16:36:57Z

This PR is the renewal and addition of the PR #1683, and solves #911.

Example:

SecRule REQUEST_URI "@beginswith /index.php"
    "id:1001,phase:1,pass,nolog,
    ctl:ruleRemoveTargetById=942100;ARGS:/^password[\d+]$/"

The new patch works with PCRE2 too.

Actually it is only for v2, so until there is no patch for v3 I don't want to merge it (hope I can do that soon).

Many thanks for your work, @vvidic, the credit is yours.

marcstern

strlen(value) could be cached in a variable, as it's used several times

marcstern

if(strlen(value) > 2 && value[0] == '/' && ...
Better to begin with check of value[0] as, most of the time, the test will be false.

marcstern

if (regex == NULL) { ...
We should return 0 immediately to simplify the flow.

Let's do it for if (targets != NULL) also when we're at it

marcstern

if((strlen(myvalue) == strlen(value)) && strncasecmp(myvalue,value,strlen(myvalue)) == 0)
Why not
if(strcasecmp(myvalue,value,strlen(myvalue)) == 0)

marcstern

else if (value == NULL && myvalue == NULL)    {
   if (msr->txcfg->debuglog_level >= 9) {
      msr_log(msr, 9, "fetch_target_exception: Target %s will not be processed.", target);
   }
   match = 1;
} else if (value == NULL && myvalue != NULL)   {
   if (msr->txcfg->debuglog_level >= 9) {
      msr_log(msr, 9, "fetch_target_exception: Target %s will not be processed.", target);
   }
   match = 1;
}

Simplify:

else {
   if (msr->txcfg->debuglog_level >= 9) {
      msr_log(msr, 9, "fetch_target_exception: Target %s will not be processed.", target);
   }
   match = 1;
}

marcstern

Inside if(strncasecmp(myname, name,strlen(myname)) == 0) there's a test if(value != NULL && myvalue != NULL)
If value is not NULL, like TX:a, REQUEST_HEADERS:abc, 'myvalue' (corresponding to the actual target) cannot be NULL (TX alone is not possible).
Thus if(value != NULL) is sufficient

…d compilation

sonarcloud · 2024-04-17T16:09:43Z

Quality Gate failed

Failed conditions
3 Security Hotspots
B Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

airween · 2024-04-17T16:10:35Z

strlen(value) could be cached in a variable, as it's used several times

Thanks, fixed in c796804.

airween · 2024-04-17T16:12:05Z

if(strlen(value) > 2 && value[0] == '/' && ... Better to begin with check of value[0] as, most of the time, the test will be false.

I introduced the value_len which is an int, I think now the order is no matter anymore, so I kept it as is.

airween · 2024-04-17T16:14:20Z

if (regex == NULL) { ... We should return 0 immediately to simplify the flow.

Let's do it for if (targets != NULL) also when we're at it

I think we can change it in a next round. regex value is produced by msc_pregcomp(), which returns NULL in case of failure, so I should compare it with a correct type.

airween · 2024-04-17T16:16:14Z

if((strlen(myvalue) == strlen(value)) && strncasecmp(myvalue,value,strlen(myvalue)) == 0) Why not if(strcasecmp(myvalue,value,strlen(myvalue)) == 0)

strlen(value) is replaced by value_len, and I think it's faster than the strncasecmp(). If the length of strings are not equal, then the engine does not start to compare the strings. I think this make sense.

airween · 2024-04-17T16:17:08Z

...
Simplify:
...

Thanks, this make sense - see 1790659.

airween · 2024-04-17T16:39:47Z

Inside if(strncasecmp(myname, name,strlen(myname)) == 0) there's a test if(value != NULL && myvalue != NULL) If value is not NULL, like TX:a, REQUEST_HEADERS:abc, 'myvalue' (corresponding to the actual target) cannot be NULL (TX alone is not possible). Thus if(value != NULL) is sufficient

After a quick analysis I'm not sure about that. value holds the inspected exclusion target in the cycle. But myvalue can be NULL even the value is not null. myvalue derived from the inspected target.

Even though it might myvalue depends on value, but an extra comparing is not a big load. I suggest we should keep that there.

airween · 2024-04-17T18:15:12Z

Thus if(value != NULL) is sufficient

Using both checks prevents the segfault. If the user's config is wrong, then it can lead to myvalue is NULL but value is not, eg:

SecRule REQUEST_FILENAME "@rx /admin" \
    "id:1001,\
    phase:1,\
    t:none,\
    nolog,\
    ctl:ruleRemoveTargetById=2001;REMOTE_HOST:/^foo.*$/"

which makes no sense, but if the user confuses a collection with a variable, then it could be a problem. I suggest to keep that there.

marcstern · 2024-04-19T15:03:30Z

Thus if(value != NULL) is sufficient

Using both checks prevents the segfault. If the user's config is wrong, then it can lead to myvalue is NULL but value is not, eg:
SecRule REQUEST_FILENAME "@rx /admin" \
    "id:1001,\
    phase:1,\
    t:none,\
    nolog,\
    ctl:ruleRemoveTargetById=2001;REMOTE_HOST:/^foo.*$/"
which makes no sense, but if the user confuses a collection with a variable, then it could be a problem. I suggest to keep that there.
This faulty syntax should be detected and refused, but that's another issue. If we want to support that case, then if (value && *value) is sufficient

if (regex == NULL) { ... We should return 0 immediately to simplify the flow.
Let's do it for if (targets != NULL) also when we're at it

I think we can change it in a next round. regex value is produced by msc_pregcomp(), which returns NULL in case of failure, so I should compare it with a correct type.

The following code is strictly equivalent to the existing one, except that it removes one 'if' level (return early) and simplify a bit this code that becomes complex to follow:

if (targets == NULL) {
            if (msr->txcfg->debuglog_level >= 9) {
                msr_log(msr, 9, "fetch_target_exception: No exception target found for rule id %s.", rule->actionset->id);
            return 0;
        }

airween · 2024-04-21T19:48:20Z

The following code is strictly equivalent to the existing one, except that it removes one 'if' level (return early) and simplify a bit this code that becomes complex to follow:
if (targets == NULL) {
            if (msr->txcfg->debuglog_level >= 9) {
                msr_log(msr, 9, "fetch_target_exception: No exception target found for rule id %s.", rule->actionset->id);
            return 0;
        }

I see your point, yes, that would make the code more cleaner.

airween · 2024-04-21T19:52:30Z

apache2/re.c

+                        if(value_len > 2 && value[0] == '/' && value[value_len - 1] == '/') {
+                            value[value_len - 1] = '\0';
+#ifdef WITH_PCRE2
+                            regex = msc_pregcomp(msr->mp, value + 1,


This (copied) code contains a memory leak: the regex variable compiled in each cycle during the exclusion check, but never freed (never called msc_pcre_cleanup()).

We should reorganize this part. I would expect that regex variable above must be compiled during startup, and not when the engine inspect the exclusion.

airween · 2024-04-22T15:40:53Z

I'm wondering why is a cycle here, I mean target created by split of targets by , (comma) which is a copy of exceptions here.

This would be make sense if the syntax of rule were allowed multiple targets, eg:

   ctl:ruleRemoveTargetById=9XXXXX;ARGS_GET,ARGS_POST,ARGS

but this is not allowed.

Variable exceptions passed when the function is called here, the item build in this line.

@marcstern, @dune73 do you have any idea, what was the original aim of this solution?

marcstern · 2024-04-22T17:31:39Z

Where is it defined that multiple targets is not allowed?
Remark: in case it's allowed, it would be quicker to return 1 immediately instead of parsing all targets.

airween · 2024-04-22T18:25:14Z

Where is it defined that multiple targets is not allowed?

How looks like the syntax?

marcstern · 2024-04-23T07:19:45Z

Where is it defined that multiple targets is not allowed?
How looks like the syntax?
ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

airween · 2024-04-23T07:42:35Z

ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

The parser does not allow this syntax. I tried this and other similar syntax's with no luck. The error message is:

AH00526: Syntax error on line...
Error parsing actions: Unknown action: ARGS

The , (comma) character is the action separator, even we use a quoted form (I tried to quote the whole argument like you (including rule id), or just quote the targets).

marcstern · 2024-04-23T08:33:12Z

ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"
The parser does not allow this syntax. I tried this and other similar syntax's with no luck. The error message is:
AH00526: Syntax error on line...
Error parsing actions: Unknown action: ARGS
The , (comma) character is the action separator, even we use a quoted form (I tried to quote the whole argument like you (including rule id), or just quote the targets).

My syntax was indeed incorrect.
The following is accepted: ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

dune73 · 2024-04-23T08:49:40Z

@airween asked me to chime in. Sorry to be so silent otherwise.

I see how the definition of multiple targets on a single runtime rule exclusion within single quotes would work syntactically. But I do not think it brings much of an advantage beyond an assumed minimal speed improvement.

The rule exclusion becomes harder to read and if you think of a graphical rule editor the UX designer would probably curse you using very harsh words for this addition.

Or in other words: Adding a regex to the target definition: Solves an itch where we have no alternative. Adding the option to add multiple arguments to a single ctl statement: Solves a small problem at the cost of hard to maintain config.

airween · 2024-04-23T08:52:28Z

ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

My syntax was indeed incorrect. The following is accepted: ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

Sorry, but I don't see any differences.

airween · 2024-04-23T09:04:55Z

I see how the definition of multiple targets on a single runtime rule exclusion within single quotes would work syntactically.

do you think this syntax works? I wasn't able to write any rule with multiple targets.

But I do not think it brings much of an advantage beyond an assumed minimal speed improvement.

I'm not sure this brings any speed improvements. In both cases there are the same cycles.

The rule exclusion becomes harder to read and if you think of a graphical rule editor the UX designer would probably curse you using very harsh words for this addition.

This is a valuable argument on long time, but may be we don't need to care of this issue actually.

Or in other words: Adding a regex to the target definition: Solves an itch where we have no alternative. Adding the option to add multiple arguments to a single ctl statement: Solves a small problem at the cost of hard to maintain config.

Actually regex in targets (with this implementation) would be very-very slow. If there is a real user demand (and based on the comments under original PR there is) I agree we need to implement this feature, but not this way.

The questions are:

do our users want this feature? If yes, we should add it to both engine (and what about Coraza? @jptosso, @fzipi, @jcchavezs?)
what about the multiple targets in one exclusion? Should it work? If yes, with what syntax? (this is important to understand how the current code works and how can we improve the speed of regex targets)

dune73 · 2024-04-23T09:16:18Z

Honestly, I think the speed problem is around the whole concept of runtime rule exclusions and how it is implemented.

You have a target list and then you remove individual targets from that list based on the ignore-list at runtime. The way it is done is very slow, very often slower than executing the rule itself. That effectively means that disabling rules via rule exclusions for speed gain is not possible. (And that use case is very real: If I assure that an ARGS follows a positive definition, I would like to improve the performance by removing this ARGS from further evaluation, but this approach actually degrades performance for simple ARGS. It's faster to run the rules ...)

And I remember looking at the execution in detail and I think either ModSec v2 or v3 would actually run the rule and remove the result if there was a rule exclusion instead of skipping the execution. But I am no longer 100% sure this was really the case.

But back to this problem here: The situation that pops up regularly is random cookie names and also random or enumerated argument names, thus targets where you do not know the name in advance. For the cookies, there is no alternative to disabling cookie inspection for a rule completely because the (Drupal!) session cookie has a random name.

I think the regex definition of runtime rule exclusions is needed, but the speed should be improved.

airween · 2024-04-23T09:24:01Z

Honestly, I think the speed problem is around the whole concept of runtime rule exclusions and how it is implemented.

yes.

You have a target list and then you remove individual targets from that list based on the ignore-list at runtime.

First I want to understand why the exclusions handling code wants to handle multiple targets, but the parser does not allow it. So it would be very good to see a valid syntax, there the exclusion has multiple targets and parser eats it.

marcstern · 2024-04-23T09:28:48Z

ctl:'ruleRemoveTargetById=942100;ARGS:/^password[\d+]$/,ARGS:abc'"

marcstern · 2024-04-23T09:39:42Z

I think having a multiple targets is probably (a little bit) quicker than specifying multiple ctl actions.

The main problem is that a lot of parsing is done at run-time and especially the regex compiling.
Pre-parsing the exceptions at config time and storing each of them in a table (with their pre-compiled regex) would probably enhance the perf a lot. Then, specifying multiple targets would be useless.

Nota that, for one of the most common cases, excluding a target from all rules (like the Drupal session cookie), there's a much better solution: allowing all collections to be writeable. That way, we can remove it completely. But that's another story (that will come back, I promise).

airween · 2024-04-23T09:40:52Z

ctl:'ruleRemoveTargetById=942100;ARGS:/^password[\d+]$/,ARGS:abc'"

AH00526: Syntax error on line 22 of ....conf:
Error parsing actions: Invalid quoted pair at position 47: ctl:'ruleRemoveTargetById=942100;ARGS:/^password[\\d+]$/,ARGS:abc'"

dune73 · 2024-04-23T09:43:23Z

I agree with @marcstern completely.

And you probably need to give your parser some rest @airween. It's clearly having a bad day.

airween · 2024-04-23T09:44:06Z

Pre-parsing the exceptions at config time and storing each of them in a table (with their pre-compiled regex) would probably enhance the perf a lot. Then, specifying multiple targets would be useless.

This is what I want to solve, but I don't see how multiple targets handling works. I mean I see the code, but I can't write a rule which will be allowed by the config parser.

marcstern · 2024-04-23T10:32:43Z

You're right @airween, the syntax doesn't work ... because of a bug I fixed in my test version.
#2927 is aimed at fixing the processing to fully support regex in actions. Currently, backslashes are posing problems.

If you don't use a , it should work:
Use SecAction "phase:1,ctl:'ruleRemoveTargetById=942100;ARGS:/^password[0-9+]$/,ARGS:abc'"

airween · 2024-04-23T11:35:00Z

You're right @airween, the syntax doesn't work ... because of a bug I fixed in my test version.

oh, it's good to know.

#2927 is aimed at fixing the processing to fully support regex in actions. Currently, backslashes are posing problems.

So before we continue the work, we should merge this PR?

If you don't use a , it should work: Use SecAction "phase:1,ctl:'ruleRemoveTargetById=942100;ARGS:/^password[0-9+]$/,ARGS:abc'"

I tried without regex format, but it still don't work:

    ...
    ctl:'ruleRemoveTargetById=2003;ARGS:cba,ARGS:abc'"

the result:

AH00526: Syntax error on line 22 of testconf.conf:
Error parsing actions: Unknown action: "

The config:

$ nl -ba testconf.conf | grep 22
    22	ctl:'ruleRemoveTargetById=2003;ARGS:cba,ARGS:abc'"

marcstern · 2024-04-24T10:14:00Z

#2927 is aimed at fixing the processing to fully support regex in actions. Currently, backslashes are posing problems.

So before we continue the work, we should merge this PR?
Definitely

Allow regular expressions in ctl:ruleRemoveTargetByX variable names II.

be37489

airween requested a review from marcstern April 16, 2024 16:37

airween self-assigned this Apr 16, 2024

airween added the do not merge label Apr 16, 2024

airween mentioned this pull request Apr 16, 2024

Allow regular expressions in ctl:ruleRemoveTargetByX variable names #911 #1683

Closed

marcstern requested changes Apr 17, 2024

View reviewed changes

airween added 3 commits April 17, 2024 17:44

Split error handling into two parts; add regular log in case of faile…

2b2535c

…d compilation

Pre-calculate string length instead of strlen()

c796804

Statement simplification

1790659

airween commented Apr 21, 2024

View reviewed changes

airween added the 2.x Related to ModSecurity version 2.x label May 3, 2024

feat: Allow regular expressions in ctl:ruleRemoveTargetByX variable names II. #3121

Are you sure you want to change the base?

feat: Allow regular expressions in ctl:ruleRemoveTargetByX variable names II. #3121

Conversation

airween commented Apr 16, 2024 • edited

marcstern left a comment

Choose a reason for hiding this comment

marcstern left a comment

Choose a reason for hiding this comment

marcstern left a comment • edited

Choose a reason for hiding this comment

marcstern left a comment

Choose a reason for hiding this comment

marcstern left a comment

Choose a reason for hiding this comment

marcstern left a comment

Choose a reason for hiding this comment

sonarcloud bot commented Apr 17, 2024

Quality Gate failed

airween commented Apr 17, 2024

airween commented Apr 17, 2024

airween commented Apr 17, 2024

airween commented Apr 17, 2024

airween commented Apr 17, 2024

airween commented Apr 17, 2024

airween commented Apr 17, 2024

marcstern commented Apr 19, 2024

airween commented Apr 21, 2024

airween Apr 21, 2024

Choose a reason for hiding this comment

airween Apr 21, 2024

Choose a reason for hiding this comment

airween commented Apr 22, 2024

marcstern commented Apr 22, 2024

airween commented Apr 22, 2024

marcstern commented Apr 23, 2024

airween commented Apr 23, 2024

marcstern commented Apr 23, 2024

dune73 commented Apr 23, 2024

airween commented Apr 23, 2024

airween commented Apr 23, 2024

dune73 commented Apr 23, 2024 • edited

airween commented Apr 23, 2024

marcstern commented Apr 23, 2024

marcstern commented Apr 23, 2024

airween commented Apr 23, 2024

dune73 commented Apr 23, 2024

airween commented Apr 23, 2024

marcstern commented Apr 23, 2024

airween commented Apr 23, 2024

marcstern commented Apr 24, 2024

airween commented Apr 16, 2024 •

edited

marcstern left a comment •

edited

dune73 commented Apr 23, 2024 •

edited