Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize regex matching for empty label values in posting match #14075

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

yeya24
Copy link
Contributor

@yeya24 yeya24 commented May 10, 2024

Now =~"" is optimized into empty string matcher https://github.com/prometheus/prometheus/blob/main/model/labels/regexp.go#L680, which matches empty string only.

Then we can skip matching because all label values should be non empty string.

@yeya24 yeya24 requested a review from jesusvazquez as a code owner May 10, 2024 04:45
Signed-off-by: Ben Ye <benye@amazon.com>
@beorn7
Copy link
Member

beorn7 commented May 15, 2024

Hmm, this looks good to me at first glance, but I don't get the line of argument that this is only possible because of a prior optimization.

Wasn't it always the case that =~"" is equivalent to ="", whether it is optimized that way or not? In other words, could this optimization (in this PR) have been done anyway?

Maybe I'm missing something. @bboreham you probably have more context.

@yeya24
Copy link
Contributor Author

yeya24 commented May 15, 2024

Hmm, this looks good to me at first glance, but I don't get the line of argument that this is only possible because of a prior optimization.
Wasn't it always the case that =~"" is equivalent to ="", whether it is optimized that way or not? In other words, could this optimization (in this PR) have been done anyway?

Without the empty string matcher optimization, I remember we cannot simply convert =~"" to ="" because regex string needs to be anchored.

@beorn7
Copy link
Member

beorn7 commented May 15, 2024

Regexp's in PromQL have always been fully anchored. So =~"" always has been equivalent to =~"^$".

@@ -354,8 +354,9 @@ func inversePostingsForMatcher(ctx context.Context, ix IndexReader, m *labels.Ma
}

res := vals[:0]
// If the inverse match is ="", we just want all the values.
if m.Type == labels.MatchEqual && m.Value == "" {
// If the inverse match is ="" or =~"", we just want all the values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made me stop and think. Is this clearer?

Suggested change
// If the inverse match is ="" or =~"", we just want all the values.
// If the match before inversion was !="" or !~"", we just want all the values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Addressed in latest commit.

@bboreham
Copy link
Member

Is there a test covering this case?

@beorn7
Copy link
Member

beorn7 commented May 15, 2024

Is there a test covering this case?

I don't think so.

@yeya24
Copy link
Contributor Author

yeya24 commented May 15, 2024

Regexp's in PromQL have always been fully anchored. So ="" always has been equivalent to ="^$".

Then maybe I misunderstood and probably it is unrelated to https://github.com/prometheus/prometheus/blob/main/model/labels/regexp.go#L680. The optimization is always there just we didn't implement at the beginning.

Updated: I mixed this one with the .+ and .* optimization case where anchor is an issue. =~ should be ok.

Is there a test covering this case?

I will update the test

Signed-off-by: Ben Ye <benye@amazon.com>
@yeya24
Copy link
Contributor Author

yeya24 commented May 15, 2024

Added a unit test case.

@beorn7
Copy link
Member

beorn7 commented May 16, 2024

I mixed this one with the .+ and .* optimization case where anchor is an issue.

I think the issue with those is that the . doesn't match a newline character (which we would like to change in v3, see #7859).

Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should be fine now. @bboreham do you agree?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants