Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(matching): unify constraint-introduced metavariables #10014

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

brandonspark
Copy link
Contributor

What:

This PR makes it such that simultaneous conditions which introduce metavariables will unify their metavariables in the resulting match.

Why:

Previously, we had logic which was such that for every succeeding instance of a condition (for instance, each succeeding match in a metavariable-pattern), we would produce a brand new match which only had the new metavariables from that instance.

This means, for instance, if we had a metavariable-regex and metavariable-pattern, both of which introduce metavariables, we would produce two matches with the metavariables from either, instead of one match which unifies their metavariables. This is not intuitive, and undesirable.

How:

We make it such that we handle all constraints and ranges simultaneously (so that we can still generate the per-condition matching explanations). We save all the introduced metavariables per succeeding constraint, and then find all the ways to combine them pairwise to unify their metavariables.

Test plan:

Automated tests.

Copy link
Contributor

PR checklist:

  • Purpose of the code is evident to future readers
  • Tests included or PR comment includes a reproducible test plan
  • Documentation is up-to-date
  • A changelog entry was added to changelog.d for any user-facing change
  • Change has no security implications (otherwise, ping security team)

If you're unsure about any of this, please see:

@brandonspark brandonspark requested review from a team, akuhlens and IagoAbal and removed request for a team March 27, 2024 19:56
@brandonspark brandonspark changed the title feat: unify constraint-introduced metavariables feat(matching): unify constraint-introduced metavariables Apr 2, 2024
@brandonspark brandonspark requested a review from emjin April 4, 2024 17:48
@brandonspark brandonspark requested a review from aryx April 16, 2024 22:04
@brandonspark
Copy link
Contributor Author

@aryx poke on this

@aryx
Copy link
Collaborator

aryx commented Apr 25, 2024

@IagoAbal , opinions?

@@ -1,10 +1,14 @@

# OLD BEHAVIOR:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we thought this was important, but now it's not important anymore? 🤔 What motivated this test that now we're fine breaking? I'm not questioning that the new behavior is useful, but I'm questioning whether we can just stop supporting the old behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt anyone was hinging on the old behavior. I don't think this particular implementation was important, I think it was just important that the bindings persisted whatsoever, irrespective of how the bindings were then unified.

I believe this to have been an oversight at the time I implemented persistent bindings. The new behavior is such that you should still get the finding, unless the persisted bindings are in conflict.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be good to then note here that "previous behavior was but in retrospective this was a mistake for , we prefer new behavior". When I read these comments it felt kinda arbitrary why we changed the behavior, and the old comment here tries to make a point that it's important to produce a finding.

@@ -0,0 +1,20 @@
rules:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add that rule that Lewis was trying to write as a test?

Before, we would produce a new match per condition success, where each match had only the
metavariables introduced by a single succeeding instance of the condition.
*)
List_.fold_right
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the use of fold_right vs fold_left important here?

introducing a different set of bindings.

For metavariable-regex foo\((?<A>)(?<B>)\) on the text "foo(12)", we could have
[ [A, ""; B, "12"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have appreciated using a running example, a bit like in here, to better visualize what each step is doing.

let rec filter_ranges (env : env) (xs : (RM.t * MV.bindings list) list)
(cond : R.metavar_cond) : (RM.t * MV.bindings list) list =
(* [filter_ranges] filters all our ranges by all our conditions simultaneously.
We do this via "phases" -- first, for each condition, and within each
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what this means now, but before reading the code, I didn't really get it.

instance, which introduces no metavariables.
*)
Some [ [] ]
else None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a MV.bindings list option or just MV.bindings list ?

@IagoAbal
Copy link
Collaborator

IagoAbal commented May 6, 2024

@IagoAbal , opinions?

@aryx if you're not against, I would approve this. I would just appreciate some comments to make it easier to follow the code. The change makes sense, I hope it doesn't have much of a perf impact, I would like to see the Argo workflow running on this again (the old run's data is gone).

@@ -554,147 +554,209 @@ let children_explanations_of_xpat (env : env) (xpat : Xpattern.t) : ME.t list =

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No changelog!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re to the test that changed, this is indeed a breaking change. I'd hope it doesn't break many rules, but if someone has a couple of metavariable-patterns where it reused the same metavariable, it may stop working. Need to make that clear in the changelog so folks can easily spot what may be causing their rules to break. We could also provide a rule option to disable this behavior, but I'd personally be fine not adding it for now and wait until we get any complain.

@@ -554,147 +554,209 @@ let children_explanations_of_xpat (env : env) (xpat : Xpattern.t) : ME.t list =

let hook_pro_entropy_analysis : (string -> bool) option ref = ref None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should update docs too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants