Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

Can the compiler move guards around? #154

Open
gvanrossum opened this issue Sep 20, 2020 · 7 comments
Open

Can the compiler move guards around? #154

gvanrossum opened this issue Sep 20, 2020 · 7 comments
Labels
accepted Discussion leading to a final decision to include in the PEP fully pepped Issues that have been fully documented in the PEP

Comments

@gvanrossum
Copy link
Owner

gvanrossum commented Sep 20, 2020

Since we only allow guards at the top level, one is sometimes required to write things like

case [x, y, z] if x >= 0:

Here presumably the compiler generates something like

  • check for sequence
  • check length == 3
  • extract x
  • extract y
  • extract z
  • check guard

But the "check guard" step could be move up two slots so it comes before we even extract y and z. In some cases (e.g. for more complex patterns y and z) this might make a significant difference.

  • @brandtbucher: Would this be easy to do? You'd have to check the guard for occurrences of the various capture variables.
  • All: We'd have to trust (or not care) that there isn't a function call in a guard that implicitly references one of the later capture variables, so we'd have to have some explicit language in the PEP to allow this. (Similar to the existing language about caching certain outcomes.)
@gvanrossum gvanrossum added the open Still under discussion label Sep 20, 2020
@brandtbucher
Copy link
Collaborator

My immediate impression is that this seems fine. I haven’t thought about the full implications, but it shouldn’t be too hard to implement some sort of dependency graph in the pattern compiler (something like this will be needed to do simpler optimizations like caching name lookups anyways).

However, while I have been doing some experimenting on the side, I’m not planning on implementing any optimizations until the “dumb” implementation is already merged. The PR review will already be wildly complex, and I think that knowledge transfer for other maintainers will be better if we keep things simple to start.

@gvanrossum
Copy link
Owner Author

That’s a good plan, and I would even say if you can simplify the code by making it slower that would be fine.

@brandtbucher
Copy link
Collaborator

Cool. Isn’t there a quote attributed to you about Python having the “dumbest compiler ever” (or something to that effect)? ;)

@Tobias-Kohn
Copy link
Collaborator

We might have to be very, very careful with this. At the top of my mind, I am thinking of something along the lines of:

y = 0

def get_y():
    return y

def foo(arg):
    global y
    match arg:
        case (x, y) if x == get_y(): ...

With the proposed optimisation, we clearly end up getting the wrong result.

So, if the compiler can prove that the expression in the guard has no way of accessing any variables bound by subsequent patterns, it might be fine.

However, the intent of the question is probably more about the guarantees given by the specs, though. The specs would probably not want to explicitly name or propose such optimisations, i.e. allow them only to be applied if the runtime semantics does not change at all. That being said, at the same time, the specs should make it clear that there is no guarantee ever that the interpreter will or will not check any given pattern.

In a case like the following, the interpreter does IMHO not have to check the patterns P and Q applied to 1 and 2, respectively, first. If it discovers that 3 == "three" from the outset, it can just skip the entire pattern directly. At the same token: it does not have to skip P and Q, either, of course.

match (1, 2, 3):
    case (P, Q, "three"): ...

@brandtbucher
Copy link
Collaborator

brandtbucher commented Sep 21, 2020

We'd have to trust (or not care) that there isn't a function call in a guard that implicitly references one of the later capture variables, so we'd have to have some explicit language in the PEP to allow this. (Similar to the existing language about caching certain outcomes.)

I think that if we're already fine breaking things like side-effecty attribute/item access, then breaking @Tobias-Kohn's example is a no-brainer. ;) Note that our handling of dotted names means that this pattern won't be an issue for assignments to attributes on self, for example... I think that using global/nonlocal name bindings in unintuitive ways like this is the only trivial way of producing broken code.

Are there any non-toy examples that are broken by this, but are still clearly better written using pattern matching than other control-flow constructs like if/else? I'm struggling to come up with any.

@gvanrossum
Copy link
Owner Author

I think it's a purely theoretical concern, but nevertheless people could write code to find out how the implementation works in practice (or just use "dis"), and then write obfuscated code that depends on it just to show off. :-)

@gvanrossum gvanrossum pinned this issue Oct 20, 2020
@gvanrossum gvanrossum added accepted Discussion leading to a final decision to include in the PEP fully pepped Issues that have been fully documented in the PEP and removed open Still under discussion labels Oct 20, 2020
@gvanrossum
Copy link
Owner Author

I consider this fully pepped; PEP 635 has explicit language:

If a guard stipulates that a variable x must be positive, say (i.e. if x > 0), the interpreter might check this directly after binding x and before any further subpatterns are considered.

@gvanrossum gvanrossum unpinned this issue Oct 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
accepted Discussion leading to a final decision to include in the PEP fully pepped Issues that have been fully documented in the PEP
Projects
None yet
Development

No branches or pull requests

3 participants