Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

Thomas Wouters' objections #165

Open
gvanrossum opened this issue Nov 6, 2020 · 13 comments
Open

Thomas Wouters' objections #165

gvanrossum opened this issue Nov 6, 2020 · 13 comments

Comments

@gvanrossum
Copy link
Owner

See his python-dev message.

Mainly he's still very worried about wildcards and would strongly prefer us to use ?.

He's also worried about mapping patterns defaulting to ignore extra keys.

The root reason for those worries is different -- for mappings he's just worried that this is a trap for users, while for wildcards his worry is that it closes the door to unifying patterns and assignments more.

I'm not sure how much I want to push back on this before I hear how the rest of the SC thinks about this.

For mappings we could just adopt the other semantics, require people to write **_ to ignore extra keys, and be done with it -- if the SC wants that. I personally think that this is the common use case but I can see how some people would think of mapping patterns as more similar to sequence patterns than to class patterns.

For wildcards it does look like it would be very painful to evolve to a version of the language where _ is always a wildcard; we would have to invent a new approach to i18n tooling and introduce that gradually over many versions before we could even think about deprecating use of _ in expressions. (Though perhaps we could deprecate using _ when it's known to be a local variable much sooner.)

I am particularly worried about i18n tooling because my past experience at large companies using these suggests that this kind of tooling does not receive a lot of developer love, and having to change all occurrences of _("...") to something else would be a huge pain.

That said I do see a way forward there, just a long and slow way (Thomas seems to think there is no way).

I just find ? as a wildcard hideous.

Thoughts about a strategy here?

@brandtbucher
Copy link
Collaborator

brandtbucher commented Nov 6, 2020

I am willing to compromise on mapping patterns (perhaps in exchange for _ 🙂). I don't have a strong opinion... but "consistency" arguments aside, the proposed change doesn't result in any loss in expressiveness or clarity. In fact, we cut out the need for a guard (and a wasteful dict construction) in the "total match" case:

total not total
current {..., **x} if not x {...}
proposed {...} {..., **_}

Although "not total" is perhaps the more common use case.

Regarding "closing the gap" between patterns and assignment targets, the language I drafted in PEP 635 refuting PEP 642 pretty much sums up my thoughts on the issue:

We believe that attempting to unify the grammars of assignment targets and patterns is attractive, but misguided... In contrast, consider function parameters and iterable unpacking: while they are certainly similar, they each have key syntactic incompatibilities that reflect their different purposes...

There are deeper issues as well. Even if the grammars for both assignments and patterns are made "consistent" with one another, strings, byte-strings, mappings, sequences, and iterators will all behave differently in both contexts. This leaves us in a worse situation: one where the grammar is consistent, but the behavior differs in meaningful ways.

I think it's unfortunately quite easy for others to look at the final syntax and semantics, and not realize all of the long, winding roads and dead-ends we traveled to get there. We didn't create sequence patterns and mapping patterns by just saying "container displays and iterable unpacking are cool, let's do those!".

More specifically, I think that _ is beside the point; it's sort of a necessary evil that sequence patterns and iterable unpacking look the same, and sometimes work the same. But taking additional steps to make them any more alike will only create confusion, in my opinion (at a certain point, the differences become "special cases").

@brandtbucher
Copy link
Collaborator

Or we just get the ball rolling with from __future__ import _. 🙃

@brandtbucher
Copy link
Collaborator

brandtbucher commented Nov 6, 2020

Whoops. I hate that button.

@brandtbucher brandtbucher reopened this Nov 6, 2020
@Tobias-Kohn
Copy link
Collaborator

I fully agree with both of you on this. And after Brandt's great answer here to the "Making the two unnecessarily different..."-part, there is not much I can add.

Concerning the mapping patterns, I do not really have a strong opinion on that and am thus open to changing it if there is consensus that the "not total" version is better.

@gvanrossum
Copy link
Owner Author

Although "not total" is perhaps the more common use case [for mapping patterns].

And there's the rub. For class patterns we certainly know it -- a class may have dozens of keys but a particular usage may have only a need for a few (e.g. the BinOp class we use in examples may have extra attributes to indicate line/column numbers etc.).

The current proposal is assuming that "not total" is significantly more common for mapping patterns. Apart from intuition, is there any way we can quantify this?

Short of data, my intuition is that the typical use case for mapping patterns is e.g. checking JSON server responses, like this:

match response:
    case {"error": message}: ...
    case {"status": "OK", "data": data_bytes}: ...

in this context we certainly want to ignore extra keys -- that's the common convention for responses, and this code is presumably equivalent to something like this:

if "error" in response:
    message = response["error"]
    ...
elif response.get("status") == "OK" and "data" in response:
    data_bytes = response["data"]
    ...

Can someone come up with an example where totality is important?

(I should look at how totality is used for TypedDict, PEP 589.)

@gvanrossum
Copy link
Owner Author

More specifically, I think that _ is beside the point; it's sort of a necessary evil that sequence patterns and iterable unpacking look the same, and sometimes work the same. But taking additional steps to make them any more alike will only create confusion, in my opinion (at a certain point, the differences become "special cases").

I'd give Thomas (being on the SC) some credit. He was in the Skype meeting with the SC and clearly has followed the discussion carefully. His were some of the most thoughtful comments on the PEP 622 draft that was discussed there. And he represents an important different culture in that his company has different conventions for unused assignments. I don't think they are unique or wrong -- just different.

@Tobias-Kohn
Copy link
Collaborator

Here is an example where you might want to have "total" semantics. It is certainly not up to win a beauty contest, of course, but might help with the discussion nonetheless.

def create_rect(*args, **kwargs):
    match args, kwargs:
        case [(x1, y1), (x2, y2)], {}:
            return rect(x1, y1, x2, y2)
        case [x1, y1], { 'width': wd, 'height': ht }:
            return rect(x1, y1, x1 + wd, y1 + ht)
        case [x1, y1], { 'right': x2, 'bottom': y2 }:
            return rect(x1, y1, x2, y2)
        case [], { 'left': x1, 'top': y1, 'right': x2, 'bottom': y2 }:
            return rect(x1, y1, x2, y2)
        case [], { 'topleft': (x1, y1), 'bottomright: '(x2, y2)' }:
            return rect(x1, y1, x2, y2)

The idea behind this example is to use pattern matching so as to simulate function overloading with named arguments.

@gvanrossum
Copy link
Owner Author

FWIW Nick Coghlan thinks that match patterns should default to ignoring extra keys (i.e. agrees with us).

@dmoisset
Copy link
Collaborator

dmoisset commented Nov 8, 2020

As usual, I'm jumping late to the party, but here's my position on the current discussions:

I appreciate Thomas providing a design rationale for his changes rather than just the changes. I can empathise with the idea of "make possible a future when matching is a generalisation of assignment", and I've considered it previously. I would have pushed for it if I had seen a "clean" way to achieve that goal, so the goal isn't bad but it's more a matter of practicality for me. I would have loved to be able to write rightmost_x = max(x for Point(x, _) in list_of_points) in Python, but I'm not sure it's generally viable and the language provides good alternatives for that anyway.

My rebuttal to this (and possibly to most of Nick's PEP 642) is: it's fine that we may want to use a pattern as a generalised unpacking. But why do we need to do it with an assignment? we could have something else IF we need this feature in the future.

On the topic of mapping totality, I have to confess that I'm inclined to agree with Thomas's position; I think there's no a clear winner for when you want each, so forcing explicitness is better. I've used (probably bad designed) APIs where you have two versions of a JSON response with an optional argument, and checking for those could be problematic:

# Option 1, current semantics
match get_response():
    case {'required1': r1, 'required2': r2} as response if 'optional' not in response:
        do_something(r1, r2)
# Option 2, current semantics
match get_response():
    case {'required1': r1, 'required2': r2, 'optional': _}:
        pass # We don't care about this
    case {'required1': r1, 'required2': r2}:
        do_something(r1, r2)

In Thomas semantics you never have "difficult cases", the worst that can happen is that you need to add **_. It's also most consisted with our notion of TypedDict (which is total by default). The only uglyness is ending your pattern with a sequence of ASCII noise like , **_} (five symbols/four non-alphanumeric tokens in a row). I would be much happier for this case ending with something like , ...} for this case although I'm not sure it's worth having More than One Way To Do It. But total or not, it's not a hill I'd die on.

Regarding the wildcard, making local _ variables work as in patterns as suggested above would create some weird inconsistencies. For example, if you copy/paste a match statement from a function in your code to a global context (module, REPL, jupyter) it may stop working if you have more than one underscore in a pattern. I find more tolerable Nick's idea of using a double underscore (which is less ugly than ? and more compatible both with gettext conventions and not too far from other languages). Note that even in we do that, most of the time you will still be able to get away with a single underscore ;-)

@gvanrossum
Copy link
Owner Author

The problem with the totality discussion is that this is a one-way door: once we've picked semantics we are stuck there. So we need indeed tread carefully. The good news is that it doesn't look like a big deal either way, so if we want to we can let the SC struggle with this particular question.

FWIW I don't quite get your example: you're saying that if the key 'optional' is present the response is to be ignored? Honestly that seems like a very odd API design.

Regarding the wildcard, making local _ variables work as in patterns as suggested above would create some weird inconsistencies. For example, if you copy/paste a match statement from a function in your code to a global context (module, REPL, jupyter) it may stop working if you have more than one underscore in a pattern.

I'm not following what you're saying this. What does "as suggested above" refer to? Without understanding that I have no idea why a pattern with two underscores would break in a global context. (Is it because of the special meaning of _ in the REPL?)

PS. I think Nick also prefers | over or.

@Tobias-Kohn
Copy link
Collaborator

May I just briefly reiterate Brandt's earlier point about 'closing the gap'. Even if we did away with all disputed and fancy patterns such as wildcard- and or-patterns, we would certainly still want to retain literal patterns, for instance. This means that we would have to allow assignments of the form 1 = x, which is basically equivalent to an assert(x == 1) (Elixir does that).

Knowing how much novice programmers are struggling with understanding that assignments are right-to-left (i.e. x = 1 and not 1 = x), I would absolutely hate to see such syntax suddently appear in a language that is so widely used for education. Hence, even if we could eventually make assignments and pattern matching consistent, it would come at a very high price---too high as far as I am concerned.

Pattern matching as we propose it is by design a feature that lives in its own little world, by which I mean that you have to actively turn it on with the match statement. It does not leak out and subtly change the syntax of semantics of other long established structures. I would claim that this is not a fault of pattern matching, but a feature that allows to continue learning Python gradually.

So, long story short, I reject the very premise upon which these ideas of why the wildcard, say, might become a problem, are based. Thomas has crafted a carefully worded and measured email and certainly has given it some thought. Still, I think our pattern matching proposal is held here against an unrealistic bar.

@gvanrossum
Copy link
Owner Author

That is a very good way of stating it. Do you feel comfortable responding to Thomas? (I repeatedly find that I don't know how to respond because to me everything Thomas says feels just wrong.)

@Tobias-Kohn
Copy link
Collaborator

Thanks and yes, I can do that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants