Literal comparison for True, False, None #139

stereobutter · 2020-08-17T05:40:10Z

As much as I disagree with @markshannon about the general usefulness of pattern matching I agree with him (and previously didn't think about) cases like

match key:
    ...
    case True:
        key = 'true'
    case False:
        key = 'false'
    case int():
        key = _intstr(key)
    ...

where for match 1: the case True is selected. The reverse thing happens when one puts case int() before case True, then for match True the case int() is selected.

I also noticed the discussion in #16 which mentions the handling of True and 1 and 1.0 (the latter matching case True is even more surprising imho.)

This is probably not what people expect (I didn't) and also not that helpful when match is going to be used a lot precisely in cases like the above where a function's return depends on the type of its argument (and some literal values). I'd expect people to think of True and False more of constants that they want to literally match a value against and less of some_value == True style equality checking. If they wanted equality checking case x: x == True is much more explicit about what it does than the current behavior of using equality checking behind the scene.

The text was updated successfully, but these errors were encountered:

stereobutter · 2020-08-17T06:59:49Z

match x:
    case True: ...
    case 1: ...
    case 1.0: ...

would work intuitively if the check for literals were isinstance(x, type(literal)) and x == literal.

def check(match, case):
    return isinstance(match, type(case)) and match == case

[(f'match {match} case {case}', check(match, case)) for match in (True, 1, 1.0) for case in (True, 1, 1.0)]
>>>
[('match True case True', True),
 ('match True case 1', True),
 ('match True case 1.0', False),
 ('match 1 case True', False),
 ('match 1 case 1', True),
 ('match 1 case 1.0', False),
 ('match 1.0 case True', False),
 ('match 1.0 case 1', False),
 ('match 1.0 case 1.0', True)]

Notice though that case True must come before case 1 since for x = True isinstance(True, type(1)) and True == 1 still is true so True matches case 1. To be able and resolve ambiguity between True and 1 (and False and 0) one would need to special case bool and the check for literals would be something similar to:

def check(match, case):
    if isinstance(match, bool):
        return match is case
    else:
        return isinstance(match, type(case)) and match == case

examples = [False, True, 0, 0.0,  1, 1.0]
[(f'match {match} case {case}', check(match, case)) for match in examples for case in examples]
>>>
[('match False case False', True),
 ('match False case True', False),
 ('match False case 0', False),
 ('match False case 0.0', False),
 ('match False case 1', False),
 ('match False case 1.0', False),
 ('match True case False', False),
 ('match True case True', True),
 ('match True case 0', False),
 ('match True case 0.0', False),
 ('match True case 1', False),
 ('match True case 1.0', False),
 ('match 0 case False', False),
 ('match 0 case True', False),
 ('match 0 case 0', True),
 ('match 0 case 0.0', False),
 ('match 0 case 1', False),
 ('match 0 case 1.0', False),
 ('match 0.0 case False', False),
 ('match 0.0 case True', False),
 ('match 0.0 case 0', False),
 ('match 0.0 case 0.0', True),
 ('match 0.0 case 1', False),
 ('match 0.0 case 1.0', False),
 ('match 1 case False', False),
 ('match 1 case True', False),
 ('match 1 case 0', False),
 ('match 1 case 0.0', False),
 ('match 1 case 1', True),
 ('match 1 case 1.0', False),
 ('match 1.0 case False', False),
 ('match 1.0 case True', False),
 ('match 1.0 case 0', False),
 ('match 1.0 case 0.0', False),
 ('match 1.0 case 1', False),
 ('match 1.0 case 1.0', True)]

JelleZijlstra · 2020-08-17T14:29:50Z

I agree that this behavior is confusing, but your proposed change could also lead to confusing behavior: if I pass say a numpy.float64(1.0), I'd probably expect it to match 1.0, but it's a different type.

Also, this same behavior would happen if you did a series of if x == checks (although to be fair, there you'd have the option of writing x is True for the bool). In practice this may not be a big deal because people aren't terribly likely to mix bools and ints or floats in the same variable.

gvanrossum · 2020-08-17T14:35:50Z

Yeah, I has something like this (an isinstance check based on the type of the literal) in an earlier version, but there were too many surprising corner cases. To start, ‘case 3.0’ should match int as well as float, according to the numeric tower. I played with isinstance and the types from the numbers module, but those are too slow.

The only part here that does make sense is to use ‘is’ instead of ‘==‘ for True, False and None.

stereobutter · 2020-08-17T14:39:14Z

@gvanrossum Proposing that is should be used instead of == for True and False is actually why I first opened the thread and I agree that this solves most of the issues because the average python developer is probably aware of the numeric tower (minus the True and False foot gun). I just made the alterations above and showed check to point out that maybe not using the numeric tower is also an option since match is a totally new sublanguage anyway and distinguishing between booleans, ints, floats (how are bytes and strings currently handled btw?) etc. is something where match could really shine (without the developer having to be aware of all the intricacies involved)

gvanrossum · 2020-08-17T14:43:26Z

Let’s add this to the next revision of the PEP, once the SC has sent us their feedback (expected this week).

stereobutter · 2020-08-17T14:46:47Z

if I pass say a numpy.float64(1.0), I'd probably expect it to match 1.0, but it's a different type.

@JelleZijlstra Actually numpy.float64(1.0) would match case 1.0 since issubclass(numpy.float64, float) is true.

JelleZijlstra · 2020-08-17T14:51:27Z

@JelleZijlstra Actually numpy.float64(1.0) would match case 1.0 since issubclass(numpy.float64, float) is true.

Oops, you're right. However, issubclass(numpy.int64, int) is False, so let's pretend I said numpy.int64(1) instead (or numpy.float32(1.0), which is also not a subclass of float). I suppose it's because float64 has the same range as builtins.float, while int64 has a smaller range because builtins.int is arbitrary precision.

I agree with using is for builtin singletons in pattern matching.

stereobutter · 2020-08-17T14:54:39Z

@JelleZijlstra yeah I just noticed that numpy.float16(1.0) would not match case 1.0 which I'd think might actually be a feature since someone who goes to the length of using numpy specific types might actually care about distingusing numpy types from regular types e.g to use match to implement a numpy-optimized version of some function if a numpy type is passed or use some specific precision logic for numpy types.

stereobutter · 2020-08-17T15:05:08Z

@gvanrossum
Another argument against using the numeric tower would be that match is conceptually much more about destructuring objects and dispatching on types than about comparing values i.e. switch-case semantics. In that light using isinstance checks instead of == for matching literal values does make a ton of sense (to me at least ;) )

gvanrossum · 2020-08-17T16:53:38Z

I'm beginning to think that Rust was right after all to disallow floats in patterns altogether. This made it into the PEP's Rejected ideas section, and I don't think it would be quite right for Python. E.g. what would we do if a value pattern had a float value? Come to think of it, value patterns currently strictly use ==. If we were to change the semantics of case True, would that mean we'd also change the semantics of case x.y if x.y happens to be the constant True?

One more concern with simple isinstance checks vs. the numeric tower: int is not a subclass of float, but Python does a lot of work to make integers acceptable wherever floats are. (E.g. the change to division semantics was motivated by this.) PEP 484 also considers int a subtype of float. So I think it would be uncool if this didn't work:

match 42:
  case 42.0: print("The answer")
  case _: assert False

stereobutter · 2020-08-17T17:20:47Z

I totally forgot about PEP 484 considering int a subtype of float and you are of course right that the distinction between 1 and 1.0 does not matter much in typical python code. For True and False however I think special casing is warranted since treating True and False as integers is rather less common (I'd personally say this is borderline discouraged). Maybe for pedantic people like me Literal[42.0] could be (ab-)used to signal that "yes I literally mean 42.0, not 42 thank you very much".

stereobutter · 2020-08-17T17:26:40Z

curious question: would the parser be able to distinguish float(42.0) and 42.0? Then if someone wanted to be really pedantic about their floats and ints one might be able to write:

match value:
    case float(42.0): ...
    case int(42): ....

brandtbucher · 2020-08-17T18:19:35Z

curious question: would the parser be able to distinguish float(42.0) and 42.0? Then if someone wanted to be really pedantic about their floats and ints one might be able to write:
match value:
    case float(42.0): ...
    case int(42): ....

This already works the way you expect it to.

stereobutter · 2020-08-17T18:33:30Z

@brandtbucher I didn't know that; thanks a lot. Using that syntax it's possible to write

match x:
    case bool(True):...
    case int(1): ...
    case float(1.0): ...

however the issue with x=True matching case int(1) remains so one has to be careful about the order of the case statements (someone should probably tell @markshannon ;) )

... also now that we know how to distinguish (bool), int and float... how does one spell just any old number; case Number(x): ... with Number from numbers doesn't seem to work (of course there is always case int(x) | float (x): ... but I would have really liked if Number just worked)

brandtbucher · 2020-08-17T18:44:32Z

... also now that we know how to distinguish (bool), int and float... how does one spell just any old number; case Number(x): ... with Number from numbers doesn't seem to work (of course there is always case int(x) | float (x): ... but I would have really liked if Number just worked)

Not exactly sure what you want without more helpful examples, but perhaps case x := Number() or case Number() if value == x fits your need?

Certain built-ins like bool, int, and float are special-cased for the above behavior, so it won't just work with any class.

stereobutter · 2020-08-17T18:50:29Z

I though about something like json-serialization e.g.

def encode(value):
    match value:
        case None: return 'null'
        case bool(x): return str(x).lower()
        case str(x): return x
        case int(x) | float(x): return x
        case {**kwargs}: return {encode(k): encode(v) for k, v in kwargs.items()}
        case [*args]: return [encode(e) for e in args]
        case obj: raise ValueError(f'{obj} is not serializable')

side note: maybe this example, trivial as it is, should be part of the PEP because everyone knows json and it is a good example of the type of application where match really shines.

stereobutter · 2020-08-17T19:15:11Z

maybe this example, trivial as it is, [...]

it actually isn't all that trivial since there are also float('-inf'), float('inf') and float('nan') the first two don't currently appear to compare correctly using the syntax case float('inf') even though == works for them; while for float('nan') comparison with == doesn't work at all.

It would really be a shame if json serialization (with all its weird edge cases) could not be written as something like:

def encode(value):
    match value:
        case None: return 'null'
        case str(x): return x
        case bool(x): return str(x).lower()
        case undef:= float('inf') | float('-inf') | float('nan'): return str(undef)
        case int(x) | float(x): return x
        case {**mapping}: return {encode(k): encode(v) for k, v in mapping.items()}
        case [*iterable]: return [encode(e) for e in iterable]
        case obj: raise ValueError(f'{obj} is not serializable')

JelleZijlstra · 2020-08-17T19:47:44Z

You could do it with a guard, something like case float(x) if math.isnan(x) or math.isinf(x).

gvanrossum · 2020-08-17T19:54:08Z

For True and False however I think special casing is warranted

Yes. Also for None. These are all final types BTW. And I'd say only for these three we should use is instead of ==.

For the other stuff, I believe we should not change anything. And given the dark edge cases of the JSON example I don't think we should add it to the PEP (it's already too long).

brandtbucher · 2020-08-17T19:58:14Z

Agreed. I guess the only remaining question here is:

If we were to change the semantics of case True, would that mean we'd also change the semantics of case x.y if x.y happens to be the constant True?

I'm neutral.

gvanrossum · 2020-08-17T20:46:45Z

If we were to change the semantics of case True, would that mean we'd also change the semantics of case x.y if x.y happens to be the constant True?

I'm neutral.

Yeah, that's a tough one. For case True and so on the code generator can easily generate different code. But I imagine that for case x.y and other value patterns the code generator currently just loads the value and then compares it to the target using == -- it would seem awkward to have to generate special cases for True, False and None there. So I'm inclined not to change the semantics of value patterns, and always use == for those. After all people shouldn't be creating aliases for True or False (let alone None, oh horror :-) to be used as "constant" values; they should be creating enums.

brandtbucher · 2020-08-17T20:58:00Z

Well, in terms of implementation it would probably just be a new opcode that does the correct comparison at runtime, rather than separate compile-time code paths. But your rationale for not wanting to do this is sound, and I think it's easier to explain.

Tobias-Kohn · 2020-08-17T21:12:29Z

Agreed. I would keep the semantics bound to syntax as close as possible, i.e. not use sometimes is and sometimes == when comparing to constants.

The idea that case True: really should mean True and nothing else makes a lot of sense to me. However, when using constants, it also feels more natural that case consts.my_true: matches anything that is ==-equal to whatever my_true is. There is still the option to use something like case bool(consts.my_true): or case x if x is consts.my_true: if the type or is semantics are really important—with the plus that the more verbose syntax stresses the tighter constraints wanted there.

viridia · 2020-08-17T21:18:40Z

One of the early arguments for the (deferred) custom matcher feature was to be able to have matchers of the form equals(x) or isExactly(x), which I think reads more intuitively than bool(True).

stereobutter · 2020-08-17T21:23:09Z

@viridia I think it would be very unfortunate if we had to (import and) use a helper function for dealing with one of the primitive data structures in the language. Not to mention that we'd need to wait for the custom matcher protocol (that I am very much looking forward to)

Tobias-Kohn · 2020-08-17T21:27:41Z

@viridia This probably addresses the numerous ideas brought forward for wanting to have something like case ==x and case is x. However, as much as I like the idea of the custom matcher in general, I think the issue with the NFT-trio here is a slightly different one: as I understood it, the main idea is to make case True etc. more behave as you would expect them, not to avoid syntax like case bool(True): as such. So, I wouldn't want to go down that route with a general is vs == debate right now.

gvanrossum · 2020-09-14T20:35:41Z

I am putting this in the revised PEP (PEP 634, python/peps#1598), so I am labeling this as "accepted" and "fully pepped". None, True and False will be compared using is instead of ==. @brandtbucher Would this be easy to implement in your branch?

brandtbucher · 2020-09-14T20:38:21Z

Yep, just two lines in the compiler. I assume this is only for literals, not constant value patterns that evaluate to True/False/None. Otherwise we'll need a new opcode, but it's still not too bad.

gvanrossum · 2020-09-14T21:28:29Z

Indeed, nothing changes for value patterns.

brandtbucher · 2020-09-16T04:19:31Z

This has been implemented.

dmoisset · 2020-10-14T23:20:33Z

I was just re-reading the spec in PEP-634 and found something which I think is misspecified around this. It's essentially about literal patterns appearing as mapping keys, i.e. a pattern like {True: var}. If I use {1: "foo"} as a subject, does it match?

I'm guessing the reasonable and implementable answer is "yes", although our spec says A mapping pattern succeeds if every key given in the mapping pattern matches the corresponding item of the subject mapping. which is False for this example

gvanrossum · 2020-10-15T03:43:16Z

Oh, good eye for detail! Indeed, that totally didn't get rephrased when we changed how True/False/None are matched. Given how dict key lookup works,

match {1: "foo"}:
    case {True: var}: print(var)

should totally print "foo". Can you add a separate PR to address this in PEP 634?

dmoisset · 2020-10-22T23:08:22Z

Added to python/peps#1675

ghost · 2021-03-17T09:14:06Z

I've looked again at the Pep and also in the issues here on github but I didn't find any mention of Ellipsis (except for some proposals to use ... instead of _ as wildcard). If None, True and False are special cased Ellipsis should probably be too? I have on occasion used ... as a placeholder where I couldn't use None because I had to distinguish between no result and the result being None. See also https://stackoverflow.com/a/120055.

gvanrossum · 2021-03-17T15:03:57Z

Ellipsis is not a reserved word in the grammar so it is not special-cased. If you need to match on it, you can do the following to avoid binding a local variable named Ellipsis: import builtins match ...: case builtins.Ellipsis: ...

brandtbucher · 2021-03-17T15:54:22Z

@sdesch see also this comment.

stereobutter changed the title ~~Literal comparison for True and False~~ Literal comparison for True, False, 1, 0, 1.0, and 0.0 Aug 17, 2020

gvanrossum added needs more pep An issue which needs to be documented in the PEP open Still under discussion labels Aug 17, 2020

gvanrossum changed the title ~~Literal comparison for True, False, 1, 0, 1.0, and 0.0~~ Literal comparison for True, False, None Aug 17, 2020

stereobutter mentioned this issue Aug 17, 2020

Literal comparision for float('inf') and float('-inf') #141

Closed

gvanrossum added accepted Discussion leading to a final decision to include in the PEP fully pepped Issues that have been fully documented in the PEP and removed needs more pep An issue which needs to be documented in the PEP open Still under discussion labels Sep 14, 2020

Literal comparison for True, False, None #139

Literal comparison for True, False, None #139

Comments

stereobutter commented Aug 17, 2020 • edited

stereobutter commented Aug 17, 2020 • edited

JelleZijlstra commented Aug 17, 2020

gvanrossum commented Aug 17, 2020

stereobutter commented Aug 17, 2020 • edited

gvanrossum commented Aug 17, 2020

stereobutter commented Aug 17, 2020 • edited

JelleZijlstra commented Aug 17, 2020

stereobutter commented Aug 17, 2020 • edited

stereobutter commented Aug 17, 2020

gvanrossum commented Aug 17, 2020

stereobutter commented Aug 17, 2020 • edited

stereobutter commented Aug 17, 2020 • edited

brandtbucher commented Aug 17, 2020

stereobutter commented Aug 17, 2020 • edited

brandtbucher commented Aug 17, 2020 • edited

stereobutter commented Aug 17, 2020 • edited

stereobutter commented Aug 17, 2020 • edited

JelleZijlstra commented Aug 17, 2020

gvanrossum commented Aug 17, 2020

brandtbucher commented Aug 17, 2020

gvanrossum commented Aug 17, 2020

brandtbucher commented Aug 17, 2020 • edited

Tobias-Kohn commented Aug 17, 2020

viridia commented Aug 17, 2020

stereobutter commented Aug 17, 2020

Tobias-Kohn commented Aug 17, 2020

gvanrossum commented Sep 14, 2020 • edited

brandtbucher commented Sep 14, 2020 • edited

gvanrossum commented Sep 14, 2020

brandtbucher commented Sep 16, 2020

dmoisset commented Oct 14, 2020

gvanrossum commented Oct 15, 2020

dmoisset commented Oct 22, 2020

ghost commented Mar 17, 2021

gvanrossum commented Mar 17, 2021 via email

brandtbucher commented Mar 17, 2021

stereobutter commented Aug 17, 2020 •

edited

stereobutter commented Aug 17, 2020 •

edited

stereobutter commented Aug 17, 2020 •

edited

stereobutter commented Aug 17, 2020 •

edited

stereobutter commented Aug 17, 2020 •

edited

stereobutter commented Aug 17, 2020 •

edited

stereobutter commented Aug 17, 2020 •

edited

stereobutter commented Aug 17, 2020 •

edited

brandtbucher commented Aug 17, 2020 •

edited

stereobutter commented Aug 17, 2020 •

edited

stereobutter commented Aug 17, 2020 •

edited

brandtbucher commented Aug 17, 2020 •

edited

gvanrossum commented Sep 14, 2020 •

edited

brandtbucher commented Sep 14, 2020 •

edited