Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

JSON example #155

Open
gvanrossum opened this issue Oct 9, 2020 · 5 comments
Open

JSON example #155

gvanrossum opened this issue Oct 9, 2020 · 5 comments

Comments

@gvanrossum
Copy link
Owner

gvanrossum commented Oct 9, 2020

What do people think of this example? I don't think it's great (the str(name) etc. grates). However it does validate the JSON (ignoring malformed and extra data -- it's easy to extend it to reject those, but boring).

# Cats and Dogs

import sys
import json
from dataclasses import dataclass

@dataclass
class Animal:
    pass

@dataclass
class Pet(Animal):
    name: str
    breed: str

@dataclass
class Dog(Pet):
    leash_color: str

@dataclass
class Cat(Pet):
    favorite_toy: str

def get_pets(raw: object) -> list[Cat | Dog]:
    match raw:
        case [*raw_pets]:  # List of pets
            return list(filter(None, (get_pet(raw_pet) for raw_pet in raw_pets)))
        case {**raw_pet}:  # Maybe a single pet
            pet = get_pet(raw_pet)
            return [pet] if pet else []

def get_pet(raw_pet: object) -> Cat | Dog | None:
    match raw_pet:
        case {"type": "cat", "name": str(name), "breed": str(breed), "favorite_toy": str(toy)}:
            return Cat(name, breed, toy)
        case {"type": "dog", "name": str(name), "breed": str(breed), "leash_color": str(leash)}:
            return Dog(name, breed, leash)
        case _:
            return None  # Not a known type of pet

def main() -> None:
    raw = json.load(sys.stdin)
    for pet in get_pets(raw):
        print(pet)

if __name__ == "__main__":
    main()
@Tobias-Kohn
Copy link
Collaborator

The example certainly works, but it looks a wee bit artificial too me, to be perfectly honest.

Yes, the str(name) part is a bit of an ugly spot. The reason why it bothers me a bit is that it is used to validate the data without really handling it. To my mind, patterns should be used to let the interpreter select the 'correct' case clause and extract data. Somehow, having a pattern like { "name": str(name) } suggests to me that there will be another pattern { "name": ... } handling anything that is not a string. But I would not use the str() here just as a kind of type annotation.

So, may I propose to slightly change the example like this:

def get_pet(raw_pet: object) -> Cat | Dog | None:
    match raw_pet:
        case {"type": "cat", "name": str(name), "breed": str(breed), "favorite_toy": str(toy)}:
            return Cat(name, breed, toy)
        case {"type": "dog", "name": str(name), "breed": str(breed), "leash_color": str(leash)}:
            return Dog(name, breed, leash)
        case {"type": "cat" | "dog"}:
            raise malformed_data(repr(raw_pet))
        case _:
            return None  # Not a known type of pet

Perhaps it might also be a nice touch to add a very brief description or title saying something like "transforming data from JSON to Python data classes."

@gvanrossum
Copy link
Owner Author

Okay, but maybe then the example should validate everything and raise in the default case? That would actually simplify things a bit because we can get rid of the Optional (| None).

We should also add case _: raise ... to get_pets() then.

@Tobias-Kohn
Copy link
Collaborator

Yes, that sounds good to me. I would still have the additional case then, though, and use perhaps two different exception types. One could be malformed_data and the other not_a_pet, for instance.

@gvanrossum
Copy link
Owner Author

Okay, then my next version is:

# Cats and Dogs

import sys
import json
from dataclasses import dataclass

@dataclass
class Animal:
    pass

@dataclass
class Pet(Animal):
    name: str
    breed: str

@dataclass
class Dog(Pet):
    leash_color: str

@dataclass
class Cat(Pet):
    favorite_toy: str

def get_pets(raw: object) -> list[Cat | Dog]:
    match raw:
        case [*raw_pets]:  # List of pets
            return [get_pet(raw_pet) for raw_pet in raw_pets]
        case {**raw_pet}:  # Maybe a single pet
            return [get_pet(raw_pet)]
        case _:
            raise TypeError(f"Neither a pet nor a list of pets: {raw}")

def get_pet(raw_pet: object) -> Cat | Dog:
    match raw_pet:
        case {"type": "cat", "name": str(name), "breed": str(breed), "favorite_toy": str(toy)}:
            return Cat(name, breed, toy)
        case {"type": "dog", "name": str(name), "breed": str(breed), "leash_color": str(leash)}:
            return Dog(name, breed, leash)
        case {"type": "cat" | "dog"}:
            raise TypeError(f"Malformed pet: {raw_pet}")
        case _:
            raise TypeError(f"Not a pet: {raw_pet}")

def main() -> None:
    raw = json.load(sys.stdin)
    for pet in get_pets(raw):
        print(pet)

if __name__ == "__main__":
    main()

However, I still find this too long for inclusion in PEP 635 (where I have a TODO suggesting a JSON example), and even if we link to it I'm less than thrilled about using str(name) six times -- this is too similar to the expression of that form. If we could instead use

    case {"type": "cat", "name": name, "breed": breed, "favorite_toy": toy}:

it would be more accessible as an example, but of course it fails to validate properly.

@gvanrossum
Copy link
Owner Author

Cleaned up as https://github.com/gvanrossum/patma/blob/master/examples/jsonpets.py

I changed the structure somewhat, moving "breed" into Dog and changing it to "pattern" for Cat (tuxedo cats aren't a breed, the Internet tells me).

Note that there's no way to validate the type annotations, as no type checker for Python currently supports match statements (nor the PEP 604 | notation for Unions :-).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants