Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve custom types based on #928 #941

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Copy link
Contributor

@pushfoo pushfoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using TypeVars would make this more expressive since different parts of the API use 0-255 (user-facing) or float (internal OpenGL) for channel values.

pyglet/customtypes.py Outdated Show resolved Hide resolved
pyglet/customtypes.py Outdated Show resolved Hide resolved
shenjackyuanjie and others added 2 commits September 2, 2023 12:25
thanks for @pushfoo

Co-authored-by: Paul <36696816+pushfoo@users.noreply.github.com>
Co-authored-by: Paul <36696816+pushfoo@users.noreply.github.com>
Copy link
Contributor

@pushfoo pushfoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

I think Color[int] and Color[float]'s current implementation are our best options right now:

  1. It's better than arguing over ByteColor vs Color255
  2. It's shorter than the options above
  3. There is no good way to generally specify a sequence or iterable of a specific length
    • If I remember correctly, Guido opposed the idea
    • Custom Annotated would be the closest (see Tuple-like Sequence python/mypy#8441)
    • Any custom annotations would need support added to sphinx

@iperov
Copy link

iperov commented Sep 3, 2023

what are type aliases for? usage example?

@pushfoo
Copy link
Contributor

pushfoo commented Sep 3, 2023

what are type aliases for

tl;dr optional static typing which relies on imported modules or external tools

  1. They allow assigning a long type annotation to shorter name
  2. You can use the shorter name to annotate variables and function arguments
  3. This makes the code easier to read for humans, type checkers, and certain imported modules

Instead of:

def zero_on_x_axis(vector: Union[Vec2, Vec2, Vec4, Sequence[Number]]) -> bool:
    return vector[0] == 0

def zero_on_y_axis(vector: Union[Vec2, Vec2, Vec4, Sequence[Number]]) -> bool:
    return vector[1] == 0

We can use an alias:

AnyVector = Union[Vec2, Vec2, Vec4, Sequence[Number]]

def zero_on_x_axis(vector: AnyVector) -> bool:
    return vector[0] == 0

def zero_on_y_axis(vector: AnyVector]) -> bool:
    return vector[1] == 0

By default, the CPython interpreter only parses and stores type annotations. Interpreting this metadata is left to optional libraries (dataclasses, Pydantic) and external tools called type checkers.

There are both stand-alone type checkers, such as mypy, and integrations with editors. For example, PyCharm and Visual Studio Code plugins both have integrations which allow auto-complete to work correctly. Other editors can also benefit from annotations and aliases, especially those with support for the language server protocol.

EDIT: I forgot to mention that there are also stub files (.pyi) which structurally resemble .h files in C/C++. They aren't used for linking, just type checking.

@iperov
Copy link

iperov commented Sep 3, 2023

zero_on_x_axis(vector: Union[Vec2, Vec2, Vec4, Sequence[Number]])

that's looks like shit. Never seen that. Any real example of this func?

@pushfoo
Copy link
Contributor

pushfoo commented Sep 3, 2023

It's an example of how code without aliases can be ugly.

Any real example of this func?

Here's a better example:

Before aliases (terrible)

def point_in_polygon(point: Union[Vec2, Tuple[Number, Number]], Sequence[Union[Vec2, Tuple[Number, Number]]]) -> bool:
     ...  # Implementation omitted; raycasting or winding number algorithm

After aliases (readable)

Point2D = Union[Vec2, Tuple[Number, Number]]

def point_in_polygon(point: Point2D, polygon: Sequence[Point2D]) -> bool:
     ...  # Implementation omitted; raycasting or winding number algorithm

Now any IDE will flag the code below as wrong before it's ever run:

polygon = [(0,0), (2,2), (-2,2)]
point_in_polygon("This should be a 2D point instead of a string", polygon)

If someone tries to make a PR with code like this, mypy or pyright can save time the same way -Wall does in a C compiler:
don't do detailed code review and don't run tests until the submitter fixes it.

@iperov
Copy link

iperov commented Sep 3, 2023

Vec2 is a class and Point/Polygon is not a class, that is ugly.
All should follow the same logic in order not to confuse the user.
Polygon should be a class or dataclass , same as Vec2.

@pushfoo
Copy link
Contributor

pushfoo commented Sep 3, 2023

I agree it's ugly. That's why the PR doesn't have it. If you check the source code for this PR, you'll see this:

Point2D = Tuple[Number, Number]
Point3D = Tuple[Number, Number, Number]

This is because pyglet tends to favor Tuple over the more technically correct Sequence. Although the latter is more correct and compatible, Python doesn't have a good way of expressing the idea of "sequence of length 3".

You asked a specific question:

what are type aliases for? usage example?

I answered with specific, hypothetical examples illustrating the following:

  1. Compatibility-focused type checking is inherently ugly in Python
  2. Aliases help hide long annotations
  3. This has the most impact when hiding ugly annotations you have to keep for compatibility reasons

Polygon should be a class or dataclass , same as Vec2.

Again, the example I offered is hypothetical. If you want to provide input on a potential actual implementation of polygon collision, there's an OOP wrapper being discussed over at #928.

If the maintainers end up making a choice you disagree with, there's always ModernGL. It's newer and generally has cleaner design. I think it may also have numpy integration, so it's probably faster too.

@zhengxyz123
Copy link
Contributor

According to this:

>>> isinstance(1+2j, numbers.Number)
True
>>> isinstance(1+2j, numbers.Real)
False
>>> isinstance(1, numbers.Real)
True
>>> isinstance(3.14, numbers.Real)
True

So I suggest turning all numbers.Number to numbers.Real.

We will never know what the user will enter.

@iperov
Copy link

iperov commented Sep 3, 2023

@pushfoo
I agree that Sequence[T] indicates unknown length of list/tuple of T elements, and Tuple indicates fixed length with particular types of elements.

I'm talking about logic, clarity, obviousness and no hidden behavior.
Capitalized word means a class. it's universally accepted by whole python community.

A user does not have to look at the implementation of every class he uses.

Point2D = Tuple[Number, Number]

this makes non obvious behavior.
User will think that Point2D is class, and instantiation Point2D() will construct an instance of annotation, not a tuple with two Numbers

Possible solutions:

  • code all math classes similar to Vec2 - only right solution
  • rename aliases for example to lowercased point2d or _point2d - I don't like. Type aliases should not be used at all in python.

@pushfoo
Copy link
Contributor

pushfoo commented Sep 4, 2023

tl;dr:

  1. Agreed, Number needs to be changed
  2. The maintainers might reject a class-only solution for performance reasons
  3. I would like to better understand @iperov's stance and recommendations for avoiding type aliases

1. Agreed, Number needs to be changed

So I suggest turning all numbers.Number to numbers.Real.

Clarification: Number = Union[int, float], not numbers.Number.

It's confusing and we should change it. OP and I discussed doing so and other issues on Discord. We wanted to get feedback before proceeding:

Idea Benefit
Replace Number with a TypeVar of the same types but different name Allow specifying Point2D[int] instead of Point2D[float]
Make Vec2 classes Generic Allow specifying int vs float coordinates

2. The maintainers might reject a class-only solution for performance reasons

code all math classes similar to Vec2 - only right solution

I agree with this in theory. I've done so in other projects.

Capitalized word means a class. it's universally accepted by whole python community.

But a class does not always mean a capitalized name:

>>> type([])
<class 'list'>

If Python 3.9 had deprecated lower case type collection names (ie list in favor of List), the situation would be less confusing. Case and class would be a bidirectional implication.

Instead, we have to resolve our current dilemma. The design constraints of pyglet restrict us further:

  1. 3.8+ support
  2. No hard install dependencies means no typing_extensions

As I understand it, performance is another reason pyglet and downstream libraries favor bare sequences:

  1. pyglet is a pure Python library, so optimizing Python code for performance is more worthwhile than usual
  2. Classes have historically run slower

Before #624, the vector classes were tuple subclasses which:

  1. avoided performance penalties
  2. kept types and potential annotations extremely simple

The current Vec classes already use slots, but do not match Sequence. Would it be worth the ambiguity of Sequence to gain simpler annotations by making the Vec classes match it?

Example:

def point_in_polygon(point: Sequence[float], polygon: Sequence[Sequence[float]]) -> bool:

To me, this seems more ambiguous.

3. Reasons for avoiding type aliases

Rename aliases for example to lowercased point2d or _point2d - I don't like.

I also dislike this option, as implied above.

Type aliases should not be used at all in python.

I would like to understand your reasoning and proposed alternatives.

As I understand it, you dislike:

  1. their risk for confusion
  2. how they can hide fundamental design issues

If so, your opposition seems reasonable, especially when Union is involved.

What do you think of the following?

  1. TypeVar and Generic
  2. Protocol types

They could help remove Union and the problems it causes:

Sphinx alias expansion Consequence Example
Disabled Confusing name-only rendering origin: Point2D
Enabled ugly Union[Vec2, Tuple[Union[int, float], Union[int, float]]]

Since they can have docstrings like normal classes, Protocol types could have the following benefits:

  1. Render as single clickable link in an function signature, optionally with hover preview
  2. Allow IDE pop-ups in PyCharm and VS Code
  3. Allow better auto-complete than Union types
  4. Easily centralize data type explanations into a single glossary page

@iperov
Copy link

iperov commented Sep 4, 2023

pyglet is a pure Python library, so optimizing Python code for performance is more worthwhile than usual

  1. when you chose python for this type of libraries, you realized that it is slower than C++

  2. code performance should not affect programmer productivity

  3. buying a slightly faster processor than you have now automatically fixes any performance shortcomings from using classes?

  4. python has gotten more performance updates since 3.10

But a class does not always mean a capitalized name: <class 'list'>

list is built-in name. You don't expect lowercase class names in 3rd party library.

The current Vec classes already use slots, but do not match Sequence.

actually it does

def __getitem__(self, item):
        return (self.x, self.y)[item]
def __len__(self) -> int:
        return 2

and passing list of Vec2 to numpy produces (N,2) shape ndarray

P.S.

I still can't figure out real-world use cases for these kinds of classes.

If I need point array processing, I use numpy.

If I need massive point array with non-linear logic per every element, I use numba.

Direct classes implementing Vec2, Line2, Rect2, etc. primitives are good for small math.

@shenjackyuanjie
Copy link
Contributor Author

well, to my opinion
the customtype.py is for pyglet builtin typing
such as https://github.com/pyglet/pyglet/pull/928/files#diff-12dee3cbc502339c52c9d251bb8dea2e033f7bcf7dfacbaea089a6b05d48e771R9
@iperov

@pushfoo
Copy link
Contributor

pushfoo commented Sep 4, 2023

tl;dr:

  1. Your technical points are correct, especially from the perspective of an expert in ML / AI
  2. Are you interested in pyglet because it lacks annotations?
  3. Your skills are well past the point where Vec2 would be useful.

buying a slightly faster processor than you have now automatically fixes any performance shortcomings from using classes?
code performance should not affect programmer productivity
python has gotten more performance updates since 3.10

I think you misunderstood my point. One of pyglet's goals is acceptable support for the earliest non-EOL Python release. Although we recommend the latest release, this goal implies making reasonable efforts to help people stuck on supported earlier versions as time allows.

The current Vec classes already use slots, but do not match Sequence.

actually it does

def example(a: _typing.Sequence):
    print(a)

example(Vec2(1,1))

The example above runs, but mypy and pyright warn that Vec2 instances don't match Sequence. This implies you aren't using a type checker.

I understand parts of the ML community prefer Python without annotations. Are you interested in pyglet because it lacks them?

If you'd prefer pyglet refrain from adding annotations, Benjamin and caffienepills are the ones to convince, not me. They make the final merge decisions.

I still can't figure out real-world use cases for these kinds of classes.

It's as you said here:

Direct classes implementing Vec2, Line2, Rect2, etc. primitives are good for small math.

They're helpers for people who either aren't ready or don't want to work with numpy or GLSL. Your skill level is well beyond the point where Vec2 would be useful.

@iperov
Copy link

iperov commented Sep 4, 2023

personally I am using annotations in my recent project. For the first 3 years of learning python, I didn't use them.

But now I think they MUST be in any serious project in every function.

Are you interested in pyglet because it lacks annotations?

i am interested in pyglet to experiment with a GUI system.

This implies you aren't using a type checker.

yeah i don't use type checker. I am coding in VSCode, so intellisense type hinting is enough for me.

def example(a: _typing.Sequence):
    print(a)

example(Vec2(1,1))

because it should be

def example( pt : Vec2):
   ...

a = [1,1] 
example( Vec2(*a) )

better explicit than implicit.

Your skill level is well beyond the point where Vec2 would be useful.

Before I used np.ndarray for single math objects (Vec2, Affine matrix), which are used for example for marking objects in a video frame. In this case there is no performance degradation for a small number of objects.

But at some point I thought it was not nice, especially when they are sent as arguments to other functions where the annotation looks like : np.ndarray, so I wrote ready-made classes for them (Vec2, Rect, etc.).

and now it looks much better

class Rect:
    @staticmethod
    def from_3pts(p0 : Vec2, p1 : Vec2, p2 : Vec2):
        """
        Construct Rect from 3 pts
           p0--p2
          /   /
         p1--
        """

@pushfoo
Copy link
Contributor

pushfoo commented Sep 6, 2023

tl;dr:

  1. The best approach for a UI system might be the same one pyglet uses: classes which abstract ctypes bindings of OpenGL state.
  2. Protocol types could help migrate toward the more explicit annotations you've mentioned above

But now I think they MUST be in any serious project in every function.

Agreed.

yeah i don't use type checker.

Outside of the ML community, there seems to be a shift toward not only using mypy and pyright, but relying on annotations during execution. Regardless of our opinions, you can see this reflected in:

  1. Projects running external type checkers as part of CI
  2. The popularity of dataclasses, pydantic, and tools built on top of them

i am interested in pyglet to experiment with a GUI system.

You're not the only one. I'd suggest reading up on the event system. It resembles that of JS events a little bit.

The best approach to implementing your own UI framework may be a more widespread version of the same one pyglet uses for shapes: abstract the tuple-focused GL calls with classes.

This allows you to draw a sharp boundary:

  1. Explicit classes wherever possible at or above the boundary line
  2. Translate calls into ctypes-friendly calls below it

In pyglet's case, classes tend to be reserved for the following cases:

  1. Abstractions of GL to help manage state (Window, Context, etc)
  2. Higher level abstractions (shapes, sprites which accept images / textures)

Otherwise, code uses primitives or built-in collections (tuples) of primitives whenever possible. This simplifies interacting with ctypes bindings. My understanding is that changing pyglet to use classes everywhere would:

  1. require rewriting much of the framework
  2. introduce breaking changes and performance impacts

Protocol types for annotations could be a step on the road to changing this, however. They would allow annotating things with a single class with a consistent API. It can be made a real implementation at a later point.

@iperov
Copy link

iperov commented Sep 6, 2023

You're not the only one. I'd suggest reading up on the event system. It resembles that of JS events a little bit.

:D no thx.

@pushfoo
Copy link
Contributor

pushfoo commented Sep 6, 2023

:D no thx.

lol, I should have said "without some of the irritations".

Even if you choose not to use pyglet, I appreciate that you challenged my assumptions. Reconsidering them will likely lead to better doc and code quality in the future. Thank you for your time and effort.

@pushfoo
Copy link
Contributor

pushfoo commented Sep 11, 2023

The CI failure looks like a fluke failure due to playback time estimation in audio tests being outside the current bounds. I'd touch up anything else which needs it and push further commits.

If it keeps happening, can deal with repeat CI failures of this in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants