improve custom types based on #928 #941

shenjackyuanjie · 2023-09-02T04:16:16Z

https://github.com/pyglet/pyglet/pull/928/files#diff-c6d7de478cd7e0cbe79e4e516884e0f3f0659abd5e243e2635a7bbe0a88d21f9

…28/files#diff-c6d7de478cd7e0cbe79e4e516884e0f3f0659abd5e243e2635a7bbe0a88d21f9

pushfoo

Using TypeVars would make this more expressive since different parts of the API use 0-255 (user-facing) or float (internal OpenGL) for channel values.

pyglet/customtypes.py

@pushfoo

thanks for @pushfoo Co-authored-by: Paul <36696816+pushfoo@users.noreply.github.com>

Co-authored-by: Paul <36696816+pushfoo@users.noreply.github.com>

pushfoo

Looks good to me.

I think Color[int] and Color[float]'s current implementation are our best options right now:

It's better than arguing over ByteColor vs Color255
It's shorter than the options above
There is no good way to generally specify a sequence or iterable of a specific length
- If I remember correctly, Guido opposed the idea
- Custom Annotated would be the closest (see Tuple-like Sequence python/mypy#8441)
- Any custom annotations would need support added to sphinx

iperov · 2023-09-03T05:15:33Z

what are type aliases for? usage example?

pushfoo · 2023-09-03T06:47:33Z

what are type aliases for

tl;dr optional static typing which relies on imported modules or external tools

They allow assigning a long type annotation to shorter name
You can use the shorter name to annotate variables and function arguments
This makes the code easier to read for humans, type checkers, and certain imported modules

Instead of:

def zero_on_x_axis(vector: Union[Vec2, Vec2, Vec4, Sequence[Number]]) -> bool:
    return vector[0] == 0

def zero_on_y_axis(vector: Union[Vec2, Vec2, Vec4, Sequence[Number]]) -> bool:
    return vector[1] == 0

We can use an alias:

AnyVector = Union[Vec2, Vec2, Vec4, Sequence[Number]]

def zero_on_x_axis(vector: AnyVector) -> bool:
    return vector[0] == 0

def zero_on_y_axis(vector: AnyVector]) -> bool:
    return vector[1] == 0

By default, the CPython interpreter only parses and stores type annotations. Interpreting this metadata is left to optional libraries (dataclasses, Pydantic) and external tools called type checkers.

There are both stand-alone type checkers, such as mypy, and integrations with editors. For example, PyCharm and Visual Studio Code plugins both have integrations which allow auto-complete to work correctly. Other editors can also benefit from annotations and aliases, especially those with support for the language server protocol.

EDIT: I forgot to mention that there are also stub files (.pyi) which structurally resemble .h files in C/C++. They aren't used for linking, just type checking.

iperov · 2023-09-03T07:53:27Z

zero_on_x_axis(vector: Union[Vec2, Vec2, Vec4, Sequence[Number]])

that's looks like shit. Never seen that. Any real example of this func?

pushfoo · 2023-09-03T08:41:39Z

It's an example of how code without aliases can be ugly.

Any real example of this func?

Here's a better example:

Before aliases (terrible)

def point_in_polygon(point: Union[Vec2, Tuple[Number, Number]], Sequence[Union[Vec2, Tuple[Number, Number]]]) -> bool:
     ...  # Implementation omitted; raycasting or winding number algorithm

After aliases (readable)

Point2D = Union[Vec2, Tuple[Number, Number]]

def point_in_polygon(point: Point2D, polygon: Sequence[Point2D]) -> bool:
     ...  # Implementation omitted; raycasting or winding number algorithm

Now any IDE will flag the code below as wrong before it's ever run:

polygon = [(0,0), (2,2), (-2,2)]
point_in_polygon("This should be a 2D point instead of a string", polygon)

If someone tries to make a PR with code like this, mypy or pyright can save time the same way -Wall does in a C compiler:
don't do detailed code review and don't run tests until the submitter fixes it.

iperov · 2023-09-03T09:07:10Z

Vec2 is a class and Point/Polygon is not a class, that is ugly.
All should follow the same logic in order not to confuse the user.
Polygon should be a class or dataclass , same as Vec2.

pushfoo · 2023-09-03T11:41:54Z

I agree it's ugly. That's why the PR doesn't have it. If you check the source code for this PR, you'll see this:

Point2D = Tuple[Number, Number]
Point3D = Tuple[Number, Number, Number]

This is because pyglet tends to favor Tuple over the more technically correct Sequence. Although the latter is more correct and compatible, Python doesn't have a good way of expressing the idea of "sequence of length 3".

You asked a specific question:

what are type aliases for? usage example?

I answered with specific, hypothetical examples illustrating the following:

Compatibility-focused type checking is inherently ugly in Python
Aliases help hide long annotations
This has the most impact when hiding ugly annotations you have to keep for compatibility reasons

Polygon should be a class or dataclass , same as Vec2.

Again, the example I offered is hypothetical. If you want to provide input on a potential actual implementation of polygon collision, there's an OOP wrapper being discussed over at #928.

If the maintainers end up making a choice you disagree with, there's always ModernGL. It's newer and generally has cleaner design. I think it may also have numpy integration, so it's probably faster too.

zhengxyz123 · 2023-09-03T14:12:31Z

According to this:

>>> isinstance(1+2j, numbers.Number)
True
>>> isinstance(1+2j, numbers.Real)
False
>>> isinstance(1, numbers.Real)
True
>>> isinstance(3.14, numbers.Real)
True

So I suggest turning all numbers.Number to numbers.Real.

We will never know what the user will enter.

iperov · 2023-09-03T14:45:38Z

@pushfoo
I agree that Sequence[T] indicates unknown length of list/tuple of T elements, and Tuple indicates fixed length with particular types of elements.

I'm talking about logic, clarity, obviousness and no hidden behavior.
Capitalized word means a class. it's universally accepted by whole python community.

A user does not have to look at the implementation of every class he uses.

Point2D = Tuple[Number, Number]

this makes non obvious behavior.
User will think that Point2D is class, and instantiation Point2D() will construct an instance of annotation, not a tuple with two Numbers

Possible solutions:

code all math classes similar to Vec2 - only right solution
rename aliases for example to lowercased point2d or _point2d - I don't like. Type aliases should not be used at all in python.

pushfoo · 2023-09-04T10:29:55Z

tl;dr:

Agreed, Number needs to be changed
The maintainers might reject a class-only solution for performance reasons
I would like to better understand @iperov's stance and recommendations for avoiding type aliases

1. Agreed, `Number` needs to be changed

So I suggest turning all numbers.Number to numbers.Real.

Clarification: Number = Union[int, float], not numbers.Number.

It's confusing and we should change it. OP and I discussed doing so and other issues on Discord. We wanted to get feedback before proceeding:

Idea	Benefit
Replace `Number` with a `TypeVar` of the same types but different name	Allow specifying `Point2D[int]` instead of `Point2D[float]`
Make `Vec2` classes `Generic`	Allow specifying `int` vs `float` coordinates

2. The maintainers might reject a class-only solution for performance reasons

code all math classes similar to Vec2 - only right solution

I agree with this in theory. I've done so in other projects.

Capitalized word means a class. it's universally accepted by whole python community.

But a class does not always mean a capitalized name:

>>> type([])
<class 'list'>

If Python 3.9 had deprecated lower case type collection names (ie list in favor of List), the situation would be less confusing. Case and class would be a bidirectional implication.

Instead, we have to resolve our current dilemma. The design constraints of pyglet restrict us further:

3.8+ support
No hard install dependencies means no typing_extensions

As I understand it, performance is another reason pyglet and downstream libraries favor bare sequences:

pyglet is a pure Python library, so optimizing Python code for performance is more worthwhile than usual
Classes have historically run slower

Before #624, the vector classes were tuple subclasses which:

avoided performance penalties
kept types and potential annotations extremely simple

The current Vec classes already use slots, but do not match Sequence. Would it be worth the ambiguity of Sequence to gain simpler annotations by making the Vec classes match it?

Example:

def point_in_polygon(point: Sequence[float], polygon: Sequence[Sequence[float]]) -> bool:

To me, this seems more ambiguous.

3. Reasons for avoiding type aliases

Rename aliases for example to lowercased point2d or _point2d - I don't like.

I also dislike this option, as implied above.

Type aliases should not be used at all in python.

I would like to understand your reasoning and proposed alternatives.

As I understand it, you dislike:

their risk for confusion
how they can hide fundamental design issues

If so, your opposition seems reasonable, especially when Union is involved.

What do you think of the following?

TypeVar and Generic
Protocol types

They could help remove Union and the problems it causes:

Sphinx alias expansion	Consequence	Example
Disabled	Confusing name-only rendering	`origin: Point2D`
Enabled	ugly	`Union[Vec2, Tuple[Union[int, float], Union[int, float]]]`

Since they can have docstrings like normal classes, Protocol types could have the following benefits:

Render as single clickable link in an function signature, optionally with hover preview
Allow IDE pop-ups in PyCharm and VS Code
Allow better auto-complete than Union types
Easily centralize data type explanations into a single glossary page

iperov · 2023-09-04T10:53:18Z

pyglet is a pure Python library, so optimizing Python code for performance is more worthwhile than usual

when you chose python for this type of libraries, you realized that it is slower than C++
code performance should not affect programmer productivity
buying a slightly faster processor than you have now automatically fixes any performance shortcomings from using classes?
python has gotten more performance updates since 3.10

But a class does not always mean a capitalized name: <class 'list'>

list is built-in name. You don't expect lowercase class names in 3rd party library.

The current Vec classes already use slots, but do not match Sequence.

actually it does

def __getitem__(self, item):
        return (self.x, self.y)[item]
def __len__(self) -> int:
        return 2

and passing list of Vec2 to numpy produces (N,2) shape ndarray

P.S.

I still can't figure out real-world use cases for these kinds of classes.

If I need point array processing, I use numpy.

If I need massive point array with non-linear logic per every element, I use numba.

Direct classes implementing Vec2, Line2, Rect2, etc. primitives are good for small math.

shenjackyuanjie · 2023-09-04T13:03:59Z

well, to my opinion
the customtype.py is for pyglet builtin typing
such as https://github.com/pyglet/pyglet/pull/928/files#diff-12dee3cbc502339c52c9d251bb8dea2e033f7bcf7dfacbaea089a6b05d48e771R9
@iperov

pushfoo · 2023-09-04T17:53:59Z

tl;dr:

Your technical points are correct, especially from the perspective of an expert in ML / AI
Are you interested in pyglet because it lacks annotations?
Your skills are well past the point where Vec2 would be useful.

buying a slightly faster processor than you have now automatically fixes any performance shortcomings from using classes?
code performance should not affect programmer productivity
python has gotten more performance updates since 3.10

I think you misunderstood my point. One of pyglet's goals is acceptable support for the earliest non-EOL Python release. Although we recommend the latest release, this goal implies making reasonable efforts to help people stuck on supported earlier versions as time allows.

The current Vec classes already use slots, but do not match Sequence.

actually it does

def example(a: _typing.Sequence):
    print(a)

example(Vec2(1,1))

The example above runs, but mypy and pyright warn that Vec2 instances don't match Sequence. This implies you aren't using a type checker.

I understand parts of the ML community prefer Python without annotations. Are you interested in pyglet because it lacks them?

If you'd prefer pyglet refrain from adding annotations, Benjamin and caffienepills are the ones to convince, not me. They make the final merge decisions.

I still can't figure out real-world use cases for these kinds of classes.

It's as you said here:

Direct classes implementing Vec2, Line2, Rect2, etc. primitives are good for small math.

They're helpers for people who either aren't ready or don't want to work with numpy or GLSL. Your skill level is well beyond the point where Vec2 would be useful.

iperov · 2023-09-04T18:48:44Z

personally I am using annotations in my recent project. For the first 3 years of learning python, I didn't use them.

But now I think they MUST be in any serious project in every function.

Are you interested in pyglet because it lacks annotations?

i am interested in pyglet to experiment with a GUI system.

This implies you aren't using a type checker.

yeah i don't use type checker. I am coding in VSCode, so intellisense type hinting is enough for me.

def example(a: _typing.Sequence):
    print(a)

example(Vec2(1,1))

because it should be

def example( pt : Vec2):
   ...

a = [1,1] 
example( Vec2(*a) )

better explicit than implicit.

Your skill level is well beyond the point where Vec2 would be useful.

Before I used np.ndarray for single math objects (Vec2, Affine matrix), which are used for example for marking objects in a video frame. In this case there is no performance degradation for a small number of objects.

But at some point I thought it was not nice, especially when they are sent as arguments to other functions where the annotation looks like : np.ndarray, so I wrote ready-made classes for them (Vec2, Rect, etc.).

and now it looks much better

class Rect:
    @staticmethod
    def from_3pts(p0 : Vec2, p1 : Vec2, p2 : Vec2):
        """
        Construct Rect from 3 pts
           p0--p2
          /   /
         p1--
        """

pushfoo · 2023-09-06T05:44:27Z

tl;dr:

The best approach for a UI system might be the same one pyglet uses: classes which abstract ctypes bindings of OpenGL state.
Protocol types could help migrate toward the more explicit annotations you've mentioned above

But now I think they MUST be in any serious project in every function.

Agreed.

yeah i don't use type checker.

Outside of the ML community, there seems to be a shift toward not only using mypy and pyright, but relying on annotations during execution. Regardless of our opinions, you can see this reflected in:

Projects running external type checkers as part of CI
The popularity of dataclasses, pydantic, and tools built on top of them

i am interested in pyglet to experiment with a GUI system.

You're not the only one. I'd suggest reading up on the event system. It resembles that of JS events a little bit.

The best approach to implementing your own UI framework may be a more widespread version of the same one pyglet uses for shapes: abstract the tuple-focused GL calls with classes.

This allows you to draw a sharp boundary:

Explicit classes wherever possible at or above the boundary line
Translate calls into ctypes-friendly calls below it

In pyglet's case, classes tend to be reserved for the following cases:

Abstractions of GL to help manage state (Window, Context, etc)
Higher level abstractions (shapes, sprites which accept images / textures)

Otherwise, code uses primitives or built-in collections (tuples) of primitives whenever possible. This simplifies interacting with ctypes bindings. My understanding is that changing pyglet to use classes everywhere would:

require rewriting much of the framework
introduce breaking changes and performance impacts

Protocol types for annotations could be a step on the road to changing this, however. They would allow annotating things with a single class with a consistent API. It can be made a real implementation at a later point.

iperov · 2023-09-06T08:18:27Z

You're not the only one. I'd suggest reading up on the event system. It resembles that of JS events a little bit.

:D no thx.

pushfoo · 2023-09-06T08:27:24Z

:D no thx.

lol, I should have said "without some of the irritations".

Even if you choose not to use pyglet, I appreciate that you challenged my assumptions. Reconsidering them will likely lead to better doc and code quality in the future. Thank you for your time and effort.

pushfoo · 2023-09-11T13:38:28Z

The CI failure looks like a fluke failure due to playback time estimation in audio tests being outside the current bounds. I'd touch up anything else which needs it and push further commits.

If it keeps happening, can deal with repeat CI failures of this in a separate PR.

improve custom types based on https://github.com/pyglet/pyglet/pull/9…

4ef2e6f

…28/files#diff-c6d7de478cd7e0cbe79e4e516884e0f3f0659abd5e243e2635a7bbe0a88d21f9

pushfoo suggested changes Sep 2, 2023

View reviewed changes

pyglet/customtypes.py Outdated Show resolved Hide resolved

pyglet/customtypes.py Outdated Show resolved Hide resolved

shenjackyuanjie and others added 2 commits September 2, 2023 12:25

Update pyglet/customtypes.py

545e086

thanks for @pushfoo Co-authored-by: Paul <36696816+pushfoo@users.noreply.github.com>

Update pyglet/customtypes.py by @pushfoo

7c39790

Co-authored-by: Paul <36696816+pushfoo@users.noreply.github.com>

pushfoo approved these changes Sep 2, 2023

View reviewed changes

imrove some detail

01cd675

use Channel insted of Number

5fd2fc2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve custom types based on #928 #941

improve custom types based on #928 #941

shenjackyuanjie commented Sep 2, 2023

pushfoo left a comment

pushfoo left a comment

iperov commented Sep 3, 2023

pushfoo commented Sep 3, 2023 •

edited

iperov commented Sep 3, 2023

pushfoo commented Sep 3, 2023

iperov commented Sep 3, 2023

pushfoo commented Sep 3, 2023 •

edited

zhengxyz123 commented Sep 3, 2023

iperov commented Sep 3, 2023

pushfoo commented Sep 4, 2023 •

edited

iperov commented Sep 4, 2023

shenjackyuanjie commented Sep 4, 2023

pushfoo commented Sep 4, 2023

iperov commented Sep 4, 2023

pushfoo commented Sep 6, 2023 •

edited

iperov commented Sep 6, 2023

pushfoo commented Sep 6, 2023

pushfoo commented Sep 11, 2023

improve custom types based on #928 #941

Are you sure you want to change the base?

improve custom types based on #928 #941

Conversation

shenjackyuanjie commented Sep 2, 2023

pushfoo left a comment

Choose a reason for hiding this comment

pushfoo left a comment

Choose a reason for hiding this comment

iperov commented Sep 3, 2023

pushfoo commented Sep 3, 2023 • edited

iperov commented Sep 3, 2023

pushfoo commented Sep 3, 2023

iperov commented Sep 3, 2023

pushfoo commented Sep 3, 2023 • edited

zhengxyz123 commented Sep 3, 2023

iperov commented Sep 3, 2023

pushfoo commented Sep 4, 2023 • edited

tl;dr:

1. Agreed, Number needs to be changed

2. The maintainers might reject a class-only solution for performance reasons

3. Reasons for avoiding type aliases

iperov commented Sep 4, 2023

shenjackyuanjie commented Sep 4, 2023

pushfoo commented Sep 4, 2023

iperov commented Sep 4, 2023

pushfoo commented Sep 6, 2023 • edited

iperov commented Sep 6, 2023

pushfoo commented Sep 6, 2023

pushfoo commented Sep 11, 2023

pushfoo commented Sep 3, 2023 •

edited

pushfoo commented Sep 3, 2023 •

edited

pushfoo commented Sep 4, 2023 •

edited

1. Agreed, `Number` needs to be changed

pushfoo commented Sep 6, 2023 •

edited