Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Python #19

Open
ghost opened this issue Mar 14, 2017 · 13 comments
Open

Support for Python #19

ghost opened this issue Mar 14, 2017 · 13 comments

Comments

@ghost
Copy link

ghost commented Mar 14, 2017

Hi, is there any chance for Python support?

@pascaldekloe
Copy link
Owner

After finishing the C port for the upcoming release Rust and Python would be next as far as I'm concerned. Unless someone with Python skills offers a hand don't expect anything soon. 😬

@Kirillism
Copy link

At least you have it on your todo, thank you!

@guilt
Copy link
Contributor

guilt commented Apr 12, 2020

@pascaldekloe Can you review this. Unmarshaller in progress

https://github.com/guilt/colfer-python/

@pascaldekloe
Copy link
Owner

pascaldekloe commented Apr 12, 2020

The floating-point byte order does not necessarily match sys.byteorder. Wikipedia has a summary on the variants. Some people use struct.pack, although good performance seems unlikely.

I believe codecs.encode can prevent all those memory allocations from .encode('utf-8').

These type checks look extremely inefficient. 🥶Can't we use the native class system, and have range checks in the codec? Please consider the following (probably malformed) example for encoding instance attribute TestU32 (as a Colfer uint32).

if isinstance(TestU32, int):
    if TestU32 < 1<<21:
        if TestU32 < 0:
            raise ValueError("TestU32 is negative")
        # TODO: compressed path here
    else:
        if TestU32 > 1<<32-1:
            raise OverflowError("TestU32 exceeds 32-bit range")
        # TODO: fixed width encoding here
elif isinstance(TestU32, float):
    if not TestU32.is_integer():
        raise ValueError("TestU32 has fractions")
    # TODO: etc.
else:
    raise TypeError("TestU32 " + type(TestU32) + " not an integer")

Can we skip the Mixin and Utils classes? 🙏

@guilt
Copy link
Contributor

guilt commented Apr 12, 2020

I'm open to suggestions on removing the typing. The reason I do that at set is because it's cheaper to do it when setting than at any time. Python not only allows us to add dynamic variables to the class, but also change their types during execution.

Doing the check on read is better, because people can mutate after set, which is concerning for lists, especially. But this stuff will be slow and have a penalty.

Python can autopromote integers to numeric, which is why those int64 s / uint16 s may not behave as we like, sadly.

@pascaldekloe
Copy link
Owner

Doing checks on set creates an error scenario, doesn't it? I'd expect attribute modification to be fast, and fail proof. Serialization is slow, and there you can raise an error.

Maybe just measure first. Can I help? How does one benchmark with Python?

@pascaldekloe
Copy link
Owner

If you make the type selection part of the algorithm, then the runtime could optimize each path quite well. No idea how Python deals right now.

@pascaldekloe
Copy link
Owner

The resolution of datetime is a problem. Can't fail on non-zero for any of the 3 nano second decimals. Silently discarding precision is even worse. Imagine how surprised you'd be if reading and writing a message/object would change the timestamps. Worst case we can use a 64-bit integer?

@guilt
Copy link
Contributor

guilt commented May 10, 2020

Well, it's not always a problem because Python 3.7 has this: https://docs.python.org/3.7/library/time.html#time.time_ns but 3.5, 3.6 (which are popular in production still) do not. And, we also support 2.7, which is the reason some of these hacks exist. Granted, we can do better on 3.7+ but there's a lot of baggage that is brought.

@pascaldekloe
Copy link
Owner

Hmmm. Maybe make timestamps a Python 3.7+ feature? Version 3 was released over a decade ago. Version 2 is end-of-life. Out of curiosity: why is 2.7 still taken care of?

@guilt
Copy link
Contributor

guilt commented May 11, 2020

Because even when people wanted to deprecate 2.7 10 years ago, there was resistance. Officially, they decided to deprecate it in Jan 2020. Not all distros have completely moved to 3.x yet, but they are getting there.

There are a lot of companies that use 2.7, so they would decide to stick with it.

@guilt
Copy link
Contributor

guilt commented May 26, 2020

I've uploaded the preliminary version of the package (slow, but somewhat tested) over here
PyPi - colf and I'm working on your code suggestions / improvements. I'm tracking them at https://github.com/guilt/colfer-python/issues

I am yet to write the generator code in Go, but the basic template / usage is there. I want you to at least check the implementation for correctness first, before performance. After that, I am working on removing self/mixins, inlining and one complete pylint pass. Then add the polyfills for nanoseconds. Then hopefully, if everything works, make the type check optional in declare-time.

@pascaldekloe
Copy link
Owner

Thanks Karthik. For correctness we have the golden test cases. Each language uses the same set.

  1. write ./python/test.py (with golden cases)
  2. compose ./python/gen/Colfer.py
  3. generate (2) with ./python.go.
  4. write ./python/bench/

Again, I'm happy to take over from point 2 if you want.

Repository owner deleted a comment Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants