Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to msgspec #1779

Open
davfsa opened this issue Dec 7, 2023 · 4 comments
Open

Switch to msgspec #1779

davfsa opened this issue Dec 7, 2023 · 4 comments
Labels
breaking This change hurts backwards compatibility enhancement New feature or request large Requires a large number of changes and testing, may take several PRs to complete optimization Optimizations to the code

Comments

@davfsa
Copy link
Member

davfsa commented Dec 7, 2023

An idea I have had for a while, msgspec performs insanely better than anything else out there, and it would allow us to simplify models code by an insane magnitude, as well as being able to finally freeze models without a performance penalty, leading to the removal of a lot of code in cache to ensure immutability.

Unfortunately, this would not come easily, as it could potentially be a pretty big breaking change, as we would only be able to provide fields that Discord themselves provide (we will ofc need converter functions, similar to what EntityFactory and EventFactory do, but they would be a lot simpler and for specific cases, since the rest will be handled by msgspec). This might mean the removal of .app and .shard, possibly passing them to the end developer some other way.

Additionally, some nice thing we might be able to implement is automatic checking of our implementation vs the discord spec to detect missing fields and new fields that we need to implement.

Some work on the ports that I have been doing: https://github.com/davfsa/hikari/tree/feature/msgspec

Links to:
#1191
#1110

Any type of comments or discussions are encouraged and welcome

@davfsa davfsa added enhancement New feature or request optimization Optimizations to the code large Requires a large number of changes and testing, may take several PRs to complete breaking This change hurts backwards compatibility labels Dec 7, 2023
@davfsa davfsa pinned this issue Dec 7, 2023
@davfsa davfsa changed the title Consider using msgpec Switch to msgscpec Dec 7, 2023
@davfsa davfsa changed the title Switch to msgscpec Switch to msgspec Dec 11, 2023
@mikeshardmind
Copy link

mikeshardmind commented May 5, 2024

First off, I'm not a user of hikari I came across this as part of a discussion with someone else about more frameworks using type safe decoding, I have no stakes in this one way or another. Just providing a bit of extra info that might be relevant to any decision making process here, as it was not noted in the original post regarding breakage

msgspec is pretty tightly tied to CPython's internals, see: jcrist/msgspec#22

This isn't necessarily bad, and as noted in the issue, several other libraries to accelerate CPython code are also non-portable across implementations. Going with msgspec, unless you keep around a non-msgspec fallback path will break existing pypy or graalpy compatibility (All of your current requirements and hikari itself appear to be compatible with both). While that could be a reasonable choice, I'm unsure of your user-base's expectations here for compatibility with other Python implementations.

@davfsa
Copy link
Member Author

davfsa commented May 5, 2024

Thanks a lot for your input @mikeshardmind. It is insanely welcome!!

Didn't actually consider how using msgspec would limit providing PyPy interoperability. I have not really used PyPy on my daily basis, because I haven't really benchmarked it in real world needs (something on my todo list, just not sure how to do it properly). I have definitely tried to make hikari work with PyPy (pypy/pypy#3763) in case people do find it useful to use instead of CPython and don't have any plans to change in that front.

It is a shame that msgspec doesn't function nicely under PyPy (it doesn't even build the wheel 😞), I was looking forward to trying to provide typesafe objects (in case Discord decide to ever break the type of something in the future that wouldn't get caught by us) and, specially speedups when it comes to deserialization and memory usage (thanks to msgspec special object type).

Curious too, since you mentioned that you ran into this issue while discussing the topic with someone, are there any other alternatives to msgspec you considered (already thought myself of pydantic and similars, but they provide no real speedup vs attrs, only just add validation)?

Thanks again!

@mikeshardmind
Copy link

Curious too, since you mentioned that you ran into this issue while discussing the topic with someone, are there any other alternatives to msgspec you considered

I'm still evaluating options for my own uses where pypy compatibility is desired. Currently, I use msgspec a lot and only recently ran into a request for pypy support that made me aware of how pypy's normal compatibility layer can't handle it even at a performance cost

Right now, I'd pick mashumaro if speed was the primary factor, and the secondary factors included needed to only support one code path and support alternative python implementations.

I think the long-term ideal in this space for libraries is writing the thinnest abstraction over msgspec + either mashumaro or cattrs such that the fastest available option on each supported platform + implementation.

My initial thoughts are that it should be relatively straightforward with minimal duplication if the right tools are leveraged, something like the below comes to mind, but this is off the cuff and untested. There's also missing puzzle pieces here in the branching such as creating a shared encoder/decoder (both libraries encourage this for performance + options, but vary on options)

try:
    import msgspec
except ImportError:
    HAS_MSGSPEC = False
else:
    HAS_MSGSPEC = True

if HAS_MSGSPEC:
    from msgspec import Struct as Base
    deco = lambda x: x
else:
    from mashumaro import DataClassDictMixin as Base
    from dataclasses import dataclass as deco

@deco
class Example(Base):
    field_name: TypeClass

I do intend to explore this further in the future, but won't be doing so today, I'd be happy to follow up with detail after doing so

@davfsa
Copy link
Member Author

davfsa commented May 5, 2024

I was not aware of mashumaro, but thanks for bringing it into my radar. Will be having a look at it and playing around with it soon:tm:

I do intend to explore this further in the future, but won't be doing so today, I'd be happy to follow up with detail after doing so

Would greatly appreciate that! Thanks a lot for your time and for sharing your thoughts with me!

If you want, you can reach out to me on Discord (username is "davfsa"). Would be glad and happy to aid you with that compatibility shim you talked about, which we can both probably greatly leverage in our respective use cases :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change hurts backwards compatibility enhancement New feature or request large Requires a large number of changes and testing, may take several PRs to complete optimization Optimizations to the code
Projects
None yet
Development

No branches or pull requests

2 participants