Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow load for otypes #39

Open
brettviren opened this issue Sep 9, 2022 · 1 comment
Open

Slow load for otypes #39

brettviren opened this issue Sep 9, 2022 · 1 comment

Comments

@brettviren
Copy link
Owner

@plasorak reports that loading moo.otypes can take O(10s) for the DAQ.

While perhaps bearable, imo, this is really too slow for comfort. I'd want moo overhead to startup time to be a couple orders of magnitude smaller.

I suspect any slowness is due to the inherent slowness of the C++ version of the Jsonnet compiler which is used by the Python jsonnet package. There is also a Go version which produces a mostly compatible .so shared library andf or which the Jsonnet community have done substantial optimization.

In Wire-Cell Toolkit we have very large and complex Jsonnet and see x10-x100 speed up going from the C++ to the Go version. Wire-Cell has a compile-time option to select which version to build against. Unfortunately I do not know an equivalent when installing the jsonnet Python module eg via pip.

One check can done here. It is always possible to precompile the .jsonnet files to .json and then load those. This load should be as fast as Python's json can manage. Any left over slowness can be blamed on moo.otypes. This precompilation could be done with the Go version. Not a great permanent solution as now one hasto track both .jsonnet source and the .json artifacts.

Moving away from otypes to something new which produces .py files holding Python dataclasses or pydantic classes which are written by applying the Jsonnet schema to moo templates (like we do for C++) is also a workaround.

Doing this may also solve #38.

DAQ folks, feel free to add info/complaints on this or other topics.

@plasorak
Copy link
Contributor

Ok, so I've given it a stab, because I think we'll need to have this at some point.

So, on my fork, in the plasorak/python-codegen branch, there are some sources that are able to generate python "headers" and code to (de)serialise raw dictionary. I've also tried to be a bit careful about the validation of the classes, so for example it checks multipleOf, maximumExclusive and string patterns/regex.

This certainly simplifies quite a bit the client code (see for example daqconf), because one now only needs to import to have the classe definitions, and I think it should make things faster (although I'll have to confess I haven't properly measured it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants