Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sym and other genned pkgs namespace packages #261

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bradley-solliday-skydio
Copy link
Collaborator

Namespace packages are packages whose __path__ (which tells the python import system to look for sub-packages and sub-modules) has multiple directories, meaning it can have portions spread out across multiple directories.

Previously, sym and the other generated packages were not namespace packages. This caused issues when generated python packages in the sym namespace attempted to access, for example, sym.Rot3 (which they might do if the generated function returns a sym.Rot3). Since the generated package itself was named sym, but Rot3 was not defined locally, an AttributeError would be raised.

While an alternative would have been to instead not use the name sym for generated packages (using, say, sym_gen instead), for reasons I didn't fully look into, we still want to generate our code within the sym namespace (generating into sym.gen was considered as an option but to do so would require the changes in this commit regardless).

While currently we haven't needed to generate any functions returning a sym class in a package named sym, we intend to soon add a sym.util package which will be used in a lot of places. That can't be done until this namespace conflict is resolved.

Note, while a python2 compatible namespace package has multiple __init__.py files for the top-level package spread across different locations, only one of them will be executed. This makes it difficult to re-export the contents of sub-modules into the top-level namespace.

The normal way to re-export a name is to write

from .sub_module import name

However, since the sub-modules can be created dynamically, it is impossible to re-export all names in this manner, as the first __init__.py that is created has no way of knowing what names one might want to re-export from subsequent modules.

It is possible to put all names one wishes to export in a standard file, say _init.py, then dynamically search for such files and execute their contents, but we considered the additional complexity to be too large of a burden (as users would have a harder time understand their generated code, and this would give future maintainers a hard time).

And so, we decided to simply stop re-exporting any names in the __init__.py's of generated code (kind of in the style of pep 420 python3 packages).

This makes loading a generated function more difficult if one uses codegen_util.load_generated_package, as now simply importing a generated package won't give you access to any of the package's contents. However, this is what codegen_util.load_generated_function is for, so hopefully the user experience shouldn't be too negatively impacted.

The one exception to the general ban of re-exporting names is the sym package, as we still wish to be able to do

import sym

sym.Rot3.identity()

However, because all sub-modules we wish to export from the sym package are known at code-gen time, allowing this is not difficult. This only applies to names in the core sym package, and any additional user generated code in the sym package will not be re-rexported in the top-level namespace.

A user can prevent their package from being generated as a namespace package by setting the namespace_package field of their PythonConfig to False. This is useful in our testing as it is the generated code being tested that is imported, not, for example, the checked in sym package code which is being imported.

As a last note, I believe pkgutil.extend_path only checks for portions of the package on the sys.path, and doesn't check for any portions than can only be found by finders on the sys.meta_path (for example, symforce itself is found by a finder on the sys.meta_path but not on the sys.path during an editable pip install). I don't expect this lapse to pose a problem, and addressing it immediately might just make the __init__.pys look more complicated than they need to be, but if this does become a problem, know that the situation can be partially addressed trying to find the spec using the finders, and adding the spec's submodule_search_locations if found to the __path__.

Copy link
Member

@aaron-skydio aaron-skydio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we had decided we were at least interested in a sym.gen package/module for generated functions instead of throwing user-generated functions into the top-level sym package - how does this relate to that?

from .buffer_func import buffer_func
__path__ = __import__("pkgutil").extend_path(__path__, __name__) # type: ignore # mypy issue #1422
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL python/mypy#1422 (comment)

We should include the error type here, ideally also actually link to the issue instead of just commenting like this (so the reader doesn't have to figure out which repo exactly you're talking about)

Seems worth trying mypy 0.920 though if we haven't

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pasted in the web address and made the type ignore specific to the actual error.

Comment on lines 5 to 20
__path__ = __import__("pkgutil").extend_path(__path__, __name__) # type: ignore # mypy issue #1422
{% if pkg_namespace == "sym" %}
from ._init import *
{% endif %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be the same template as __init__.py.jinja, with this bit conditioned on config.namespace_package? Not a strong opinion either way though

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Maybe. The way I originally had it (with the complicated __init__.py) it definitely did make sense to have them be two different files, as one contained the exports, and the other contained the namespace package logic, and both would be rendered for every portion of the package.

Comment on lines 7 to 9
"""
Python runtime geometry package.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely still want the sym module docstring to show up properly, should make sure this doesn't break that

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Forgot about this. I made sure to check that it was there when initially when I was using the more elaborate __init__.py that allowed re-exporting, but forgot to do that here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Added it in


epsilon = 2.220446049250313e-15
__path__ = __import__("pkgutil").extend_path(__path__, __name__) # type: ignore # mypy issue #1422
from ._init import *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth a comment either here or in sym/_init.py with some mention of what this is doing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I added a comment. Let me know what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking to explain the from ._init import * since that's the non-standard part, although the comment that this is a namespace package is good too :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this sound?

# Re-export core geometry and camera types
from ._init import *

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining something describing why we do this as opposed to what it's doing (since generally what it's doing should be clear from the code :) ), something like:

# Import `sym` types onto the top-level module.  Needs to be in a separate `_init` module since
# namespace packages do not evaluate every copy of `__init__.py` , just the first discovered

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I'll adjust accordingly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a preview, I am going with

# Import sym types into the top level of the package. Needs to be done in all portions of
# the package since namespace packages do not evaluate every copy of `__init__.py`, just
# the first discovered. Import must occur after __path__ has been modified to ensure the
# sym._init module can be located.

Just a note, since the only files we are importing are the geo and cam cal classes, it's not necessary to have the import be from _init.py (this was only necessary if any portion of the package could add imports to the __init__.py). We could just do from .rot3 import Rot3 and the like directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I do think it might make sense to just add them back in here directly, although it might be a little confusing if other generated sym modules also have those imports? Not sure, I'm fine either way

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could also go either way. Though, since it is already from ._init import *, absent a compelling argument otherwise, I'm inclined to keep it that way.

@aaron-skydio
Copy link
Member

I see the mention that sym.gen would require the changes in this PR regardless, are we planning to look into that next/later?

@bradley-solliday-skydio
Copy link
Collaborator Author

I see the mention that sym.gen would require the changes in this PR regardless, are we planning to look into that next/later?

I kind of assumed we would. Though, to make the change we'd have to change a lot of the usages and such (and I didn't want to make this PR any more crowded, as it already contains relatively complicated changes).

Namespace packages are packages whose `__path__` (which tells the python
import system to look for sub-packages and sub-modules) has multiple
directories, meaning it can have portions spread out across multiple
directories.

Previously, `sym` and the other generated packages were not namespace
packages. This caused issues when generated python packages in the `sym`
namespace attempted to access, for example, `sym.Rot3` (which they might
do if the generated function returns a `sym.Rot3`). Since the generated
package itself was named `sym`, but `Rot3` was not defined locally, an
`AttributeError` would be raised.

While an alternative would have been to instead not use the name `sym`
for generated packages (using, say, `sym_gen` instead), for reasons I
didn't fully look into, we still want to generate our code within the
`sym` namespace (generating into `sym.gen` was considered as an option
but to do so would require the changes in this commit regardless).

While currently we haven't needed to generate any functions returning a
`sym` class in a package named `sym`, we intend to soon add a `sym.util`
package which will be used in a lot of places. That can't be done until
this namespace conflict is resolved.

Note, while a python2 compatible namespace package has multiple
`__init__.py` files for the top-level package spread across different
locations, only one of them will be executed. This makes it difficult to
re-export the contents of sub-modules into the top-level namespace.

The normal way to re-export a name is to write
``` python3
from .sub_module import name
```
However, since the sub-modules can be created dynamically, it is
impossible to re-export all names in this manner, as the first
`__init__.py` that is created has no way of knowing what names one might
want to re-export from subsequent modules.

It is possible to put all names one wishes to export in a standard file,
say `_init.py`, then dynamically search for such files and execute their
contents, but we considered the additional complexity to be too large of
a burden (as users would have a harder time understand their generated
code, and this would give future maintainers a hard time).

And so, we decided to simply stop re-exporting any names in the
`__init__.py`'s of generated code (kind of in the style of pep 420
python3 packages).

This makes loading a generated function more difficult if one uses
`codegen_util.load_generated_package`, as now simply importing a
generated package won't give you access to any of the package's
contents. However, this is what `codegen_util.load_generated_function`
is for, so hopefully the user experience shouldn't be too negatively
impacted.

The one exception to the general ban of re-exporting names is the `sym`
package, as we still wish to be able to do
``` python3
import sym

sym.Rot3.identity()
```
However, because all sub-modules we wish to export from the `sym`
package are known at code-gen time, allowing this is not difficult.
This only applies to names in the core `sym` package, and any additional
user generated code in the `sym` package will not be re-rexported in the
top-level namespace.

A user can prevent their package from being generated as a namespace
package by setting the `namespace_package` field of their `PythonConfig`
to `False`. This is useful in our testing as it is the generated code
being tested that is imported, not, for example, the checked in `sym`
package code which is being imported.

As a last note, I believe `pkgutil.extend_path` only checks for portions
of the package on the `sys.path`, and doesn't check for any portions
than can only be found by finders on the `sys.meta_path` (for example,
`symforce` itself is found by a finder on the `sys.meta_path` but not on
the `sys.path` during an editable pip install). I don't expect this
lapse to pose a problem, and addressing it immediately might just make
the `__init__.py`s look more complicated than they need to be, but if
this does become a problem, know that the situation can be partially
addressed trying to find the spec using the finders, and adding the
spec's `submodule_search_locations` if found to the `__path__`.
Copy link
Member

@aaron-skydio aaron-skydio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm ready to approve this, the main remaining question is whether we should have a better sense of what's feasible as a follow-up before merging this to reduce any risk of churn (e.g. if it's not possible to make sym.gen a thing, maybe we'd rather "only allow symforce itself to generate things into the sym module", and then this wouldn't be necessary?)

@aaron-skydio aaron-skydio changed the base branch from load_generated_function to main December 3, 2022 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants