Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't treat pth files as data files #885

Open
dalcinl opened this issue Mar 16, 2023 · 2 comments
Open

Don't treat pth files as data files #885

dalcinl opened this issue Mar 16, 2023 · 2 comments

Comments

@dalcinl
Copy link

dalcinl commented Mar 16, 2023

*.pth files are a documented mechanism to alter the Python module search path. They can also be slightly abused for other things. For example, on Windows, you can use them to call os.add_dll_directory() (Py 3.8+) to extend the DLL search path for extension modules.

I decided to take the pth file approach in mpi4py when using Intel MPI on Windows. Intel MPI does not install DLLs on system locations, its setvars.bat script just appends to $PATH, but $PATH is no longer used for DLL search since Py 3.8+. Therefore, the only way for any extension module linked with impi.dll is to properly load is to use before os.add_dll_directory() . Of course, other things could be done, but I'm assuming the worst case scenario: the user don't have enough permissions to mklink/copy DLLs over from the Intel MPI installation to Python's install tree or system locations. Not to mention that there are two DLLs, one debug and the other release, and users may want to use one of or the other, and this is easy via pth files and environment variables (see intelmpi.pth).

For pth files to be useful, they have to be located at the top-level of a site-packages directory, and NOT inside any installed package tree within site-packages. Therefore, to ship a pth file in a wheel and get it properly installed, they have to be located at the top level of the wheel, and not inside a package folder, or stored as data files in the .data folders. Unfortunately, scikit-build classifies *.pth files (generated in CMakeList.txt) as data files (because they are outside any package directory), and store *.pth files in the wheel's dedicated <package>.data folder for package data.

I was able to somehow workaround this issue in mpi4py, but is slightly hacky and it may easily break in future updates to scikit-build. Before attempting to fix this issue on scikit-side and contributing PR, I would like to know if the scikit-team agrees with my view on specially handling *.pth files at the top level to not classify them as data files, and then let them be stored at the top level of wheels.

For reference, this is the way I managed to include *.pth files in the proper location within the wheel.

setup(
    package_data = {
        'mpi4py': [
            '../*.pth', # definitely  a hack
            ...
        ],
        ...
    }
    # The filtering below is just to not have pth files
    # in the <package>.data/ folder within the wheel
    cmake_process_manifest_hook = \
        lambda fl: [f for f in fl if not f.endswith('.pth')]
    ...
)
@henryiii
Copy link
Contributor

FYI, I'm pretty sure (but haven't tested it) that this works correctly in scikit-build-core. The problem here is tied to the fact scikit-build's output is built around a package directory, rather than site-packages, as its primary output. But packages may need to write files or folders to site-packages, not just always inside a dir. I've avoided this from the beginning in scikit-build-core. Fixing it in scikit-build is going to be tricky, though, as it's likely not backward compatible. When I do, I'll also make sure .pth is handled correctly.

Why does package_data = {'': ['*.pth'], ...} not work, by the way?

@dalcinl
Copy link
Author

dalcinl commented May 15, 2023

FYI, I'm pretty sure (but haven't tested it) that this works correctly in scikit-build-core.

Oh, sorry, I forgot to update this issue. I've switched to scikit-build-core, and the issue is gone: installing *.pth files in site-packages (out of the package dir) works just fine.

Why does package_data = {'': ['*.pth'], ...} not work, by the way?

I didn't think of doing it that way. I will try, although I have to go back in my project's git history.

Fixing it in scikit-build is going to be tricky, though, as it's likely not backward compatible. When I do, I'll also make sure .pth is handled correctly.

From my side, I 'm not longer in need for this issue to be fixed, so please feel free to close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@dalcinl @henryiii and others