Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating an AMUSE binary distribution #984

Open
rieder opened this issue Sep 13, 2023 · 16 comments
Open

Creating an AMUSE binary distribution #984

rieder opened this issue Sep 13, 2023 · 16 comments

Comments

@rieder
Copy link
Member

rieder commented Sep 13, 2023

A long standing wish is to have a binary (pre-compiled) distribution of AMUSE.
The main problem in realising this is that AMUSE depends on quite a few libraries, most importantly an MPI distribution. These need to be the same version as the one AMUSE was built with.
One possible solution could be to distribute these libraries pre-built with AMUSE. It would be good to check if this is feasible, both technologically and legally (would there be any license conflicts?).
Ideally, installing AMUSE as a binary would work via Pip, and packages would automatically be built on GitHub when a new release is created.

@rieder
Copy link
Member Author

rieder commented Sep 13, 2023

Solving this would make AMUSE much more accessible and portable, so I think it should be high priority.

@rieder
Copy link
Member Author

rieder commented Sep 13, 2023

I'm not sure if/how mpi4py (used in AMUSE) could be configured to use an MPI library that would be included with AMUSE. This would be an important thing to find out.
Alternatively, we might be able to just use the sockets channel, but this is generally slower and more limited than MPI.

@rieder
Copy link
Member Author

rieder commented Sep 13, 2023

A slightly more complex (for the user) but much easier (for us) solution could be to write an installation script, that sets up a Python virtual environment with MPI (and other libraries) installed pre-compiled, and distribute binary AMUSE packages for this specific environment. We can't use PyPI for this then, since the binaries would fail on any other environment than the specific one they were compiled for.

@LourensVeen
Copy link
Collaborator

Well, that would certainly be a project 😄

I've been looking at the build system in the past couple of weeks (after MESA failed to compile for me), and I've been going through the community codes (I'm about 1/3 of the way through) looking at languages and build systems and licenses and such. A few things that come to mind:

  • This will be tricky on HPC. HPC machines normally have their own very specific MPI library, and it may simply be impossible to use anything else, or else possible-but-broken. HPC machines also sometimes have CPUs with things like AVX-512 that you do want to use if the code and/or compiler supports it. So for HPC, a binary distribution is not a good idea. EasyBuild support may be a better solution there, to make it easy for admins and users to install. You can install EasyBuild on your laptop too, but the documentation for that isn't brilliant and it takes a day or so to compile your first toolchain so you can do stuff. So not ideal there.

  • AMUSE has a ton of dependencies, both in that it is an amalgamation of libraries and community codes which often have their own dependencies in turn, and in the sense that it and the community codes need all sorts of sometimes rather obscure tools to build (fpx? ndiff? This MESA SDK is also...interesting). I've found some duplicate dependencies already among the community codes. Getting all that to build correctly isn't easy, which is why a binary distribution would be nice, and would have to include everything so that all you need to run it is a Linux kernel (whose ABI is very backwards compatible) and perhaps glibc.

  • Docker is an obvious way of doing this, but would a hard requirement on Docker be acceptable? You'd need Singularity/AppTainer on HPC too, which is not available everywhere. Unless you want to maintain both a containerised and a native build. Also, we'd need to figure out how to get fast MPI between containers.

  • Conda is more widely used, and possibly an option considering that it does Python as well as native languages. Similar to the current PyPI setup but with Conda binary packages. Same HPC considerations.

  • Licensing-wise, AMUSE is interesting 😄. Many of the community codes have no explicit license, although I gather they've been added with help (and so tacit approval) of their authors. Of course, legally speaking, it's not the authors but their employers that own the code, and it's not clear that the authors can speak on behalf of their employer. This is why e.g. the BSD mentions the regents of the university, because they are the only ones who can legally decide to license software made by Berkeley employees, and so they granted the license for BSD Unix.

    See also the Flash license, where the university (correctly!) claims copyright ownership. So that's questionable, and then there are some licenses that strongly suggest if perhaps not quite legally require you to cite papers if you publish, the latter terms are not compatible with the GPL, and so that's a conflict depending on how you interpret the license. And then there's the legal gray area of what exactly constitutes a derived work and where the borderline is between a simple amalgamation of independent packages in an archive, and a combine work to which e.g. the GPL applies.

  • A possible alternative could be to set up a wide range of testing environments, with many different operating systems and compilers, and simply keep fixing things until it works everywhere. The CI would help with regressions and ensuring that the users don't find the bugs before we do. Problem may again be HPC, in that you're not going to be able to run the Cray compiler on your CI. This is what I'm doing with MUSCLE3, but that only has the coupling framework and doesn't have any community codes built in, so that makes this a lot more feasible.

@LourensVeen
Copy link
Collaborator

One more thought: we could go native, and create .deb, .rpm and .dmg packages so that you can install those from a ppa (on Ubuntu or Debian) or the equivalent for Fedora/CentOS/RockyLinux if there is one, or by downloading the package on macos (or perhaps MacPorts or Homebrew makes more sense actually).

There would have to be some infrastructure for building on all the various platforms, probably based on Docker or VMs, and we'd have to figure out how it works, but if you support a couple of recent Ubuntu versions and a couple of recent macos versions then you're probably covering a good chunk of the users.

@rieder
Copy link
Member Author

rieder commented Sep 13, 2023

Well, that would certainly be a project 😄

I've been looking at the build system in the past couple of weeks (after MESA failed to compile for me), and I've been going through the community codes (I'm about 1/3 of the way through) looking at languages and build systems and licenses and such. A few things that come to mind:

  • This will be tricky on HPC. HPC machines normally have their own very specific MPI library, and it may simply be impossible to use anything else, or else possible-but-broken. HPC machines also sometimes have CPUs with things like AVX-512 that you do want to use if the code and/or compiler supports it. So for HPC, a binary distribution is not a good idea. EasyBuild support may be a better solution there, to make it easy for admins and users to install. You can install EasyBuild on your laptop too, but the documentation for that isn't brilliant and it takes a day or so to compile your first toolchain so you can do stuff. So not ideal there.

I agree. HPC will probably always need custom builds - which is fine. That's the way we currently work, and that should always be supported.
For users who are just using AMUSE on their laptop or desktop (e.g. students, people who want to experiment), a binary distribution would be very helpful however.

  • AMUSE has a ton of dependencies, both in that it is an amalgamation of libraries and community codes which often have their own dependencies in turn, and in the sense that it and the community codes need all sorts of sometimes rather obscure tools to build (fpx? ndiff? This MESA SDK is also...interesting). I've found some duplicate dependencies already among the community codes. Getting all that to build correctly isn't easy, which is why a binary distribution would be nice, and would have to include everything so that all you need to run it is a Linux kernel (whose ABI is very backwards compatible) and perhaps glibc.

Right. Most of the dependencies are searched for in configure, but indeed especially MESA has its own extra dependencies.

  • Docker is an obvious way of doing this, but would a hard requirement on Docker be acceptable? You'd need Singularity/AppTainer on HPC too, which is not available everywhere. Unless you want to maintain both a containerised and a native build. Also, we'd need to figure out how to get fast MPI between containers.

I would prefer not to have a hard requirement on Docker, since that brings its own issues.

  • Conda is more widely used, and possibly an option considering that it does Python as well as native languages. Similar to the current PyPI setup but with Conda binary packages. Same HPC considerations.

Conda is indeed widely used. We have had lots of issues with Conda (see many open issues), it would be really good to solve these and to have an AMUSE Conda package. Also here, I think HPC would not be the main target for this.

  • Licensing-wise, AMUSE is interesting 😄. Many of the community codes have no explicit license, although I gather they've been added with help (and so tacit approval) of their authors. Of course, legally speaking, it's not the authors but their employers that own the code, and it's not clear that the authors can speak on behalf of their employer. This is why e.g. the BSD mentions the regents of the university, because they are the only ones who can legally decide to license software made by Berkeley employees, and so they granted the license for BSD Unix.

Yes... This is something we haven't paid much attention to but perhaps/probably we should pay closer attention, especially if/when we start distributing binaries.

See also the Flash license, where the university (correctly!) claims copyright ownership. So that's questionable, and then there are some licenses that strongly suggest if perhaps not quite legally require you to cite papers if you publish, the latter terms are not compatible with the GPL, and so that's a conflict depending on how you interpret the license. And then there's the legal gray area of what exactly constitutes a derived work and where the borderline is between a simple amalgamation of independent packages in an archive, and a combine work to which e.g. the GPL applies.

Flash is explicitly not distributed with AMUSE for exactly this reason. Also, indeed we suggest (but don't require) that people cite our papers / those of the community code. AMUSE itself uses the Apache license.

  • A possible alternative could be to set up a wide range of testing environments, with many different operating systems and compilers, and simply keep fixing things until it works everywhere. The CI would help with regressions and ensuring that the users don't find the bugs before we do. Problem may again be HPC, in that you're not going to be able to run the Cray compiler on your CI. This is what I'm doing with MUSCLE3, but that only has the coupling framework and doesn't have any community codes built in, so that makes this a lot more feasible.

A diverse testing environment is certainly a good idea. We started setting this up but so far we're testing only limited environments/codes.

@rieder
Copy link
Member Author

rieder commented Sep 13, 2023

One more thought: we could go native, and create .deb, .rpm and .dmg packages so that you can install those from a ppa (on Ubuntu or Debian) or the equivalent for Fedora/CentOS/RockyLinux if there is one, or by downloading the package on macos (or perhaps MacPorts or Homebrew makes more sense actually).

There would have to be some infrastructure for building on all the various platforms, probably based on Docker or VMs, and we'd have to figure out how it works, but if you support a couple of recent Ubuntu versions and a couple of recent macos versions then you're probably covering a good chunk of the users.

Yes, this is also something we considered (see open issues on macports/homebrew and debian). It would be helpful. We need to be careful about requirements but this could probably work.

@LourensVeen
Copy link
Collaborator

I don't like the idea of a Docker dependency either, so let's drop that option. We could go with Conda for the desktop and EasyBuild for HPC, which would give us two mostly standardised environments to deal with. Although not all HPC machines have EasyBuild, and Conda is also somewhat messy. And Mac users may prefer HomeBrew or MacPorts.

On the other hand, getting this packaged up for Debian is also very attractive. Many people run a Debian derivative, so having it available by default from the repos would be great. And as a long-time Linux user, being in the Debian repo also feels like your software has officially arrived in the FOSS world. A potential downside would be that when running say Ubuntu LTS (as I do) you may well end up with a two year old version. We could solve that with a PPA along the lines of deadsnakes. Still, one way or another it will have to build in a range of environments.

For inclusion into Debian, the licensing situation needs to be clear for sure, which can be tricky. To really do it properly, we'd have to contact the universities, explain that one of their employees wrote some code several decades ago, explain to them what code is, what copyright is, what an open source license is, and then ask nicely if they're willing to license it to make the status quo of everyone copying it everywhere legal, at which point they'll panic and punt because oh my legal issues. At least, that's the worst case scenario 😄, and there are places that know how to do this, but it's still pretty early days. Of course, this should be resolved anyway, but I think EB and Conda are a bit less strict and we're already redistributing these codes so at least we wouldn't be making it worse.

Perhaps a first step would be to see if we can set up some robust infrastructure? As far as I can see, one CI currently runs Python tests of the core, and the other does a test with two community codes and different MPI versions, but there's no complete build and test with all the community codes, let alone in multiple environments. This may end up stretching the free GitHub resources a bit, but let's see.

Another issue is the setuptools setup. For the development environment that uses a custom setup.py command, which is deprecated and will disappear from setuptools at some point. It's also very complicated, there's some model-specific stuff in the setuptools code that should perhaps be in the corresponding per-package part of the system, there's #814 still partially open, there's a download.py that's been copy-pasted several times and as far as I can see could be replaced by a few calls to wget or curl in the Makefile (that would also avoid redownloading already-downloaded files automatically).

It seems to me that there could be some room for cleanup there, but I'm completely new to this project and I worry that I'm missing things. So maybe we (I 😄) should focus on tests first? Ensure that they cover everything we need the build system to do, and then get working on making changes. At least that would reduce the chance of regressions, and get the requirements clear.

@rieder
Copy link
Member Author

rieder commented Sep 14, 2023

yes - focus on tests first is probably best. we should sit together and discuss how to move forward from there.

@rieder
Copy link
Member Author

rieder commented Sep 14, 2023

@LourensVeen
Copy link
Collaborator

I've been looking at Python wheels with binary code a bit, as the only ones I've done so far are Python-only ones and those are easy. Note that the below assumes a desktop installation, HPC is another kettle of fish.

Wheels and their limitations

A wheel is a ZIP file with Python files and anything else needed for the package, which for packages that contain native code typically means precompiled dynamically linked libraries. These libraries are not allowed to be dynamically linked against any other libraries, unless those are also included or (for Linux) they are on a very short list of core system libraries.

In our case, the wheels would contain precompiled workers, for which presumably the same rules would hold. PyPI will refuse wheels that link to libraries outside the wheel that aren't on the exempt list. Even if we can get around this (e.g. by creating an installer script that determines which system it is on and then downloads an appropriate wheel for that specific OS version from a separate server), it's clear that you're not supposed to do that. We'd also have to support a potentially large number of combinations of installed library versions.

MPI and CUDA dependencies

We cannot link our workers against an included MPI library, because the Python side uses mpi4py, which links against the system's MPI on installation and that may be different and incompatible on the wire. So our workers need to be linked against the system MPI library, which would have to happen during installation. So at least a compiler is needed then, as well as MPI dev packages.

CUDA may be even trickier, as it's proprietary and has a C++ API, and C++ doesn't have a stable ABI. This means any CUDA code would pretty much have to be compiled locally.

We could try to compile some of the code into static archives, which could then be shipped in the wheel and linked with local MPI and CUDA libraries on installation. For MPI this might even work, it having a standardised C API and C having a standard ABI, but more likely it will just break in many hard-to-debug ways. For CUDA this very likely won't work unless we have exactly the same version on either side.

Wheels + AMUSE = a bad idea?

Given the above, it seems that there's no good solution for shipping wheels for use with pip and virtualenv. One way to look at this is that AMUSE is really a bunch of native code with Python bindings, rather than a Python library that has some native bits, and wheels simply aren't designed for that.

So I think it makes more sense to ship binaries using a package manager that's designed for this scenario, such as Anaconda or Homebrew or MacPorts or dpkg or RPM.

Improving the sdist installation

There are likely to always be cases where AMUSE will have to be installed from source. HPC machines come to mind, any other place where you cannot use the above package managers or really need to use a virtualenv, development installs, or any system that doesn't have a binary available for some reason.

The current source build on PyPI is error-prone. Users normally expect to just be able to do pip install amuse and get everything working. For Python-only packages or self-contained wheels this works, but we can't provide that, so the users have to do a bunch of work first to ensure that the dependencies are installed and findable, and if they don't follow the documentation very carefully or there's something slightly funny about their system then this fails and there's a mysterious error message about PEP 517 and no indication that there is more output, let alone where to find it and/or get help. So the current sdist is a rather leaky abstraction.

One option may be to get rid of the packages on PyPI altogether, and have users clone the repo or download a tarball and then use make or something to build a wheel and install it into their virtualenv. That way at least it's clear that they're compiling a native package, in which context it makes sense that you need to install dependencies and that there's a build log somewhere. Of course, compiling native packages is usually a poor user experience to begin with, albeit one that some users are used to suffering through.

Perhaps the least bad solution here would be to make a separate installer program, which would be written in plain Python and which would detect the OS environment, install any necessary packages in consultation with the user "I see that we're on a Mac and that you have Homebrew installed, shall I use Homebrew to install the dependencies for you?", and then build from source. Then you could pip install install-amuse and then install-amuse and then it would guide you on your way to doing an as-painless-as-possible source install.

@jobovy
Copy link

jobovy commented Oct 11, 2023

Hi!

I've been spending some time trying to get AMUSE installed on a UNIX server (CentOS) without any success (either pip or the development version), so I would welcome binaries being available for easier installation! I found this thread and just wanted to mention the existence of tools such as auditwheel and delocate, which will copy necessary libraries into wheels and replace all references to them in the code to the local version. This makes it much easier to install packages that require system libraries, because users don't have to pre-install any requirements. I'm not sure whether this would work with your mpi4py usage, but perhaps it at least could be a way to help distribute the community codes.

@rieder
Copy link
Member Author

rieder commented Oct 11, 2023

Thanks Jo!
Happy to help with installing amuse on the server, if you want.

@jobovy
Copy link

jobovy commented Oct 11, 2023

Thanks Jo! Happy to help with installing amuse on the server, if you want.

Thanks for the offer! I'm not quite following the recommended installation guidelines (I think), so I will try a bit further on my end, but otherwise will reach out in another issue.

@LourensVeen
Copy link
Collaborator

Likewise I'll be happy to help.

I'm aware of auditwheel and delocate, but indeed they wouldn't solve the problem of mpi4py compatibility. We'd have to convince the mpi4py maintainers to create a binary wheel with some specific MPI library included, and then bundle and use exactly the same library in AMUSE, carefully keep those synchronised, and then hope that the bundled MPI library doesn't clash at runtime with any MPI already installed on the system.

I had a look at the CUDA EULA and that does actually permit redistribution of some of the libraries, but there's an issue there with some of the codes or dependencies having GPL licenses, and still the potential for version mismatches between a bundled CUDA library and the installed CUDA driver. Although I guess that latter problem will exist with Anaconda too.

@LourensVeen
Copy link
Collaborator

Related mpi4py issue: mpi4py/mpi4py#28

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants