Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdshealpix, a Rust HEALPix implementation #128

Open
bmatthieu3 opened this issue May 24, 2019 · 12 comments
Open

cdshealpix, a Rust HEALPix implementation #128

bmatthieu3 opened this issue May 24, 2019 · 12 comments
Assignees
Labels
Milestone

Comments

@bmatthieu3
Copy link
Contributor

@cdeil @fxpineau @astrofrog @tboch - Following our discussion from PyGamma19, a mid-term goal is to replace the current C HEALPix implementation from astrometry.net with the new Rust one (github link) developed by @fxpineau. The assumption behind that being that astropy_healpix will remain an affiliated package (because Rust is currently too recent to be moved to the core of astropy).

On my side, I continued my developments on cdshealpix (github link) and released a version 0.2.0 where:

  • I use the rust-numpy library to interface numpy arrays (and view arrays) between python and rust code.
  • rust-numpy relies on the pyo3 Python-Rust wrapper. Thus, cdshealpix does not rely on cffi anymore meaning that there is no more C code between python and rust (this is safer because I do not manipulate C raw pointers anymore...).

cdshealpix has some features that astropy_healpix does not have:

  • a polygon search that returns all the HEALPix cells in a SkyCoord describing a polygon on the sphere. Self-intersecting polygons are accepted.
  • an elliptical search
  • a method, external_edges_cells that returns the neighbors (i.e. the cells touching) of an HEALPix cell but at a deeper depth (it is a the same as neighbors but with a depth parameter that must be > to the depth of the HEALPix cell from which the neighbors are computed).

And some features from astropy_healpix are missing in cdshealpix. Those are:

  • The ring scheme. Currently, only the nested scheme is implemented.
  • The code responsible for the interpolation
  • The dx, dy offset optional arguments in healpix_to_lonlat and lonlat_to_healpix
  • The step optional argument in boundaries_lonlat (i.e. vertices in cdshealpix)
@bmatthieu3
Copy link
Contributor Author

@cdeil - @fxpineau and I have implemented the basic methods for the ring scheme. Please see the doc of the API for all the methods supported:
https://cds-astro.github.io/cds-healpix-python/api.html

The package has two sub packages nested and ring that each have their own methods:

from cdshealpix.nested import lonlat_to_healpix
from cdshealpix.ring import lonlat_to_healpix

The nested schema methods take a depth param whereas the ring scheme methods take a Nside param which does not necessary have to be a power of two.

There are two new methods in cdshealpix, from_ring and to_ring that allow to switch from the nested to the ring scheme and vice versa.

Finally there is also a new bilinear_interpolation method implemented for the nested scheme.

@cdeil cdeil added the question label Jul 3, 2019
@cdeil cdeil added this to the 0.5 milestone Jul 3, 2019
@cdeil
Copy link
Member

cdeil commented Jul 3, 2019

@bmatthieu3 @fxpineau @tboch - Thank you for all your work on the CDS HEALPix codes!

As I mentioned at PyGamma19, I would consider a C HEALPix lib that we bundle and wrap for astropy-healpix the ideal long-term solution, and also that we put it in the core package at some point as astropy.healpix.

For now, I agree that changing astropy-healpix to be based on the Rust CDS HEALPix lib is the best way to go, since you are wiling to add the missing features we need like interpolation, and the core algorithms are very fast and accurate there, which is not the case for what we have now. If others also feel doing a C translation is a good idea and someone has time or can get funding to do it, this change to a more common setup could then happen any time in the future (like how WCSLib or ERFA or CFITSIO are C libs that Astropy wraps). Or who knows, maybe Rust will become more common and we'll leave it as-is.

@astrofrog @lpsinger - OK to go this way?

@bmatthieu3 - How do you propose we set this up?

Would we depend on and import cdshealpix from astropy-healpix?
Or would you give up the Python cdshealpix and merge it with astropy-healpix?

I guess the second option is more maintainable and better for users (no "should I use cdshealpix or astropy-healpix?), but to do it there's a lot of API and feature questions to be re-discussed, to do it in a way that e.g. allows all features you need for queries to be exposed?

One thing we can do for now is to improve tests a bit further, e.g. add CI builds for the packages that use astropy-healpix here so that we notice if we break them: reproject, hipspy, soon gammapy.

@lpsinger
Copy link
Contributor

lpsinger commented Jul 3, 2019

My main concern is that the core routines are still Numpy ufuncs so that they get broadcasted over indices and types appropriately. I am not familiar with Rust build tools so I don't know how to call Rust functions from a Python C extension.

@bmatthieu3
Copy link
Contributor Author

@lpsinger - The way I do it is in pure Rust because:

  • I have seen that there is a numpy wrapper in Rust allowing me to manipulate ArrayViews in Rust too.
  • I think interfacing Rust code with some C would break the interest of using Rust behind it. We would pass raw pointers from C to Rust and raw pointers are unsafe in Rust meaning that we could potentially have segfault in Rust too when dereferencing these. Instead I do use pyo3 that allows me to write python extensions directly in Rust without doing some C between.

There is a rust wrapper library around numpy called rust-numpy that handles dynamic numpy array types, e.g. this offers the type PyArrayDyn<T> being generic over the type T (float64, uint64, ...).

This is an example of lonlat_to_healpix coded in Rust that calls the Rust cdshealpix lib.

#[pyfn(m, "lonlat_to_healpix")]
fn lonlat_to_healpix(_py: Python,
        depth: u8,
        lon: &PyArrayDyn<f64>,
        lat: &PyArrayDyn<f64>,
        ipix: &PyArrayDyn<u64>,
        dx: &PyArrayDyn<f64>,
        dy: &PyArrayDyn<f64>)
    -> PyResult<()> {
        let lon = lon.as_array();
        let lat = lat.as_array();
        let mut ipix = ipix.as_array_mut();
        let mut dx = dx.as_array_mut();
        let mut dy = dy.as_array_mut();

        let layer = healpix::nested::get_or_create(depth);
        Zip::from(&mut ipix)
            .and(&mut dx)
            .and(&mut dy)
            .and(&lon)
            .and(&lat)
            .par_apply(|p, x, y, &lon, &lat| {
                let r = layer.hash_with_dxdy(lon, lat);
                *p = r.0;
                *x = r.1;
                *y = r.2;
            });

        Ok(())
}

The PyArrayDyn are first converted to ArrayView type objects. And then I zip the array views and iterate over them by computing the hash values with the dx and dy one by one. This is done in parallel by replacing the call to apply by a call to par_apply (this internally use the rayon library handling concurrency over iterators).

This code is then compiled into a .so shared library using setuptools_rust and I can call its methods from python that way:

def lonlat_to_healpix(lon, lat, depth, return_offsets=False):
    lon = np.atleast_1d(lon.to_value(u.rad))
    lat = np.atleast_1d(lat.to_value(u.rad))

    if depth < 0 or depth > 29:
        raise ValueError("Depth must be in the [0, 29] closed range")

    if lon.shape != lat.shape:
        raise ValueError("The number of longitudes does not match with the number of latitudes given")

    num_ipix = lon.shape
    # Allocation of the array containing the resulting HEALPix cell indices
    # This is done in the Python side code to benefit the Python garbage collector
    ipix = np.empty(num_ipix, dtype=np.uint64)
    # And the offsets
    dx = np.empty(num_ipix, dtype=np.float64)
    dy = np.empty(num_ipix, dtype=np.float64)

    # Call the lonlat_to_healpix from the shared library (.so)
    cdshealpix.lonlat_to_healpix(depth, lon, lat, ipix, dx, dy)

    if return_offsets:
        return ipix, dx, dy
    else:
        return ipix

@cdeil
Copy link
Member

cdeil commented Jul 4, 2019

@bmatthieu3 - Interesting. I see that you don't use OpenMP, there's no libopenmp here:

$ otool -L cdshealpix/cdshealpix.cpython-37m-darwin.so  
cdshealpix/cdshealpix.cpython-37m-darwin.so:
	/private/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg80000gn/T/pip-req-build-fpcomatb/target/release/deps/libcdshealpix.dylib (compatibility version 0.0.0, current version 0.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.0.0)
	/usr/lib/libresolv.9.dylib (compatibility version 1.0.0, current version 1.0.0)

So Rust / Rayon really has a way to ship multi-threading binaries that are completely self-contained? Also on Windows?

I think still we need to have a conversation whether to use multi-threading or single-threading.
With the current implementation we used OpenMP and I pushed to remove it for better portability and simpler user experience (see #97).

I just tried cdshealpix quickly, and ran this on my 4-core Macbook:

In [15]: %time healpix_to_lonlat(np.zeros(int(1e9)), depth=1)                                                                                                                                    
^C^C^C^C^C^C^C^C^C^C^C^C^C^C^CTerminated: 15

It was fluctuating one to four cores, not running consistently on four cores.
CTRL + C didn't work, I had to terminate the process from the outside.

Overall I would probably argue again that we just don't use multi-threading to keep it simple.

But that's a detail - the big question is whether to adopt cdshealpix and how to integrate it here and maintain it moving forward. @lpsinger @astrofrog - Please check out cdshealpix and comment when you have some time.

@cdeil
Copy link
Member

cdeil commented Jul 4, 2019

IMO packaging and distribution maintenance effort for the coming years is the main concern with introducing Rust. There's also the fact that pretty much no-one knows Rust yet or is using it in the Astropy community, people know C or C++ or Cython or Julia, but after looking at Rust a bit I think it's fine if you and CDS supports it -- it's complex, but C Numpy Ufunc code is also complex and Cython can be tricky, it's all about the same and it is maintainable, we don't need many people to hack on that part of the code.

So concerning distribution: did you try making a conda package via conda-forge? Do they have any packages using Rust or it's possible to do it? Maybe you could ask on the conda-forge gitter and try?

Currently I think for Astropy and most packages users use conda, i.e. having https://github.com/conda-forge/astropy-healpix-feedstock is important.

Concerning the wheel build, it would also be good to not have a special setup that you maintain, but have it as part of a larger build setup. @astrofrog recently started https://github.com/astropy/wheel-forge - I don't know how much work it would be to build cdshealpix there. I know doing the build and deployment setup is a lot of work - I'm not suggesting you change now when it's not clear which way to go, but long-term we need a low-effort to maintain solution.

@cdeil
Copy link
Member

cdeil commented Jul 4, 2019

I tried to build cdshealpix and for now am stuck here:

$ pip install -r requirements-dev.txt
...
   Compiling pyo3 v0.6.0
     Running `rustc --edition=2018 --crate-name build_script_build /Users/deil/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.6.0/build.rs --color always --crate-type bin --emit=dep-info,link -C opt-level=3 --cfg 'feature="default"' --cfg 'feature="extension-module"' --cfg 'feature="python3"' -C metadata=eccd457f61957a9f -C extra-filename=-eccd457f61957a9f --out-dir /Users/deil/work/code/cds-healpix-python/target/release/build/pyo3-eccd457f61957a9f -L dependency=/Users/deil/work/code/cds-healpix-python/target/release/deps --extern regex=/Users/deil/work/code/cds-healpix-python/target/release/deps/libregex-52739a7ae997f86f.rlib --extern version_check=/Users/deil/work/code/cds-healpix-python/target/release/deps/libversion_check-14d1cf1b2e18726d.rlib --cap-lints allow`
     Running `/Users/deil/work/code/cds-healpix-python/target/release/build/pyo3-eccd457f61957a9f/build-script-build`
error: failed to run custom build command for `pyo3 v0.6.0`
process didn't exit successfully: `/Users/deil/work/code/cds-healpix-python/target/release/build/pyo3-eccd457f61957a9f/build-script-build` (exit code: 101)
--- stderr
Error: pyo3 requires a nightly or dev version of Rust.
Installed version is: 1.34.2 (2019-05-13). Minimum required: 1.34.0-nightly (2019-02-06).
thread 'main' panicked at 'Aborting compilation due to incompatible compiler.', /Users/deil/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.6.0/build.rs:540:17
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.

warning: build failed, waiting for other jobs to finish...
error: build failed
error: cargo failed with code: 101

@bmatthieu3 - Will Rust nightly be required for the forseeable future or will it be possible to compile cdshealpix with stable Rust soon? For now, could you please put the command to use here to change to nightly for people like me that have rust installed and now need to switch to nightly or make a new "env" with nightly if the Rust people have that?
https://github.com/cds-astro/cds-healpix-python#setting-up-your-development-environment

@bmatthieu3
Copy link
Contributor Author

So concerning distribution: did you try making a conda package via conda-forge? Do they have any packages using Rust or it's possible to do it? Maybe you could ask on the conda-forge gitter and try?
Currently I think for Astropy and most packages users use conda, i.e. having https://github.com/conda-forge/astropy-healpix-feedstock is important.

Ok! I will work on that and will keep you in touch.

For the wheel building @astrofrog already said to me about https://github.com/astropy/wheel-forge. I fully agree with you, a special setup as it is for now is not a good long-term solution.

@cdeil - Only test and benchmark features need the nightly version of the rust compiler so that is not core code and thus this should be removed soon. For the moment, you can switch the compiler to the nightly with this command:

rustup default nightly

and go back to the stable with:

rustup default stable

@cdeil
Copy link
Member

cdeil commented Jul 4, 2019

@bmatthieu3 - Now I get this compile error with cdshealpix!?
https://gist.github.com/cdeil/40439c3eb0491f8dae121fc2a03e26a6#file-gistfile1-txt-L241

@lpsinger
Copy link
Contributor

@bmatthieu3, my concern is that the Rust implementation does not broadcast over indices because it is not providing a Numpy ufunc. For example, it chokes on this:

>>> import numpy as np
>>> import astropy_healpix as ah
>>> import cdshealpix as ch
>>> ipix = np.arange(3)
>>> level = np.arange(2)
>>> nside = 2**level
>>> ah.healpix_to_lonlat(ipix[:, np.newaxis], nside[np.newaxis, :])
(<Longitude [[0.78539816, 0.78539816],
            [2.35619449, 2.35619449],
            [3.92699082, 3.92699082]] rad>, <Latitude [[0.72972766, 1.15965846],
           [0.72972766, 1.15965846],
           [0.72972766, 1.15965846]] rad>)
>>> ch.healpix_to_lonlat(ipix[:, np.newaxis], level[np.newaxis, :])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/site-packages/cdshealpix/nested/healpix.py", line 155, in healpix_to_lonlat
    if depth < 0 or depth > 29:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

It's important for our LIGO applications that these functions broadcast correctly.

bmatthieu3 added a commit to cds-astro/cds-healpix-python that referenced this issue Aug 20, 2019
bmatthieu3 added a commit to cds-astro/cds-healpix-python that referenced this issue Aug 20, 2019
@fxpineau
Copy link

cdshealpix is now in conda-forge and does not require nightly builds any more:

conda install -c conda-forge cdshealpix

@lpsinger
Copy link
Contributor

lpsinger commented May 7, 2021

Interestingly, the popular cryptography package uses rust now. I am prepared to lift my objection to including rust as a dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants