Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WCS sampling & FITS to TOAST FITS #86

Merged
merged 32 commits into from
Aug 31, 2022

Conversation

imbasimba
Copy link
Member

Mostly addresses #81, but does nothing about masked pixels and does not yet support a collection of FITS input.

The main feature is based around the WCS sampler & filter, which allows us to create a TOAST imageset from any array/wcs input. The tile_fits method now has the ability to create TOAST imagesets, which also has replaced HiPS as the default tiling method for images covering a large area of the sky.

Upgrades to the general process:

  1. toast_base does now accept tile filters to allow processing of a specific area, and skipping every tile outside input bounds
  2. Same for the cascade process
  3. Replaced the parameters force_tan etc. with a TilingMethod enum

Requires WorldWideTelescope/wwt_data_formats#50 to work properly.

@codecov
Copy link

codecov bot commented Aug 19, 2022

Codecov Report

Merging #86 (e9a8774) into master (80a3976) will increase coverage by 1.91%.
The diff coverage is 84.77%.

@@            Coverage Diff             @@
##           master      #86      +/-   ##
==========================================
+ Coverage   75.42%   77.34%   +1.91%     
==========================================
  Files          23       23              
  Lines        3520     3703     +183     
==========================================
+ Hits         2655     2864     +209     
+ Misses        865      839      -26     
Impacted Files Coverage Δ
toasty/cli.py 80.42% <0.00%> (ø)
toasty/collection.py 56.66% <0.00%> (-0.64%) ⬇️
toasty/fits_tiler.py 46.93% <70.00%> (-0.40%) ⬇️
toasty/merge.py 92.81% <88.40%> (-2.59%) ⬇️
toasty/samplers.py 87.57% <88.43%> (+13.62%) ⬆️
toasty/pyramid.py 94.47% <93.33%> (-0.38%) ⬇️
toasty/__init__.py 100.00% <100.00%> (ø)
toasty/builder.py 96.73% <100.00%> (+0.04%) ⬆️
toasty/image.py 86.07% <0.00%> (+0.20%) ⬆️
... and 2 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

…um things like TilingMethod.TOAST actually equals TilingMethod.TOAST. Also added assertion for the resulting imageset's projection type after creation by tests.
@imbasimba
Copy link
Member Author

Now that the tests are properly creating a TOAST from FITS, they are failing because they are not yet using the latest wwt_data_formats WorldWideTelescope/wwt_data_formats#50

@pkgw
Copy link

pkgw commented Aug 23, 2022

wwt_data_formats 0.15.1 is now out with the new code. Closing and opening to re-trigger the CI.

@pkgw pkgw closed this Aug 23, 2022
@pkgw pkgw reopened this Aug 23, 2022
Copy link

@pkgw pkgw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good! Lots of the infrastructure comes together in a nice way.

I've pushed an update to the branch to merge in master and honor parallelism flags throughout the FITS tiling framework, since that couldn't actually be controlled before.

I've run into a serious problem with the _image_bounds() function with the DASCH test image, which (by sheer luck) is a pretty pathological test case. The maximum Dec of the image is in its middle (δ = 90°), not one of its corners, and this also means that it in fact spans all RAs. The np.min() and np.max() approach taken in this function has problems when dealing with longitudes because if, say, the min and max values are 30° and 60°, that could mean an image that spans longitudes 30–60°, or it could mean an image that spans 60–180–360–30°. It could be pretty tricky to make sure that we capture the two possibilities correctly.

docs/cli/view.rst Outdated Show resolved Hide resolved
toasty/__init__.py Outdated Show resolved Hide resolved
toasty/fits_tiler.py Outdated Show resolved Hide resolved
toasty/fits_tiler.py Outdated Show resolved Hide resolved
toasty/fits_tiler.py Outdated Show resolved Hide resolved
toasty/fits_tiler.py Outdated Show resolved Hide resolved
toasty/tests/test_toasty.py Outdated Show resolved Hide resolved
toasty/samplers.py Outdated Show resolved Hide resolved
@pkgw
Copy link

pkgw commented Aug 24, 2022

OK, I have pushed a couple of commits that I think fix the computation of image bounds, and it turns out that the lat/lon sampler also needed fixing to deal with images spanning very large amounts of latitude. I now can fully TOAST the DASCH test image (which I'm doing after reprojecting it to a straight TAN projection to avoid the Astropy WCS bug). They're not fully tested, but I wanted to get them uploaded before I go away for my travel.

Unfortunately the positioning of the image appears to be off after these fixes. I need to look into why that's happening.

@pkgw
Copy link

pkgw commented Aug 24, 2022

Good catch on the min/max mistake, thanks!

In one test image I have, hipsgen produces a "BAD PIXEL CUT" warning and
doesn't emit this information. (I think it's because the file has an
integer data type.) Avoid the crash if this is the case.
@pkgw
Copy link

pkgw commented Aug 24, 2022

OK, I've pushed one more commit that avoids a crash in hipsgen with my DASCH test file.

The "good" news is that my test image is landing in the wrong place because the reproject package is doing something very wrong when converting it from TPV to TAN projection, as far as I can tell. So that's not Toasty's fault.

edit: the reproject problem seems to be a consequence of astropy/astropy#13509 . The image shape looks reasonable, but the image is half-empty.

However, the less-good news is that if I compare the TOAST and hipsgen output, there's still a disagreement. It looks like there's a negated rotation term somewhere. But I wouldn't worry about this too much since the image coordinate system may be generally messed up.

@pkgw
Copy link

pkgw commented Aug 24, 2022

For posterity, here's what I'm seeing. The tnx file is TPV, the tanproj file is reprojected by reproject. The colorscales labeled on the left indicate which image is landing where. The view is pretty much centered on the North Pole.

image

@pkgw
Copy link

pkgw commented Aug 24, 2022

Here's hipsgen vs. TOAST of the TPV file — looks like the AstroPy bug didn't arise here. It's close, but rotated 180°, or something similar (note the area with writing on it, in the two images). The hipsgen version is correct.

image

@imbasimba
Copy link
Member Author

I had planned to add handling of FITS collections to a new PR, but it turned out this functionality could be added with minimal changes (thanks past toasty developers!).

This may not be the most optimal way of toasting multiple FITS files, but it seems to work quite well. Let's discuss here or on Monday @pkgw

imbasimba and others added 4 commits August 25, 2022 09:31
As noted in the comments, my DASCH sample file that crosses the pole has
WCS headers that seem to have another problem beyond their specification
of the TPV polynomial distortions. I need to set LONPOLE explicitly to
get the correct orientation on the sky.
@pkgw
Copy link

pkgw commented Aug 29, 2022

I figured out the problem with the DASCH test image, and it appears to be something outside of all of the TOAST infrastructure: the file should have a LONPOLE header. Based on my reading of Calabretta & Greisen 2002, the headers in the file are not correct. Forcing LONPOLE to 180 fixes the celestial alignment for this test image.

The "initiate readiness state" algorithm repeatedly tests whether
positions are in the tiles_to_process set. If large numbers of tiles
(tens of millions) are being processed, the algorithm basically fails
because such tests have O(n) scaling with lists. With a set, things
become tractable.

There may well be further room for improvement here, but this one change
makes an enormous difference for big problems.
In processing a FITS-to-TOAST dataset going down to level 14, with about
24 million tiles to process, I discovered that the tile filter function
was a non-trivial bottleneck -- it took about 10 minutes just to count
the tiles that needed to be processed! Some profiling showed that most
of the time was actually in the Numpy min/max functions. Here we
Cythonize the function and avoid the Numpy functions. My measurements
indicate a speedup of a factor of around 20.
@pkgw
Copy link

pkgw commented Aug 31, 2022

OK, there is still some more work to do here — in the DECaPS2 test processing I am running into some scalability issues since there are about 28 million tiles to process. But the code here seems to basically be working reliably, and I think it makes the most sense to close out this PR and continue evolving in separate ones.

Docs build is failing here due to an expired cert for dotnetfoundation.org. Merging anyway.

@pkgw pkgw merged commit 9c49c5c into WorldWideTelescope:master Aug 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants