Way forward for co-registration module #435

rhugonnet · 2023-09-06T22:07:42Z

We currently have 25 issues open on co-registration 🙀, and a lot of notes and plans for the way forward dispersed everywhere. I initially tried to organize this in a project: https://github.com/orgs/GlacioHack/projects/3/views/2, but I think that the level of detail of the issues is too inconsistent for this, so I'm writing this summary instead!

Architecture:
We already did a lot in #158 and #329 to re-organize the coding structure to be more consistent and robust, but some points are left:

A detailed list in Re-organization of coreg.py #327 especially about cleaning + having consistent subfunctions of class methods + side-features (see next point),
Additional cleaning mentioned in Clean coreg.py #266,
The discussion of Revamp and simplify the raster-point/P-R/P-P functionality #402 about merging point methods into a single fit and apply, and call methods depending on Point-Point, Point-Raster or Raster-Raster support, which would also completely solve Allow point clouds for Coreg #134.

Features:
We've been needing for a while to have consistent:

Plotting: Enable optional plots for coregistration functions #139 (also in Re-organization of coreg.py #327),
Statistics (in Re-organization of coreg.py #327),
Weight support: Add weights in coreg functions #59 ,
Subsampling: Add random_state argument to Coreg class for subsample cases #243, Allow custom subsampling for each CoregPipeline step separately. #137 and mainly the long discussion in Argument subsample does not work for a CoregPipeline #428.

Most of them are not too far since we introduced a consistent structure for optimizers or binning in #158. It makes weights almost directly supported (but we'll need to raise proper warnings for methods that currently ignore, such as ICP), and will make plotting more easy by consistently treating the type of methods (binning, fit, or both) and the dimensionality of the variables (1D, 2D, ND), which can be re-used for Tilt or NuthKaab fits in the affine functions.

Tests:
Some tests are still a bit old or slow, several related issues could be solved all at once:

Consistent data fixtures in test modules: Use pytest.fixture to provide consistent test data across all test modules? #427,
More varied test data Add DEMs with NaNs (or synthesise them) for tests #64, Improve tests for Coreg, in particular with NaNs #140 and Add example data set with different projections geoutils#310, with synthetic data examples from GeoUtils here: Disable numba during CI tests to get adequate coverage stats #358 (comment).

Performance:
⚠️ We really need to think ahead for a structure that will allow memory-efficient computations using Dask:

For reading/writing with georeferencing, this will depend directly on rioxarray, and Add Xarray accessor gu geoutils#383 and Add Xarray accessor dem #392, so not too much to think about!
For Coreg.fit, as a regression requires all samples at once, it cannot be combined from solutions of different chunks (except in a blockwise way using BlockwiseCoreg). So it's all about the subsample which is fairly easy to deal with (read subsample per chunk + return concatenated vector). There's also the computation of derivatives needed, which are also straightforward (slope using overlapping chunks, bias_vars using normal chunks), see the thoughts in Argument subsample does not work for a CoregPipeline #428 (comment). Most other methods residuals, plot can be based on the same logic as they use subsample.
It leaves the logic for Coreg.apply, which might not be trivial. For bias corrections, the solution from the optimized function is applicable independently to every pixel given their bias_vars (coordinates, angle, etc), so very easy to apply to chunk. However, for affine methods, applying a 3D affine matrix in 4x4 format lazily to independent chunks won't work directly... it would also require a definition of the rotation center of the matrix, and maybe other things... Any thoughts on how to address this @erikmannerfelt? Or maybe @friedrichknuth has some insights?
We could also improve affine pipelines performance with Improve CoregPipeline.apply() and .fit() performance by merging matrices #79,
And improve memory performance by adding a default resampling logic, see Provide default downsampling logic in Coreg pre-processing for performance #437.

Bugs:
Here there's a lot, but they might solve themselves (or become irrelevant) after changing architecture + tests:
#423, #422, #404, #326, #232, #193

Idea of plan moving forward:

Think ahead on logic for Performance: already done a bit in Argument subsample does not work for a CoregPipeline #428 (comment). I think we can use the structure proposed there which should work eventually 🤞! For apply, this should be adaptable down the line...
Focus first on things all related to Architecture: to avoid the effort of adapting all the rest if we did it first (new features, bug fixes and tests). A lot of lessons learned from Add BiasCorr classes and rename previous coreg.BiasCorr in coreg.VerticalShift #158 and Re-organize coreg.py #329.
Add basic architectural Tests to ensure 2. is working as intended with what we already have, and add new data for consistent (and quick!) tests (this specific point might require its own thread of discussion).
Add support for all new Features: should be made easier by the consistent architecture!
Add new Tests in parallel of each feature in 4.
See if the Bugs are still relevant, and fix them if they are! 😄
Celebrate, for we would have reached quite a way! 🍾

Any thoughts @adehecq @erikmannerfelt? 😄

The text was updated successfully, but these errors were encountered:

rhugonnet · 2023-09-07T03:33:36Z

To clarify on the Coreg.apply for chunks:

The chunked writing of the transformed DEM would be fairly easy...
... But we'd need an apply_matrix_coords function that can transform the 4x4 matrix into a Callable that takes in 2D coordinates as input, and outputs the affine transformation at any given pixel (like is currently done in BiasCorr classes with fit_func).

Is that something we can do currently @erikmannerfelt?

adehecq · 2023-09-07T12:20:14Z

Any thoughts @adehecq @erikmannerfelt? 😄

Wow, I'm a bit overwhelmed with the list of things left to do to have any useful feedback for the moment.
It's great you made an exhaustive list of all the things to be considered for coreg. Now we need two things:

time ! There are a lot of changes to be made so we need to coordinate efforts on this. I'm happy to actively contribute to the code writing, but either way, someone will have to review the code so it's better if we do it in parallel rather than having to go through a huge PR
split this huge list into a sequence of steps that we can go through.

What was your idea regarding this project. Did you want to work on it in the coming days/weeks?
We should try to set some blocks of time to work together on this. We can coordinate over Slack or face to face.

rhugonnet · 2023-09-07T20:13:09Z

Yes it is a bit overwhelming like this haha, but thankfully some of these points are fairly small! 🙂

We'll need more detailed lists, but I think those can live within each PR that will address a new step/feature listed above. The architecture work will have to be one big PR, as everything is inter-dependent on the structure.

With all the discussions and the work already done in #158 trying to make the structure generic and robust, I have a clear vision of the technical steps to go through to reach every point. And I don't want to wait too long and lose it! I'd be happy to write a more detailed plan, discuss it, and combine efforts if you both can block some days for it, but can otherwise push on my own in the frame of the coreg study! 😉

My idea was to start on this full time end-of-September (for the co-registration study to reach a good stage before AGU).

rhugonnet mentioned this issue Sep 25, 2023

Improve subsample across Coreg subclasses and pipelines #436

Merged

rhugonnet added enhancement Feature improvement or request priority Needs to be fixed rapidly labels Jan 31, 2024

rhugonnet mentioned this issue Mar 7, 2024

Improve tests in test_affine once core class restructured with BiasCorr parameters #485

Open

This was referenced Mar 26, 2024

adapt changes to geouilts #519 and #501 #497

Closed

Add GDC(Gradient denscent coregistration) in documentation #495

Open

adehecq mentioned this issue Apr 12, 2024

Add Deramp and Blockwise to the documentation (coregistration section) #505

Open

rhugonnet mentioned this issue May 24, 2024

Re-structure AffineCoreg subclasses to be more modular #530

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Way forward for co-registration module #435

Way forward for co-registration module #435

rhugonnet commented Sep 6, 2023 •

edited

rhugonnet commented Sep 7, 2023

adehecq commented Sep 7, 2023

rhugonnet commented Sep 7, 2023

Way forward for co-registration module #435

Way forward for co-registration module #435

Comments

rhugonnet commented Sep 6, 2023 • edited

rhugonnet commented Sep 7, 2023

adehecq commented Sep 7, 2023

rhugonnet commented Sep 7, 2023

rhugonnet commented Sep 6, 2023 •

edited