Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to new Cython-based RGB to CIE LCH_ab conversion for colorizing #122

Merged
merged 50 commits into from
Aug 24, 2023

Conversation

djhoese
Copy link
Member

@djhoese djhoese commented Jan 30, 2023

This is a replacement for #121 that adds a cython extension to trollimage. This does mean a lot more complexity to the building of trollimage, but opens up a lot of doors for more optimizations in the future.

I commented on #79 about the performance I was seeing from #121 and how it was worse. I'm happy to say that with this new version the say test that took 1.68s in the version in main now takes 479ms. I still have some optimizations though. This newest implementation is faster but only barely. I will try optimizing it further in another PR.

This work is heavily taken from the rio-color package: https://github.com/mapbox/rio-color

TODO:

  • Decide/determine if the failing tests are just different numbers and not necessarily wrong

  • Add wheel building and upload to CI

  • Implement rgb2hcl in Cython (currently only have hcl2grb)

  • Refactor, refactor, refactor

  • Closes Add more accurate RGB <-> HCL conversion for colorize interpolation #121, closes hcl2rgb bottleneck in colorize methods #79 (remove if there is no corresponding issue, which should only be the case for minor changes)

  • Tests added (for all bug fixes or enhancements)

  • Tests passed (for all non-documentation changes)

  • Passes git diff origin/master **/*py | flake8 --diff (remove if you did not edit any Python files)

  • Fully documented (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)

@djhoese djhoese requested a review from mraspaud January 30, 2023 03:16
@djhoese djhoese self-assigned this Jan 30, 2023
@djhoese
Copy link
Member Author

djhoese commented Jan 30, 2023

For performance, this script:

from trollimage.colormap import Colormap
import numpy as np
cmap = Colormap(values=list(range(4)), colors=[(1.0, 1.0, 0.0), (0.0, 1.0, 1.0), (1.0, 1.0, 1.0), (0.0, 0.0, 0.0)])
data = np.random.random((3000, 3000))
colorized_image = cmap.colorize(data)

I run:

kernprof -l  profile_colorize.py
python -m line_profiler profile_colorize.py.lprof

And get:

Timer unit: 1e-06 s

Total time: 0.428572 s
File: /home/davidh/repos/git/trollimage/trollimage/colormap.py
Function: _interpolate_rgb_colors at line 92

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    92                                           @profile
    93                                           def _interpolate_rgb_colors(arr, colors, values):
    94         1          8.4      8.4      0.0      interp_xp_coords = np.array(values)
    95         1         35.8     35.8      0.0      interp_y_coords = rgb2hcl(colors)
    96         1          2.6      2.6      0.0      if values[0] > values[-1]:
    97                                                   # monotonically decreasing
    98                                                   interp_xp_coords = interp_xp_coords[::-1]
    99                                                   interp_y_coords = interp_y_coords[::-1]
   100                                               # handling NaN Hue values for interpolation
   101         1          0.7      0.7      0.0      new_hues = interp_y_coords[..., 0]
   102         1         35.5     35.5      0.0      if np.isnan(new_hues).all():
   103                                                   new_hues[..., :] = 0
   104                                               else:
   105         1          1.7      1.7      0.0          if np.isnan(new_hues[..., 0]):
   106                                                       new_hues[..., 0] = new_hues[~np.isnan(new_hues)][0]
   107         1          6.8      6.8      0.0          while np.isnan(new_hues).any():
   108                                                       new_hues[..., 1:] = np.where(np.isnan(new_hues[..., 1:]), new_hues[..., :-1], new_hues[..., 1:])
   109         1          1.4      1.4      0.0      interp_y_coords[..., 0] = new_hues
   110         1         12.4     12.4      0.0      interp_hcl = np.zeros(arr.shape + (3,), dtype=interp_y_coords.dtype)
   111         1     111445.3 111445.3     26.0      interp_hcl[..., 0] = np.interp(arr, interp_xp_coords, interp_y_coords[..., 0])
   112         1      54063.3  54063.3     12.6      interp_hcl[..., 1] = np.interp(arr, interp_xp_coords, interp_y_coords[..., 1])
   113         1      51841.0  51841.0     12.1      interp_hcl[..., 2] = np.interp(arr, interp_xp_coords, interp_y_coords[..., 2])
   114         1     211114.2 211114.2     49.3      new_rgb = hcl2rgb(interp_hcl)
   115         1          2.9      2.9      0.0      return [new_rgb[..., 0], new_rgb[..., 1], new_rgb[..., 2]]

So hcl2rgb is still the slowest individual call.

@djhoese
Copy link
Member Author

djhoese commented Feb 11, 2023

So I started analyzing the failures of test_colorize_with_interpolation and tried to figure out which algorithm is more "right". So one of the results the test is supposed to product is 50%/50% between the first two colors of the colormap which are RGB (1.0, 1.0, 0.0) and (0.0, 1.0, 1.0). So yellow and cyan.

Here's what I get when I convert them to HCL:

In [3]: rgb2hcl_old(1.0, 1.0, 0.0)
Out[3]: (80.17623543714927, 0.00501416179526073, -0.14337082984424737)

In [4]: rgb2hcl_old(0.0, 1.0, 1.0)
Out[4]: (169.421806448465, 0.0034564219537122592, -0.1440018813512837)

In [5]: rgb2hcl(np.array([[1.0, 1.0, 0.0]]))
Out[5]: array([[1.04719755, 0.66666667, 0.5       ]])

In [6]: rgb2hcl(np.array([[0.0, 1.0, 1.0]]))
Out[6]: array([[3.14159265, 0.66666667, 0.5       ]])

What I see here is that the new algorithm is much more consistent. For example, why is C in the old version different for two "full" colors on the HCL colorspace. Why is luminance negative? Here's the paper for this PR's algorithm if you want the full algorithm:

https://web.archive.org/web/20190220074017/http://pdfs.semanticscholar.org/206c/a4c4bb4a5b6c7b614b8a8f4461c0c6b87710.pdf

but here's the colorspace diagram:

image

Let me see if I can generate the two versions of the colormaps as colorbars...

@djhoese
Copy link
Member Author

djhoese commented Feb 11, 2023

Old:

colorbar_test_old

New:

colorbar_test_new

Hm that cyan to white in the new one definitely doesn't look right.

Edit: ...but I'm pretty sure I fixed this in the pure-python version 😕

@djhoese
Copy link
Member Author

djhoese commented Feb 11, 2023

Newest commit fixes this:

colorbar_test_new2

@djhoese
Copy link
Member Author

djhoese commented Feb 12, 2023

I have another branch based on this one which completely reworks docs. I'd like to save that work for another PR.

@codecov
Copy link

codecov bot commented Feb 12, 2023

Codecov Report

Merging #122 (2d97595) into main (0041bf5) will decrease coverage by 2.21%.
The diff coverage is 68.84%.

@@            Coverage Diff             @@
##             main     #122      +/-   ##
==========================================
- Coverage   93.37%   91.17%   -2.21%     
==========================================
  Files          11       11              
  Lines        3624     3852     +228     
==========================================
+ Hits         3384     3512     +128     
- Misses        240      340     +100     
Flag Coverage Δ
unittests 91.17% <68.84%> (-2.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
trollimage/_colorspaces.pyx 63.89% <63.89%> (ø)
trollimage/colormap.py 100.00% <100.00%> (ø)
trollimage/colorspaces.py 100.00% <100.00%> (ø)
trollimage/tests/test_colormap.py 100.00% <100.00%> (ø)
trollimage/tests/test_image.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

@djhoese
Copy link
Member Author

djhoese commented Feb 12, 2023

So I had to update some other existing expected test results. The colormap used in that is brbg and I noticed a very small difference in the overall look.

Image with old colorize:

colorize_la_rgb_old.png

Image with new colorize:

colorize_la_rgb_new.png

Ugh, github is not rendering these nicely.

@djhoese djhoese marked this pull request as ready for review February 12, 2023 20:16
@djhoese
Copy link
Member Author

djhoese commented Feb 15, 2023

From what I can tell lost coverage is due to no tests for colorspaces.py and now that the hcl functions from that module aren't being used there are no tests touching them.

@mraspaud
Copy link
Member

Re YlGnBu, we should definitely add more intermediate colors, maybe these: https://colorbrewer2.org/#type=sequential&scheme=YlGnBu&n=9

@djhoese
Copy link
Member Author

djhoese commented Aug 21, 2023

Yes, absolutely. Even just adding the middle color (which matches the other colors I think) made things better in this PR. I should note that the color produced by main is not accurate with color brewer.

Do you have any other thoughts on the other colormap changes? Are any of those unacceptable to you if this PR were to be adopted/merged?

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for putting in all the effort, it's just a bit disappointing it didn't improve the performance that much.

@mraspaud
Copy link
Member

Tests are failing though...

@djhoese
Copy link
Member Author

djhoese commented Aug 22, 2023

I'll spend an hour or so today seeing if I flip the logic around if it changes performance: I'm thinking if I do the for loop on the outer convert function and then passing every pixel in to the sub-functions that it should improve performance based on memory locality. I think for simplicity of code I'm going to pass pixel colors by reference instead of by value. Or at least the output. That way all functions can remain void for returns and I don't have to worry about creating stuff on the stack...I think. Do you have other ideas @mraspaud? The rio-color project did a struct that it assigned to and then returned. I'm not sure I want to go down that road but I don't have a strong reason either.

@djhoese
Copy link
Member Author

djhoese commented Aug 22, 2023

Oh and the tests I will fix. They are just expected color comparison failures. I'll triple check that they aren't completely different colors before changing the expected values.

@mraspaud
Copy link
Member

I'll spend an hour or so today seeing if I flip the logic around if it changes performance: I'm thinking if I do the for loop on the outer convert function and then passing every pixel in to the sub-functions that it should improve performance based on memory locality. I think for simplicity of code I'm going to pass pixel colors by reference instead of by value. Or at least the output. That way all functions can remain void for returns and I don't have to worry about creating stuff on the stack...I think. Do you have other ideas @mraspaud? The rio-color project did a struct that it assigned to and then returned. I'm not sure I want to go down that road but I don't have a strong reason either.

I have to admit I can't think of much right now....

One more thing I saw now when looking at the code again: could we extract methods for the rgb to and from srgb conversions? I think it would make the code a bit nicer.

@mraspaud
Copy link
Member

also opencv and corresponding python bindings could make things much faster if we want to test https://docs.opencv.org/3.4/de/d25/imgproc_color_conversions.html

@djhoese
Copy link
Member Author

djhoese commented Aug 22, 2023

One more thing I saw now when looking at the code again: could we extract methods for the rgb to and from srgb conversions? I think it would make the code a bit nicer.

Which ones are you referring to?

@djhoese
Copy link
Member Author

djhoese commented Aug 22, 2023

Ah sure. I could make a little inline function for the individual component operation. I think I'll leave the simplified version as it is.

@djhoese
Copy link
Member Author

djhoese commented Aug 22, 2023

Oh and no I haven't looked at those other libraries/tools because I was trying to avoid another dependency for only a small portion of what this library does and for such a small portion of what those libraries does (lots of extra functionality we aren't using).

@mraspaud
Copy link
Member

Oh and no I haven't looked at those other libraries/tools because I was trying to avoid another dependency for only a small portion of what this library does and for such a small portion of what those libraries does (lots of extra functionality we aren't using).

true.

@djhoese djhoese merged commit 3550c58 into pytroll:main Aug 24, 2023
22 of 24 checks passed
@djhoese djhoese deleted the hcl-cython branch August 24, 2023 01:47
@djhoese
Copy link
Member Author

djhoese commented Aug 24, 2023

As mentioned on slack, I will follow this PR up with an optimization PR and a PR to auto-generate the colormap images used in the docs.

@djhoese djhoese changed the title Add new Cython-based RGB to HCL conversion Switch to new Cython-based RGB to CIE LCH_ab conversion for colorizing Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hcl2rgb bottleneck in colorize methods
2 participants