Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use binary search for classify tools in CuPy case #762

Closed
wants to merge 3 commits into from

Conversation

thuydotm
Copy link
Contributor

@thuydotm thuydotm commented Jan 19, 2023

Fixes #761

  • try binary search for _gpu_bin()
  • benchmarking to see if the performance is better.

@codecov-commenter
Copy link

Codecov Report

Base: 79.90% // Head: 79.90% // No change to project coverage 👍

Coverage data is based on head (4f0e1d8) compared to base (e47c278).
Patch coverage: 0.00% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #762   +/-   ##
=======================================
  Coverage   79.90%   79.90%           
=======================================
  Files          19       19           
  Lines        4175     4175           
=======================================
  Hits         3336     3336           
  Misses        839      839           
Impacted Files Coverage Δ
xrspatial/classify.py 78.43% <0.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@thuydotm thuydotm added the WIP Work in progress label Jan 19, 2023
@thuydotm
Copy link
Contributor Author

Comparing benchmarking when applying naive binary search for gpu_bin(). The performance doesn't seem to be improved. We should investigate more to see whether we can better implement binary search for GPU, or just not use it.

$ asv compare b6c683b e47c278

All benchmarks:

                                      ratio
     [b6c683b5]       [e47c2784]
     <classify_binary_search_gpu>       <classify_binary_search_gpu~3>
      2.14±0.08ms      2.10±0.06ms     0.98  classify.EqualInterval.time_equal_interval(100, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
       2.07±0.1ms      2.13±0.09ms     1.02  classify.EqualInterval.time_equal_interval(100, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      2.63±0.09ms      2.71±0.08ms     1.03  classify.EqualInterval.time_equal_interval(1000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
       2.54±0.2ms       2.75±0.2ms     1.08  classify.EqualInterval.time_equal_interval(1000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      82.4±0.03ms      82.4±0.09ms     1.00  classify.EqualInterval.time_equal_interval(10000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      82.3±0.03ms      82.4±0.04ms     1.00  classify.EqualInterval.time_equal_interval(10000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      2.13±0.06ms      2.06±0.07ms     0.97  classify.EqualInterval.time_equal_interval(300, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
       2.10±0.1ms       2.20±0.1ms     1.05  classify.EqualInterval.time_equal_interval(300, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
       9.09±0.1ms      9.84±0.08ms     1.08  classify.EqualInterval.time_equal_interval(3000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
       9.11±0.1ms      9.87±0.08ms     1.08  classify.EqualInterval.time_equal_interval(3000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed           failed      n/a  classify.NaturalBreaks.time_natural_breaks(300, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed           failed      n/a  classify.NaturalBreaks.time_natural_breaks(3000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      2.86±0.09ms       2.88±0.1ms     1.01  classify.Quantile.time_quantile(100, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
       2.92±0.1ms       2.99±0.1ms     1.03  classify.Quantile.time_quantile(100, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      3.21±0.06ms       3.19±0.1ms     1.00  classify.Quantile.time_quantile(1000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      3.11±0.06ms      3.20±0.06ms     1.03  classify.Quantile.time_quantile(1000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      30.1±0.04ms      30.1±0.09ms     1.00  classify.Quantile.time_quantile(10000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      30.1±0.02ms      30.1±0.07ms     1.00  classify.Quantile.time_quantile(10000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
       2.97±0.1ms      2.95±0.06ms     0.99  classify.Quantile.time_quantile(300, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      2.97±0.05ms       2.93±0.1ms     0.99  classify.Quantile.time_quantile(300, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
      5.08±0.06ms       5.23±0.1ms     1.03  classify.Quantile.time_quantile(3000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
       5.06±0.1ms      5.14±0.06ms     1.02  classify.Quantile.time_quantile(3000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed              n/a      n/a  classify.Reclassify.time_reclassify(100, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
      1.62±0.09ms      1.71±0.08ms     1.05  classify.Reclassify.time_reclassify(100, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed              n/a      n/a  classify.Reclassify.time_reclassify(100, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
      1.55±0.09ms       1.71±0.1ms     1.10  classify.Reclassify.time_reclassify(100, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed              n/a      n/a  classify.Reclassify.time_reclassify(1000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
      1.65±0.09ms      1.82±0.07ms    ~1.10  classify.Reclassify.time_reclassify(1000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed              n/a      n/a  classify.Reclassify.time_reclassify(1000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
      1.71±0.08ms       1.71±0.1ms     1.00  classify.Reclassify.time_reclassify(1000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed              n/a      n/a  classify.Reclassify.time_reclassify(10000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
      5.75±0.02ms      5.79±0.05ms     1.01  classify.Reclassify.time_reclassify(10000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed              n/a      n/a  classify.Reclassify.time_reclassify(10000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
      5.76±0.03ms      5.77±0.06ms     1.00  classify.Reclassify.time_reclassify(10000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed              n/a      n/a  classify.Reclassify.time_reclassify(300, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
      1.60±0.05ms      1.73±0.08ms     1.09  classify.Reclassify.time_reclassify(300, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed              n/a      n/a  classify.Reclassify.time_reclassify(300, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
       1.59±0.1ms      1.70±0.07ms     1.07  classify.Reclassify.time_reclassify(300, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed              n/a      n/a  classify.Reclassify.time_reclassify(3000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
      2.08±0.07ms      1.90±0.04ms     0.91  classify.Reclassify.time_reclassify(3000, 10, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]
           failed              n/a      n/a  classify.Reclassify.time_reclassify(3000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit10.1-cupy-pyct]
      2.00±0.06ms       1.93±0.1ms     0.97  classify.Reclassify.time_reclassify(3000, 100, 'cupy') [nvidia-ngc-base-test-b-1-vm/conda-py3.9-cudatoolkit11.2-cupy-pyct]

@brendancol
Copy link
Contributor

brendancol commented Feb 16, 2023

@thuydotm any reason not to set this to ready or since there isn't a big performance gain should we close this PR? Happy to take your lead on this

@brendancol
Copy link
Contributor

investigate more to see whether we can better implement binary search for GPU

Add an issue here: #767

@thuydotm
Copy link
Contributor Author

Closing this PR as the proposed implementation does not help improving the performance.

@thuydotm thuydotm closed this May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP Work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Classify tools: use binary search in _gpu_bin()
3 participants