Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cpp code #12

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft

add cpp code #12

wants to merge 8 commits into from

Conversation

mpadge
Copy link

@mpadge mpadge commented May 10, 2020

For #9, with benchmark something like this:

library(concaveman)
xy <- sf::st_coordinates(points)
library(rbenchmark)
benchmark(
          res <- concaveman_rcpp (xy),
               res <- concaveman (xy),
          replications = 10)
#>                         test replications elapsed relative user.self sys.self
#> 1 res <- concaveman_rcpp(xy)           10   0.106  1.000     0.106        0.000
#> 2      res <- concaveman(xy)           10   0.770  7.264     1.148        0.042
#>   user.child sys.child
#> 1          0         0
#> 2          0         0

Created on 2020-05-10 by the reprex package (v0.3.0)

This early PR only implements are barebones interface with the C++ code that accepts a two-column matrix input only.

TODO

  • Add Stanislaw Adaszewski as (aut,cph)
  • Add me as (aut) - if you're okay with that?
  • Mirror current dispatch methods for sf and matrix for rcpp code - or just replace current workhorse dispatch with much simpler call to rcpp_concaveman().

If you opt for complete replacement, then

  • ditch V8 Dep and all associated json round-trip stuff

I've only done a draft PR at this stage, as we'll definitely need to workshop this a bit prior to merging. Finally, if it helps to have an external opinion, the V8 Dep of current implementation is an absolute game killer for me, and so presumably for a number of others. I can't even install V8 on my current system setup, so can't directly use current package. Having pure C++ would not only allow a speed up of close to an order of magnitude, but would make installation requirements hugely easier.

@dcooley Would you care to at least offer an opinion or two in this regard, as well as to contribute however you might see fit, maybe including suggestions for how you would design the R-side interface? sf and matrix dispatch methods only, as in current version, or do you think it'd be worthwhile extending to other classes such as data.frame? Any other thoughts?


Oh, and one more thing: the two approaches will never give comparable results, because of underlying differences in the ways the hulls are initially constructed. The C++ code needs an initial convex hull, which i generate via grDevices::chull, which the .js does its own internally. The starting points are therefore different, and so are the end points. The parameters also have different effects because of this, so it will never be possible to compare results of the two. But you can easily visually inspect them to confirm that they are visually mighty similar, regardless of irreconcilable numeric differences.

@joelgombin
Copy link
Owner

Thanks so much @mpadge! I'm completely sold on replacing the js backend by your C++ implementation. However, because I think it needs some more work (in particular in terms of documentation - see the CI results above), I think I'll ship the current version (which is mainly fixes to accomodate upgrades from other packages - sf and dplyr mainly) to CRAN, and we'll make a new major upgrade with your implementation in a few weeks. Does that sound right to you?
And no problem of course to add Stanislaw Adaszewski and you as authors and cph - I'm not even sure I should stay among authors!

@joelgombin
Copy link
Owner

And yes I'd go for full replacement, I don't really see a point in keeping the two implementations, unless some people are attached to the js one? In which case we could make V8 a Suggests rather than Imports?

@dcooley
Copy link

dcooley commented May 10, 2020

a few thoughts

  1. Move concaveman.h into /inst/include/concaveman/ (and add Makevars so it can be found). This means other packages can link to this header also.

  2. Potentially make a /inst/include/concaveman/api.hpp file, where most of your current .cpp code will live. This means other packages can link and call this code directly through C++.

  3. This one may be biased, but I would also remove sf and replace with sfheaders, to remove the requirement to have GDAL, GEOS and PROJ.4 as SystemRequirements.

If you're happy with these suggestions, what's the best way to update this PR; do I fork mpadge/concaveman#rcpp and update that?

@mpadge
Copy link
Author

mpadge commented May 11, 2020

Thanks heaps @dcooley, and yeah, I'd agree with your suggestion for how to do this: You fork my fork, PR to that, we back-and-forth with @joelgombin, and he approves final merge. I also completely agree with your suggestion of dumping sf in favour of sfheaders. If you want to do those bits, I'll tend to the parts listed above. I also want to play around a bit with the template interface, especially to work out whether it's computationally worthwhile providing distinct <int> and <double> template instances. Is the approach always to provide an exact copy of ./src/* in ./inst/include/concaveman/.h(pp)?

And @joelgombin, no worries of course about you doing a release in the meantime with V8 - probably a good strategy. Finally, I of course entirely disagree about your joking comment of not staying "among authors" - this package was your idea, so you will always own it, and for that we'll remain grateful!

@dcooley
Copy link

dcooley commented May 11, 2020

Cool - I'll have a go at it tomorrow.

Is the approach always to provide an exact copy of ./src/* in ./inst/include/concaveman/.h(pp)?

No, generally, with a templated header approach, your ./src/ .cpp files will be bare-bones interfaces to the full implementations in the .hpp files.

I think a good example of this is, I recently updated decido by making a decido.hpp file, moved your original cpp earcut() code into there, and made another overloaded function with the same name that accepts a single SEXP polygon object.

Then the .cpp file now only needs to include that .hpp, then call one of the overloaded earcut() functions.

@joelgombin
Copy link
Owner

I didn't know about sfheaders but it seems great to be able not to depend on sf. It will definitively make my/our job easier for the next CRAN release... I can have a look at replacing sf by sfheader.

@mpadge
Copy link
Author

mpadge commented May 11, 2020

@dcooley done most ./inst stuff now - thanks for the example of your fine decido work - excellent templating there! Speaking of which ... that's the one thing yet to be implemented here; current code is just a dumb hard-coded interface (./inst/include/concaveman/api.hpp). The way I see it, we can dispatch from R on only 2 forms of .matrix and .data.frame (with suitable column names, indices, or whatever info is necessary). I still want to first check out the <int> vs <double> stuff before doing any of that ...


EDIT: (using hacky templating for now):

library(concaveman)
xy <- sf::st_coordinates (points)
xy_int <- round (xy * 1000)
storage.mode (xy_int) <- "integer"
library (rbenchmark)
benchmark (
           h <- concaveman_rcpp (xy_int, concavity = 3),
           h <- concaveman_rcpp (xy, concavity = 3),
           replications = 10) [, 1:4]
#>                                          test replications elapsed relative
#> 1 h <- concaveman_rcpp(xy_int, concavity = 3)           10   0.021  1.000
#> 2     h <- concaveman_rcpp(xy, concavity = 3)           10   0.052  2.476

Created on 2020-05-11 by the reprex package (v0.3.0)

This is because the most of the algorithmic operations are subtractions, and these are way more efficient for <int>. Definitely worth distinguishing these two cases via templates!

@dcooley
Copy link

dcooley commented May 11, 2020

@mpadge There's a PR waiting in your repo.

@joelgombin the sfheaders code can either be incorporated into your existing R functions as a replacement for the sf code, or we can integrate it into the C++ code

@mpadge
Copy link
Author

mpadge commented Aug 4, 2022

@joelgombin Sorry for complete silence on this one 🙏 - i do still intend to finish this. In the meantime, ping @mdsumner who has also been playing with this code in case he has anything useful to contribute. Joel: Would this PR still be useful to you?

@joelgombin
Copy link
Owner

joelgombin commented Aug 4, 2022 via email

@mdsumner
Copy link

mdsumner commented Aug 4, 2022

oh I didn't read this thread enough ... 😅 all good, and fwiw don't mind where the code ends up ... I translated to cpp11 here just yesterday

https://github.com/mdsumner/cchull

as headers would be better, happy to be involved 👌

@mdsumner
Copy link

mdsumner commented Aug 4, 2022

I'd suggest keeping the core code in a dev package, and dep on that - I definitely want this independent of sf - geos has its own now but it's obviously higher level than this raw algo, would have to ensure validity etc - but I consider that a user-responsibility with the available params

@mpadge
Copy link
Author

mpadge commented Aug 5, 2022

sf is only used here for the very lightweight concaveman.sf() conversion, which could easily be handled by headers. There'd be no CRS validation, but i totally agree: user-responsibility. So ditching current sf hard-dep would be easy, and should be done.

@mdsumner
Copy link

mdsumner commented Aug 5, 2022

yes, we really need some community effort on getting basic algorithms organised in R

currently everyone doing it over and over in different ways is a disaster 🙏

but how, we could end up with three copies of this code now just because we don't collectively have a framework to build upon, I know it's off topic here ... but it's really become a huge problem for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants