Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ggparcoord (and helper methods) into a separate package #499

Open
schloerke opened this issue Apr 23, 2024 · 4 comments
Open

Make ggparcoord (and helper methods) into a separate package #499

schloerke opened this issue Apr 23, 2024 · 4 comments

Comments

@schloerke
Copy link
Member

schloerke commented Apr 23, 2024

ggparcoord() depends upon {scagnostics} which is difficult to install. It is such a pain to deal with!

If we made a new package ({ggparcoord}) that had {scagnostics} as a Suggests dependency, then (by default) when installing {ggally}, {scagnostics} would not be required for installation.

If {scagnostics} was not installed, users would be happier. If they need the {scagnostics} package, they would be opting into installing {scagnostics}.

cc @92amartins

@schloerke
Copy link
Member Author

What about using {cassowaryr} instead? https://github.com/numbats/cassowaryr

@92amartins
Copy link
Collaborator

Hey, @harriet-mason!

We are considering to use your package (cassowayr) to replace a scagnostics calculation provided by the package of the same name.

Do you think that would work well?

We specifically use the package in this block of code:

ggally/R/ggparcoord.R

Lines 467 to 473 in 1f58feb

} else if (order %in% c(
"Outlying", "Skewed", "Clumpy", "Sparse", "Striated", "Convex", "Skinny",
"Stringy", "Monotonic"
)) {
require_namespaces("scagnostics")
scag <- scagnostics::scagnostics(saveData2)
data.m$variable <- factor(data.m$variable, levels = scag_order(scag, names(saveData2), order))

@harriet-mason
Copy link

Hey @92amartins,

So, cassoaryr can calculate those scagnsotics, however you will likely get different results from the scagnsotics package.

Unlike scagnsotics, cassowaryr does not perform binning which means points can get very close together, leading to infintesimally small MST lengths. This means any scagnsotic that uses MST lengths in the denominator of it's calculation has a tendancy to be quite volotile and give unpredictable results. We tried to design more robust scagnsotics to prevent these issues (such as clumpy2 and striated2) however the calculations used by those scagnsotics are fundamentally different from those in the Leland and Wilkinson paper. Binning is something we have been hoping to implement, but haven't had time yet.

Additionally, you would need to use the most recent development version of the package. We had a series of issues with changing dependencies that broke the package a couple of times, so the version on CRAN may throw errors for some scatter plots that the scagnostics package would have had no issue on. The current Github version should not have this issue and we are going to do some additional checks before re-submitting the package to CRAN. Ultimately, whether or not cassowaryr would would work well here depends on whether these trade offs are better or worse than trying to install scagnostics haha.

@92amartins
Copy link
Collaborator

Good to know. Thanks for the inputs on that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants