-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve corspearman performance (Matrix & Matrix-Matrix cases) #850
Conversation
if n = size(x,2) then this release reduces the number of calls made by corspearman(x) to tiedrank from n + n(n-1)/2 to n. Hence better speed.
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
My testing showed that this suggested change does indeed reduce allocations and results in the method `tiedrank_nan` being a bit faster. Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
Thank you @devmotion for looking at this PR. My real interest is to improve the performance of |
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Hello @nalimilan and other maintainers of StatsBase, Is it possible for this PR to be accepted while there are failures against Julia nightly? I'd like to make a PR for changes to corkendall and it would be simpler to do that if this PR had gone through, since the relevant code is in the same file. Thanks! |
I plan to submit an alternative PR in which |
if
n = size(x,2)
then this PR reduces the number of calls made bycorspearman(x)
totiedrank
fromn + n(n-1)/2
ton
. Likewise, ifm = size(y,2)
, the number of calls made bycorspearman(x,y)
falls fromn + n*m
ton + m
. Hence a big speedup for multi-column input.In the tests below, the new version of
corspearman
is in a package KendallTau.After the change, the tests in
test/rankcorr.jl
all pass. The desired handling ofNaN
values is maintained too.