Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Median difference vs. difference in medians for the paired plot #105

Open
Generalized opened this issue May 21, 2021 · 1 comment
Open

Comments

@Generalized
Copy link

When we work with paired data, it's usually about median difference, not the difference in medians.
Mean change is equal to change in means, but this doesn't hold for medians in general.

> set.seed(100)
> a <- rnorm(100, mean = 1)
> b <- rnorm(100, mean=10)
> median(a-b)
[1] -8.987909
> median(a) - median(b)
[1] -8.986542
# Close, but not exactly equal (finite sampling)

> set.seed(100)
> a <- runif(100, 0, 10)
> b <- runif(100, 10, 20)
> median(a-b)
[1] -9.998856
> median(a) - median(b)
[1] -9.968535
# Close, but not exactly equal

> set.seed(100)
> a <- rlnorm(100)
> b <- rlnorm(100, meanlog = 1)
> median(a) - median(b)
[1] -1.585291
> median(a-b)
[1] -1.586252
# very close

> set.seed(100)
> a <- rlnorm(100)
> b <- rlnorm(100, meanlog = 1, sdlog = 3)
> median(a-b)
[1] -1.275271
> median(a) - median(b)
[1] -1.245357
#quite close

> set.seed(100)
> a <- rbeta(100, 10, 1)
> b <- rbeta(100, 1, 10)
> median(a)-median(b)
[1] 0.8524503
> median(a-b)
[1] 0.8345763
# quite close

But

> set.seed(100)
> a <- runif(100, 0, 10)
> b <- rnorm(100, 5, 1)
> median(a-b)
[1] -0.04137754
> median(a) - median(b)
[1] 0.1726379

> set.seed(100)
> a <- rnorm(100, 1, 1)
> b <- rlnorm(100, 0, 1)
>median(a-b); 
[1] -0.1147998
>median(a)-median(b)
[1] 0.0107118

> set.seed(100); a <- runif(100, -1, 3); b <- rlnorm(100, 0, 2); median(a-b); median(a)-median(b)
[1] -0.2062541
[1] 0.2443354

> set.seed(100); a <- c(runif(50, -4, -1), rnorm(50, 2, 4)); b <- rnorm(100, 0, 4); median(a-b); median(a)-median(b);
[1] 0.2922939
[1] -1.24893

and so on.

Another example:
http://onbiostatistics.blogspot.com/2015/12/median-of-differences-versus-difference.html

Actually, I have never seen difference in medians being reported for paired data. It was rather Hodges–Lehmann estimator of the pseudo-media, approached by the median change when the distribution of changes is symmetric, or the relative effect.

I looked at the code and it says it's about difference in medians:
median(treatment) - median(control)

Would you consider adding median difference too?

@adamcc
Copy link
Collaborator

adamcc commented May 22, 2021

Many thanks for the note, we'll aim to correct this.

I hope the summer interns can get to it by August. But if you can see a way to do it yourself, please let us know—and send a pull request when ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants