Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dot sizes are inconsistent in faceted quantile dot plots. #191

Open
groceryheist opened this issue Aug 5, 2023 · 4 comments
Open

Dot sizes are inconsistent in faceted quantile dot plots. #191

groceryheist opened this issue Aug 5, 2023 · 4 comments

Comments

@groceryheist
Copy link

I want to use stats_dotinterval in a faceted plot. The variance of the data is different within each facet, but comparing across facets is important. The dots in the lower-variance facets are larger than the dots in the smaller-variance facets. I hope to choose scale so that the dots are the same size in each facet, and they are as large as they can be while fitting data from all plots in each row.

Reproducible example.

library(ggplot2)
library(ggdist)

x <- c(rep(rnorm(500,sd=1),3), rep(rnorm(500,sd=0.5),3), rep(rnorm(500,sd=0.1),3))
group.x <- c(1,2,3)
group.y <- c('a','b','c')
y <- c('i','ii','iii')

df <- expand.grid(x=rnorm(500),group.x=group.x, group.y=group.y, y=y)
df[['x']] <- x

ggplot(aes(x=x, y=y), data=df) + stat_dotsinterval(quantiles=50, linewidth=0.5, point_size=1, shape=1) + facet_grid(group.x ~ group.y, as.table=F)

Here's the result
Screenshot_2023-08-05_14-26-03

As a simple workaround, I can set binwidth in stats_dotinterval, but I am making several similar plots with different data and so their scales will be inconsistent.

@mjskay
Copy link
Owner

mjskay commented Aug 5, 2023

Yeah, unfortunately you can't automatically set the same binwidth across facets at the moment. This is due to a limitation of ggplot2: to automatically determine binwidths, we have to know the plot dimensions. However, plot dimensions are not known when graphical objects (grobs) used to display the data are created by ggplot2 geometries. Thus, geom_dots uses a custom grob to draw dotplots that calculates a good binwidth for all the dotplots in a single facet. ggplot2 draws grobs at a facet level, so a single grob can't be used to draw dotplots across multiple facets (so far as I can tell), thus automatic binwidth selection can't be used across facets.

(though it does occur to me in writing this that it might be possible to use a hacky approach to layer manipulation to get around the limitations of ggplot2 and allow binwidths to be shared across facets, based on some approaches I used in ggblend... but not sure if/when I'll implement it...)

There are some workarounds:

  1. Pass verbose = TRUE to stat_dotsinterval() and it will tell you the binwidth of each dots layer, which you can use to set a binwidth manually. This could be hard to automate if you are doing a large number of charts.
  2. Manually create the data for the dot positions using find_dotplot_binwidth, bin_dots, and coord_fixed(). Takes a bit more work, but should be possible to automate.

@groceryheist
Copy link
Author

Thanks for the response! I am going to try using (1) but automating it by capturing and parsing stdout because it seems like option (2) doesn't do quantile dot plots. Not 100% of the latter. Is that right?

@mjskay
Copy link
Owner

mjskay commented Aug 7, 2023

Option 2 can do quantile dotplots if you construct them manually, which is generally not too much of a pain: e.g., if you have a sample x that you want a 100-dot quantile dotplot for, you just plot a dotplot of quantile(x, ppoints(100)).

@groceryheist
Copy link
Author

groceryheist commented Aug 7, 2023

That seems like a better solution, but I already have the hackier version of Option 1. Pasting below for others looking for a quick fix.


library(ggplot2)
library(ggdist)

x <- c(rep(rnorm(500,sd=1),3), rep(rnorm(500,sd=0.5),3), rep(rnorm(500,sd=0.1),3))
group.x <- c(1,2,3)
group.y <- c('a','b','c')
y <- c('i','ii','iii')

df <- expand.grid(x=rep(0,500),group.x=group.x, group.y=group.y, y=y)
df[['x']] <- x
ptemp <- ggplot(aes(x=x, y=y), data=df) + stat_dotsinterval(quantiles=50, linewidth=0.5, point_size=1, shape=1, verbose=TRUE) + facet_grid(group.x ~ group.y, as.table=F)
plotmessages <- capture.output({ggsave(plot=ptemp, filename="tmpplot.pdf", width=7,height=7*.65)},type="message")
matches <- regexec("binwidth = ([[:digit:]\\.]*)", plotmessages)
binwidths <- sapply(regmatches(plotmessages,matches), function(m) m[2])
binwidths <- as.numeric(binwidths[!is.na(binwidths)])
binwidth <- min(binwidths)
p <- ggplot(aes(x=x, y=y), data=df) + stat_dotsinterval(binwidth=binwidth, quantiles=50, linewidth=0.5, point_size=1, shape=1) + facet_grid(group.x ~ group.y, as.table=F)

ggsave(plot=p, filename='equal_binwidth.pdf', width=7, height=7*.65)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants