Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't make Figure 10 style RainCloudPlot for multiple variables with huge difference in absolute value #21

Open
yanxianl opened this issue Oct 20, 2018 · 0 comments

Comments

@yanxianl
Copy link

Hi there,

Thank you for developing the RainCloudPlot, which has been my favorite way of presenting data ever since I saw it in twitter.

I was trying to make a RainCloudPlot in a similar style of Figure 10 (Repeated Measures Factorial Rainclouds) but without success. Any help is greatly appreciated!

Below are the dataframe and code to reproduce my problem.

Yanxian

# Before start, run the RainCloudPlot source code to get the function for making geom_flat_violin

### This script creates an R function to generate raincloud plots, then simulates 
### data for plots. If using for your own data, you only need lines 1-80. 
### It relies largely on code previously written by David Robinson 
### (https://gist.github.com/dgrtwo/eb7750e74997891d7c20) and ggplot2 by H Wickham

library(tidyverse)
#> Warning: ³Ì¼­°ü'tidyverse'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ
#> Warning: ³Ì¼­°ü'ggplot2'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ
#> Warning: ³Ì¼­°ü'tibble'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ
#> Warning: ³Ì¼­°ü'tidyr'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ
#> Warning: ³Ì¼­°ü'readr'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ
#> Warning: ³Ì¼­°ü'purrr'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ
#> Warning: ³Ì¼­°ü'dplyr'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ
#> Warning: ³Ì¼­°ü'stringr'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ
#> Warning: ³Ì¼­°ü'forcats'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ

# Defining the geom_flat_violin function. Note: the below code modifies the 
# existing github page by removing a parenthesis in line 50

"%||%" <- function(a, b) {
  if (!is.null(a)) a else b
}

geom_flat_violin <- function(mapping = NULL, data = NULL, stat = "ydensity",
                             position = "dodge", trim = TRUE, scale = "area",
                             show.legend = NA, inherit.aes = TRUE, ...) {
  layer(
    data = data,
    mapping = mapping,
    stat = stat,
    geom = GeomFlatViolin,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      trim = trim,
      scale = scale,
      ...
    )
  )
}

@rdname ggplot2-ggproto
@Format NULL
@Usage NULL
@export

GeomFlatViolin <-
  ggproto("GeomFlatViolin", Geom,
          setup_data = function(data, params) {
            data$width <- data$width %||%
              params$width %||% (resolution(data$x, FALSE) * 0.9)
            
            # ymin, ymax, xmin, and xmax define the bounding rectangle for each group
            data %>%
              group_by(group) %>%
              mutate(ymin = min(y),
                     ymax = max(y),
                     xmin = x,
                     xmax = x + width / 2)
            
          },
          
          draw_group = function(data, panel_scales, coord) {
            # Find the points for the line to go all the way around
            data <- transform(data, xminv = x,
                              xmaxv = x + violinwidth * (xmax - x))
            
            # Make sure it's sorted properly to draw the outline
            newdata <- rbind(plyr::arrange(transform(data, x = xminv), y),
                             plyr::arrange(transform(data, x = xmaxv), -y))
            
            # Close the polygon: set first and last point the same
            # Needed for coord_polar and such
            newdata <- rbind(newdata, newdata[1,])
            
            ggplot2:::ggname("geom_flat_violin", GeomPolygon$draw_panel(newdata, panel_scales, coord))
          },
          
          draw_key = draw_key_polygon,
          
          default_aes = aes(weight = 1, colour = "grey20", fill = "white", size = 0.5,
                            alpha = NA, linetype = "solid"),
          
          required_aes = c("x", "y")
  )

# Create a dataframe
Diet <- rep(rep(c("REF", "IM"), each = 18), 3)
Gut_segment <- rep(c("PI", "MI", "DI"), each = 36)
OS_index <- c(rnorm(36, mean=3, sd=1), rnorm(36, mean=0.2, sd=0.05), rnorm(36, mean=0.3, sd=0.08))
df <- data.frame(Diet, Gut_segment, OS_index)
head (df)
#>   Diet Gut_segment OS_index
#> 1  REF          PI 2.657419
#> 2  REF          PI 3.355020
#> 3  REF          PI 1.895698
#> 4  REF          PI 2.569957
#> 5  REF          PI 3.105802
#> 6  REF          PI 1.941960

# Define the desired order of "Diet" and "Gut_segment"
df$Diet <- factor(df$Diet, levels = c("REF", "IM"))
df$Gut_segment <- factor(df$Gut_segment, levels = c("PI", "MI", "DI"))

# make raincloud plot #######################################################################################

library(cowplot)
#> Warning: ³Ì¼­°ü'cowplot'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ
#> 
#> ÔØÈë³Ì¼­°ü£º'cowplot'
#> The following object is masked from 'package:ggplot2':
#> 
#>     ggsave

# Make a function for setting the number of decimals shown on the graph
fmt_dcimals <- function(decimals = 0){
  function(x) format(x, nsmall = decimals,scientific = FALSE)
}

# First try with facet_wrap #################################################################################
ggplot(df, aes(x = Gut_segment, y = OS_index, fill = Diet)) +
  geom_flat_violin(aes(fill = Diet),
                   position = position_nudge(x = .1, y = 0), 
                   adjust = 1.5, 
                   trim = FALSE, 
                   alpha = .5, 
                   colour = NA) +
  geom_point(aes(x = as.numeric(Gut_segment)-.15, y = OS_index, colour = Diet),
             position = position_jitter(width = .05), 
             size = 1, 
             shape = 20) +
  geom_boxplot(aes(x = Gut_segment, y = OS_index, fill = Diet),
               outlier.shape = NA, 
               alpha = .5, 
               width = .1, 
               colour = "black") +
  scale_y_continuous(limits = c(0, NA), expand = c(0,0), labels = fmt_dcimals(1)) +
  ylab('Organosomatic index (%)') +
  facet_wrap(~Gut_segment, nrow = 1, scales="free_y") +
  scale_colour_brewer(palette = "Dark2") +
  scale_fill_brewer(palette = "Dark2")
#> Warning: ³Ì¼­°ü'bindrcpp'ÊÇÓÃR°æ±¾3.4.4 À´½¨ÔìµÄ
#> Warning: Removed 17 rows containing missing values (geom_flat_violin).

### COMMENTS:                                                   
### Using "Gut_segment" as independent variable resulted in showing all the gut segments on x axis in each plot.  
### If use "Diet" as independent variable, can't get the Figure 10 style.                                

# Second try: split the dataframe, make raincloud plot one by one and combine them together ################# 
# Split the dataframe by "Gut_segment" 
df_spl <- split(df, f = df$Gut_segment)

# make a raincloud plot for each data frame 
plist <- lapply(
  df_spl, 
  function(x) 
  {
    ggplot(x, aes(x = Gut_segment, y = OS_index, fill = Diet)) +
      geom_flat_violin(aes(fill = Diet),
                       position = position_nudge(x = .1, y = 0), 
                       adjust = 1.5, 
                       trim = FALSE, 
                       alpha = .5, 
                       colour = NA) +
      geom_point(aes(x = as.numeric(Gut_segment)-.15, y = OS_index, colour = Diet), 
                 position = position_jitter(width = .05), 
                 size = 1, 
                 shape = 20) +
      geom_boxplot(aes(x = Gut_segment, y = OS_index, fill = Diet),
                   outlier.shape = NA, 
                   alpha = .5, 
                   width = .1, 
                   colour = "black") +
      scale_y_continuous(limits = c(0, NA), expand = c(0,0), labels = fmt_dcimals(1)) +
      ylab('Organosomatic index (%)') +
      scale_colour_brewer(palette = "Dark2") +
      scale_fill_brewer(palette = "Dark2")
  }
)

plot_grid(plotlist = plist, ncol = 3)
#> Warning: Removed 17 rows containing missing values (geom_flat_violin).

### COMMENTS:                                                 
### Somehow, "Gut_segment" was converted to differnet numbers for different dataframe in the list, resulting
### in wrong alignments for the geom_point. Could not figure out a workaround.                                

Created on 2018-10-20 by the reprex package (v0.2.1)

Session info
devtools::session_info()
#> - Session info ----------------------------------------------------------
#>  setting  value                         
#>  version  R version 3.4.3 (2017-11-30)  
#>  os       Windows 10 x64                
#>  system   x86_64, mingw32               
#>  ui       RTerm                         
#>  language (EN)                          
#>  collate  Chinese (Simplified)_China.936
#>  ctype    Chinese (Simplified)_China.936
#>  tz       Europe/Berlin                 
#>  date     2018-10-20                    
#> 
#> - Packages --------------------------------------------------------------
#>  package      * version date       lib source        
#>  assertthat     0.2.0   2017-04-11 [1] CRAN (R 3.4.4)
#>  backports      1.1.2   2017-12-13 [1] CRAN (R 3.4.3)
#>  base64enc      0.1-3   2015-07-28 [1] CRAN (R 3.4.1)
#>  bindr          0.1.1   2018-03-13 [1] CRAN (R 3.4.4)
#>  bindrcpp     * 0.2.2   2018-03-29 [1] CRAN (R 3.4.4)
#>  broom          0.5.0   2018-07-17 [1] CRAN (R 3.4.3)
#>  callr          3.0.0   2018-08-24 [1] CRAN (R 3.4.4)
#>  cellranger     1.1.0   2016-07-27 [1] CRAN (R 3.4.4)
#>  cli            1.0.1   2018-09-25 [1] CRAN (R 3.4.4)
#>  colorspace     1.3-2   2016-12-14 [1] CRAN (R 3.4.4)
#>  cowplot      * 0.9.3   2018-07-15 [1] CRAN (R 3.4.4)
#>  crayon         1.3.4   2017-09-16 [1] CRAN (R 3.4.4)
#>  curl           3.2     2018-03-28 [1] CRAN (R 3.4.4)
#>  debugme        1.1.0   2017-10-22 [1] CRAN (R 3.4.4)
#>  desc           1.2.0   2018-05-01 [1] CRAN (R 3.4.4)
#>  devtools       2.0.0   2018-10-19 [1] CRAN (R 3.4.3)
#>  digest         0.6.18  2018-10-10 [1] CRAN (R 3.4.4)
#>  dplyr        * 0.7.7   2018-10-16 [1] CRAN (R 3.4.4)
#>  evaluate       0.12    2018-10-09 [1] CRAN (R 3.4.4)
#>  forcats      * 0.3.0   2018-02-19 [1] CRAN (R 3.4.4)
#>  fs             1.2.6   2018-08-23 [1] CRAN (R 3.4.4)
#>  ggplot2      * 3.0.0   2018-07-03 [1] CRAN (R 3.4.4)
#>  glue           1.3.0   2018-07-17 [1] CRAN (R 3.4.4)
#>  gtable         0.2.0   2016-02-26 [1] CRAN (R 3.4.4)
#>  haven          1.1.2   2018-06-27 [1] CRAN (R 3.4.4)
#>  hms            0.4.2   2018-03-10 [1] CRAN (R 3.4.4)
#>  htmltools      0.3.6   2017-04-28 [1] CRAN (R 3.4.4)
#>  httr           1.3.1   2017-08-20 [1] CRAN (R 3.4.4)
#>  jsonlite       1.5     2017-06-01 [1] CRAN (R 3.4.4)
#>  knitr          1.20    2018-02-20 [1] CRAN (R 3.4.4)
#>  labeling       0.3     2014-08-23 [1] CRAN (R 3.4.1)
#>  lattice        0.20-35 2017-03-25 [2] CRAN (R 3.4.3)
#>  lazyeval       0.2.1   2017-10-29 [1] CRAN (R 3.4.4)
#>  lubridate      1.7.4   2018-04-11 [1] CRAN (R 3.4.4)
#>  magrittr       1.5     2014-11-22 [1] CRAN (R 3.4.4)
#>  memoise        1.1.0   2017-04-21 [1] CRAN (R 3.4.4)
#>  mime           0.6     2018-10-05 [1] CRAN (R 3.4.4)
#>  modelr         0.1.2   2018-05-11 [1] CRAN (R 3.4.4)
#>  munsell        0.5.0   2018-06-12 [1] CRAN (R 3.4.4)
#>  nlme           3.1-137 2018-04-07 [1] CRAN (R 3.4.4)
#>  pillar         1.3.0   2018-07-14 [1] CRAN (R 3.4.4)
#>  pkgbuild       1.0.2   2018-10-16 [1] CRAN (R 3.4.3)
#>  pkgconfig      2.0.2   2018-08-16 [1] CRAN (R 3.4.4)
#>  pkgload        1.0.1   2018-10-11 [1] CRAN (R 3.4.4)
#>  plyr           1.8.4   2016-06-08 [1] CRAN (R 3.4.4)
#>  prettyunits    1.0.2   2015-07-13 [1] CRAN (R 3.4.4)
#>  processx       3.2.0   2018-08-16 [1] CRAN (R 3.4.4)
#>  ps             1.1.0   2018-08-10 [1] CRAN (R 3.4.4)
#>  purrr        * 0.2.5   2018-05-29 [1] CRAN (R 3.4.4)
#>  R6             2.3.0   2018-10-04 [1] CRAN (R 3.4.4)
#>  RColorBrewer   1.1-2   2014-12-07 [1] CRAN (R 3.4.1)
#>  Rcpp           0.12.19 2018-10-01 [1] CRAN (R 3.4.4)
#>  readr        * 1.1.1   2017-05-16 [1] CRAN (R 3.4.4)
#>  readxl         1.1.0   2018-04-20 [1] CRAN (R 3.4.4)
#>  remotes        2.0.1   2018-10-19 [1] CRAN (R 3.4.3)
#>  rlang          0.2.2   2018-08-16 [1] CRAN (R 3.4.4)
#>  rmarkdown      1.10    2018-06-11 [1] CRAN (R 3.4.4)
#>  rprojroot      1.3-2   2018-01-03 [1] CRAN (R 3.4.4)
#>  rvest          0.3.2   2016-06-17 [1] CRAN (R 3.4.4)
#>  scales         1.0.0   2018-08-09 [1] CRAN (R 3.4.4)
#>  sessioninfo    1.1.0   2018-09-25 [1] CRAN (R 3.4.4)
#>  stringi        1.1.7   2018-03-12 [1] CRAN (R 3.4.4)
#>  stringr      * 1.3.1   2018-05-10 [1] CRAN (R 3.4.4)
#>  testthat       2.0.1   2018-10-13 [1] CRAN (R 3.4.4)
#>  tibble       * 1.4.2   2018-01-22 [1] CRAN (R 3.4.4)
#>  tidyr        * 0.8.1   2018-05-18 [1] CRAN (R 3.4.4)
#>  tidyselect     0.2.5   2018-10-11 [1] CRAN (R 3.4.4)
#>  tidyverse    * 1.2.1   2017-11-14 [1] CRAN (R 3.4.4)
#>  usethis        1.4.0   2018-08-14 [1] CRAN (R 3.4.4)
#>  withr          2.1.2   2018-03-15 [1] CRAN (R 3.4.4)
#>  xml2           1.2.0   2018-01-24 [1] CRAN (R 3.4.4)
#>  yaml           2.2.0   2018-07-25 [1] CRAN (R 3.4.4)
#> 
#> [1] C:/Users/ljt89/Documents/R/win-library/3.4
#> [2] C:/Program Files/R/R-3.4.3/library
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant