Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group-wise color in scatterplot #113

Open
araikes opened this issue Apr 19, 2019 · 4 comments
Open

Group-wise color in scatterplot #113

araikes opened this issue Apr 19, 2019 · 4 comments
Assignees
Labels
type: enhancement Improvement on an existing feature

Comments

@araikes
Copy link

araikes commented Apr 19, 2019

Hello,
First, thank you for your awesome product. I really appreciate the data exploration tool you've put together.

I'm trying to figure out if certain functionality exists. I'm running DataExplorer v 0.8.0.

I have the following dataframe:

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':	32 obs. of  7 variables:
 $ group          : Factor w/ 2 levels "amber","blue": 1 2 2 1 2 1 1 2 2 2 ...
 $ ess_score      : num  13 14 9 12 1 3 11 8 5 8 ...
 $ rpcsq_rpq3     : num  2 4 2 3 5 0 6 4 0 0 ...
 $ rpcsq_rpq13    : num  4 16 19 2 16 4 15 23 12 16 ...
 $ rpcsq_cognitive: num  0 4 8 0 6 2 2 10 6 7 ...
 $ rpcsq_somatic  : num  6 14 13 3 15 2 12 10 6 6 ...
 $ rpcsq_emotional: num  0 2 0 2 0 0 7 7 0 3 ...

I'd like to produce ess x rpcsq scatterplots (5 scatterplots) with the points colored by group. I've tried the following:

> plot_scatterplot(tmp, by = "ess_score")

This works but obviously doesn't color the points. The following code however fails to produce the plots:

> plot_scatterplot(tmp, by = "ess_score", geom_point_args = list(col = "group"))
Error in grDevices::col2rgb(colour, TRUE) : invalid color name 'group'
> plot_scatterplot(tmp, by = "ess_score", geom_point_args = list(col = group))
Error in do.call("geom_point", geom_point_args) : 
  object 'group' not found
> plot_scatterplot(tmp, by = "ess_score", geom_point_args = list(group = "group", col = "group"))
Error in grDevices::col2rgb(colour, TRUE) : invalid color name 'group'

Does the functionality I'm looking for exist in the current iteration of DataExplorer? Thanks for any help you can be.

@boxuancui
Copy link
Owner

boxuancui commented Apr 19, 2019

Thanks for using DataExplorer. For your need, you will have to tweak the source code a little. Copy & paste the following function and you should be able to pass group.

plot_scatterplot2 <- function(data, by, group, sampled_rows = nrow(data), geom_point_args = list(), title = NULL, ggtheme = theme_gray(), theme_config = list(), nrow = 3L, ncol = 3L, parallel = FALSE) {
  variable <- NULL
  if (!is.data.table(data)) data <- data.table(data)
  if (sampled_rows < nrow(data)) data <- data[sample.int(nrow(data), sampled_rows)]
  dt <- suppressWarnings(melt.data.table(data, id.vars = c(by, group), variable.factor = FALSE))
  feature_names <- unique(dt[["variable"]])
  layout <- DataExplorer:::.getPageLayout(nrow, ncol, length(feature_names))
  plot_list <- DataExplorer:::.lapply(
    parallel = parallel,
    X = layout,
    FUN = function(x) {
      ggplot(dt[variable %in% feature_names[x]], aes_string(x = by, y = "value", color = group)) +
        do.call("geom_point", geom_point_args) +
        coord_flip() +
        xlab(by)
    }
  )
  class(plot_list) <- c("multiple", class(plot_list))
  plotDataExplorer(
    plot_obj = plot_list,
    page_layout = layout,
    title = title,
    ggtheme = ggtheme,
    theme_config = theme_config,
    facet_wrap_args = list(
      "facet" = ~ variable,
      "nrow" = nrow,
      "ncol" = ncol,
      "scales" = "free_x",
      "shrink" = FALSE
    )
  )
}

Then just do:

plot_scatterplot2(tmp, by = "ess_score", group = "group")

I tested it on iris and it works fine:

plot_scatterplot2(iris, by = "Sepal.Length", group = "Species")

@boxuancui
Copy link
Owner

Please also keep this issue open. I might be able to add this in future versions, but I can't promise which one.

@boxuancui boxuancui added the type: enhancement Improvement on an existing feature label Apr 19, 2019
@boxuancui boxuancui self-assigned this Apr 19, 2019
@araikes
Copy link
Author

araikes commented Apr 19, 2019

Thanks @boxuancui. I'll give it a shot.

@khughitt
Copy link

khughitt commented Jun 1, 2019

In general, having the ability to color (or assign other ggplot2 aesthetics) based on some groups defined in some column would be quite useful for many of the plotting functions (plot_density, etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement Improvement on an existing feature
Projects
None yet
Development

No branches or pull requests

3 participants