Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feed formulas into summarise() as objects #3544

Closed
Prometheus77 opened this issue Apr 25, 2018 · 4 comments
Closed

Feed formulas into summarise() as objects #3544

Prometheus77 opened this issue Apr 25, 2018 · 4 comments

Comments

@Prometheus77
Copy link

Prometheus77 commented Apr 25, 2018

I've spent a decent amount of time reading (and rereading and rereading) various explanations of NSE and dplyr programming, but have not been able to solve what seems like it should be a simple problem. I'd like to define summarization formulas in advance, and then write a function that feeds those formulas into summarise() for evaluation.

The motivation for this is to build a package specific to a set of data that sets up complicated aggregations of that data in a standard way that doesn't need to be remembered by the users, but allows them to quickly and easily pull the data and aggregate it properly, by only remembering the name of the aggregation.

Here's an example of what I mean:

price_per_carat <- ~ sum(price) / sum(carat)
price_depth_ratio <- ~ sum(price) / sum(depth)

summarise_as <- function(.data, ...) {
  # some kind of black magic
}

diamonds %>%
  group_by(cut, clarity)
  summarise_as(price_per_carat, price_depth_ratio)

# A tibble: 40 x 4
# Groups:   cut [?]
     cut clarity price_per_carat price_depth_ratio
   <ord>   <ord>           <dbl>             <dbl>
 1  Fair      I1        2721.185          56.37119
 2  Fair     SI2        4297.840          80.31302
 3  Fair     SI1        4362.573          65.84666
 4  Fair     VS2        4715.875          65.60159
 5  Fair     VS1        4734.064          66.17081
 6  Fair    VVS2        4843.546          53.29967
 7  Fair    VVS1        5824.159          64.07653
 8  Fair      IF        4030.679          31.83685
 9  Good      I1        2989.670          57.94697
10  Good     SI2        4424.404          73.58696
# ... with 30 more rows

In case that wasn't perfectly clear, I'm basically trying to find a way to get the summarise_as(price_per_carat, price_depth_ratio) to behave as I had typed summarise(price_per_carat = sum(price) / sum(carat), price_depth_ratio = sum(price) / sum(depth)).

I've tried all sorts of forms of quo(), enquo(), as.formula, deparse(substitute()), etc. to try to get it to work, and all have failed. What's the proper way to do this?

@romainfrancois romainfrancois self-assigned this Apr 25, 2018
@romainfrancois
Copy link
Member

If you want to use formulas, here is one way:

library(tidyverse)
library(rlang)
#> 
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#> 
#>     %@%, %||%, as_function, flatten, flatten_chr, flatten_dbl,
#>     flatten_int, flatten_lgl, invoke, list_along, modify, prepend,
#>     rep_along, splice

summarise_as <- function(.data, ...) {
  dots  <- quos(...)
  names <- map_chr(dots, ~as.character(quo_get_expr(.x)))
  names(dots) <- names
  
  get_expr <- function(q){
    formula <- eval_tidy(quo_get_expr(q), quo_get_env(q))
    expr(!!formula[[2]])
  }
  
  exprs <- map( dots, get_expr)
  summarise(.data, !!!exprs)
}

price_per_carat <- ~ sum(price) / sum(carat)
price_depth_ratio <- ~ sum(price) / sum(depth)

diamonds %>%
  group_by(cut, clarity) %>%
  summarise_as(price_per_carat, price_depth_ratio)
#> # A tibble: 40 x 4
#> # Groups:   cut [?]
#>    cut   clarity price_per_carat price_depth_ratio
#>    <ord> <ord>             <dbl>             <dbl>
#>  1 Fair  I1                2721.              56.4
#>  2 Fair  SI2               4298.              80.3
#>  3 Fair  SI1               4363.              65.8
#>  4 Fair  VS2               4716.              65.6
#>  5 Fair  VS1               4734.              66.2
#>  6 Fair  VVS2              4844.              53.3
#>  7 Fair  VVS1              5824.              64.1
#>  8 Fair  IF                4031.              31.8
#>  9 Good  I1                2990.              57.9
#> 10 Good  SI2               4424.              73.6
#> # ... with 30 more rows

maybe @lionel- has a better approach

@romainfrancois romainfrancois removed their assignment Apr 25, 2018
@romainfrancois
Copy link
Member

Maybe that's easier:

library(tidyverse)
library(rlang)
#> 
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#> 
#>     %@%, %||%, as_function, flatten, flatten_chr, flatten_dbl,
#>     flatten_int, flatten_lgl, invoke, list_along, modify, prepend,
#>     rep_along, splice

exprs <- list( 
  price_per_carat = expr(sum(price) / sum(carat)), 
  price_depth_ratio = expr(sum(price) / sum(depth))
)

diamonds %>%
  group_by(cut, clarity) %>%
  summarise(!!!exprs)
#> # A tibble: 40 x 4
#> # Groups:   cut [?]
#>    cut   clarity price_per_carat price_depth_ratio
#>    <ord> <ord>             <dbl>             <dbl>
#>  1 Fair  I1                2721.              56.4
#>  2 Fair  SI2               4298.              80.3
#>  3 Fair  SI1               4363.              65.8
#>  4 Fair  VS2               4716.              65.6
#>  5 Fair  VS1               4734.              66.2
#>  6 Fair  VVS2              4844.              53.3
#>  7 Fair  VVS1              5824.              64.1
#>  8 Fair  IF                4031.              31.8
#>  9 Good  I1                2990.              57.9
#> 10 Good  SI2               4424.              73.6
#> # ... with 30 more rows

Created on 2018-04-25 by the reprex package (v0.2.0).

@romainfrancois
Copy link
Member

romainfrancois commented Apr 25, 2018

I'll close this now. Perhaps consider https://community.rstudio.com for things that are questions more than issues.

@lock
Copy link

lock bot commented Oct 22, 2018

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Oct 22, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants