Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating multiple one or n-way tables #36

Open
mbcann01 opened this issue Nov 20, 2020 · 2 comments
Open

Creating multiple one or n-way tables #36

mbcann01 opened this issue Nov 20, 2020 · 2 comments

Comments

@mbcann01
Copy link
Member

mbcann01 commented Nov 20, 2020

Currently using purr::map_df. Here is an example from L2C quarterly report:

# Loop over all categorical vars
cat_stats <- purrr::map_df(
  quos(gender_f, race_3cat_f, hispanic_f), 
  function(x) {
    demographics %>%
      filter(screened_in == 1) %>% 
      freq_table({{x}}) %>%
      freq_format(recipe = "n (percent)", digits = 1) %>%
      select(var, cat, formatted_stats) %>%
      # Add a row with the var name only
      add_row(var = quo_name(x), .before = 1) %>% 
      # Add blank row below
      add_row(var = "", cat = "", formatted_stats = "")
  }
)

I should either:

  1. Create a wrapper function to make this easier to read.
  2. Document using purrr::map_df really well.
  3. Both.
@mbcann01 mbcann01 moved this from To do to In progress in Bug fixes and enhancements Jun 1, 2021
@mbcann01 mbcann01 removed this from In progress in Bug fixes and enhancements Jun 1, 2021
@mbcann01
Copy link
Member Author

mbcann01 commented Jul 10, 2022

Make a wrapper?

freq_tables <- function(.data, ...) {
  purrr::map(
    .x = enquos(...),
    .f = ~ .data %>% freq_table({{ .x }})
  )
}

mtcars %>% 
  group_by(am) %>% 
  freq_tables(cyl, vs)

Returns a list of data frames

@mbcann01
Copy link
Member Author

Multiple 2-way tables when multiple variables are passed to freq_table()

Previously, passing two variable names to the freq_table() function would create a two-way table. After, issue #40 it still does, but not for a lot longer. After a year or so, we are going to remove ... from freq_table(), which will make that functionality go away. At that point, passing two variables to freq_table() could create multiple two-way tables. Here is some code demonstrating what that might look like.

Cyl is the outcome var of interest

mtcars %>% 
  freq_table(cyl)
  var cat  n n_total percent       se   t_crit      lcl      ucl
1 cyl   4 11      32  34.375 8.530513 2.039513 19.49961 53.11130
2 cyl   6  7      32  21.875 7.424859 2.039513 10.34883 40.44691
3 cyl   8 14      32  43.750 8.909831 2.039513 27.09672 61.94211

Now, cyl within levels of am

mtcars %>% 
  group_by(am) %>%
  freq_table(cyl)
  row_var row_cat col_var col_cat     n n_row n_total percent_total se_total t_crit_total lcl_total ucl_total percent_row se_row
  <chr>   <chr>   <chr>   <chr>   <int> <int>   <int>         <dbl>    <dbl>        <dbl>     <dbl>     <dbl>       <dbl>  <dbl>
1 am      0       cyl     4           3    19      32          9.38     5.24         2.04      2.86      26.7        15.8   8.59
2 am      0       cyl     6           4    19      32         12.5      5.94         2.04      4.51      30.2        21.1   9.61
3 am      0       cyl     8          12    19      32         37.5      8.70         2.04     22.0       56.1        63.2  11.4 
4 am      1       cyl     4           8    13      32         25        7.78         2.04     12.5       43.7        61.5  14.0 
5 am      1       cyl     6           3    13      32          9.38     5.24         2.04      2.86      26.7        23.1  12.2 
6 am      1       cyl     8           2    13      32          6.25     4.35         2.04      1.45      23.2        15.4  10.4 
# … with 3 more variables: t_crit_row <dbl>, lcl_row <dbl>, ucl_row <dbl>

That is the result we want. However, this works too.

mtcars %>% 
  freq_table(am, cyl)
  row_var row_cat col_var col_cat     n n_row n_total percent_total se_total t_crit_total lcl_total ucl_total percent_row se_row
  <chr>   <chr>   <chr>   <chr>   <int> <int>   <int>         <dbl>    <dbl>        <dbl>     <dbl>     <dbl>       <dbl>  <dbl>
1 am      0       cyl     4           3    19      32          9.38     5.24         2.04      2.86      26.7        15.8   8.59
2 am      0       cyl     6           4    19      32         12.5      5.94         2.04      4.51      30.2        21.1   9.61
3 am      0       cyl     8          12    19      32         37.5      8.70         2.04     22.0       56.1        63.2  11.4 
4 am      1       cyl     4           8    13      32         25        7.78         2.04     12.5       43.7        61.5  14.0 
5 am      1       cyl     6           3    13      32          9.38     5.24         2.04      2.86      26.7        23.1  12.2 
6 am      1       cyl     8           2    13      32          6.25     4.35         2.04      1.45      23.2        15.4  10.4 
# … with 3 more variables: t_crit_row <dbl>, lcl_row <dbl>, ucl_row <dbl>

If that didn't what would we want it to return instead? A list of one-way tables?

I think I also want to get rid of some of this output. It's too much.

# Multiple n-way tables
freq_table2 <- function(.data, .freq_var, drop = FALSE) {
  
  # ===========================================================================
  # Get within group counts
  # .drop = FALSE creates an explicit n = 0 for unobserved factor levels
  # ===========================================================================
  .data <- dplyr::count(.data, {{ .freq_var }}, .drop = drop)
  return(.data)
  
  # Return tibble of results
  out
}

# For testing
# mtcars %>% 
#   group_by(am) %>% 
#   freq_table2(cyl)

# And if you want more than one table
purrr::map(
  .x = quos(cyl, vs),
  .f = ~ mtcars %>% group_by(am) %>% freq_table2({{ .x }})
)
# Make a wrapper?
freq_tables <- function(.data, ...) {
  dot_syms <- ensyms(...)
  dot_names <- purrr::map(dot_syms, rlang::as_name)
  purrr::map(
    .x = enquos(...),
    .f = ~ .data %>% freq_table2({{ .x }})
  ) %>% 
    rlang::set_names(dot_names)
}

mtcars %>% 
  group_by(am) %>% 
  freq_tables(cyl, vs)

This creates a list of named freq tables.

@mbcann01 mbcann01 removed this from To do in Bug fixes and enhancements Jul 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Develop Next
Development

No branches or pull requests

1 participant