Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to make n-way tables #22

Open
5 tasks
mbcann01 opened this issue Feb 15, 2020 · 6 comments
Open
5 tasks

Add ability to make n-way tables #22

mbcann01 opened this issue Feb 15, 2020 · 6 comments

Comments

@mbcann01
Copy link
Member

mbcann01 commented Feb 15, 2020

Overview

Currently, I freqtables will only create one- and two-way tables. It will not create n-way tables. We want to add the ability to create n-way tables.

What I had in mind was something like:

demo_nih %>% 
    freqtables::freq_table(ethnicity_nih, race_nih, sex_nih)

However, what I've been doing in the meantime is:

make_table_section <- function(cat) {
  demo_nih %>% 
    filter(ethnicity_nih == cat) %>% 
    freqtables::freq_table(race_nih, sex_nih) 
}

And then:

purrr::map_dfc(
  .x = c("Not Hispanic or Latino", "Hispanic or Latino", "Unknown/Not Reported"),
  .f = make_table_section
)

Obviously, this is more verbose, but it gets the job done and is very versitile (e.g., user can return a list instead of a data frame). However, the spirit of freqtables isn't really to be the most "versitile" package. It's to be the easiest to use "out of the box" for 85%+ of normal use. Give this some thought.

The suggestion from a user on RStudio Community could also be useful:

mtcars %>% 
  gather(variable,category,cyl,vs,am,factor_key = TRUE)%>%
  group_by(variable,category)%>%
  summarize(n=n())

Left off at

2023-03-17: Working on test.Rmd as part of #40.

  • I created two data files for comparing freqtables with Stata and SAS.
  • The data files are called /inst/extdata/freq_study.dta and /inst/extdata/freq_study.xpt.
  • These data files are created using data-raw/study.R.
  • I also created a do file - /inst/extdata/compare_freqtables.do - and a SAS script - /inst/extdata/compare_freqtables.sas.
  • I added all of these files to buildignore.

2020-06-11: Created test.Rmd on the plane to Minnesota to test out different ways of doing this. test.Rmd is git ignored and build ignored.

Tasks

  • Complete one, two, and n-way tables in Stata (/inst/extdata/compare_freqtables.do). Use them for comparison.
  • Complete one, two, and n-way tables in SAS (/inst/extdata/compare_freqtables.sas). Use them for comparison.
  • Figure out how you want freq_tbl to treat n-way tables.
  • Figure out how you want freq_table to treat n-way tables.
  • Figure out how you want freq_test to calculated stats for n-way tables.
@mbcann01
Copy link
Member Author

Here's how Stata does it: Screen Shot 2020-06-14 at 11.04.33 AM.png

@mbcann01
Copy link
Member Author

2020-06-15:
We may want to distinguish between an n-way freq_table (shows overall n, prop, ci like Stata) and a grouped_by n-way freq_table that uses row n's and percents instead.

@mbcann01
Copy link
Member Author

2020-06-11 - Notes from while I was on the plane:

  • If you allow group_by to work again, then you may need to change the descriptive analysis vignette.
  • If you allow 3-way tables then you will have to change some of the wording in the descriptive analysis vignette. Specifically, in Bivariate percentages and 95% log transformed confidence intervals.
  • If you change the row/column terminology to group/subgroup terminology then you will have to change some of the wording in the descriptive analysis vignette.

Row/column labels

  • What is the best way to create a contingency table? Then row/column makes sense.
  • group level 1, group level 2, group level 3
  • group, subgroup 1, subgroup 3

@mbcann01
Copy link
Member Author

Do this in a new branch

@mbcann01
Copy link
Member Author

So, meantables uses group_by. Then, the output is labeled "response_var" and "group_var". It might be worth considering keeping this consistent.

One var

mtcars %>% freq_table(am)

Two+ vars

mtcars %>% group_by(mpg) %>% freq_table(am)

Could even have "response_var" (or something similar) and "group_var" in table of results.

@mbcann01
Copy link
Member Author

mbcann01 commented Aug 19, 2020

Needed it on the Sun Study for this (as an example):

map_student %>% 
  filter(!is.na(ss_application_f)) %>% 
  freq_table(period_f, teacher_f, ss_application_f)

Tested this in stata with: by period_f, sort : tabulate teacher_f ss_application_f, chi2. It returns this:

Screen Shot 2020-08-19 at 2.19.12 PM.png

Tried it in SAS using :

proc freq data=map_student;
	tables period_f * teacher_f * ss_application_f;
run;

Which returned

Screen Shot 2020-08-19 at 3.12.33 PM.png

I also tried proc surveyfreq, but that won't return chisq for three-way tables.

@mbcann01 mbcann01 added this to In progress in Bug fixes and enhancements Jun 1, 2021
@mbcann01 mbcann01 added this to In progress in n-way tables Jun 1, 2021
@mbcann01 mbcann01 removed this from In progress in Bug fixes and enhancements Jun 1, 2021
@mbcann01 mbcann01 removed this from To do in Bug fixes and enhancements Jul 31, 2022
@mbcann01 mbcann01 changed the title Add ability to make 3-way tables Add ability to make n-way tables Mar 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Develop Next
n-way tables
In progress
Development

No branches or pull requests

1 participant