Skip to content

sarxhlwalker/roses-2024

Repository files navigation

roses-2024

This project includes code for our study on the association between sex and race and ethnicity and intravenous sedation use in patients receiving invasive ventilation.

Installation

First gain access to MIMIC-IV through PhysioNet, specifically the BigQuery datasets.

Create your own BigQuery dataset to store intermediate tables with the following shell command: bq mk --dataset PROJECT_ID:DATASET_ID.

Additionally, you will need to install a number of Python packages. This can be done by pip3 install -r requirements.txt.

Cohort Creation

There are two main cohorts: one for the primary analysis and the sensitivity analysis looking at interaction effects between sex and race and ethnicity, and one for the sensitivity analysis including patients who were previously excluded due to not having a race or ethnicity listed (called all_patients).

See the scripts in create_cohorts/.

Primary Cohort

For the primary analysis and sensitivity analyses regarding interaction effects, run the generate_tables script ./generate_tables.sh DATASET_ID.

All_Patients Cohort

Run ./generate_tables_all_patients.sh DATASET_ID.

You will have to manually download the csv files created from running the generate_tables scripts for the succeeding steps.

Data Processing

See files in data_processing/.

Note some paths may have to be changed to ensure smooth flow when running from the command line.

Primary Cohort

Run ./process_data.sh OUTCOME_PATH IMPUTED_DATA_PATH BASELINE_PATH TIME_VARYING_PATH BASELINE_TIME_VARYING_PATH.

The first two arguments, OUTCOME_PATH and IMPUTED_DATA_PATH, are paths to where you would like the processed data to be stored.

The last three arguments, BASELINE_PATH, TIME_VARYING_PATH, and BASELINE_TIME_VARYING_PATH, are the path to the tables generated by generate_tables.sh.

All_Patient Cohort

The same as above, but run ./process_data_all_patients.sh OUTCOME_PATH IMPUTED_DATA_PATH BASELINE_PATH TIME_VARYING_PATH BASELINE_TIME_VARYING_PATH instead.

Analysis

Tables

For tables describing the data, you can run python3 create_tables.py BASELINE_PATH TIME_VARYING_PATH BASELINE_TIME_VARYING_PATH OUTPUT_BASELINE OUTPUT_DRUG OUTPUT_MISSING.

The first 3 arguments are the same 3 files as used when running generate_tables.sh.

The last 3 arguments are the paths for your desired output location.

Models

You can run the models in analyses/. For example, Rscript {outcome}.R TYPE 1, where TYPE=2 runs the main analysis, TYPE=3 includes interaction effects, and TYPE=4 runs the all_patients sensitivity analysis.

Note that you may need to update the paths for where the data is stored to IMPUTED_DATA_PATH from the data processing stage.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published