This project includes code for our study on the association between sex and race and ethnicity and intravenous sedation use in patients receiving invasive ventilation.
First gain access to MIMIC-IV through PhysioNet, specifically the BigQuery datasets.
Create your own BigQuery dataset to store intermediate tables with the following shell command: bq mk --dataset PROJECT_ID:DATASET_ID
.
Additionally, you will need to install a number of Python packages. This can be done by pip3 install -r requirements.txt
.
There are two main cohorts: one for the primary analysis and the sensitivity analysis looking at interaction effects between sex and race and ethnicity, and one for the sensitivity analysis including patients who were previously excluded due to not having a race or ethnicity listed (called all_patients).
See the scripts in create_cohorts/
.
For the primary analysis and sensitivity analyses regarding interaction effects, run the generate_tables script ./generate_tables.sh DATASET_ID
.
Run ./generate_tables_all_patients.sh DATASET_ID
.
You will have to manually download the csv files created from running the generate_tables scripts for the succeeding steps.
See files in data_processing/
.
Note some paths may have to be changed to ensure smooth flow when running from the command line.
Run ./process_data.sh OUTCOME_PATH IMPUTED_DATA_PATH BASELINE_PATH TIME_VARYING_PATH BASELINE_TIME_VARYING_PATH
.
The first two arguments, OUTCOME_PATH
and IMPUTED_DATA_PATH
, are paths to where you would like the processed data to be stored.
The last three arguments, BASELINE_PATH
, TIME_VARYING_PATH
, and BASELINE_TIME_VARYING_PATH
, are the path to the tables generated by generate_tables.sh
.
The same as above, but run ./process_data_all_patients.sh OUTCOME_PATH IMPUTED_DATA_PATH BASELINE_PATH TIME_VARYING_PATH BASELINE_TIME_VARYING_PATH
instead.
For tables describing the data, you can run python3 create_tables.py BASELINE_PATH TIME_VARYING_PATH BASELINE_TIME_VARYING_PATH OUTPUT_BASELINE OUTPUT_DRUG OUTPUT_MISSING
.
The first 3 arguments are the same 3 files as used when running generate_tables.sh
.
The last 3 arguments are the paths for your desired output location.
You can run the models in analyses/. For example, Rscript {outcome}.R TYPE 1
, where TYPE=2 runs the main analysis, TYPE=3 includes interaction effects, and TYPE=4 runs the all_patients sensitivity analysis.
Note that you may need to update the paths for where the data is stored to IMPUTED_DATA_PATH from the data processing stage.