Skip to content

omic/jailbreak

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HitCount

We jailbreak Electronic Health Records (EMRs) to help solve COVID-19.

Under HIPPA federal law, you have the right to your health records. What this project enables you to do is to contribute your anonymized data help researchers around the globe discover optimal treatment paths.

Join the fight.

supported providers

how it works

The current process is quite simple and will become more complicated in the future, especially as multi-omics are considered.

  1. Enter healthcare provider alias (see below table) and portal credentials.
  2. Run the jailbreak.sh script. Make sure you have NPM installed. The program then executes:
    1. Login to provider's portal.
    2. Navigate to and download C-CDA (medical) records.
    3. Parse records into JSON format, anonymizing the data in the process.
    4. Convert data to OMOP CDM.
    5. Publish anonymized, transformed data to central database for researchers to analyze.
# Add provider credentials to untracked secret directory.
mkdir ~/.secrets/
echo "[healthcare-alias],[portal-username],[portal-password]" >> ~/.secrets/creds.csv

# Make some waves.
./jailbreak.sh

Currently supported and not-yet-supported healthcare providers:

Provider Supported? Alias Portal
UW Medicine Y uw -
Kaiser Permanente Y kaiser -
Aetna N aetna -
Johnson & Johnson N johnson2 -
UnitedHealth Group N united -
Cardinal Health N cardinal -
Anthem N anthem -
CVS Health N cvs -
AmerisourceBergen N ameriberg -
Express Scripts Holdings N express -

...and more.

todo contributions welcome

  • Write githook to prevent engineers from stupidly committing highly sensitive secrets.
  • Prompt user to reset password to help them stay extra secure.
  • Compile list of most related repos and integrate some of their insights.
  • Secure compression and archiving of data for more thorough extraction later.

disclaimer

First off, if the approach is not secure, we won't do it. We believe we can protect the privacy of data donors and simultaneously further scientific research for the common good. Current approaches for patients to willingly donate (anonymized) EMR data to further scientific discoveries -- especially given the COVID-19 pandemic -- are incredibly anitquated and nearly nonexistent.

So we're trying something different. There is absolutely no guarantee that this will work.

research

Our first research target is in determining the most effective treatment paths for individuals based on what is already being tried and tested by doctors.

making contributions

consent

Starting off, patient's must give us consent to crawl and anonymize their EMRs for them. The current approach relies on them providing us with login credentials in order for us to peruse their healthcare provider's portal. This is likely not the PROD solution, because who the hell would do that?

Look at working in a very clear DocuSign here along with a privacy policy so people know we're not evil.

extract

Crawlers run in secure headless browsers, automating the process of extracting patient records from healthcare provider portals. Is there a better way?

transform

The extracted data must then go through:

  • anonomyzation -- should do this as soon as possible upon obtaining the data. At the basic level, we're need to remove SSNs, names, and probably birthdates.
  • normalization -- tools such as White Rabit are meant to help at this step.

load

Data is currently structured in the OMOP data format in Redshift tables so the data from the prior step should fit without much fuss. TBD.

references