Accelerating Interoperability With Databricks Lakehouse

From FHIR ingestion to patient outcomes analysis

In this solution accelerator, we demonstrate how we can leverage the lakehouse approach, for an in-depth analysis of patient outcomes, using EHR data. Consider a scenario that we have a collection of FHIR bundles and want to explore the effect of different factors on Covid outcomes. However, FHIR standard is primarily designed for the exchange of information and not optimized for analytics. To solve this problem, we need to flatten the the bundles (stored as nested json files) and extract resources such as patients, encounters, conditions etc. so that we can create a dataset which is ready for exploratory data analysis. We can decompose this process in 3 main steps:

Data ingestion (on the left)
- Simplify ingestion, from all kind of sources. As example, we'll use Databricks Labs dbignite library to ingest FHIR bundle as tables ready to be queried in SQL in one line.
- Query and explore the data ingested
- Optionaly we can secure data access
Eploratory Analysis/Data Curation (flow on the top)
- Create cohorts
- Create a patient level data strucure (a patient dashboard) from the bundles
- Investigate rate of hospital admissions among covid patients and explore correlations among different factors such as SDOH, disease history and hospital admission
Data Science / Advance Analystics (bottom)
- Create patient features
- Create a training dataset to build a model predicting and analysing our cohort
- Use SHAP for explaining the effect of different features on the outcome under study

Data

The data used in this demo is generated using synthea. We used covid infections module, which incorporates patient risk factors such as diabetes, hypertension and SDOH in determining outcomes. The data is available at s3://hls-eng-data-public/data/synthea/fhir/fhir/.

License ⚖️

Copyright / License info of the notebook. Copyright Databricks, Inc. [2022]. The source in this notebook is provided subject to the Databricks License. All included or referenced third party libraries are subject to the licenses set forth below.

Library Name	Library License	Library License URL	Library Source URL
Synthea	Apache License 2.0	https://github.com/synthetichealth/synthea/blob/master/LICENSE	https://github.com/synthetichealth/synthea
The Book of OHDSI	Creative Commons Zero v1.0 Universal license.	https://ohdsi.github.io/TheBookOfOhdsi/index.html#license	https://ohdsi.github.io/TheBookOfOhdsi/

Disclaimers

Databricks Inc. (“Databricks”) does not dispense medical, diagnosis, or treatment advice. This demo (“tool”) is for informational purposes only and may not be used as a substitute for professional medical advice, treatment, or diagnosis. This tool may not be used within Databricks to process Protected Health Information (“PHI”) as defined in the Health Insurance Portability and Accountability Act of 1996, unless you have executed with Databricks a contract that allows for processing PHI, an accompanying Business Associate Agreement (BAA), and are running this notebook within a HIPAA Account. Please note that if you run this notebook within Azure Databricks, your contract with Microsoft applies.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
00-README.py		00-README.py
01-interop-lakehouse-de.py		01-interop-lakehouse-de.py
02-interop-lakehouse-aa.py		02-interop-lakehouse-aa.py
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
RUNME.py		RUNME.py
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

.gitignore

.gitignore

00-README.py

00-README.py

01-interop-lakehouse-de.py

01-interop-lakehouse-de.py

02-interop-lakehouse-aa.py

02-interop-lakehouse-aa.py

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

NOTICE

NOTICE

README.md

README.md

RUNME.py

RUNME.py

SECURITY.md

SECURITY.md

Repository files navigation

Accelerating Interoperability With Databricks Lakehouse

From FHIR ingestion to patient outcomes analysis

Data

License ⚖️

Disclaimers

About

Releases

Packages

Contributors 2

Languages

License

databricks-industry-solutions/interop

Folders and files

Latest commit

History

Repository files navigation

Accelerating Interoperability With Databricks Lakehouse

From FHIR ingestion to patient outcomes analysis

Data

License ⚖️

Disclaimers

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Languages