BoXHED Fuse is a clinical note embedding pipeline designed for BoXHED 2.0, a software package for nonparametrically estimating hazard functions via gradient boosted trees. BoXHED Fuse aims to improve BoXHED 2.0's training capabilities by enriching its dataset with note embeddings.
As a repository, BoXHED Fuse functions independently of BoXHED 2.0. It transforms clinical notes from MIMIC IV electronic health records (EHR) into note embeddings, which are merged with BoXHED 2.0 time-series data. The resulting augmented dataset is directly usable for BoXHED 2.0.
For the BoXHED 2.0 repository, click here.
For the BoXHED 2.0 paper, please refer to Pakbin et al. (2023) for details.
-
TODO add BoXHED Fuse paper
-
Pakbin, Wang, Mortazavi, Lee (2023): BoXHED2.0: Scalable boosting of dynamic survival analysis
- Python (=3.10)
- conda (we recommend using the free Anaconda distribution)
- Set up a dedicated virtual environment for BoXHED Fuse. First, create a virtual environment called BoXHED_Fuse:
conda create -n BoXHED_Fuse python=3.10
then activate it
conda activate BoXHED_Fuse