A Multimodal Dataset for Assessing Emotion, Stress,
and Emotional Workload in Interpersonal Work Scenario
📌 This repository contains supplementary code and technical validation materials for the manuscript
"EmoWork: A Multimodal Dataset for Assessing Emotion, Stress, and Emotional Workload in Interpersonal Work Scenario" (under review)
The dataset itself is available at Zenodo - EmoWork.
TECHNICAL_VALIDATION/
├── Dataset_Records.ipynb # Data source summary and preprocessing overview
├── Label_Analysis.ipynb # Label distribution, missing data, and correlation analysis
├── ML_analysis.ipynb # Machine learning model implementation and evaluation
└── utils/ # Utility scripts
RESULTS/
├── Condition/ # Session classification results (GT = session)
│ └── [model_name]/ # e.g., DecisionTree, RandomForest, ...
│ ├── all_runs_results.csv
│ └── summary_5runs.csv
├── Perceived/ # Label prediction results (GT = perceived_*)
│ └── [label_name]/ # e.g., perceived_arousal, perceived_stress, ...
│ └── [model_name]/ # e.g., XGBoost, SVM, ...
│ ├── all_runs_results.csv
│ └── summary_5runs.csv
EmoWork/ # Directory to store the dataset files downloaded from Zenodo
├── META/ # Metadata files
├── LABELS/ # Ground-truth labels (e.g., perceived stress, arousal)
└── SENSORS/ # Multimodal sensor data
figures/
├── sensor_data/ # Visualizations from Dataset_Records.ipynb
├── label_analysis/ # Visualizations from Label_Analysis.ipynb
└── model_results/ # Visualizations from ML_analysis.ipynb
LICENSE
README.md
requirements.txt
We recommend using Python 3.10. Some dependencies may not be fully compatible with Python 3.11. All notebooks were developed and tested using Python 3.10.
- Clone this repository
git clone https://github.com/Kaist-ICLab/EmoWork.git
- Install dependencies:
pip install -r requirements.txt
- Run the notebooks in
TECHNICAL_VALIDATION
folder:
This notebook summarizes the dataset structure and provides a high-level overview of data sources and preprocessing steps. This notebook includes:
- Data collection protocol details
- Data quality checks
- Missing data analysis
- Data synchronization procedures
Example of heart rate signal collected from Polar H10
Additional visualizations generated from this notebook are available in the figures/sensor_data/
directory.
This notebook analyzes the distribution of self-reported labels (e.g., perceived arousal, stress, suppression, valence), investigates missing values, and explores correlations and group differences (e.g., by gender or role).
Distribution of perceived arousal and valence across all participants
Additional visualizations generated from this notebook are available in the figures/label_analysis/
directory.
This notebook builds machine learning models to predict perceived emotional states
(arousal
, stress
, suppression
, valence
) using five classifiers:
Decision Tree, Random Forest, SVM, XGBoost, and kNN.
Model performance is evaluated with standard metrics including Accuracy, F1 score, Precision, Recall, and ROC-AUC.
Participant-wise AUC scores for session classification using a Random Forest model
Additional visualizations generated from this notebook are available in the figures/model_results/
directory.
We welcome contributions to improve the code and documentation. Please feel free to submit issues and pull requests.
This project is licensed under the terms of the license included in the repository.