Model Development with the DAiSEE Dataset for an Intelligent Tutoring System recognizing Comprehension Problems during Learning

This project is meant to develop an intelligent tutoring system monitoring the learning and comprehension process of complex material during e.g. online classes or e-learning environments. In this case, the intelligent tutoring system was testes on a language grammar learning task. A novel stimuli was presented to learn the basic Finnish plural in a short video. The learner must take a comprehension test directly after the explanation video, unless he or she showed signs of comprehension difficulties. In this case, the tutorial system suggests a new explanation in which the plural rules are shown again in summary.

Model Development

Dataset

The DAiSEE Dataset is a freely available dataset of multi-label video classification capturing engagement and emotional states (frustration, confusion and boredom) of students watching MOOCs. DAiSEE can be downloaded here but can be also directly loaded into Python via deeplake.

Download the dataset, create the folder 'data/raw' and put the dataset into that folder.

Data Extraction

Download the dataset and keep the given data structure. Extract relevant facial parameters from the DAiSEE dataset with OpenFace with 'make_dataset.py'. The resulting CSV file will be saved in the same folder under the respective data set.

Make sure that OpenFace is installed correctly according to your operating system. E.g. for Windows see the Wiki.

Run the following Python script:

$ make_dataset.py

Data Preprocessing

All features are calculated based on Action Units, landmarks and other parameters provided by OpenFace. For more information see the project report. The resulting data file is saved as a single CSV file containing all data in the folder data\interim\calculated_features.

Run the following Python script:

$ feature_calculation.py

Since the label were set new th

$ merge_labels2data.py

The calculated features need to be normalized and further data manipulation has to be done. Figures of all features will be stored in 'model_development\preprocessing\figures\data_distribution' that show the data distribution before and after data manipulation. The resulting data will be saved in the following folder: 'data\interim\data_preprocessed'

$ data_preprocessing.py

Features are selected based on variance analysis, feature correlation or recursive feature elimination. Some plots such as feature importance plots are saved in the folder 'model_development\preprocessing\figures\feature_selection'.

The best features are saved in a list in 'Selected_Features.json' in the same folder.

$ man_feature_selection.py

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data/external		data/external
fig		fig
model_development		model_development
project_report		project_report
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data/external

data/external

fig

fig