EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important temporal segments in educational videos.
Download the dataset and put it in "data" folder.
Video Annotation Tool (VAT) is a web based tool to annotate videos datasets to use in machine learning tasks.
Python version >= 3.6
# clone the repository
git clone git@github.com:Junaid112/EDUVSUM-Educational-Video-Summarization.git
cd EDUVSUM-Educational-Video-Summarization
pip install -r requirements.txt
@article{ghauri2020eduvsum,
title={Classification of Important Segments in Educational Videos using Multimodal Features},
author={Ghauri, Junaid Ahmed and Hakimov, Sherzod and Ewerth, Ralph},
Conference={International Workshop on Investigating Learning During Web Search (IWILDS 2020) co-located with CIKM},
year={2020}
}