MUSIED

Dataset and baselines for paper "MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts".

Data

The dataset can be obtained from the “data” folder. The data format is introduced in this document.

Run preprocessing.py to obtain the sentence-level input of model. The result is saved in data directory.

├── data
│     └── train_sentence.json
│     └── dev_sentence.json
│     └── test_sentence.json

We release the source codes for the baselines, including

sentence-level models:

document-level models