Supersense Sequence Labelling

About

There are a lot of models that specialise in NER (Named Entity Recognition) task, generally optimizing their results for the CONLL-2003 shared task on NER. If we consider SuperSeq (Supersense Sequence) Labelling as an extension of NER, the models which achieve 90%+ F1 score on mentioned data fail to perform on a comparative scale. The project attempts to evaluate the NER SOTA models on the SuperSeq Labelling task, and investigate on what features need to be captured in addition, so as to extend NER for the problem.

The Project is done under the guidance of Prof. Oier at UPV-EHU, in May-June 2019.

Data for Training and Evaluation

TODO: Edit the data here, with links

Models Investigated

Perceptron Model - This model is considered the Baseline as well as SOTA for SuperSeq Labelling Task. As mentioned in the paper [1], the authors used a set of hand-refined features to create the perceptron model tagger for the data. On running the aforementioned tagger, we get an F1 score of 69.46 on the test data.
Model 1 (word level bi-LSTM + CRF) - The model as suggested in the paper [2] by Huang and Yu. The implementation of the model was borrowed from a github repository¹, and then tuned upon the data.
Model 2 (word level bi-LSTM + character level bi-LSTM + CRF) - The model as suggested in the paper [3] by Lample et al. The implementation of the model was borrowed from a github repository¹, and then tuned upon the data.
Model 3 (convNet on characters + word level bi-LSTM + CRF) - The model as suggested in the paper [4] by Ma and Hovy. The implementation of the model was borrowed from a github repository¹, and then tuned upon the data.

Statistics

Table 1: Training using train+dev data, tuning and results on test data

Model Name	Embeddings used	Data Format	F1 Score
Perceptron	-	IOB	69.46
Model 1	glove	IOBES	67.68
Model 2	glove	IOBES	67.48
Model 3	glove	IOBES	66.73

Additional Models Evaluation

TODO: Add links

The following Embeddings were used to tune the models, with the associated hyper-parameters as listed in the next section:

ElMO Embeddings
Flair Embeddings
BERT Embeddings
Stack1 - ElMO + Flair
Stack2 - ElMO + BERT
Stack3 - Flair + BERT
Stack4 - ElMO + Flair + BERT

For each of the embeddings mentioned above, a model with embedding + character-based embeddings were also tried.

Hyperparameters Tuned

Table 2: Hyperparameters tuned per embedding sequences

Parameter	Search Type	Limits
Hidden Layers	Random Integers	0 - 400
RNN Layers	Choice	1, 2
Dropout	Uniform	0 - 0.5
Learning Rate	Choice	0.05, 0.10, 0.15, 0.20
Mini Batch Size	Choice	16, 32
use_CRF	Choice	True, False

Statistics

Table3: Training using train data, tuning on dev data, and results for test data

Embedding Type	F1 Score	Tuned HyperParameters
ElMO	?	?
Flair	?	?
BERT	?	?
Stack1	?	?
Stack2	?	?
Stack3	?	?
Stack4	?	?

Table4: Training using train data, tuning and results for test data

Embedding Type	F1 Score	Tuned HyperParameters
ElMO	?	?
Flair	?	?
BERT	?	?
Stack1	?	?
Stack2	?	?
Stack3	?	?
Stack4	?	?

References

[1]: Ciaramita, M., & Altun, Y. (2010). Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger, 594. https://doi.org/10.3115/1610075.1610158
[2]: Huang, Z., Xu, W., & Yu, K. (2015). Bidirectional LSTM-CRF Models for Sequence Tagging. Retrieved from http://arxiv.org/abs/1508.01991
[3]: Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., & Dyer, C. (2016). Neural Architectures for Named Entity Recognition. Retrieved from http://arxiv.org/abs/1603.01360
[4]: Ma, X., & Hovy, E. (2016). End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. Retrieved from http://arxiv.org/abs/1603.01354

Footnotes

Github Repository by Guillaume Genthial

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
flair_expts		flair_expts
.gitignore		.gitignore
README.md		README.md
SuperSequence_Labelling_Paper.pdf		SuperSequence_Labelling_Paper.pdf
data_preparation.py		data_preparation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

flair_expts

flair_expts

.gitignore

.gitignore

README.md

README.md

SuperSequence_Labelling_Paper.pdf

SuperSequence_Labelling_Paper.pdf

data_preparation.py

data_preparation.py

Repository files navigation

Supersense Sequence Labelling

About

Data for Training and Evaluation

Models Investigated

Statistics

Additional Models Evaluation

Hyperparameters Tuned

Statistics

References

Footnotes

About

Releases

Packages

Languages

Akshayanti/supersense-sequence-labelling

Folders and files

Latest commit

History

Repository files navigation

Supersense Sequence Labelling

About

Data for Training and Evaluation

Models Investigated

Statistics

Additional Models Evaluation

Hyperparameters Tuned

Statistics

References

Footnotes

About

Resources

Stars

Watchers

Forks

Languages