The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks

This is the code and the dataset for the paper titled

The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks. Brihi Joshi, Neil Shah, Francesco Barbieri, Leonardo Neves

accepted at The 28th International Conference on Computational Linguistics (COLING’20).

If you end up using this code or the data, please cite our paper:

@inproceedings{joshi-etal-2020-devil,
    title = "The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks",
    author = "Joshi, Brihi  and
      Shah, Neil  and
      Barbieri, Francesco  and
      Neves, Leonardo",
    booktitle = "Proceedings of the 28th International Conference on Computational Linguistics",
    month = dec,
    year = "2020",
    address = "Barcelona, Spain (Online)",
    publisher = "International Committee on Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.coling-main.326",
    pages = "3652--3659",
    abstract = "Contextual embeddings derived from transformer-based neural language models have shown state-of-the-art performance for various tasks such as question answering, sentiment analysis, and textual similarity in recent years. Extensive work shows how accurately such models can represent abstract, semantic information present in text. In this expository work, we explore a tangent direction and analyze such models{'} performance on tasks that require a more granular level of representation. We focus on the problem of textual similarity from two perspectives: matching documents on a granular level (requiring embeddings to capture fine-grained attributes in the text), and an abstract level (requiring embeddings to capture overall textual semantics). We empirically demonstrate, across two datasets from different domains, that despite high performance in abstract document matching as expected, contextual embeddings are consistently (and at times, vastly) outperformed by simple baselines like TF-IDF for more granular tasks. We then propose a simple but effective method to incorporate TF-IDF into models that use contextual embeddings, achieving relative improvements of up to 36{\%} on granular tasks.",
}

About

Figure: An example pair of articles from the News Dedup dataset: Both report the same news event, and are thus similar on a granular level; the colored text indicates fine-grained details associated with this determination. Both articles are also of the "sports" topic, and are thus similar on an abstract level.

Contextual embeddings derived from transformer-based neural language models have shown state-of-the-art performance for various tasks such as question answering, sentiment analysis, and textual similarity in recent years. Extensive work shows how accurately such models can represent abstract, semantic information present in text. In this expository work, we explore a tangent direction and analyze such models' performance on tasks that require a more granular level of representation. We focus on the problem of textual similarity from two perspectives: matching documents on a granular level (requiring embeddings to capture fine-grained attributes in the text), and an abstract level (requiring embeddings to capture overall textual semantics). We empirically demonstrate, across two datasets from different domains, that despite high performance in abstract document matching as expected, contextual embeddings are consistently (and at times, vastly) outperformed by simple baselines like TF-IDF for more granular tasks. We then propose a simple but effective method to incorporate TF-IDF into models that use contextual embeddings, achieving relative improvements of up to 36% on granular tasks.

License

Copyright (c) Snap Inc. 2020. This sample code is made available by Snap Inc. for informational purposes only. It is provided as-is, without warranty of any kind, express or implied, including any warranties of merchantability, fitness for a particular purpose, or non-infringement. In no event will Snap Inc. be liable for any damages arising from the sample code or your use thereof.

Quick Start

Requirements

Python 3.5.x

To install the dependencies used in the code, you can use the requirements.txt file as follows -

pip install -r requirements.txt

Installing baselines and datasets

Using the SIF Baseline, follow the installation steps given here and add it to the code/SIF location.
For accessing the Bugrepo dataset, download the dataset from this LogPAI Bugrepo repository..

Running the code

The code is organised as follows.

├── code
│   ├── utils/ # This folder contains all the necessary pre-processing and skeleton code for the models. 
│   ├── SIF/ # This folder contains the SIF baseline requirements, installed as per the above instructions.
│   ├── news_dedup_experiments/ # This folder contains the experiments done with the News Dedup dataset
│   └── bug_data_experiments/ # This folder contains the experiments done with the Bugrepo dataset
└── README.md

To run the code for specific experiments, go to their respective Jupyter Notebook and run the cells to train the models.

For example, to run the code for the TFIDF Experiments for the Bugrepo dataset run the following -

cd bug_data_experiments/
jupyter notebook

and open the tf_idf_classification_bugrepo.ipynb notebook.

Contact

If you face any problem in running this code, you can contact us at brihi16142[at]iiitd[dot]ac[dot]in or make an Issue in this repository.

For license information, see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
code		code
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
granular_example.png		granular_example.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

granular_example.png

granular_example.png

requirements.txt

requirements.txt

Repository files navigation

The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks

About

License

Quick Start

Requirements

Installing baselines and datasets

Running the code

Contact

About

Releases

Packages

Languages

License

brihijoshi/granular-similarity-COLING-2020

Folders and files

Latest commit

History

Repository files navigation

The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks

About

License

Quick Start

Requirements

Installing baselines and datasets

Running the code

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages