Skip to content

Strong-AI-Lab/Logical-Reasoning-Reading-Comprehension-ReClor

Repository files navigation

Logical-Reasoning-Reading-Comprehension-ReClor

Here is the code for the #5 to the ReClor Logical Reasoning Reading Comprehension Leaderboard (2021/07/28).

image

Here is the code for the #6 to the ReClor Logical Reasoning Reading Comprehension Leaderboard (2021/07/27).

image

Here is the link for the ReClor leaderboard. We are the team qbao775. The method we used is RoBERTa-large finetuned on MNLI dataset. In the first submission, we use the RoBERTa-large-mnli from the Huggingface.

ReClor Leaderboard

We also finetune a RoBERTa-large-mnli by ourselves. The finetuning code and hyperparameters are referred from the Huggingface (https://github.com/huggingface/transformers/tree/master/examples/pytorch/text-classification). You need to clone the transformers code (https://github.com/huggingface/transformers) firstly.

MNLI Project page

How to run the code?

Environment setup

  • Python3.5+
  • PyTorch 1.0+
  • Transformers 2.3.0
  • apex - install Nvidia apex for mixed precision training Install Python3.5+, PyTorch 1.0+, Transformers and apex

Load existing RoBERTa-large-mnli from Huggingface

Our #5 submission code (2021/07/28) is the run_roberta_large_MNLI_PARARULE_Plus_reclor.sh which located in the scripts folder. You can run it directly.

  1. Before you run the scripts in the main directory by such as run_roberta_large_MNLI_PARARULE_Plus_reclor.sh, please run the run_roberta_large_MNLI_PARARULE_Plus.sh firstly and then use the lastest output model as the initialization model for the run_roberta_large_MNLI_PARARULE_Plus_reclor.sh.
  2. Run the scripts in the main directory by such as sh scripts/run_roberta_large_MNLI_PARARULE_Plus_reclor.sh
  3. You will find test_preds.npy which is the test prediction result. You need to submit it to the ReClor leaderboard.

Our #6 submission code (before 2021/07/28) is the run_roberta_large_mnli.sh which located in the scripts folder. You can run it directly.

  1. Run the scripts in the main directory by such as sh scripts/run_roberta_large.sh
  2. You will find test_preds.npy which is the test prediction result. You need to submit it to the ReClor leaderboard.

The test predication results test_preds.npy submitted to the leaderboard and models can be found from here.

Finetune a RoBERTa-large-mnli by yourself

  1. git clone the transformers code git clone https://github.com/huggingface/transformers.git from the link (https://github.com/huggingface/transformers).

  2. cd transformers and then pip install -e .

  3. cd ./examples/pytorch/text-classification/ and then run the script as the README.md shown. You only need to change the TASK_NAME to mnli like the following shown. The script will help you download and load the mnli dataset automatically.

export TASK_NAME=mnli

python run_glue.py \
  --model_name_or_path roberta-large \
  --task_name $TASK_NAME \
  --do_train \
  --do_eval \
  --max_seq_length 128 \
  --per_device_train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 3 \
  --output_dir /tmp/$TASK_NAME/

Built With

  • Torch - library used to train and run models
  • Transformers - Huggingface library used to implement models
  • Sklearn - library used to implement and evaluate models
  • Matplotlib - main plotting library
  • Seaborn - helper plotting library for some charts
  • NumPy - main numerical library for data vectorisation
  • Pandas - helper data manipulation library
  • Jsonlines - helper jsonl data manipulation library
  • Apex - install Nvidia apex for mixed precision training

Acknowledgement

Thanks for the benchmark source code provided from the ReClor group. https://github.com/yuweihao/reclor

PARARULE Plus: A Larger Deep Multi-Step Reasoning Dataset over Natural Language https://github.com/Strong-AI-Lab/PARARULE-Plus