Skip to content

BAJUKA/SQuAD-NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Question Answering on SQuAD

I started this project to explore the domain of Question Answering especially attention mechanism and how they affect the overall performance of the model. SQuAD is a reading comprehension data set. This means a paragraph and question about the paragraph as input to the model. The answer to the question is a continuous span in the paragraph i.e we have to predict the start and end indices of the answer. More info about the data set can be found in data_visualization.py and also in the paper https://arxiv.org/pdf/1606.05250.pdf. I have used the template provided for the default final project of Stanford's CS224n course.

Papers(parts of it) I have tried to implement

This blog helped me to gain a lot of insight about the R-NET Challenges of reproducing R-NET

About the code

  • get_started.sh : A script to install requirements, and download and preprocess the data.
  • requirements.txt: Used by get_started.sh to install requirements
  • code/: A directory containing all code:
    – preprocessing/: Code to preprocess the SQuAD data, so it is ready for training:
  • download_wordvecs.py: Downloads and stores the pretrained word vectors (GloVe).
  • squad_preprocess.py: Downloads and preprocesses the official SQuAD train and dev sets and writes the preprocessed versions to file.
  • data_batcher.py: Reads the pre-processed data from file and processes it into batches for training.
  • main.py: The top-level entrypoint to the code. You can run this file to train the model, view examples from the model and evaluate the model.
  • modules.py: Contains componets for different models.
  • pretty_print.py: Contains code to visualize model output.
  • qa_model.py: Contains the model definition.
  • vocab.py: Contains code to read GloVe embeddings from file and make into an embedding matrix.

Results

I have tried to experiment with things I have learnt and tried to understand their effect on the performance as much as I can.

  • The various models were compared against a baseline which had a RNN Encoder with simple attention and a softmax decoder which gave a F1 score of about 40% on the dev set.
  • With coattention and softmax decoder I managed to get a F1 score of 63% on the dev set.
  • With self attention(R-NET) and softmax decoder I managed to get a F1 score of 62% on the dev set.
  • With the complete R-NET architecture (i.e self attention and answer pointer) and with co-attention and answer pointer I managed to get an F1 score of 64%.
    These are some of the images of the question answered by the model along with the right answer and F1 score. Result1 Result2 Result3

Future Work

  • Analyze the results produced by the model to understand where it has gone wrong so that I can make changes to improve on these(like smart span!).
  • Improve the R-NET model as I am not satisfied with the F1 score. The fault may be in the implementation of the answer pointer which is not giving the expected results.
  • Implement a GUI interface which accepts the question and passage and displays the answer.
  • Experiment more. I have just started exploring the question answering domain and want to implement and experiment with more papers as I read them.

About

This repository contains work on SQuAD

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published