We built a Question Answer System using BERT fine tuned on SQuAD 2.0. Based on our benchmark dataset that we designed for a specific task, we evaluated it at a 40% over one question bank and 28% over the other.
- This QA System is topic agnostic - there is no inbuilt context. Depending on the context you feed it, you can ask questions about that (there is a little bit of structure for our task listed below).
- It selects the top 3 documents in a corpus and outputs the answer with the highest confidence score.
- It checks the spelling (Damerau-Levenshtein distance) and grammar (t5-base-grammar-correction) of the question before feeding it to BERT.
- It does remember context history as long as you are talking about one object, if you switch between subjects and then refer to the new subject's history, it'll get confused :( - you can check some of our older issues for more information.
- If you ask about something that it does not have in context, it will respond with "Unable to answer. Please try again." (most of the time).
We used a specific corpus for our task, however it can be used according to your needs as well. The current structure is:
- prompted to choose from 3 different corpi
- all further questions based on selected corpus
- once you decide to stop asking questions, the system prompts whether you want to learn more
- if you decide yes, then three more choices are offerred
- if no, conversation ends
- Install the following -
pip install transformers
(for BERT for QA)pip install happytransformer
(for grammar checking which uses t5 model)pip install symspellpy
(for spell checking)
- Feed your corpus as a dataframe to
current_phase_df
- Use
begin_conversation
method inquestion_answer_bot.py
to get started!