Skip to content

Latest commit

 

History

History
14 lines (9 loc) · 1.31 KB

score_and_generate_gap_fill_questions_candidates.md

File metadata and controls

14 lines (9 loc) · 1.31 KB

Finding Candidate Sentences for Gap-Fill Question Creation

The first step in the question generation process is to find sentences within the corpus that can be transformed into a multiple-choice, fill-in-the-blank style question. To this end, the idea is to use the BTM topic model that we trained earlier to find such sentences within the corpus.

First, we assign a score to every sentence in the corpus. For this step, we compute the top three most probable topics for each sentence, weight each posterior probabiltiy by 0.5, 0.3, and 0.2, respectively, and sum the resulting values. We use the ScoreSentences program to perform this computation.

Next, we filter out all low-scoring sentences. This step is arbitrary, as in one must manually specify a threshold value. In our experiments with the biology textbook corpus, we use a value of 0.4. The SelectSentencesForQuestions program performs this step.

Note

Before using any of the following programs, ensure that:

  • the sentence data is present and in the correct format.
  • the BTM topic model has been trained on the relevant data.
  • all of the code has been compiled and executable scripts have been created using sbt pack.