Skip to content

Latest commit

 

History

History
33 lines (18 loc) · 1.01 KB

README.md

File metadata and controls

33 lines (18 loc) · 1.01 KB

Paper-Analysis

External repos used

###Description: The main aim of this project was to find out the most frequently repeated topics in the previous year exam papers which might take some time.

Deliverables:This project is still incomplete as i am waiting for the scanned copies of the previous year papers of the university under which i am doing my B.E.

###Things done

  • Scans the pdf files and extracts text data from them.
  • Tokenzes the text files into words.
  • Deletes irrelevant words from the corpus like determines verbs etc
  • Gives the frequency distribution of the most frequent words in the sample document

###Things to do

  • Get digitized copies of previous year question papers.
  • Make a list of all the topics in the syllabus.
  • Cross check if the tokenized list contains the topics in the syllabus and plot a frequency distribution table accordingly
  • plot the results in a graph.