###Description: The main aim of this project was to find out the most frequently repeated topics in the previous year exam papers which might take some time.
Deliverables:This project is still incomplete as i am waiting for the scanned copies of the previous year papers of the university under which i am doing my B.E.
###Things done
- Scans the pdf files and extracts text data from them.
- Tokenzes the text files into words.
- Deletes irrelevant words from the corpus like determines verbs etc
- Gives the frequency distribution of the most frequent words in the sample document
###Things to do
- Get digitized copies of previous year question papers.
- Make a list of all the topics in the syllabus.
- Cross check if the tokenized list contains the topics in the syllabus and plot a frequency distribution table accordingly
- plot the results in a graph.