Skip to content

rawat-sushil/vertical_search_engine

Repository files navigation

vertical_search_engine

File Names:

  1. CoventryUni_main_20March.ipynb - Main file which builds inverted index , integrate both search engine and subject classifier and also builds UI
  2. TextClassification.ipynb - Contains code to build and train subject classifier model
  3. Crawler.ipynb - Contains code to crawl the google scholar website
  4. subject_Classification_NB.sav - Pre-trained subject classifier
  5. Training_data.npy - Training data which is used for traning classifier

Crawler Output:
1.CoventryUni_Data_Scraped_profile.csv - Contains all the profiles information
2.CoventryUni_Data_Scraped_articles.csv - Contains all the publication information

Folder:
Templates - Contain index.html which is rendered for UI

User Interface:

image image