Skip to content

Web Crawler with Integrated Query Recommender System based on BERT

Notifications You must be signed in to change notification settings

zar-e/Information-Retrieval-System

Repository files navigation

Information Retrieval System

Web crawler that scrapes data from a website and provides semantically-accurate recommendation based on input query

Incorporated tf-idf vectorizer and BERT

See documentation for additional info

Brought to you by the Islanders:

Avlonitis Ektor Lazarevic Milos Vavakas Alexandros

Usage:

As of now, works only on azlyrics.com

Run python files via cmd or use IDE of choice

Dependencies:

Downloading the Data: • Urllib • Requests • BeautfiulSoup • Time • Random • CSV

Preprocessing, TF-IDF: • Pandas • SKlearn • Numpy • NLTK

Processing, BERT: • Seaborn • Matplotlib • FAISS • Sentence_Transformers