Information-Retrieval-Data-Mining

Coursework 1 : Without so much data everywhere, Information retrieval systems are very common nowadays. Efforts are being made to develop more accurate retrieval systems that give relevant results when a query is given. In this report, I'll mention how I approached the problem of making a information retrieval system that gives ranked results when a query is given. For this, I've used cosine similarity for vector space models and BM25 (probabilistic model). At last, I've made query-likelihood model using laplace smoothing, Lindstone correction, dirichlet smoothing and compared them.

Coursework 2: The basic process of ranking in Information Retrieval is: Documents are indexed and stored. The user query is used to get top-k documents for a specific query. These k documents are sent to a ranking model that has been trained on similar data with help of a learning algorithm. After ranking, the results are displayed to user on a results page in a specific order. In this report, I've used similarities based on query and document embedding to rank documents when certain query is given. These rankings have been compared using Normalized Discounted Cumulative Gain (NDCG) and mean Average Precision (mAP). I've used cosine similarity, document length and query length as features for Logistic regression, LambdaMART and Neural Network models.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Learning to Rank - CW2		Learning to Rank - CW2
LICENSE		LICENSE
README.md		README.md
Report.pdf		Report.pdf
bm25.csv		bm25.csv
dirichlet.csv		dirichlet.csv
laplace.csv		laplace.csv
lidstone.csv		lidstone.csv
task1.py		task1.py
task2.py		task2.py
task3.py		task3.py
task4.py		task4.py
tfidf.csv		tfidf.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learning to Rank - CW2

Learning to Rank - CW2

LICENSE

LICENSE

README.md

README.md

Report.pdf

Report.pdf

bm25.csv

bm25.csv

dirichlet.csv

dirichlet.csv

laplace.csv

laplace.csv

lidstone.csv

lidstone.csv

task1.py

task1.py

task2.py

task2.py

task3.py

task3.py

task4.py

task4.py

tfidf.csv

tfidf.csv

Repository files navigation

Information-Retrieval-Data-Mining

About

Releases

Packages

Languages

License

Anjali001/Information-Retrieval-Data-Mining

Folders and files

Latest commit

History

Repository files navigation

Information-Retrieval-Data-Mining

About

Topics

Resources

License

Stars

Watchers

Forks

Languages