Information Retrieval on Cranfield 1400 and NF-Corpus with Vector Space Model and Query Likelihood Model
-
Updated
Feb 28, 2022 - Jupyter Notebook
Information Retrieval on Cranfield 1400 and NF-Corpus with Vector Space Model and Query Likelihood Model
An Information Retrieval System with 3 models and 3 datasets from the ir_datasets library .
An advanced form of the previously implemented search Engine which acts as a information retrieval system over the cranfield collection of the 1400 documents and also makes use of the stemmer algorithm. Other things are pretty much the same as the previously implemented SearchEngine project.
Python-based Information Retrieval system leveraging the BIM probabilistic model. Features include handling free-form text queries, relevance & pseudo-relevance feedback. Performance is rigorously evaluated using metrics like precision/recall, mean average precision, and R-precision. Utilizes the standard Cranfield dataset from aerodynamics.
🔍 A Lucene demo for searching the Cranfield collection.
Lucene SE for Cranfield Collection
Assignment from my MAI module 'Information Retrieval and Web Search' where I index the Cranfield Collection
Search Engine for the Cranfield Collection
Performs tokenization, stemming, lemmatization, index creation, index compression and ranked retrieval of Cranfield documents
Implementation of Salton and Buckley paper using 2 methods TF-IDF & Best Weighted Probabilistic
Implementation of a Vector Space Retrieval Model using TF-IDF and cosine similarity on the Cranfield document corpus
Add a description, image, and links to the cranfield-collection topic page so that developers can more easily learn about it.
To associate your repository with the cranfield-collection topic, visit your repo's landing page and select "manage topics."