This repository integrates the following functions:
-
Vector Space IR
-
Boolean IR (basic syntax: term1 and term2, term1 or term2)
-
Probabilistic IR
-
Query Expansion based on Association Rule Mining
-
User feedback Rocchio algorithm
-
Co-Author
And for other functions mentioned in report, you may refer to corresponding folders.
To run the project properly, you should have 64 bit Python 2.7.10 installed. Moreover, it relies on the following packages:
- sk-learn
- nltk (including corpus)
- Scipy (http://www.lfd.uci.edu/~gohlke/pythonlibs/xmshzit7/scipy-0.16.1-cp27-none-win_amd64.whl)
- Matplotlib (which depends on numpy)
- ipython
- pandas
- sympy
- nose
- flask
After you doing those things successfully (it's very easy if you use pip), just run 'python index.py' on command promot or shell and you will be able to explore these functions through browser. The website IP is 127.0.0.1:5000.
Thanks!