We'll use amazon's data for our project. You can either download data directly from Kaggle, or you can donwload using Kaggle's API.
The idea of the project is to build a Hybrid Recommender System using Cosine and Jaccard Similarities methods.
- Product Clustering: We'll use product_id, Discounted_Price, Subcategory, rating to find products similarity.
- User clustering: Calculate users similarity using jaccard similarity with input features such as user_id, product_id, Subcategory, rating.
Dependencies: You'll need to install below dependencies to run this project.
- numpy: 1.18.1
- pandas: 1.0.1
- matplotlib: 3.5.3
- seaborn: 0.10.0
- re: 2.2.1
- sklearn: 0.22.1
- xlsxwriter: 1.2.7
- scipy: 1.4.1
This project is open-source and distributed under the MIT License. Feel free to use and modify the code as needed.
If you encounter any issues or have suggestions for improvement, please open an issue in the Issues section of this repository.
The code has been tested on Windows system. It should work well on other distributions but has not yet been tested. In case of any issue with installation or otherwise, please contact me on Linkedin
I’ve been working as a Data Scientist for a very long time now. I've worked on various NLP, Machine learning & cutting edge deep learning frameworks to solve business problems. Please feel free to check out my personal wesbsite TowardsMachineLearning.Org , where I cover an array of topics from Machine learning, NLP, Deep Learning, etc.