Skip to content

Swapneel01/URL-Based-Spam-Classification-Using-Machine-Learning

Repository files navigation

URL Based Spam Classification Using Machine Learning

We have tried to implement the paper "Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs" by "Justin Ma, Lawrence K. Saul, Stefan Savage, Geoffrey M. Voelker" of Department of Computer Science and Engineering of the University of California, San Diego.

You can download the paper here.

Initially in the paper the models which have been used include - Logistic Regression, SVM and Naive Bayes. We have tried to extend the paper by using models like - Gradient Boosting, Random Forest and Decision Trees.

To run the code you will need Jupyter-Notebook which is available as a package of the Anaconda Python distribution. Once you have it installed, open the initial implementation and final implementation folders and open the codes in the notebook.