This tutorial is created by Baligh Mnassri. It is inspired from that realized by Ben Sadeghi. This work is improved, extended and adapted to be running on the databricks cloud. It presents six classifiers that will be compared at the cross validation part. I will explain how to compute the different evaluate metrics on the binary classification case.
The studied classifiers are:
- Logistic regression
- Naive Bayes
- Linear Support Vector Machine
- Decision tree classifier
- Random forest classifier
- Gradient-boosted tree classifier
Two notebooks are achieved:
-
The first one is available and published on databricks cloud under the following link: https://databricks-prod-cloudfront.cloud.databricks.com/public/4027ec902e239c93eaaa8714f173bcfc/8143187682226564/612737040150280/3186001515933643/latest.html. It can be also viewed on nbviewer:
-
The second is running on google colab: and it can be viewed on nbviewer: