P&S Lab Assignment 1: Naive Bayes Classifier

Task

Determine which class - fake or credible - some observation probably belongs to by using the Bayes formula.

Dataset - fake news

Algorithm

Steps taken to perform the task :

Analyze the dataset, process it to turn into a bag-of-words
Use Bayes' formula to try to predict the class of a message
- Calculate the probability for the bag-of-word to be in fake class
- Calculate the probability for the bag-of-word to be in credible class
- Compare them
Compute the success of the predictions
Calculate the metrics to evaluate the effectiveness of the classifier
Form the confusion matrix to represent the effectiveness of the method
Form the diagram and visualize the statistics

Statistics and summary

Accuracy : approximately 93% (0.9289 out of 1)

Pros of using Naive Bayes approach :

Simple and easy to implement. Naive Bayes uses basic probability formulas and concepts and clear algorithm.
Comparatively fast by the assumption that the features are independent. That is one of the reasons why Naive Distribution is used on big datasets.

Cons :

Does not work for non-independent features. The count of datasets having completely independent features is reaching zero, so we can use Naive Bayes only when speed is more praised than the accuracy

In the dataset given, we actually assumed that there was no correlation between the word frequency and our method still had good results. However, it could be a good practise e.g to consider the dependencies of the word appearance. Still, Naive Bayes has pretty good classification accuracy.

License

The MIT License (MIT)

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data/2-fake_news		data/2-fake_news
.gitignore		.gitignore
LICENSE		LICENSE
Lab1_Naive_Bayes_Classifier.Rmd		Lab1_Naive_Bayes_Classifier.Rmd
Lab1_Naive_Bayes_Classifier.html		Lab1_Naive_Bayes_Classifier.html
README.md		README.md
result.csv		result.csv
stop_words.txt		stop_words.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data/2-fake_news

data/2-fake_news

.gitignore

.gitignore

LICENSE

LICENSE

Lab1_Naive_Bayes_Classifier.Rmd

Lab1_Naive_Bayes_Classifier.Rmd

Lab1_Naive_Bayes_Classifier.html

Lab1_Naive_Bayes_Classifier.html

README.md

README.md

result.csv

result.csv

stop_words.txt

stop_words.txt

Repository files navigation

P&S Lab Assignment 1: Naive Bayes Classifier

Task

Algorithm

Statistics and summary

License

About

Contributors 3

Languages

License

andylvua/NaiveBayesClassifier

Folders and files

Latest commit

History

Repository files navigation

P&S Lab Assignment 1: Naive Bayes Classifier

Task

Algorithm

Statistics and summary

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages