In the first (left in the above flowchart) part of the project, we constructed a Multinomial Navie Bayes Classifier to investigate the document (comment) classification problem. On the other hand, we built 5 fully connected neural networks with different vectorization techniques to fit the same real world dataset. Lastly, both approaches have been assessed through common metric, including accuracy, precision, recall, and losses.
In the second (right in the above flowchart) part of this project, we simplified the multinormial classification into binomial and applied the previous multinomial Navie Bayes Classifier to generate synthetic data. The data was used to trained both Navie Bayes and neural network approaches. We evaluated the results qualitatively and quantitatively. Situations where each approach performs well and poorly were highlighted.
Note:
- The data used in the project comes from kaggle competition.
- Python packages used in the project requirements.txt.