-
Quora is a platform that empowers people to learn from each other. On Quora, people can ask questions and connect with others who contribute unique insights and quality answers. But sometimes questions ask on the platform are disparaging, intended to insult and disrespect a particular community, uses sexual content for spreading nuisance, founded upon false premises, or intend to make a statement rather than looking for helpful answers. These questions are considered insincere questions.
-
A key challenge is to weed out insincere questions to make the platform a trustworthy and genuine source of information for all the people who use it.
-
We present a project to classify sincere and insincere questions using different ML algorithms and ensembles with the use of different embedding. We have shown that logistic regression and TFIDF embedding technique perform best among others by giving an F1 Score of 0.6320.
-
This repository contains a detailed report on this project along with its working code.
Dataset link: https://www.kaggle.com/c/quora/data