Skip to content

Latest commit

 

History

History
13 lines (10 loc) · 571 Bytes

README.md

File metadata and controls

13 lines (10 loc) · 571 Bytes

Quora-Question-Pair-Similarity

Problem Statement

  • Identify which questions asked on Quora are duplicates of questions that have already been asked.
  • This could be useful to instantly provide answers to questions that have already been answered.
  • Here tasked with predicting whether a pair of questions are duplicates or not.

Data Overview

Source of Data : https://www.kaggle.com/c/quora-question-pairs/data

  • Train.csv contains 5 columns : qid1, qid2, question1, question2, is_duplicate
  • Size of Train.csv - 60MB
  • Number of rows in Train.csv = 404,290