Skip to content

Latest commit

 

History

History
13 lines (10 loc) · 617 Bytes

README.md

File metadata and controls

13 lines (10 loc) · 617 Bytes

Personalized-Cancer-Diagnosis

Problem Statement

Classify the given genetic variations/mutations based on evidence from text-based clinical literature

Data Overview

Source of Data : https://www.kaggle.com/c/msk-redefining-cancer-treatment/data

I have two data files: one contains the information about the genetic mutations and the other contains the clinical evidence (text) that human experts/pathologists use to classify the genetic mutations.

  • Both these data files are have a common column called ID
  • Data file's information: training_variants (ID , Gene, Variations, Class) training_text (ID, Text)