Skip to content

Exploring Gene Expression Features through Factor and Cluster Analysis in Breast Cancer Patients.

Notifications You must be signed in to change notification settings

lacodyle/breast_cancer_gene_expression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Breast Cancer Clinical and Gene Expression Data Analysis

Data Analysis Project for DSC424: Advanced Data Analysis at DePaul University
Highlights: Feature Dimensionality Reduction with PCA and Factor Analysis

Analysis of breast cancer clincial features and gene expression features to explore their relationship using principal component analysis (PCA), factor analysis, and cluster analysis. In this group project, my focus was on preprocessing of the data and performed dimensionality reduction of the gene expression features using PCA from 663 to 25. Principle factor analysis and common factor analysis were performed to discover patterns in the gene expression features and lastly cluster analysis was also performed as an exploration.

The dataset is acquired through the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) database through Kaggle. The dataset contains 1,904 instances and 693 features with each instance representing a breast cancer patient, the patient's clinicial attributes, and their gene expression attributes.

About

Exploring Gene Expression Features through Factor and Cluster Analysis in Breast Cancer Patients.

Topics

Resources

Stars

Watchers

Forks

Languages