Binary income classification :

build a model that predicts whether an individual makes over $50,000 per year based on anonymized census data

Goal :

understanding factors influencing income inequality and potentially informing targeted social programs.

Data Cleaning or Refinement :

1- deals with Missing Values. 2- Figure out why the data is missing. 3- Eliminating all extra variables. 4- Eliminating duplicates. 5- detect and remove outliers (you can use box plot to ensure that your data have outliers). 6- Scaling and Normalization. 7- Eliminating blank spaces or missing information.(can use SimpleImputer to handle missing values). 8- Arranging the data logically and sequentially so that it is easy to visualize. 9- Grouping data in rows and columns or horizontally and vertically will help in data arrangement and also proper visualization. 10- Dealing with Inconsistent Data Entry.

Exploratory data analysis (EDA):

How is one variable related to the other? What sort of relationship exists between two different variables? What kind of trend is the data following? Can a dataset be divided into smaller parts?

Visualization:

used basic visualization methods using plottly and cufflinks not matplotlib and seaborn : 1- Line plots. 2- Area plots. 3- Histogram. 4- Bar charts. 5- Pie charts. 6- Box plots. 7- Scatter plots. 8- Bubble plots.

Feature Engineering:

Dimensionality Reduction (PCA) / Encoding (1 Hot - Normal) / Scaling

Build model:

7 Models evaluation using different evaluation metrics like (Accuracy – Precision – Recall – ROC):

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Data Science.docx		Data Science.docx
README.md		README.md
final.ipynb		final.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Science.docx

Data Science.docx

README.md

README.md

final.ipynb

final.ipynb

Repository files navigation

Binary income classification :

Goal :

Data Cleaning or Refinement :

Exploratory data analysis (EDA):

Visualization:

Feature Engineering:

Build model:

About

Releases

Packages

Languages

Manar20575/Data-Science-Project

Folders and files

Latest commit

History

Repository files navigation

Binary income classification :

Goal :

Data Cleaning or Refinement :

Exploratory data analysis (EDA):

Visualization:

Feature Engineering:

Build model:

About

Topics

Resources

Stars

Watchers

Forks

Languages