Skip to content

ML project for two datasets House prices, and Fashion MSNIST

Notifications You must be signed in to change notification settings

hossamAhmedSalah/ML_PROJECT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ML_PROJECT

Fashion MNIST

👕 Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. it was intend for Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

Summary

  • Representing the images in 3 ways
    • Pixel features representation (28×28) 784 pixel with a column for the label
    • HOG features representation 70 features
    • PCA 40 features
    • Mix dataset between HOG and PCA
  • Building Logistic Regression on each dataset Model hypertuning using GridSearch , Cross validation
  • Kmeans (removing the label from the subset of dataset contain only 5 classses)
  • Visualising K-means using PCA to create 3 components can be plotted as a 3D clusters
  • Building a simpple GUI -> GUI
    • you need to download the model from model

House Pricing Regression

💡 House prices is regression data set about: Predict sales prices and practice feature engineering, RFs of training set is 81 column and test set is 79 column .

Cleaning and Preparing the data

  • Feature Selection
  • Dealing with missing data
  • Dealing with outliers
  • Feature Encoding
  • Model Building and Enhancing
    • Linear Regression
    • KNN Regressor
  • you can find the notebook -> code
  • The report of what we have done -> report and model evaluation