This project was created as part of our 3rd semester Introduction to Data Science (UE18CS203) final project. The dataset is hosted on my Google Drive with slight preprocessing: https://drive.google.com/file/d/1nLCVDfQy8NUu9NnD55meyj54ynkxiowp/view
- pandas
- numpy
- sklearn
- statsmodels
- seaborn
- matplotlib
- mpl_toolkits
- scipy
- cython
- pydrive
- oauth2client
- Data Cleaning
- Data Normalization - using StandardScaler()
- Visualizations
- Box plot - Check for outliers
- Histogram - Check for normalization
- q-q plot - Check for normalization
- Map visualizations - Visualize a heat map for landing sites
- Pie charts - Distribution of different types of meteorites
- Heat map - Confusion matrix for correlation graph
- Scatter plot - Visualization for correlation graph
- Correlation Graph - Find correlations between columns using Heat Map generated
- Hypothesis testing -
H0: The difference between mean of sample mass and population mass mass is a statistical fluctuation.
H1: The difference between mean of sample mass and population mass mass is significatn and not a mere case of statistical fluctuation.
- Shubham Gupta
- Sindhu Rao
- Sinduja Mullangi