This portfolio is consistently populated with projects as they become available. The structure of this portfolio is based on sections which address different types of techniques and analyses such as exploration, clustering, classification, time series and more.
Enjoy 😃
Thanks,
Ali
The goal of this project is to show you the performance of S&P for the last 5 years in terms of price, returns, and the distribution of both of these attributes. Finally there is a comparison between major indices with a brief explanation on their relative performance.
This project estimates a linear model which can be used to predict sales based on spending on TV, Newspaper, and Radio media. In building the model we test for the statistical significance of the model coeffecients using t-test, p-values, and F-Statistics, and also the quality of the fit using R-squared.
This notebook contains various regression models built to predict house values and car seat sales from two different datasets. The first dataset used to predict house values is called the Boston housing dataset and the second dataset used to predict child car seats at 400 different stores is called carseats. Both datasets are avaliable through the MASS and ISLR libraries in R. The models built are tested for statistical significance using p-values, F-Statistics, and ANOVA test. To enhance model accuracy and fit techniques as non-linear transformations and interaction terms are applied.
This project uses the heart mortality data retreived from Data.gov. The aim of this project is to build a classifier to predict the gender classification of observations in the data set segmented by state and mortality rate.
This project is built in collaboration with Brandon Moragne and Maycie McKay as our Final Practicum for Masters in Data Science at Lipscomb University. Algorithms used in this project include Logistic Regression, Naive Bayes, Random Forest, and K-Nearest Neighbors.
This two part project addresses :
-
Default prediction on loans
-
Daily stock market percentage change
Classification Models:
- Logistic Regression
- Linear DIscriminant Analysis
- Quadratic Discriminant Analysis
- KNN
This project is based on forecasting the level of in Seoul, Korea. The data is first cleaned by removing outliers using z-scores with a threshold of 3. This is followed by correlation analysis of various pollutants in Seoul's atmosphere. is then isolated from the dataset and tested for stationarity, autocorrelation, and partial autocorrelation. Finally, using a SARIMA model the level of is forecasted.
This project will demonstrate how to visually present a collection of nodes in a graph followed by the application of Karger's Minimu Cut Contraction algorithm to find the minimum cut of the graph. The minimum cut of a graph can help spot the vulnerabilities in a network. These vulnerabilities are represented as edges(connections) between nodes(endpoints.) Addressing such vulnerabilities can help strengthen the network.
- Quick Sort Implementation in Python
- To run the code above use the QuickSort.txt file.
- Save both(code and data) files at the same location before executing the code.
This dashboards containes three seperate tabs with visuals mostly describing the state of different economic indicators as measured by trading economics. The inidcators are analyzed by regions and countries. They provide the over health of economies through measures such as GDP, FDI Inflows, Tax Burdens, Business Freedom and more. These are PNG files uploaded as screen shots of the actual interactive dashboard.