GitHub - amuraddd/project-portfolio: Portfolio containing projects on supervised and unsupervised learning, algorithms, as well as exploratory analysis and visualizations.

Project Portfolio

This portfolio is consistently populated with projects as they become available. The structure of this portfolio is based on sections which address different types of techniques and analyses such as exploration, clustering, classification, time series and more.

Enjoy 😃

Thanks,

Ali

Note: Data used in these projects is for demonstration purposes only.

Data Exploration:

The goal of this project is to show you the performance of S&P for the last 5 years in terms of price, returns, and the distribution of both of these attributes. Finally there is a comparison between major indices with a brief explanation on their relative performance.

Data Exploration and Analysis of major stock indices.

Regression

Sales prediction from advertisement spending

This project estimates a linear model which can be used to predict sales based on spending on TV, Newspaper, and Radio media. In building the model we test for the statistical significance of the model coeffecients using t-test, p-values, and F-Statistics, and also the quality of the fit using R-squared.

Regression Model to estimate sales based on advertising media

Median house value and car seat sales prediction

This notebook contains various regression models built to predict house values and car seat sales from two different datasets. The first dataset used to predict house values is called the Boston housing dataset and the second dataset used to predict child car seats at 400 different stores is called carseats. Both datasets are avaliable through the MASS and ISLR libraries in R. The models built are tested for statistical significance using p-values, F-Statistics, and ANOVA test. To enhance model accuracy and fit techniques as non-linear transformations and interaction terms are applied.

Median house value and car seat sales prediction

Classification

Gender Classification for Heart Mortality in USA with population over 35 years old

This project uses the heart mortality data retreived from Data.gov. The aim of this project is to build a classifier to predict the gender classification of observations in the data set segmented by state and mortality rate.

Gender Classification for Heart Mortality in USA with population aged over 35.

Lending Club Data Analysis: Application of Machine Learning techniques to Loan Default Prediction

This project is built in collaboration with Brandon Moragne and Maycie McKay as our Final Practicum for Masters in Data Science at Lipscomb University. Algorithms used in this project include Logistic Regression, Naive Bayes, Random Forest, and K-Nearest Neighbors.

Loan Default Prediction on Lending Club Loan Data.

Classification of loan defaults and stock market direction in terms of daily percentage change

This two part project addresses :

Default prediction on loans
Daily stock market percentage change

Classification Models:
1. Logistic Regression
2. Linear DIscriminant Analysis
3. Quadratic Discriminant Analysis
4. KNN

Loan defaults and stock market direction prediction.

Time Series Analysis and Forecasting:

This project is based on forecasting the level of $SO_{2}$ in Seoul, Korea. The data is first cleaned by removing outliers using z-scores with a threshold of 3. This is followed by correlation analysis of various pollutants in Seoul's atmosphere. $SO_{2}$ is then isolated from the dataset and tested for stationarity, autocorrelation, and partial autocorrelation. Finally, using a SARIMA model the level of $SO_{2}$ is forecasted.

Forecasting the level of $SO_{2}$ in Seoul, Korea.

Network Analysis Using Graphs

Network Visualization and Vulnerability Detection:

This project will demonstrate how to visually present a collection of nodes in a graph followed by the application of Karger's Minimu Cut Contraction algorithm to find the minimum cut of the graph. The minimum cut of a graph can help spot the vulnerabilities in a network. These vulnerabilities are represented as edges(connections) between nodes(endpoints.) Addressing such vulnerabilities can help strengthen the network.

Network Visualization and Vulnerability Detection.

Algorithms

Quick Sort

Quick Sort Implementation in Python
To run the code above use the QuickSort.txt file.
Save both(code and data) files at the same location before executing the code.

Dasboards

Economic Indicators in Microsoft Power BI

This dashboards containes three seperate tabs with visuals mostly describing the state of different economic indicators as measured by trading economics. The inidcators are analyzed by regions and countries. They provide the over health of economies through measures such as GDP, FDI Inflows, Tax Burdens, Business Freedom and more. These are PNG files uploaded as screen shots of the actual interactive dashboard.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
langchain-projects		langchain-projects
.DS_Store		.DS_Store
.gitignore		.gitignore
QuickSort.txt		QuickSort.txt
README.md		README.md
air_pollution_in_seoul_korea.ipynb		air_pollution_in_seoul_korea.ipynb
classification_with_logistic_regression.ipynb		classification_with_logistic_regression.ipynb
heart_failure.ipynb		heart_failure.ipynb

amuraddd/project-portfolio

Folders and files

Latest commit

History

Repository files navigation