Skip to content

nataberishvili/nataberishvili.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

Data Science Portfolio by Nata Berishvili

This portfolio is a collection of notebooks which I created for data analysis and machine learning projects.

Automated & Interactive reporting Infrastructure (R Shiny, flexdashboard, CSS)

sales-dashboard-untitled-gif

Implementing automated reporting infrastructure has enormous potential and benefits. It can be applied to all parts of a business, from marketing and sales to fulfillment and customer services. Automated reports will significantly decrease the timeline, combine data from different resources, and create reports within minutes. Some advantages associated with the automated reporting approach:   

  • Decreased expenses  
  • Reduced time
  • Greater control and consistency 
  • Decreased risk associated with human/manual error  
  • Accurate and reliable results

Sales interactive dashboard help us to drill-down the sales data, quickly identify trends, monitor the changes, and make timely strategic decisions. App is available on this link

Create an Interactive Dashboard with Shiny, Flexdashboard, and Plotly

INTERACTIVED

Interactive dashboards empower users to gain valuable insight into key metrics and make data-driven decisions. Interactivity helps optimize the use of dashboard space and updates visualizations automatically as the user changes inputs. Flexdashboard is an R markdown file, which can be either static or dynamic. By combining flexdashboard with Shiny, you can write dynamic web applications without any knowledge of HTML, CSS, or JavaScript, using only R and R markdown.

Detailed post and brief tutorial is avaialble on my Medium

App for Machine Learning - House Price Prediction with Linear Regression

interactiveml

App is available on the following link (under development)

  • Choose Independent Variables
  • Train the Model
  • Visualize Predictions (Actual numbers vs Predicted numbers)
  • Visualize Residuals
  • Download Predictions

Exploratory Data Analysis - Medicare

For this project I use Medicare Provider Utilization and Payment Data for more than 3,000 U.S. hospitals that receive Medicare paments. I explore and visualize data to make comparisons between the individual hospital-level charges and payments within local markets, and nationwide. Questions I want to answer

  • Which Diagnostic Related Groups cost Medicare the most?
  • What are the most commong hospital discharges?
  • What is the trend in last 3 years?
  • Which States and Hospitals charge the most? etc.

For more details code can be found here GitHub Flavored Markdown.

Happiness Around the World - Data Visualization with R

The data was scraped from the wikipedia. The World Happiness Report is an annual publication of the United Nations Sustainable Development Solutions Network. Happiness is explained by 6 factors. As per the 2019 Happiness Index, Finland is the happiest country in the world. The interactive version of map (Plotly) can be found here.

hap-nat

For more details code can be found here GitHub Flavored Markdown.

Predicting Loan Default - Binary Classification

The main purpose of this project is to build supervised machine learning models with h2o and experiment with hyperparameters. (Python) For more details code can be found here GitHub Flavored Markdown.

Interactive Maps with R leaflet

Interactive data visualization enhances exploratory data analysis and is a great way to engage with both technical and non-technical audiences. R's leaflet package is a powerful tool to create visually compelling interactive maps. In this post I will show how to create a choropleth map with leaflet. Choropleth maps show the level of variability within a region, using color.

I will build a choropleth map using data for the 2019 Novel Coronavirus published by Johns Hopkins University Center for Systems Science and Engineering. To make it easy to follow through the steps, we will work with state-level data.

Recording-3

Explain Loan Probability of Default With SHAP Values

Some models are easy to interpret such as linear / logistic regression (weight on each feature, knowing the exact contribution and negative and positive interaction), single decision trees, some models are harder to interpret such as Ensemble models - it is hard to understand the role of each feature, it comes with "feature importance" but does not tell if feature affects decision positively or negatively

nataaa

GitHub Flavored Markdown