Demo notebook and data for Spark Summit Dublin 2017: One-Pass Data Science with Generative T-Digests
-
Updated
Oct 21, 2017 - Jupyter Notebook
Demo notebook and data for Spark Summit Dublin 2017: One-Pass Data Science with Generative T-Digests
Given enough data, could we make predictions on whether a terrorist attack will be successful, or not? This analysis aims to do just that using Decision Trees and Random Forests created with scikit-learn. (Python)
Predict the outcome of childbirth, from a data set containing socio-economic data of the mother-to-be, and from previous Ante Natal Care checkups
Apply supervised machine learning techniques and an analytical mind on data collected for the U.S. census to help CharityML (a fictitious charity organization) identify people most likely to donate to their cause
A Comprehensive Guide to Titanic Machine Learning from Disaster
Predicting the ideological direction of Supreme Court decisions: ensemble vs. unified case-based model
Customer churn analysis for a telecommunication company
Used the Global Terrorism Database to Explore Features of Suicide Bombings
Udacity Data Scientist Nanodegree Project - Employ supervised algorithms to accurately model individuals income
The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset
Features selector based on the self selected-algorithm, loss function and validation method
Analyzing the Features which leads to heart diseases and visualizing the models' performance and important features using eli5, shap and pdp.
Feature Importance of categorical variables by converting them into dummy variables (One-hot-encoding) can skewed or hard to interpret results. Here I present a method to get around this problem using H2O.
The given information of network connection, model predicts if connection has some intrusion or not. Binary classification for good and bad type of the connection further converting to multi-class classification and most prominent is feature importance analysis.
Build an algorithm to best identify potential donors of CharityML
Machine Learning Nano-degree Project : To help a charity organization identify people most likely to donate to their cause
Variance-based Feature Importance in Neural Networks
Nowadays, sports events live above all from their media coverage, which includes cheering up winners and writing down losers. Statitstics are used to underpin the own argumentation in this reports. But is there any cherry picking here? Are only those statistics used that make the report/commentary look completely logical? In order to give an ini…
Add a description, image, and links to the feature-importance topic page so that developers can more easily learn about it.
To associate your repository with the feature-importance topic, visit your repo's landing page and select "manage topics."