Demo of 1-pass variable importance using t-digests
-
Updated
Jun 18, 2017 - Jupyter Notebook
Demo of 1-pass variable importance using t-digests
🎯🎓 An introductory workshop lecture on a generalized framework for Targeted Learning using the tmle3 R package
Identify the most important variables that contribute to the level of sustainability of schools using supervised machine learning. Use the importance to assign a weight to each characteristic and calculate a sustainability index for each school.
Prediction of equipment failures
Predict prices of diamond data in ggplot2
Analyzed 34 attributes to understand 395 students pen portrait, factors driving alcohol consumption and understand what factors impacted the final exam grades the most
Simulating Supervised Learning Data
Final Project: Predictive and descriptive analysis, interpretations, and recommendations based on Austin Animal Center data on cats housed at the shelter. The main focus of this analysis is to determine which factors affect a cat's chances of being adopted or returned to their owner.
2022년 2학기 팀 프로젝트 : 도금공정 데이터 셋 분석(k-인공지능 제조데이터 분석 경진대회)
Code and tutorials for implementing the GlObal And Local Score
Comprehensive dimensionality reduction and cluster analysis toolset
📦 🔬 R/methyvim: Targeted, Robust, and Model-free Differential Methylation Analysis
R script to rank and select variables based on their importance/predictive power
Code for Master Thesis
SVC and KNN methods were used to predict whether mushrooms are poisonous or edible according to their properties. Random forest and chi-square variable selection methods were applied and the 10-fold cross validation method was used and f1 scores were calculated by re-estimating. Finally, the models were compared.
Creating predictive models to classify Trump's vote share and clustering counties based on demographics and economic variables. Report findings in PDF with detailed methodologies, model assessments, and R code for the project.
Important Variable selection
Demo notebook and data for Spark Summit Dublin 2017: One-Pass Data Science with Generative T-Digests
💬 Talk on causal inference and variable importance with stochastic interventions under two-phase sampling
In this project, I use 3 machine learning models (CART, Random Forest and ANN) to predict the claim frequency for a travel insurance firm. I also evaluate which of the three models is most suitable for our dataset.
Add a description, image, and links to the variable-importance topic page so that developers can more easily learn about it.
To associate your repository with the variable-importance topic, visit your repo's landing page and select "manage topics."