This repository contains the prediction of baseball statistics using MLB Statcast Metrics.
Goals
- Using MLB Statcast Metrics, summarize and examine baseball statistics.
Classification
-
Build and train models to predict home runs and extra-base hits implementing the following approaches:
- Logistic Regression
- k-Nearest Neighbors
- Decision-Classification Tree
- Random Forest Classification
- Support Vector Machine Classification
- XGBoost Classification
-
Implement over-sampling for imbalanced data to improve the quality of predictive modeling (i.e., generalizability).
-
Apply regularization and cross-validation techniques for model evaluation, selection, and optimization.
Regression
-
Build and train models to predict hit distance implementing the following approaches:
- Linear Regression
- Decision-Regression Tree
- Random Forest Regression
-
Apply regularization (Ridge, Lasso, Elastic Net) and cross-validation (k-fold) techniques for model evaluation, selection, and optimization.