neural_network_stacked_generalization_xgboost_glm

by Lennart Wallentin, lennartwallentin@gmail.com

In my project 'neural_network_stacked_generalization_xgboost_glm_lennart_wallentin.ipynb' the main objective is first of all to show that I have a broad set of skills regarding machine learning algorithms since I use Logistic Regression, XGBoost and Neural Networks in this project. The second objective is to research if a stacked generalization with a neural network (built with Keras and TensorFlow) as meta learner yields a higher AUC score than a stacked generalization with XGBoost as meta learner. The benchmark AUC score’s I use to compare against was yield from my other stacked generalization project https://github.com/lennartwallentin/passenger_satisfaction_stacking_anova hence there are similarities between these two projects but with one important distinction, in this project I show that I also have a good understanding regarding neural networks and the Keras and TensorFlow library.

In addition to that I also demonstrate:

Machine learning - Use different Python libraries like Scikit-learn, XGBoost and Keras (it’s built on top of TensorFlow) to build machine learning model’s. Regarding the neural network I use Keras more flexible Functional API as a way to create the models and I also use KerasTuner together with Bayesian optimization for the hyperparameter tuning. In addition to that I describe in depth important neural network parts such as layers, activation functions (incl ReLU), batch normalization and a neural network model instantiation.
For all models including the standalone model’s and the stacked generalization base and meta learner model’s, I validate the model’s strength as well as approximate the model's optimal parameters. And explain the model's output in an interpretable and clear way.
Business knowledge - Last in my project at section 5.1 A higher AUC score, so what? - Business context. I bring up something many data scientists sometimes struggle with, describing their findings in a business friendly way to their stakeholders. In that section I explain how changes to the AUC score also affect other evaluation metrics and I use the false negative rate (FNR) and how changes in FNR affects a core business metric regarding the number of satisfied or neutral/dissatisfied airline passengers.

The dataset that is used for this project is in the csv-file, 'data_neural_network_stacked_generalization_xgboost_glm_lennart_wallentin.csv'

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
data_neural_network_stacked_generalization_xgboost_glm_lennart_wallentin.csv		data_neural_network_stacked_generalization_xgboost_glm_lennart_wallentin.csv
neural_network_stacked_generalization_xgboost_glm_lennart_wallentin .ipynb		neural_network_stacked_generalization_xgboost_glm_lennart_wallentin .ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

data_neural_network_stacked_generalization_xgboost_glm_lennart_wallentin.csv

data_neural_network_stacked_generalization_xgboost_glm_lennart_wallentin.csv