Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General Workshop Improvements #33

Open
stemlock opened this issue Mar 3, 2022 · 0 comments
Open

General Workshop Improvements #33

stemlock opened this issue Mar 3, 2022 · 0 comments

Comments

@stemlock
Copy link
Contributor

stemlock commented Mar 3, 2022

  • Replace Iris dataset
  • There is no baseline model for classification (decision tree?). What about logistic regression?
  • Feel like some sections lack explanations (e.g., feature importances, comparing different algorithms, no ROC curves?)
  • Other types of hyperparameter tuning (RandomSearch, Bayes Search)
  • XGBoost is generally considered the gold standard for shallow learning models. Replace AdaBoost?
  • Code could be cleaned up in general/more comments
  • Regression section would be a great place to introduce general modeling pipelines (data cleaning, feature transformation, feature engineering (maybe not applicable here), model training, hyperparameter tuning/cross-validation, model evaluation)
  • No need for a separate dummyencoder class -> this can be handled using onehotencoder or even Pandas get_dummies
  • If we are going to use a transformer + pipelines, we should think about adding the model object to the pipeline as well. In general, this is a better practice as you can then save off entire model pipelines vs just feature transformation pipelines.
  • I typically see KNN used for more naive classification vs regression. Not sure if it is necessary to include
  • We don't talk about Naive Bayes in classification. I feel this is a canonical algorithm that could be introduced
  • No mention of any dimensionality reduction/latent variable techniques for clustering seems like a gap
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant