Skip to content

Commit

Permalink
minor edits to the text of logistic chapter
Browse files Browse the repository at this point in the history
  • Loading branch information
debnolan committed Apr 22, 2023
1 parent c37e8b0 commit 63f5f39
Show file tree
Hide file tree
Showing 8 changed files with 360 additions and 275 deletions.
141 changes: 100 additions & 41 deletions content/ch/19/class_dr.ipynb

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions content/ch/19/class_example.ipynb

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions content/ch/19/class_intro.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In this chapter we'll expand our view of modeling. Instead of predicting numeric outcomes, these models predict nominal outcomes. For example, these models enable banks to predict whether a credit card transaction is fraudulent or not, doctors to classify tumors as benign or malignant, and your email service to identify spam and set it aside from your usual emails. This type of modeling is called *classification* and occurs widely in data science."
"In this chapter we expand our view of modeling. Instead of predicting numeric outcomes, we build models to predict nominal outcomes. These sorts of models enable banks to predict whether a credit card transaction is fraudulent or not, doctors to classify tumors as benign or malignant, and your email service to identify spam and set it aside from your usual emails. This type of modeling is called *classification* and occurs widely in data science."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Just as with linear regression, we formulate a model, choose a loss function, fit the model by minimizing average loss for our data, and assess the fitted model. But unlike linear regression, our model is not linear, the loss function is not squared error, and our assessment compares different kinds of classification errors. Despite these differences, the overall structure of model fitting carries over to this setting. Together, regression and classification compose the primary approaches for _supervised learning_, the general task of learning a model based on observed outcomes and covariates. "
"Just as with linear regression, we formulate a model, choose a loss function, fit the model by minimizing average loss for our data, and assess the fitted model. But unlike linear regression: our model is not linear; the loss function is not squared error; and our assessment compares different kinds of classification errors. Despite these differences, the overall structure of model fitting carries over to this setting. Together, regression and classification compose the primary approaches for _supervised learning_, the general task of fitting models based on observed outcomes and covariates. "
]
},
{
Expand Down
75 changes: 46 additions & 29 deletions content/ch/19/class_log_model.ipynb

Large diffs are not rendered by default.

29 changes: 18 additions & 11 deletions content/ch/19/class_loss.ipynb

Large diffs are not rendered by default.

23 changes: 15 additions & 8 deletions content/ch/19/class_pred.ipynb

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions content/ch/19/class_summary.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,14 @@
"(sec:class_summary)=\n",
"# Summary\n",
"\n",
"In this chapter, we only fit simple logistic regressions with one explanatory variable, but we can easily include other variables in the model by adding more features to our design matrix. For example, if some predictors are categorical, we can include them as one-hot encoded features. These ideas carry over directly from {numref}`Chapter %s <ch:linear>`. The technique of regularization ({numref}`Chapter %s <ch:risk>`) also applies to logistic regression. We integrate all of these modeling techniques---including using a test-train split to assess the model and cross-validation to choose the threshold---in the case study in {numref}`Chapter %s <ch:fake_news>` that develops a model to classify fake news."
"In this chapter, we fit simple logistic regressions with one explanatory variable, but we can easily include other variables in the model by adding more features to our design matrix. For example, if some predictors are categorical, we can include them as one-hot encoded features. These ideas carry over directly from {numref}`Chapter %s <ch:linear>`. The technique of regularization ({numref}`Chapter %s <ch:risk>`) also applies to logistic regression. We integrate all of these modeling techniques---including using a test-train split to assess the model and cross-validation to choose the threshold---in the case study in {numref}`Chapter %s <ch:fake_news>` that develops a model to classify fake news."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Logistic regression is a cornerstone in machine learning since it naturally extends to more complex models. For example, logistic regression is one of the basic components of a neural network. When the response variable has more than two categories, logistic regression can be extended to multinomial logistic regression. Another extension for counts is Poisson regression. These different forms of regression are related to maximum likelihood, where the underlying model for the response is binomial, multinomial, or Poisson, respectively, and the goal is to optimize the likelihood of the data over the parameters of the respective distribution. This family of models is also known as generalized linear models. In all of these scenarios, closed form solutions for minimizing loss don't exist, so optimization of the average loss relies on numerical methods, which we'll cover in the next chapter."
"Logistic regression is a cornerstone in machine learning since it naturally extends to more complex models. For example, logistic regression is one of the basic components of a neural network. When the response variable has more than two categories, logistic regression can be extended to multinomial logistic regression. Another extension for counts is Poisson regression. These different forms of regression are related to maximum likelihood, where the underlying model for the response is binomial, multinomial, or Poisson, respectively, and the goal is to optimize the likelihood of the data over the parameters of the respective distribution. This family of models is also known as generalized linear models. In all of these scenarios, closed form solutions for minimizing loss don't exist, so optimization of the average loss relies on numerical methods, which we cover in the next chapter."
]
},
{
Expand Down

0 comments on commit 63f5f39

Please sign in to comment.