ch 20 (optimization): a few small issues #165

murphyk · 2023-07-07T20:47:58Z

In sec 20.1, The comment about scipy.minimize where you say "we don’t even need to compute the gradient" may be misleading. As you know, by default it uses numerical differentaiton to compute the gradient, if the grad function is not specified by the user, so this is likely to be slow. You may want to mention automatic differentiation libraries like jax and pytorch, which can solve this problem for you. (Also scipy.minimize defaults to BFGS, not GD, and chooses step size automagically :) Since this book is trying to demonstrate "best practice" for DS (eg the nice way you use dataframe.pipe for reproducible wrangling), maybe you should show how to use scipy.minimize on your example problem?

In sec 20.2 first 2 paragraphs need rewriting to avoid repetition/ redundancy.

In sec 20.3 maybe mention that convex implies second order derivative is positive, so the function has a bowl shape.
This condition is easier to check in practice than the definition of convexity. It's probably also worth mentioning some examples of convex and non-convex loss functions encountered in the book.

Maybe mention SAGA and other variance reduced SGD methods since it is used in 21.4.1?

murphyk changed the title ~~sec 20.2 scipy.minimize comment~~ ch 20 (optimization): a few small issues Jul 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ch 20 (optimization): a few small issues #165

ch 20 (optimization): a few small issues #165

murphyk commented Jul 7, 2023 •

edited

ch 20 (optimization): a few small issues #165

ch 20 (optimization): a few small issues #165

Comments

murphyk commented Jul 7, 2023 • edited

murphyk commented Jul 7, 2023 •

edited