Skip to content

Interactive flashcards and quizzes, as well as additional tutorials, animations, and code, for "Foundations of Data Science with Python" by John M. Shea

Notifications You must be signed in to change notification settings

jmshea/Foundations-of-Data-Science-with-Python

Repository files navigation

Foundations of Data Science with Python

by John M. Shea

Learn data visualization, statistics, probability, and dimensionality reduction using a computational-first approach, without giving up mathematical rigor. A great textbook for an Introduction to Data Science or Engineering Statistics class.

Cover of the book *Foundations of Data Science with Python*

Buy on Amazon. [Affiliate link]

This repository is the source for the book's website (fdsp.net).

About the book

This book is an introduction to the foundations of data science, including data visualization, statistics, probability, and dimensionality reduction. This book is targeted toward engineers and scientists, but it should be easily accessible to anyone who knows basic calculus and the basics of computer programming. By leveraging this background knowledge, this book fits a unique niche in the books on data science and statistics:

  • This book applies a modern, computational approach to work with data, and in particular, uses simulations (an approach called resampling) to answer statistical questions.
    • Many books on statistics (especially those for engineers) teach a theoretical approach to answering statistical questions that many learners find difficult to understand. Most learners can easily understand how resampling works in contrast to some arcane formula.

  • This text provides a basic, but rigorous, introduction to probability and its application to statistics.
    • Some of the other books that use the resampling approach to statistics omit the mathematical foundations because they are targeted toward a broader audience who may not have the rigorous mathematical background of engineers and scientists.

  • This book provides an introduction to some of the most important libraries in the Python data stack, including NumPy, SciPy, Matplotlib, and Pandas.

  • Real data sets are used wherever practical.
    • Many statistics books use contrived examples to make examples that are solvable using a calculator, but the majority of the data sets used in this book are analyzed using computer programs.

  • The data sets and the questions asked are chosen to appeal to a broad audience.
    • Although the approach taken and the material covered is targeted toward engineers and scientists, I try to investigate questions that will appeal to most readers, and especially those that may appeal to college students.

  • The book has a unique set of interactive materials, including interactive quizzes and animated flashcards.
    • These are available on the book's website, fdsp.net. This GitHub repository contains the source files for that site.

About the website

The website (fdsp.net) contains material that could not be included in the book itself, including:

Interactive tools to help students learn the material, including:

  • Interactive self-assessment quizzes via JupyterQuiz
  • Interactive flashcards to aid in learning terminology via JupyterCards
  • Animations and interactive visualizations
  • Problem sets for homework or additional practice (Coming soon!)
  • Errata for the book (When available)
    • A list of websites and books for those who want to continue their learning: Next Steps

As an Amazon Associate I earn from qualifying purchases.

About

Interactive flashcards and quizzes, as well as additional tutorials, animations, and code, for "Foundations of Data Science with Python" by John M. Shea

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published