Skip to content

antononcube/HowToBeADataScientistImpostor-book

Repository files navigation

"How to be a data scientist impostor?" book

This repository has chapters, code, and organizational materials for the book "How to be a data scientist impostor?"

The book is not finished yet -- it is a work in progress...

Mission statement

The purpose of this book is to give an overview and examples of different philosophical and mathematical methodologies and software programming techniques that would allow the reader to practice Data Science almost as successfully as seasoned practitioners, who have solid backgrounds in Statistics or Machine Learning. (Or better than them.)

Programming languages

The programming languages used are Wolfram Language (WL) and R.

Almost all code is available in both WL and R.

(WL is the primary language. Also, "Wolfram Language" and "Mathematica" are used as synonyms in this book.)

Exposition plan

  1. We start with Data Science market diagnosis and general strategies for problem solving.

    • Here we also discuss what kind of people we are going to collaborate with, argue with, be examined by, be hired by.
  2. Then we proceed with didactic chapters for:

    • doing data analysis, and

    • explanations of fundamental Machine Learning (ML) algorithms.

  3. Then we give practical know-how for tackling certain ML problems. Variations of those problems often occur in "real life."

  4. Finally, we show some "shock and awe" projects.

See this mind-map or this org-mode file for a more detailed order of book's parts and chapters.

Generally speaking, I am very interested in comparisons of the abilities of theories, methodologies, programming languages, algorithms, and concrete implementations to solve problems encountered in practice. This book presents a fair amount of such comparisons.

(And yes that is used to compare WL/Mathematica and R.)

Who is the intended reader?

This book is for the smart and audacious. (Definitely not for dummies…)

The reader is expected to have at least one fairly well developed relevant skill. Like the following.

  • Programming ability.
  • Mathematical maturity and reasoning abilities.
  • Mathematical modeling abilities.
  • Ability to express processes through equations and formulas.
  • Systems operations knowledge.
  • Strong Physics or Physical Sciences engineering background.
    • Like Mechanical Engineering, Electrical Engineering, Chemical Engineering …
    • (Software Engineering does not count here.)

We assume the reader is inquisitive and willing to jump into the water without knowing how to swim.

Code and related repositories

Most of the practical ML know-how projects are projects from MathematicaVsR at GitHub.

Many of the chapters were previously published in MathematicaForPrediction at WordPress or MathematicaForPrediction at GitHub.

This book outsources the detailed explanations of the core Machine Learning workflows to the book "Simplified Machine Learning Workflows", which from its part outsources the software architecture methods explanations to the book "Software Design Methods with Wolfram Language".

Here is a diagram that shows the dependencies between the books and code repositories:

BookDependencies

Videos

Below is given a list of some videos with presentations of mine that discuss some of the topics in this book. The videos more relevant to this book are put on top.


Anton Antonov
Windermere, Florida, USA
2019-07-15

About

Chapters, code, and organizational materials for the book "How to be a data scientist impostor?"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published