Skip to content

PacktPublishing/Thoughtful-Data-Science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Thoughtful Data Science

This is the code repository for Thoughtful Data Science, published by Packt. It contains all the supporting project files necessary to work through the book from start to finish.

About the Book

Thoughtful Data Science brings new strategies and a carefully crafted programmer's toolset to work with modern, cutting-edge data analysis. This new approach is designed specifically to give developers more efficiency and power to create cutting-edge data analysis and artificial intelligence insights.

Industry expert David Taieb bridges the gap between developers and data scientists by creating a modern open-source, Python-based toolset that works with Jupyter Notebook, and PixieDust. You'll find the right balance of strategic thinking and practical projects throughout this book, with extensive code files and Jupyter projects that you can integrate with your own data analysis.

David Taieb introduces four projects designed to connect developers to important industry use cases in data science. The first is an image recognition application with TensorFlow, to meet the growing importance of AI in data analysis. The second analyses social media trends to explore big data issues and natural language processing. The third is a financial portfolio analysis application using time series analysis, pivotal in many data science applications today. The fourth involves applying graph algorithms to solve data problems. Taieb wraps up with a deep look into the future of data science for developers and his views on AI for data science.

Instructions and Navigation

All of the code is organized into folders. Each folder starts with a number followed by the application name. For example, Chapter02.

The code will look like the following:

import pandas
data_url = "https://data.cityofnewyork.us/api/views/e98g-f8hy/rows.csv?accessType=DOWNLOAD"
building_df = pandas.read_csv(data_url)
building_df
  • Most of the software needed to follow the example is open source and therefore free to download. Instructions are provided throughout the book, starting with installing anaconda which includes the Jupyter Notebook server.
  • In Chapter 7, Big Data Twitter Sentiment Analysis, the sample application requires the use of IBM Watson cloud services including NLU and Streams Designer. These services come with a free tier plan, which is suffiient to follow the example along.

Related Products