Skip to content

IBMDeveloperUK/Python-Pandas-Workshop

Repository files navigation

Data analysis in Python using pandas

You might think that Python is only for developers and people with computer science degrees. However, Python is great for beginners, even those with little coding experience because it’s free, open source, and runs on any platform. The Python packages documentation is great, and after an introductory course, you have a good foundation to build on.

Python is a general purpose and high-level programming language that is used for more than working with data. For example, it’s good for developing desktop GUI applications, websites, and web applications. However, this tutorial focuses on the data and only goes through getting started with data.

pandas is an open source Python Library that provides high-performance data manipulation and analysis. With the combination of Python and pandas, you can accomplish five typical steps in the processing and analysis of data, regardless of the origin of data: load, prepare, manipulate, model, and analyze.

There are many options when working with the data using pandas. The following list shows some of the things that can be done using pandas.

  • Cleaning data by removing or replacing missing values
  • Converting data formats
  • Sorting rows
  • Deleting or adding rows and columns
  • Merging or joining DataFrames
  • Summarizing data by pivoting or reshaping
  • Creating visualizations

This list is far from complete. See the pandas documentation for more of what you can do.

This workshop walks you though some of the most interesting features of pandas using structured data that contains information about the boroughs in London. You can download the data used in the tutorial from data.gov.uk.

Getting Started with Jupyter Notebooks

Instead of writing code in a text file and then running the code with a Python command in the terminal, you can do all of your data analysis in one place. Code, output, tables, and charts can all be edited and viewed in one window in any web browser with Jupyter Notebooks. As the name suggests, this is a notebook to keep all of your ideas and data explorations in one place. In this tutorial, you use IBM Watson Studio to run a notebook. For this, you need a free IBM Cloud account. The following steps show you how sign up and get started. When you have the notebook up and running, you can go through the notebook.

Jupyter notebooks are an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.

In this workshop we will use IBM Watson Studio to run a notebook. For this you will need an IBM Cloud account. The following steps will show you how sign up and get started. When you have the notebook up and running we will go through the notebook.

IBM Cloud

  • Use this link to sign up for an IBM Cloud account

  • When you are signed up click Create Resource at the top of the Resources page. You can find the resources under the hamburger menu at the top left:

  • or Search for Watson Studio and click on the tile:

  • Select the Lite plan and click Create.
  • Go back to the Resources list and click on your Watson Studio service and then click Get Started.

IBM Watson Studio

Create a new Project

  • You should now be in Watson Studio.
  • Create a new project by clicking on Get Started and New Project, or Create Project

  • Give your Project a name.
  • Select an Object Storage from the drop-down menu or create a new one for free. This is used to store the notebooks and data. Do not forget to click refresh when returning to the Project page.

  • click Create.

4. Load and run a notebook

  • Add a new notebook. Click Add to project and choose Notebook:

  • Choose new notebook From URL. Give your notebook a name and copy the URL https://github.com/IBMDeveloperUK/python-pandas-workshop/blob/master/Working-with-structured-data-in-Python-using-Pandas.ipynb
  • Select the default runtime and click Create Notebook.
  • The notebook will load. You are now ready to follow along with the workshop in the notebook!

References

Find the tutorial on IBM Developer.

About

This workshop walks you though some of the most interesting features of pandas using structured data that contains information about the boroughs in London.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published