Skip to content

TextXD/introduction-to-web-scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Binder

Introduction to web scraping

Overview

This is a one-hour beginner's introuction to web scraping, using Python. We'll work through a complete example of scraping a website containing course information from a university, resulting in a dataset of almost 10,000 university courses. We'll focus on the concepts involved in web scraping rather than memorizing Python syntax.

What you'll learn

  • Why you'd want to scrape data from the web in the first place
  • A high-level view of how the web works
  • How to make a HTTP request in Python
  • How to parse HTML in Python
  • Why you need to read the Terms of Service of a website before you scrape any website

Prerequisites

Anyone is welcome at this workshop no matter what level their programming is at. That's because we'll focus on the concepts behind web scraping more than the specific syntax. This workshop will be most useful to people who have some familiarity with Python but have never done web scraping before.

IOKN2K

It's OK Not To Know! That's our motto at D-Lab. D-Lab is open to researchers and professionals from all disciplines and levels of experience. Ask any questions.

Contributing

If you spot a problem with these materials, please make an issue describing the problem.

Author

  • Geoff Bacon

Acknowledgments

  • Chris Hench

D-Lab logo

Binder