Note: This workshop was given at Access 2019 on October 2, 2019.
You can find exercises files in Python and R.
The Web has become a source of data for daily and scientific research. Although there are many initiatives to facilitate data exchange, most of the Web content are written in plain HTML. This workshop will introduce three approaches (Google Sheets, Python, and R) from simple to advanced to scrape web data in a standard format like CSV, XML, and JSON and how these techniques can be applied to daily work and research.
You can find my slides used for this workshop in Zenodo.