Skip to content

libjohn/workshop_webscraping

Repository files navigation

README

John Little 2021-06-17

A workshop case study on webscraping

DOI

ORCID

Creative Commons CC BY-NC

Launch Rstudio Binder

YouTube Playlist

Using the rvest library to learn about web crawling and HTML parsing in R.

  • Introduce just enough HTML/CSS
  • Introduce the library(rvest) package for harvesting websites/HTML
  • Tidyverse iteration with purrr::map

Workshop Video: https://youtu.be/8ISc8V9GDAg

See Also: What to know about law & ethics when archiving & mining data by Rachael Samberg, J.D., MLIS Timothy Vollmer, MIS & the UC Berkeley Office of Scholarly Communication Services youtube playlists on navigating intellectual property, copyright, fair-use. Please note, the Samberg/Vollmer slides are found in this github repo’s slides folder and are redistributed with permission from the slide authors.


License

John Little https://JohnLittle.info https://Rfun.library.duke.edu https://library.duke.edu/data

CC BY-NC

Creative Commons Attribution-NonCommercial https://creativecommons.org/licenses/by-nc/4.0