Skip to content

A bot written in R that extracts a table from a pdf file, processes the data and saves everything to a csv file

Notifications You must be signed in to change notification settings

jlomako/pdfscraper-R

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A bot written in R that extracts a table from a pdf file, processes the data and saves everything to a csv file. It runs every day around noon local time. It is a bit slow and has been replaced by a faster bot that does exactly the same thing but in python, see pdfscraper

pdfscraper_R

note to myself about some problems I ran into:

  • loading R package "pdftools" resulted in errors --> solution: use runner "macos-10.15" and install XQuartz before pdftools is installed: Add run: brew install xquartz --cask to yml file
  • GH stopped supporting macos-10.15 this summer (2022), runs on macos-11 now
  • ggplot stopped working and has been de-activated until I find a solution
  • created new yml file that loads packages from renv.lock (still runs on macos-11, ubuntu not working)
  • if run fails with error exit code 128 check Workflow permissions settings > actions > general: read and write permissions need to be activated
  • bot has been retired and replaced by a faster version. It's still operational as of February 8th, 2023.

About

A bot written in R that extracts a table from a pdf file, processes the data and saves everything to a csv file

Topics

Resources

Stars

Watchers

Forks

Languages