ScrapPaper

About this project

ScrapPaper is a web scrapping method to extract journal information from PubMed and Google Scholar using Python script. Users need to install Python 3 and required modules, and run the scrappaper.py script. Refer to the published paper for detailed instruction. This side project was completed on March 8, 2022 by @rafsanlab. Follow me on Twitter: https://twitter.com/rafsanlab

Paper to cite:

Rafsanjani, M. R. (2022). ScrapPaper: A web scrapping method to extract journal information from PubMed and Google Scholar search result using Python. In bioRxiv (p. 2022.03.08.483427). https://doi.org/10.1101/2022.03.08.483427

System Requirement

Python (version 3 or above)
The following Python modules: requests, csv, re, time, random, pandas, sys, bs4
Operating system (current code was tested on Windows 10)
Command prompt (if using Windows) / terminal
Search link of the first page result from PubMed or Google Scholar
Text editor or spreadsheet software to open the results

Simplified instructions

Download the scrappaper.py script and cd terminal to the directory.
Copy the link from the first search results of PubMed or Google Scholar.
Run the code and paste the link when prompted.
When finished, open the results using text editor or spreadsheet.
Refer to the published paper for detailed instruction.

Disclaimer

Web scraping might get you blocked from the server, run at your own risk. So far, we scrapped 28 pages of Google Scholar results with no issues.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
img		img
sample_result		sample_result
LICENSE		LICENSE
README.md		README.md
scrappaper.py		scrappaper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

img

img

sample_result

sample_result

LICENSE

LICENSE

README.md

README.md

scrappaper.py

scrappaper.py

Repository files navigation

ScrapPaper

About this project

Paper to cite:

System Requirement

Simplified instructions

Disclaimer

About

Releases

Packages

Languages

License

rafsanlab/ScrapPaper

Folders and files

Latest commit

History

Repository files navigation

ScrapPaper

About this project

Paper to cite:

System Requirement

Simplified instructions

Disclaimer

About

Topics

Resources

License

Stars

Watchers

Forks

Languages