dbscrape: scrape html tables from drugbank

Export HTML tables from DrugBank to dataframes. This function can come in handy in cases where downloading and processing the full database is not viable. Code was kept as simple as possible, using only the xpath of each table and spliting respective urls in a manner that takes pagination into account.

Note

After fetching is finished, some tables may need further proccesing for information to be presentable -I suggest using regex to fix those issues.

Install:

if (!requireNamespace("remotes")) install.packages("remotes")
remotes::install_github("ptogias/dbscrape")

Overview

There are three parameters used in dbscrape

1. std_url

Character. The URL post pagination (stops to 'page=').

2. pages

Numeric. Pagination range (i.e. '1:88'). Must start from 1 and end to the respective query end page (search for this manually).

3. postfix

[Optional] Character. Applied on-page filters. These are indicated after the "page=" url part. Defaults to an empty char vector. See second example.

Examples

std_url <- "https://go.drugbank.com/pharmaco/metabolomics?page="
pages <- 1:124
postfix <- ""
dbscrape::dbscrape(std_url, pages, postfix)

std_url <- "https://go.drugbank.com/categories?approved=0&ca=0&eu=1&experimental=1&illicit=0&investigational=0&nutraceutical=0&page="
pages <- 1:36
postfix <- "&q[description]=&q[drug_count]=&q[target_count]=&q[title]=&us=0&withdrawn=0"
dbscrape::dbscrape(std_url, pages, postfix)

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
R		R
man		man
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md
dbscrape.Rproj		dbscrape.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R

R

man

man

DESCRIPTION

DESCRIPTION

NAMESPACE

NAMESPACE

README.md

README.md

dbscrape.Rproj

dbscrape.Rproj

Repository files navigation

dbscrape: scrape html tables from drugbank

Note

Overview

1. std_url

2. pages

3. postfix

Examples

About

Releases 1

Languages

ptogias/dbscrape

Folders and files

Latest commit

History

Repository files navigation

dbscrape: scrape html tables from drugbank

Note

Overview

1. std_url

2. pages

3. postfix

Examples

About

Topics

Resources

Stars

Watchers

Forks

Languages