SPARQLchunks

Coding in R is useless without interesting research questions; and even the best questions remain unanswered without data. RStudio provides a number of convenient ways to access data, among which the possibility to write SQL code chunks in Rmarkdown, to run these chunks and to assign the value of the query result directly to a variable of your choice. No such thing is available yet for SPARQL queries: the ones that allow you to navigate gigantic knowledge graphs that incarnate the conscience of the semantic web. This is where the SPARQLchunks package steps in.

This package allows you to query SPARQL endpoints in two different ways:

It allows you to run SPARQL chunks in Rmarkdown files.
It provides inline functions to send SPARQL queries to a user-defined endpoint and retrieve data in dataframe form (sparql2df) or list form (sparql2list).

Endpoints can be reached from behind corporate firewalls on Windows machines thanks to automatic proxy detection. See Execute SPARQL chunks in R Markdown.

Installation

Most users can install by running this command

remotes::install_github("aourednik/SPARQLchunks", build_vignettes = TRUE)

If you are behind a corporate firewall on a Windows machine, direct access to GitHub might be blocked. If that is your case, run this installation code instead:

proxy_url <- curl::ie_get_proxy_for_url("https://github.com")
httr::set_config(httr::use_proxy(proxy_url))
remotes::install_url("https://github.com/aourednik/SPARQLchunks/archive/refs/heads/master.zip", build_vignettes = TRUE)

Use

To use the full potential of the package you need to load the library and tell knitr that a SPARQL engine exists:

library(SPARQLchunks)
knitr::knit_engines$set(sparql = SPARQLchunks::eng_sparql)

Once you have done so, you can run SPARQL chunks:

Chunks

Retrieve a data frame

output.var: the name of the data frame you want to store the results in

endpoint: the URL of the SPARQL endpoint

autoproxy: whether or not try to use the automatic proxy detection

auth: authentication information for the sparql endpoint (as an httr authentication object, optional)

Example 1 (Swiss administration endpoint)

```{sparql output.var="queryres_df", endpoint="https://lindas.admin.ch/query"}
PREFIX schema: <http://schema.org/>
SELECT * WHERE {
  ?sub a schema:DataCatalog .
  ?subtype a schema:DataType .
}
```

Example 2 (Uniprot endpoint)

Note the use of attempt at automatic proxy detection.

```{sparql output.var="tes5", endpoint="https://sparql.uniprot.org/sparql", autoproxy=TRUE}
PREFIX up: <http://purl.uniprot.org/core/>
SELECT ?taxon
FROM <http://sparql.uniprot.org/taxonomy>
WHERE {
	?taxon a up:Taxon .
} LIMIT 500
```

Example 3 (WikiData endpoint):

```{sparql output.var="res.df", endpoint="https://query.wikidata.org/sparql"}
SELECT DISTINCT ?item ?itemLabel ?country ?countryLabel ?linkTo ?linkToLabel
WHERE {
    ?item wdt:P1142 ?linkTo .
    ?linkTo wdt:P31 wd:Q12909644 .
    VALUES ?type { wd:Q7278  wd:Q24649 }
    ?item wdt:P31 ?type .
    ?item wdt:P17 ?country .
    MINUS { ?item wdt:P576 ?abolitionDate }
    SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
}
```

Retrieve a list

output.var: the name of the list you want to store the results in

endpoint: the URL of the SPARQL endpoint

output.type : when set to "list", retrieves a list (tree structure) instead of a data-frame

autoproxy: whether or not try to use the automatic proxy detection

```{sparql output.var="queryres_list", endpoint="https://lindas.admin.ch/query", output.type="list"}
PREFIX schema: <http://schema.org/>
SELECT * WHERE {
  ?sub a schema:DataCatalog .
  ?subtype a schema:DataType .
}
```

Inline code

The inline functions sparql2df and sparql2list both have the same pair of arguments: a SPARQL endpoint and a SPARQL query. Queries can be multi-line:

endpoint <- "https://lindas.admin.ch/query"
query <- "PREFIX schema: <http://schema.org/>
  SELECT * WHERE {
  ?sub a schema:DataCatalog .
  ?subtype a schema:DataType .
}"

Retrieve a data frame

result_df <- sparql2df(endpoint,query)

The same but with attempt at automatic proxy detection:

result_df <- sparql2df(endpoint,query,autoproxy=TRUE)

Retrieve a list

result_list <- sparql2list(endpoint,query)

The same but with attempt at automatic proxy detection:

result_list <- sparql2list(endpoint,query,autoproxy=TRUE)

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github		.github
R		R
man		man
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.md		README.md
SPARQLchunks.Rproj		SPARQLchunks.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

R

R

man

man

vignettes

vignettes

.Rbuildignore

.Rbuildignore

.gitignore

.gitignore

DESCRIPTION

DESCRIPTION

LICENSE.md

LICENSE.md

NAMESPACE

NAMESPACE

README.md

README.md

SPARQLchunks.Rproj

SPARQLchunks.Rproj

Repository files navigation

SPARQLchunks

Installation

Use

Chunks

Retrieve a data frame

Retrieve a list

Inline code

Retrieve a data frame

Retrieve a list

About

Releases

Packages

Contributors 3

Languages

License

aourednik/SPARQLchunks

Folders and files

Latest commit

History

Repository files navigation

SPARQLchunks

Installation

Use

Chunks

Retrieve a data frame

Retrieve a list

Inline code

Retrieve a data frame

Retrieve a list

About

Topics

Resources

License

Stars

Watchers

Forks

Languages