Potentially unintended consequences of `limit` in `dbpedia_get_wikidata_uris()` #29

ChristophLeonhardt · 2024-02-21T19:37:29Z

The parameter limit might have unintended consequences, at least when following the current documentation of the dbpedia_get_wikidata_uris() function.

Take this as an example:

wikidata_uris <- dbpedia_get_wikidata_uris(
  x = c("http://dbpedia.org/resource/London", "http://dbpedia.org/resource/Washington,_D.C."),
  endpoint = "https://dbpedia.org/sparql/",
  wait = 5,
  limit = 2,
  progress = TRUE
)

In this example, the two queries are processed as one chunk, i.e. in one query sent to the endpoint. Although both items have Wikidata IDs associated with them in DBpedia, only Wikidata IDs for the first item are returned.

At first glance, this might be expected behavior. The limit argument is used as a parameter of the query and controls the number of results returned by the server. If it is set to 2, and the first item in the query has more than one Wikidata ID in the "sameAs" property (which it does in this example), then all returned Wikidata IDs will be for this first item only.

However, limit is also used to split the input vector, i.e. the URIs in x into chunks. This is also how the argument is documented in the package. This is why both URIs are passed in a single SPARQL query which includes the single limit argument for both items.

In the case above, a larger value for limit would solve the problem as it would allow all values for both items to be returned.

But I think that using limit for both purposes - in the query and for chunking the input vector - might be confusing and should be reconsidered.

The text was updated successfully, but these errors were encountered:

…arg limit #29

ablaette · 2024-02-26T14:20:34Z

Just want to report that after having introduced a distinction between 'chunksize' and 'limit', a divergence resulted in joins problems: Essentially, the value should be identical!

ablaette pushed a commit that referenced this issue Feb 26, 2024

arg chunksize in dbpedia_get_wikidata_uris() to avoid confusion with …

5ec17a9

…arg limit #29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potentially unintended consequences of `limit` in `dbpedia_get_wikidata_uris()` #29

Potentially unintended consequences of `limit` in `dbpedia_get_wikidata_uris()` #29

ChristophLeonhardt commented Feb 21, 2024

ablaette commented Feb 26, 2024

Potentially unintended consequences of limit in dbpedia_get_wikidata_uris() #29

Potentially unintended consequences of limit in dbpedia_get_wikidata_uris() #29

Comments

ChristophLeonhardt commented Feb 21, 2024

ablaette commented Feb 26, 2024

Potentially unintended consequences of `limit` in `dbpedia_get_wikidata_uris()` #29

Potentially unintended consequences of `limit` in `dbpedia_get_wikidata_uris()` #29