You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this example, the two queries are processed as one chunk, i.e. in one query sent to the endpoint. Although both items have Wikidata IDs associated with them in DBpedia, only Wikidata IDs for the first item are returned.
At first glance, this might be expected behavior. The limit argument is used as a parameter of the query and controls the number of results returned by the server. If it is set to 2, and the first item in the query has more than one Wikidata ID in the "sameAs" property (which it does in this example), then all returned Wikidata IDs will be for this first item only.
However, limit is also used to split the input vector, i.e. the URIs in x into chunks. This is also how the argument is documented in the package. This is why both URIs are passed in a single SPARQL query which includes the single limit argument for both items.
In the case above, a larger value for limit would solve the problem as it would allow all values for both items to be returned.
But I think that using limit for both purposes - in the query and for chunking the input vector - might be confusing and should be reconsidered.
The text was updated successfully, but these errors were encountered:
Just want to report that after having introduced a distinction between 'chunksize' and 'limit', a divergence resulted in joins problems: Essentially, the value should be identical!
The parameter
limit
might have unintended consequences, at least when following the current documentation of thedbpedia_get_wikidata_uris()
function.Take this as an example:
In this example, the two queries are processed as one chunk, i.e. in one query sent to the endpoint. Although both items have Wikidata IDs associated with them in DBpedia, only Wikidata IDs for the first item are returned.
At first glance, this might be expected behavior. The
limit
argument is used as a parameter of the query and controls the number of results returned by the server. If it is set to 2, and the first item in the query has more than one Wikidata ID in the "sameAs" property (which it does in this example), then all returned Wikidata IDs will be for this first item only.However,
limit
is also used to split the input vector, i.e. the URIs inx
into chunks. This is also how the argument is documented in the package. This is why both URIs are passed in a single SPARQL query which includes the singlelimit
argument for both items.In the case above, a larger value for
limit
would solve the problem as it would allow all values for both items to be returned.But I think that using
limit
for both purposes - in the query and for chunking the input vector - might be confusing and should be reconsidered.The text was updated successfully, but these errors were encountered: