You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sort query-result.nt|uniq|wc -l
2280 unique lines (7x duplication on average)
sort query-result.nt|uniq -c|grep -v " 1 "|sort -r -n|less
to view the top repeated lines
sort query-result.nt|uniq -c|grep " 114 "|grep -v isDefinedBy
to see that the most repeated lines (114x) are all rdfs:isDefinedBy
sort query-result.nt|uniq -c|grep " 114 "|sort|uniq|wc -l
113: to see that there are 113 terms, and so the duplication is caused by some Cartesian Product that produces o(N^2) results
sort query-result.nt|uniq -c|grep " 2 "|less
to see that the other duplicated lines are of the form <https://swapi.co/vocabulary/Clawdite> a <https://swapi.co/vocabulary/Species>
Note: The ontology is a bit unusual since it has a "metaclass" Character and then per-species classes that describe typical characteristics of that species, see below.
But I think that's beside the point since the major duplication comes from DESCRIBE swapi: ?term
Current Behavior
We're trying to get a description of an ontology and all terms within with this query:
The result is attached: query-result.txt (rename to .nt):
wc -l query-result.nt
15087 total lines
sort query-result.nt|uniq|wc -l
2280 unique lines (7x duplication on average)
sort query-result.nt|uniq -c|grep -v " 1 "|sort -r -n|less
to view the top repeated lines
sort query-result.nt|uniq -c|grep " 114 "|grep -v isDefinedBy
to see that the most repeated lines (114x) are all
rdfs:isDefinedBy
sort query-result.nt|uniq -c|grep " 114 "|sort|uniq|wc -l
113: to see that there are 113 terms, and so the duplication is caused by some Cartesian Product that produces
o(N^2)
resultssort query-result.nt|uniq -c|grep " 2 "|less
to see that the other duplicated lines are of the form
<https://swapi.co/vocabulary/Clawdite> a <https://swapi.co/vocabulary/Species>
Note: The ontology is a bit unusual since it has a "metaclass"
Character
and then per-species classes that describe typical characteristics of that species, see below.But I think that's beside the point since the major duplication comes from
DESCRIBE swapi: ?term
Expected Behavior
No duplicates, or some triples duplicated only twice (eg each
isDefinedBy
from the point of view of subject and object)Steps To Reproduce
Version
rdf4j 4.3.8 (GraphDB 10.4.2)
Are you interested in contributing a solution yourself?
None
Anything else?
No response
The text was updated successfully, but these errors were encountered: