Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gravsearch): Optimise Gravsearch queries using topological sort (DSP-1327) #1813

Merged
merged 42 commits into from Mar 2, 2021
Merged
Show file tree
Hide file tree
Changes from 36 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
7f4aa5a
feat(gravsearch): Start implementation of topological sort.
Feb 4, 2021
a431f02
Merge branch 'main' into wip/DSP-1327-gravsearch
Feb 5, 2021
84938b9
feat(gravsearch) create a graph from statement patterns
SepidehAlassi Feb 8, 2021
b9da0ac
Merge branch 'main' into wip/DSP-1327-gravsearch
SepidehAlassi Feb 8, 2021
c01c796
fix (grvsearch): immutable graph
SepidehAlassi Feb 8, 2021
2d2afd7
fix (gravsearch) is not cyclic, sort the graph
SepidehAlassi Feb 8, 2021
41c8be2
fix (gravsearch): use input query for sorting
SepidehAlassi Feb 8, 2021
e514050
feat(gravsearch) use directed hyper edge
SepidehAlassi Feb 8, 2021
2778e57
feat(gravearch) change to DiHyperedge
SepidehAlassi Feb 9, 2021
8aaccdb
Merge branch 'main' into wip/DSP-1327-gravsearch
SepidehAlassi Feb 9, 2021
41147b5
feat(gravsearch): sort statements
SepidehAlassi Feb 9, 2021
106f335
fix (gravsearch): correctly preserve the order of statements as indic…
SepidehAlassi Feb 9, 2021
6d7168a
test (gravsearch) test the recursive function for sorting statements …
SepidehAlassi Feb 10, 2021
68a3bbd
fix the failing test
SepidehAlassi Feb 10, 2021
ce490e1
feat(gravsearch): break cycles in graph
SepidehAlassi Feb 10, 2021
29aba9d
refactor (gravsearch) clean up
SepidehAlassi Feb 11, 2021
d0290a0
move the topological sort to prequery generator
SepidehAlassi Feb 15, 2021
4eca3b5
style(gravsearch): Clean up a few things.
Feb 15, 2021
cfbc750
test(gravsearch): Improve test.
Feb 15, 2021
cde90b6
feat(gravsearch): Add utility for finding all topological orders of a…
Feb 16, 2021
f092bf3
feat(gravsearch): Fix topological sort bugs, add tests.
Feb 16, 2021
1a07c99
Merge branch 'main' into wip/DSP-1327-gravsearch
Feb 17, 2021
431bd75
test(gravsearch): Fix test.
Feb 17, 2021
bb5e48b
feat(gravsearch): Prefer topological orders that don't put rdf:type s…
Feb 17, 2021
9497151
Merge branch 'main' into wip/DSP-1327-gravsearch
Feb 17, 2021
f33a90d
fix(gravsearch): Correctly handle standoff classes in optimisations.
Feb 18, 2021
5e96dba
Merge branch 'main' into wip/DSP-1327-gravsearch
Feb 18, 2021
1339b0c
test(gravsearch): Update toplogical reordering tests.
Feb 18, 2021
47e74d4
test(gravsearch): Fix test.
Feb 19, 2021
2bf4b0a
feat(gravsearch): Add feature toggle for topological sort optimisation.
Feb 19, 2021
018dd47
test(gravsearch): Clean up test.
Feb 19, 2021
4eb807e
style(gravsearch): Use import wildcard.
Feb 19, 2021
35587e5
style(test): Add copyright.
Feb 23, 2021
420f873
style(gravsearch): Add comment.
Feb 23, 2021
2f84b69
Merge branch 'main' into wip/DSP-1327-gravsearch
Feb 25, 2021
14f0648
Merge branch 'main' into wip/DSP-1327-gravsearch
Feb 25, 2021
75e9023
feat (gravsearch) get all permutations of topological order according…
SepidehAlassi Mar 1, 2021
94f577b
Merge branch 'wip/DSP-1327-gravsearch' of https://github.com/dasch-sw…
SepidehAlassi Mar 1, 2021
f3d8de6
fix(gravsearch) fix the failing test
SepidehAlassi Mar 1, 2021
885bb33
style(TopologicalSortUtil): Improve style a little bit.
Mar 1, 2021
f9fb4aa
doc (gravsearch) documentation about optimization of gravsearches wit…
SepidehAlassi Mar 1, 2021
b2272c6
style(docs): Improve style a bit.
Mar 2, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions third_party/dependencies.bzl
Expand Up @@ -137,6 +137,9 @@ def dependencies():
# Additional Selenium libraries besides the ones pulled in during init
# of io_bazel_rules_webtesting
"org.seleniumhq.selenium:selenium-support:3.141.59",

# Graph for Scala
"org.scala-graph:graph-core_2.12:1.13.1",
],
repositories = [
"https://repo.maven.apache.org/maven2",
Expand Down Expand Up @@ -187,6 +190,7 @@ BASE_TEST_DEPENDENCIES = [
"@maven//:org_scalatest_scalatest_shouldmatchers_2_12",
"@maven//:org_scalatest_scalatest_compatible",
"@maven//:org_scalactic_scalactic_2_12",
"@maven//:org_scala_graph_graph_core_2_12",
]

BASE_TEST_DEPENDENCIES_WITH_JSON = BASE_TEST_DEPENDENCIES + [
Expand Down
15 changes: 15 additions & 0 deletions webapi/src/main/resources/application.conf
Expand Up @@ -292,6 +292,21 @@ app {
"Benjamin Geer <benjamin.geer@dasch.swiss>"
]
}

gravsearch-dependency-optimisation {
description = "Optimise Gravsearch queries by reordering query patterns according to their dependencies."

available-versions = [ 1 ]
default-version = 1
enabled-by-default = yes
override-allowed = yes
expiration-date = "2021-12-01T00:00:00Z"

developer-emails = [
"Sepideh Alassi <sepideh.alassi@dasch.swiss>"
"Benjamin Geer <benjamin.geer@dasch.swiss>"
]
}
}

shacl {
Expand Down
Expand Up @@ -44,5 +44,6 @@ scala_library(
"@maven//:org_scala_lang_scala_reflect",
"@maven//:org_slf4j_slf4j_api",
"@maven//:org_springframework_security_spring_security_core",
"@maven//:org_scala_graph_graph_core_2_12",
],
)
Expand Up @@ -55,10 +55,6 @@ abstract class AbstractPrequeryGenerator(constructClause: ConstructClause,
// suffix appended to variables that are returned by a SPARQL aggregation function.
protected val groupConcatVariableSuffix = "__Concat"

// A set of types that can be treated as dates by the knora-api:toSimpleDate function.
private val dateTypes: Set[IRI] =
Set(OntologyConstants.KnoraApiV2Complex.DateValue, OntologyConstants.KnoraApiV2Complex.StandoffTag)

/**
* A container for a generated variable representing a value literal.
*
Expand Down Expand Up @@ -352,14 +348,14 @@ abstract class AbstractPrequeryGenerator(constructClause: ConstructClause,

}

val (maybeSubjectTypeIri: Option[SmartIri], subjectIsResource: Boolean) =
val maybeSubjectType: Option[NonPropertyTypeInfo] =
typeInspectionResult.getTypeOfEntity(statementPattern.subj) match {
case Some(NonPropertyTypeInfo(subjectTypeIri, isResourceType, _)) => (Some(subjectTypeIri), isResourceType)
case _ => (None, false)
case Some(nonPropertyTypeInfo: NonPropertyTypeInfo) => Some(nonPropertyTypeInfo)
case _ => None
}

// Is the subject of the statement a resource?
if (subjectIsResource) {
if (maybeSubjectType.exists(_.isResourceType)) {
// Yes. Is the object of the statement also a resource?
if (propertyTypeInfo.objectIsResourceType) {
// Yes. This is a link property. Make sure that the object is either an IRI or a variable (cannot be a literal).
Expand Down Expand Up @@ -490,7 +486,7 @@ abstract class AbstractPrequeryGenerator(constructClause: ConstructClause,
if (querySchema == ApiV2Complex) {
// Yes. If the subject is a standoff tag and the object is a resource, that's an error, because the client
// has to use the knora-api:standoffLink function instead.
if (maybeSubjectTypeIri.contains(OntologyConstants.KnoraApiV2Complex.StandoffTag.toSmartIri) && propertyTypeInfo.objectIsResourceType) {
if (maybeSubjectType.exists(_.isStandoffTagType) && propertyTypeInfo.objectIsResourceType) {
throw GravsearchException(
s"Invalid statement pattern (use the knora-api:standoffLink function instead): ${statementPattern.toSparql.trim}")
} else {
Expand Down Expand Up @@ -769,7 +765,7 @@ abstract class AbstractPrequeryGenerator(constructClause: ConstructClause,
case xsdLiteral: XsdLiteral if xsdLiteral.datatype.toString == OntologyConstants.KnoraApiV2Simple.ListNode =>
xsdLiteral.value

case other =>
case _ =>
throw GravsearchException(s"Invalid type for literal ${OntologyConstants.KnoraApiV2Simple.ListNode}")
}

Expand Down Expand Up @@ -1259,7 +1255,7 @@ abstract class AbstractPrequeryGenerator(constructClause: ConstructClause,
val langLiteral: XsdLiteral = compareExpression.rightArg match {
case strLiteral: XsdLiteral if strLiteral.datatype == OntologyConstants.Xsd.String.toSmartIri => strLiteral

case other =>
case _ =>
throw GravsearchException(
s"Right argument of comparison statement must be a string literal for use with 'lang' function call")
}
Expand Down Expand Up @@ -1821,9 +1817,8 @@ abstract class AbstractPrequeryGenerator(constructClause: ConstructClause,
val standoffTagVar = functionCallExpression.getArgAsQueryVar(pos = 1)

typeInspectionResult.getTypeOfEntity(standoffTagVar) match {
case Some(nonPropertyTypeInfo: NonPropertyTypeInfo)
if nonPropertyTypeInfo.typeIri.toString == OntologyConstants.KnoraApiV2Complex.StandoffTag =>
()
case Some(nonPropertyTypeInfo: NonPropertyTypeInfo) if nonPropertyTypeInfo.isStandoffTagType => ()

case _ =>
throw GravsearchException(
s"The second argument of ${functionIri.toSparql} must represent a knora-api:StandoffTag")
Expand Down Expand Up @@ -1930,7 +1925,7 @@ abstract class AbstractPrequeryGenerator(constructClause: ConstructClause,

typeInspectionResult.getTypeOfEntity(dateBaseVar) match {
case Some(nonPropInfo: NonPropertyTypeInfo) =>
if (!dateTypes.contains(nonPropInfo.typeIri.toString)) {
if (!(nonPropInfo.isStandoffTagType || nonPropInfo.typeIri.toString == OntologyConstants.KnoraApiV2Complex.DateValue)) {
throw GravsearchException(
s"${dateBaseVar.toSparql} must represent a knora-api:DateValue or a knora-api:StandoffDateTag")
}
Expand Down Expand Up @@ -2042,53 +2037,4 @@ abstract class AbstractPrequeryGenerator(constructClause: ConstructClause,
}

}

/**
* Optimises a query by removing `rdf:type` statements that are known to be redundant. A redundant
* `rdf:type` statement gives the type of a variable whose type is already restricted by its
* use with a property that can only be used with that type (unless the property
* statement is in an `OPTIONAL` block).
*
* @param patterns the query patterns.
* @return the optimised query patterns.
*/
protected def removeEntitiesInferredFromProperty(patterns: Seq[QueryPattern]): Seq[QueryPattern] = {

// Collect all entities which are used as subject or object of an OptionalPattern.
val optionalEntities = patterns
.filter {
case OptionalPattern(_) => true
case _ => false
}
.flatMap {
case optionalPattern: OptionalPattern =>
optionalPattern.patterns.flatMap {
case pattern: StatementPattern =>
GravsearchTypeInspectionUtil.maybeTypeableEntity(pattern.subj) ++ GravsearchTypeInspectionUtil
.maybeTypeableEntity(pattern.obj)
case _ => None
}
case _ => None
}

// remove statements whose predicate is rdf:type, type of subject is inferred from a property, and the subject is not in optionalEntities.
val optimisedPatterns = patterns.filter {
case statementPattern: StatementPattern =>
statementPattern.pred match {
case iriRef: IriRef =>
val subject = GravsearchTypeInspectionUtil.maybeTypeableEntity(statementPattern.subj)
subject match {
case Some(typeableEntity) =>
!(iriRef.iri.toString == OntologyConstants.Rdf.Type && typeInspectionResult.entitiesInferredFromProperties.keySet
.contains(typeableEntity)
&& !optionalEntities.contains(typeableEntity))
case _ => true
}

case _ => true
}
case _ => true
}
optimisedPatterns
}
}