Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api-v2): Add an RDF processing façade (DSP-1020) #1754

Merged
merged 36 commits into from Nov 17, 2020
Merged
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
533976c
feat: Start adding RDF API and Jena implementation.
Nov 6, 2020
32ac622
Merge branch 'main' into wip/DSP-1020-rdf-api
Nov 6, 2020
fb7b8c9
feat(rdf-api): Add RDF4J implementation.
Nov 9, 2020
34a4ad6
test(rdf-api): Start adding tests.
Nov 9, 2020
ae92ab5
test(rdf-api): Add tests.
Nov 9, 2020
648ffe1
Merge branch 'main' into wip/DSP-1020-rdf-api
Nov 9, 2020
472126f
style(rdf-api): Fix filename.
Nov 9, 2020
325e218
feat(api-v2): Add façade for RDF parsing/formatting (doesn't even com…
Nov 10, 2020
52f8427
fix: Make everything compile again.
Nov 11, 2020
e39e3ce
fix(rdf-api): Add tests, fix bugs.
Nov 11, 2020
f7c8b68
feat(store): Use RDF API in HttpTriplestoreConnector, cause lots of c…
Nov 11, 2020
d77cb8c
feat: Fix a lot of compile errors (many more to go)
Nov 12, 2020
e1bf122
fix: Fix many compile errors, many more to go.
Nov 12, 2020
1b8570d
test: Make tests compile again.
Nov 13, 2020
7d00c70
fix: Fix bugs.
Nov 13, 2020
7c29912
test(FeatureToggleR2RSpec): Fix test.
Nov 13, 2020
0f02ffe
test: Fix tests.
Nov 13, 2020
79edf9a
tests: Update E2E and R2R tests.
Nov 13, 2020
2e070fb
Merge branch 'main' into wip/DSP-1020-rdf-api
Nov 13, 2020
dbd8182
test: Fix tests.
Nov 13, 2020
949f0bd
fix(rdf-api): Fix bugs.
Nov 13, 2020
8887872
docs(rdf-api): Add design doc.
Nov 13, 2020
5fa52e3
test(rdf-api): Add test for prefixes and custom datatypes.
Nov 13, 2020
6fdb70e
refactor(rdf-api): Simplify code.
Nov 14, 2020
f5a2f5c
refactor(JsonLDUtil): Simplify and clarify code.
Nov 14, 2020
3eb9b22
Merge branch 'main' into wip/DSP-1020-rdf-api
Nov 16, 2020
16a81d1
chore(build): Clean up dependencies.
Nov 16, 2020
0187e96
style(RdfFormatUtil): Clarify code.
Nov 16, 2020
a0ea4a0
refactor(rdf-api): Always use the singleton instances of the node fac…
Nov 16, 2020
c200281
style(rdf-api): Fix typo in method name.
Nov 16, 2020
e6ff4ab
style(JsonLDUtil): Clarify comments.
Nov 17, 2020
52aa1c3
style(JsonLDUtil): Clarify method names and comments.
Nov 17, 2020
624c301
test(JsonLDUtil): Add test for circular reference.
Nov 17, 2020
5cc9534
test: Optimise imports.
Nov 17, 2020
6a4af8c
style(JsonLDUtil): Add comments.
Nov 17, 2020
263efa7
style(test): Rearrange code.
Nov 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/05-internals/design/principles/index.md
Expand Up @@ -23,6 +23,7 @@ License along with Knora. If not, see <http://www.gnu.org/licenses/>.
- [Futures with Akka](futures-with-akka.md)
- [HTTP Module](http-module.md)
- [Store Module](store-module.md)
- [RDF Processing API](rdf-api.md)
- [Triplestore Updates](triplestore-updates.md)
- [Consistency Checking](consistency-checking.md)
- [Authentication](authentication.md)
Expand Down
111 changes: 111 additions & 0 deletions docs/05-internals/design/principles/rdf-api.md
@@ -0,0 +1,111 @@
<!---
Copyright © 2015-2019 the contributors (see Contributors.md).

This file is part of Knora.

Knora is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published
by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

Knora is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public
License along with Knora. If not, see <http://www.gnu.org/licenses/>.
-->

# RDF Processing API

Knora provides an API for parsing and formatting RDF data and
for working with RDF graphs. This allows Knora developers to use a single,
idiomatic Scala API as a façade for a Java RDF library.
By using a feature toggle, you can choose either
[Jena](https://jena.apache.org/tutorials/rdf_api.html)
or
[RDF4J](https://rdf4j.org/documentation/programming/)
as the underlying implementation.


## Overview

The API is in the package `org.knora.webapi.messages.util.rdf`. It includes:

- `RdfModel`, which represents a set of RDF graphs (a default graph and/or one or more named graphs).
A model can be constructed from scratch, modified, and searched.

- `RdfNode` and its subclasses, which represent RDF nodes (IRIs, blank nodes, and literals).

- `Statement`, which represents a triple or quad.

- `RdfNodeFactory`, which creates nodes and statements.

- `RdfModelFactory`, which creates empty RDF models.

- `RdfFormatUtil`, which parses and formats RDF models.

- `JsonLDUtil`, which provides specialised functionality for working
with RDF in JSON-LD format, and for converting between RDF models
and JSON-LD documents. `RdfFormatUtil` uses `JsonLDUtil` when appropriate.

To work with RDF models, start with `RdfFeatureFactory`, which returns instances
of `RdfNodeFactory`, `RdfModelFactory`, and `RdfFormatUtil`, using feature toggle
configuration.

`JsonLDUtil` does not need a feature factory.


## Implementations

- The Jena-based implementation, in package `org.knora.webapi.messages.util.rdf.jenaimpl`.

- The RDF4J-based implementation, in package `org.knora.webapi.messages.util.rdf.rdf4jimpl`.


## Feature toggle

For an overview of feature toggles, see [Feature Toggles](feature-toggles.md).

The RDF API uses the feature toggle `jena-rdf-library`:

- `on`: use the Jena implementation.

- `off` (the default): use the RDF4J implementation.


The default setting is used on startup, e.g. to read ontologies from the
repository. After startup, the per-request setting is used.


## What still uses RDF4J directly

Before this API was added, Knora mainly used the RDF4J API directly, and still does
in some places:

- Code that uses RDF4J's streaming API to process large amounts of data, especially to
avoid constructing a large string in TriG format:

- `ProjectsResponderADM.projectDataGetRequestADM`

- `HttpTriplestoreConnector.turtleToTrig`

- `RepositoryUpdater`

- The repository update plugin tests, which use SPARQL.

- `TEIHeader`: uses XSLT that depends on the exact format of the RDF/XML generated by RDF4J.
The XSLT would need to be improved to handle `rdf:Description`.

- `GravsearchParser`: uses RDF4J's SPARQL parser. This is probably
not worth changing.


## TODO

- SHACL validation.

- SPARQL querying.

- A streaming parsing/formatting API for processing large graphs.
6 changes: 1 addition & 5 deletions webapi/BUILD.bazel
Expand Up @@ -2,7 +2,6 @@ package(default_visibility = ["//visibility:public"])

load("@io_bazel_rules_scala//scala:scala.bzl", "scala_binary", "scala_library", "scala_repl", "scala_test")


# alias added for convenience. To call, use: bazel run //webapi:GenerateContributorsFile
alias(
name = "GenerateContributorsFile",
Expand Down Expand Up @@ -156,6 +155,7 @@ scala_library(
"//webapi/src/main/scala/org/knora/webapi/http/handler",
"//webapi/src/main/scala/org/knora/webapi/instrumentation",
"//webapi/src/main/scala/org/knora/webapi/messages",
"//webapi/src/main/scala/org/knora/webapi/feature",
"//webapi/src/main/scala/org/knora/webapi/responders",
"//webapi/src/main/scala/org/knora/webapi/routing",
"//webapi/src/main/scala/org/knora/webapi/settings",
Expand All @@ -167,17 +167,14 @@ scala_library(
# Test Libs
"@maven//:com_typesafe_akka_akka_testkit_2_12",
"@maven//:com_typesafe_akka_akka_http_testkit_2_12",
"@maven//:com_jsuereth_scala_arm_2_12",
"@maven//:com_typesafe_akka_akka_actor_2_12",
"@maven//:com_typesafe_akka_akka_http_2_12",
"@maven//:com_typesafe_akka_akka_http_core_2_12",
"@maven//:com_typesafe_akka_akka_http_spray_json_2_12",
"@maven//:com_typesafe_akka_akka_stream_2_12",
"@maven//:com_typesafe_config",
"@maven//:io_spray_spray_json_2_12",
"@maven//:org_eclipse_rdf4j_rdf4j_client",
"@maven//:org_scalactic_scalactic_2_12",
"@maven//:org_scalatest_scalatest_2_12",
"@maven//:org_scalatest_scalatest_core_2_12",
"@maven//:org_scalatest_scalatest_wordspec_2_12",
"@maven//:org_scalatest_scalatest_matchers_core_2_12",
Expand Down Expand Up @@ -236,7 +233,6 @@ scala_library(
"@maven//:org_scala_lang_scala_library",
"@maven//:org_scala_lang_scala_reflect",
"@maven//:org_scalactic_scalactic_2_12",
"@maven//:org_scalatest_scalatest_2_12",
"@maven//:org_scalatest_scalatest_core_2_12",
"@maven//:org_scalatest_scalatest_wordspec_2_12",
"@maven//:org_scalatest_scalatest_matchers_core_2_12",
Expand Down
4 changes: 2 additions & 2 deletions webapi/src/it/scala/org/knora/webapi/ITKnoraLiveSpec.scala
Expand Up @@ -36,8 +36,8 @@ import org.knora.webapi.exceptions.AssertionException
import org.knora.webapi.messages.StringFormatter
import org.knora.webapi.messages.app.appmessages.{AppStart, AppStop, SetAllowReloadOverHTTPState}
import org.knora.webapi.messages.store.triplestoremessages.{RdfDataObject, TriplestoreJsonProtocol}
import org.knora.webapi.messages.util.{JsonLDDocument, JsonLDUtil}
import org.knora.webapi.settings.{KnoraDispatchers, KnoraSettings, KnoraSettingsImpl, _}
import org.knora.webapi.messages.util.rdf.{JsonLDDocument, JsonLDUtil}
import org.knora.webapi.settings._
import org.knora.webapi.util.StartupUtils
import org.scalatest.matchers.should.Matchers
import org.scalatest.wordspec.AnyWordSpecLike
Expand Down
Expand Up @@ -31,6 +31,7 @@ import org.knora.webapi.exceptions.AssertionException
import org.knora.webapi.messages.IriConversions._
import org.knora.webapi.messages.store.triplestoremessages.TriplestoreJsonProtocol
import org.knora.webapi.messages.util._
import org.knora.webapi.messages.util.rdf.{JsonLDArray, JsonLDKeywords, JsonLDDocument, JsonLDObject, JsonLDValue}
import org.knora.webapi.messages.v2.routing.authenticationmessages._
import org.knora.webapi.messages.{OntologyConstants, SmartIri, StringFormatter}
import org.knora.webapi.sharedtestdata.SharedTestDataADM
Expand Down Expand Up @@ -157,11 +158,11 @@ class KnoraSipiIntegrationV2ITSpec extends ITKnoraLiveSpec(KnoraSipiIntegrationV
private def getValueFromResource(resource: JsonLDDocument,
propertyIriInResult: SmartIri,
expectedValueIri: IRI): JsonLDObject = {
val resourceIri: IRI = resource.requireStringWithValidation(JsonLDConstants.ID, stringFormatter.validateAndEscapeIri)
val resourceIri: IRI = resource.requireStringWithValidation(JsonLDKeywords.ID, stringFormatter.validateAndEscapeIri)
val propertyValues: JsonLDArray = getValuesFromResource(resource = resource, propertyIriInResult = propertyIriInResult)

val matchingValues: Seq[JsonLDObject] = propertyValues.value.collect {
case jsonLDObject: JsonLDObject if jsonLDObject.requireStringWithValidation(JsonLDConstants.ID, stringFormatter.validateAndEscapeIri) == expectedValueIri => jsonLDObject
case jsonLDObject: JsonLDObject if jsonLDObject.requireStringWithValidation(JsonLDKeywords.ID, stringFormatter.validateAndEscapeIri) == expectedValueIri => jsonLDObject
}

if (matchingValues.isEmpty) {
Expand Down Expand Up @@ -620,7 +621,7 @@ class KnoraSipiIntegrationV2ITSpec extends ITKnoraLiveSpec(KnoraSipiIntegrationV

val request = Post(s"$baseApiUrl/v2/resources", HttpEntity(RdfMediaTypes.`application/ld+json`, jsonLdEntity)) ~> addCredentials(BasicHttpCredentials(anythingUserEmail, password))
val responseJsonDoc: JsonLDDocument = getResponseJsonLD(request)
val resourceIri: IRI = responseJsonDoc.body.requireStringWithValidation(JsonLDConstants.ID, stringFormatter.validateAndEscapeIri)
val resourceIri: IRI = responseJsonDoc.body.requireStringWithValidation(JsonLDKeywords.ID, stringFormatter.validateAndEscapeIri)
csvResourceIri.set(responseJsonDoc.body.requireIDAsKnoraDataIri.toString)

// Get the resource from Knora.
Expand Down Expand Up @@ -768,7 +769,7 @@ class KnoraSipiIntegrationV2ITSpec extends ITKnoraLiveSpec(KnoraSipiIntegrationV

val request = Post(s"$baseApiUrl/v2/resources", HttpEntity(RdfMediaTypes.`application/ld+json`, jsonLdEntity)) ~> addCredentials(BasicHttpCredentials(anythingUserEmail, password))
val responseJsonDoc: JsonLDDocument = getResponseJsonLD(request)
val resourceIri: IRI = responseJsonDoc.body.requireStringWithValidation(JsonLDConstants.ID, stringFormatter.validateAndEscapeIri)
val resourceIri: IRI = responseJsonDoc.body.requireStringWithValidation(JsonLDKeywords.ID, stringFormatter.validateAndEscapeIri)
xmlResourceIri.set(responseJsonDoc.body.requireIDAsKnoraDataIri.toString)

// Get the resource from Knora.
Expand Down
13 changes: 13 additions & 0 deletions webapi/src/main/resources/application.conf
Expand Up @@ -279,6 +279,19 @@ app {
"Benjamin Geer <benjamin.geer@dasch.swiss>"
]
}

jena-rdf-library {
description = "Use the Jena API for RDF processing. If turned off, use the RDF4J API."

available-versions = [ 1 ]
default-version = 1
enabled-by-default = no
override-allowed = yes

developer-emails = [
"Benjamin Geer <benjamin.geer@dasch.swiss>"
]
}
}

print-extended-config = false // If true, an extended list of configuration parameters will be printed out at startup.
Expand Down
8 changes: 8 additions & 0 deletions webapi/src/main/scala/org/knora/webapi/RdfMediaTypes.scala
Expand Up @@ -48,13 +48,21 @@ object RdfMediaTypes {
fileExtensions = List("rdf")
)

val `application/trig`: MediaType.WithFixedCharset = MediaType.customWithFixedCharset(
mainType = "application",
subType = "trig",
charset = HttpCharsets.`UTF-8`,
fileExtensions = List("trig")
)

/**
* A map of MIME types (strings) to supported RDF media types.
*/
val registry: Map[String, MediaType.NonBinary] = Set(
`application/json`,
`application/ld+json`,
`text/turtle`,
`application/trig`,
`application/rdf+xml`
).map {
mediaType => mediaType.toString -> mediaType
Expand Down
Expand Up @@ -32,6 +32,7 @@ import com.typesafe.scalalogging.LazyLogging
import kamon.Kamon
import org.knora.webapi.core.LiveActorMaker
import org.knora.webapi.exceptions.{InconsistentTriplestoreDataException, SipiException, UnexpectedMessageException, UnsupportedValueException}
import org.knora.webapi.feature.{FeatureFactoryConfig, KnoraSettingsFeatureFactoryConfig}
import org.knora.webapi.http.handler
import org.knora.webapi.http.version.ServerVersion
import org.knora.webapi.messages.admin.responder.KnoraRequestADM
Expand Down Expand Up @@ -105,6 +106,11 @@ class ApplicationActor extends Actor with Stash with LazyLogging with AroundDire
*/
implicit val knoraSettings: KnoraSettingsImpl = KnoraSettings(system)

/**
* The default feature factory configuration, which is used during startup.
*/
val defaultFeatureFactoryConfig: FeatureFactoryConfig = new KnoraSettingsFeatureFactoryConfig(knoraSettings)

/**
* Provides the actor materializer (akka-http)
*/
Expand Down Expand Up @@ -353,7 +359,10 @@ class ApplicationActor extends Actor with Stash with LazyLogging with AroundDire

/* load ontologies request */
case LoadOntologies() =>
responderManager ! LoadOntologiesRequestV2(KnoraSystemInstances.Users.SystemUser)
responderManager ! LoadOntologiesRequestV2(
featureFactoryConfig = defaultFeatureFactoryConfig,
requestingUser = KnoraSystemInstances.Users.SystemUser
)

/* load ontologies response */
case SuccessResponseV2(_) =>
Expand Down
3 changes: 3 additions & 0 deletions webapi/src/main/scala/org/knora/webapi/app/BUILD.bazel
Expand Up @@ -10,6 +10,7 @@ scala_library(
"//webapi/src/main/scala/org/knora/webapi",
"//webapi/src/main/scala/org/knora/webapi/core",
"//webapi/src/main/scala/org/knora/webapi/exceptions",
"//webapi/src/main/scala/org/knora/webapi/feature",
"//webapi/src/main/scala/org/knora/webapi/http/handler",
"//webapi/src/main/scala/org/knora/webapi/http/version",
"//webapi/src/main/scala/org/knora/webapi/instrumentation",
Expand Down Expand Up @@ -54,6 +55,8 @@ scala_binary(
"@maven//:ch_qos_logback_logback_core",
"@maven//:com_typesafe_akka_akka_slf4j_2_12",
"@maven//:org_slf4j_log4j_over_slf4j",
"@maven//:org_glassfish_jakarta_json",
"@maven//:org_scala_lang_modules_scala_java8_compat_2_12",
],
)

Expand Down
Expand Up @@ -467,6 +467,14 @@ case class TestConfigurationException(message: String) extends ApplicationConfig
*/
case class FeatureToggleException(message: String, cause: Option[Throwable] = None) extends ApplicationConfigurationException(message)

/**
* Indicates that RDF processing failed.
*
* @param message a description of the error.
* @param cause the original exception representing the cause of the error, if any.
*/
case class RdfProcessingException(message: String, cause: Option[Throwable] = None) extends InternalServerException(message)

/**
* Helper functions for error handling.
*/
Expand All @@ -482,7 +490,7 @@ object ExceptionUtil {
SerializationUtils.serialize(e)
true
} catch {
case serEx: SerializationException => false
case _: SerializationException => false
}
}

Expand Down
3 changes: 0 additions & 3 deletions webapi/src/main/scala/org/knora/webapi/feature/BUILD.bazel
Expand Up @@ -9,12 +9,9 @@ scala_library(
deps = [
"//webapi/src/main/scala/org/knora/webapi",
"//webapi/src/main/scala/org/knora/webapi/exceptions",
"//webapi/src/main/scala/org/knora/webapi/messages",
"//webapi/src/main/scala/org/knora/webapi/settings",
"@maven//:com_typesafe_akka_akka_actor_2_12",
"@maven//:com_typesafe_akka_akka_http_2_12",
"@maven//:com_typesafe_akka_akka_http_core_2_12",
"@maven//:com_typesafe_scala_logging_scala_logging_2_12",
"@maven//:org_apache_jena_apache_jena_libs",
],
)
Expand Up @@ -23,12 +23,12 @@ import akka.http.scaladsl.model.{HttpHeader, HttpResponse}
import akka.http.scaladsl.model.headers.RawHeader
import akka.http.scaladsl.server.RequestContext
import org.knora.webapi.exceptions.{BadRequestException, FeatureToggleException}
import org.knora.webapi.messages.StringFormatter
import org.knora.webapi.settings.KnoraSettings.FeatureToggleBaseConfig
import org.knora.webapi.settings.KnoraSettingsImpl

import scala.annotation.tailrec
import scala.util.{Failure, Success, Try}
import scala.util.control.Exception._

/**
* A tagging trait for module-specific factories that produce implementations of features.
Expand Down Expand Up @@ -362,8 +362,7 @@ object RequestContextFeatureFactoryConfig {
* @param parent the parent [[FeatureFactoryConfig]].
*/
class RequestContextFeatureFactoryConfig(requestContext: RequestContext,
parent: FeatureFactoryConfig)(implicit stringFormatter: StringFormatter) extends OverridingFeatureFactoryConfig(parent) {

parent: FeatureFactoryConfig) extends OverridingFeatureFactoryConfig(parent) {
import FeatureToggle._
import RequestContextFeatureFactoryConfig._

Expand Down Expand Up @@ -392,7 +391,7 @@ class RequestContextFeatureFactoryConfig(requestContext: RequestContext,
}

val maybeVersion: Option[Int] = featureNameAndVersion.drop(1).headOption.map {
versionStr => stringFormatter.validateInt(versionStr, throw BadRequestException(s"Invalid version number '$versionStr' in feature toggle $featureName"))
versionStr: String => allCatch.opt(versionStr.toInt).getOrElse(throw BadRequestException(s"Invalid version number '$versionStr' in feature toggle $featureName"))
}

featureName -> FeatureToggle(
Expand Down
Expand Up @@ -26,7 +26,7 @@ import com.typesafe.scalalogging.LazyLogging
import org.knora.webapi.exceptions.{InternalServerException, RequestRejectedException}
import org.knora.webapi.http.status.{ApiStatusCodesV1, ApiStatusCodesV2}
import org.knora.webapi.messages.OntologyConstants
import org.knora.webapi.messages.util.{JsonLDDocument, JsonLDObject, JsonLDString}
import org.knora.webapi.messages.util.rdf.{JsonLDDocument, JsonLDObject, JsonLDString}
import org.knora.webapi.settings.KnoraSettingsImpl
import spray.json.{JsNumber, JsObject, JsString, JsValue}

Expand Down