Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: add upgrade plugin that fixes invalid date serialisations #2081

Merged
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Expand Up @@ -36,7 +36,7 @@ sipi/test
*.bak
*.rdb
.sbtrc
*.trig
/*.trig

dependencies.txt
/client-test-data.zip
Expand Down
6 changes: 6 additions & 0 deletions Makefile
Expand Up @@ -278,6 +278,12 @@ init-db-test-from-ls-test-server: db_ls_test_server_dump.trig init-db-test-empty
@curl -X POST -H "Content-Type: application/sparql-update" -d "DROP ALL" -u "admin:test" "http://localhost:3030/knora-test"
@curl -X POST -H "Content-Type: application/trig" --data-binary "@${CURRENT_DIR}/db_ls_test_server_dump.trig" -u "admin:test" "http://localhost:3030/knora-test"

.PHONY: init-db-test-from-ls-test-server-trig-file
init-db-test-from-ls-test-server-trig-file: init-db-test-empty ## init local database with data from a local ls-test-server dump
@echo $@
@curl -X POST -H "Content-Type: application/sparql-update" -d "DROP ALL" -u "admin:test" "http://localhost:3030/knora-test"
@curl -X POST -H "Content-Type: application/trig" --data-binary "@${CURRENT_DIR}/db_ls_test_server_dump.trig" -u "admin:test" "http://localhost:3030/knora-test"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we could have above init-db-test-from-ls-test-server make command split into 2 steps anyway

  • dump the data first
  • load it from that file

So basically command introduced by you stays and the other one could be reduced. Reason for that is the huge amount of data and the issue with that file if the download dies before it ends.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my initial thought. (Additionally, once downloaded, you don't need the password anymore, which is nice.)

But I didn't want to mess with the other make target, so I thought I'd add a separate one. Do you think it makes a difference?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave it like that for now, I might fo it later on. I had some problems with dumping the data and then you need to remember to remove the file otherwise another error appears. This could be overall improved but not the scope of this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you quickly add what Marcin suggested? It follows the pattern we are using in other cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

#################################
## Other
#################################
Expand Down
2 changes: 1 addition & 1 deletion knora-ontologies/knora-base.ttl
Expand Up @@ -19,7 +19,7 @@
rdf:type owl:Ontology ;
rdfs:label "The Knora base ontology"@en ;
:attachedToProject knora-admin:SystemProject ;
:ontologyVersion "knora-base v21" .
:ontologyVersion "knora-base v22" .
BalduinLandolt marked this conversation as resolved.
Show resolved Hide resolved


#################################################################
Expand Down
36 changes: 36 additions & 0 deletions test_data/upgrade/pr2081.trig
@@ -0,0 +1,36 @@
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix knora-base: <http://www.knora.org/ontology/knora-base#> .

# resources with old date serialization

<http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9ma>
a knora-base:Resource ;
knora-base:attachedToUser <http://rdfh.ch/users/9XBCrDV3SRa7kS1WwynB4Q> ;
knora-base:attachedToProject <http://rdfh.ch/projects/0001> ;
knora-base:hasPermissions "V knora-admin:UnknownUser|M knora-admin:ProjectMember" ;
knora-base:creationDate "2019-11-29T10:00:00.673298+00:00"^^xsd:dateTime ;
rdfs:label "a thing" ;
knora-base:isDeleted false .

<http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9mb>
a knora-base:Resource ;
knora-base:attachedToUser <http://rdfh.ch/users/9XBCrDV3SRa7kS1WwynB4Q> ;
knora-base:attachedToProject <http://rdfh.ch/projects/0001> ;
knora-base:hasPermissions "V knora-admin:UnknownUser|M knora-admin:ProjectMember" ;
knora-base:creationDate "2018-08-07T13:27:18.518+00:00"^^xsd:dateTime ;
knora-base:lastModificationDate "2019-11-29T10:00:01.673298+00:00"^^xsd:dateTime ;
rdfs:label "a thing" ;
knora-base:isDeleted false .

<http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9mc>
a knora-base:Resource ;
knora-base:attachedToUser <http://rdfh.ch/users/9XBCrDV3SRa7kS1WwynB4Q> ;
knora-base:attachedToProject <http://rdfh.ch/projects/0001> ;
knora-base:hasPermissions "V knora-admin:UnknownUser|M knora-admin:ProjectMember" ;
knora-base:creationDate "2018-08-07T13:27:18.518123Z"^^xsd:dateTime ;
knora-base:lastModificationDate "2019-11-29T10:00:01.673298+02:00"^^xsd:dateTime ;
knora-base:deleteDate "2020-04-07T14:59:28.960124+00:00"^^xsd:dateTime ;
rdfs:label "a thing" ;
knora-base:isDeleted false .
2 changes: 1 addition & 1 deletion webapi/src/main/scala/org/knora/webapi/package.scala
Expand Up @@ -11,7 +11,7 @@ package object webapi {
* The version of `knora-base` and of the other built-in ontologies that this version of Knora requires.
* Must be the same as the object of `knora-base:ontologyVersion` in the `knora-base` ontology being used.
*/
val KnoraBaseVersion: String = "knora-base v21"
val KnoraBaseVersion: String = "knora-base v22"

/**
* `IRI` is a synonym for `String`, used to improve code readability.
Expand Down
Expand Up @@ -59,6 +59,11 @@ object RepositoryUpdatePlan {
versionNumber = 21,
plugin = new UpgradePluginPR2079(featureFactoryConfig, log),
prBasedVersionString = Some("PR 2079")
),
PluginForKnoraBaseVersion(
versionNumber = 22,
plugin = new UpgradePluginPR2081(featureFactoryConfig, log),
prBasedVersionString = Some("PR 2081")
)
)

Expand Down
@@ -0,0 +1,52 @@
/*
* Copyright © 2021 - 2022 Swiss National Data and Service Center for the Humanities and/or DaSCH Service Platform contributors.
* SPDX-License-Identifier: Apache-2.0
*/

package org.knora.webapi.store.triplestore.upgrade.plugins

import com.typesafe.scalalogging.Logger
import org.knora.webapi.feature.FeatureFactoryConfig
import org.knora.webapi.messages.OntologyConstants
import org.knora.webapi.messages.util.rdf._
import org.knora.webapi.store.triplestore.upgrade.UpgradePlugin
import java.time.Instant

/**
* Transforms a repository for Knora PR 2079.
* Adds missing datatype ^^<http://www.w3.org/2001/XMLSchema#anyURI> and/or value to valueHasUri
BalduinLandolt marked this conversation as resolved.
Show resolved Hide resolved
*/
class UpgradePluginPR2081(featureFactoryConfig: FeatureFactoryConfig, log: Logger) extends UpgradePlugin {
private val nodeFactory: RdfNodeFactory = RdfFeatureFactory.getRdfNodeFactory(featureFactoryConfig)

override def transform(model: RdfModel): Unit = {
val statementsToRemove: collection.mutable.Set[Statement] = collection.mutable.Set.empty
val statementsToAdd: collection.mutable.Set[Statement] = collection.mutable.Set.empty
Comment on lines +23 to +24
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks familiarly inelegant 🤣

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you dislike it too! xD


val newObjectValue: String => DatatypeLiteral = (in: String) =>
nodeFactory.makeDatatypeLiteral(Instant.parse(in).toString, OntologyConstants.Xsd.DateTime)
val shouldTransform: DatatypeLiteral => Boolean = (literal: DatatypeLiteral) =>
(literal.datatype == OntologyConstants.Xsd.DateTime &&
literal.value != newObjectValue(literal.value).value)

for (statement: Statement <- model) {
statement.obj match {
case literal: DatatypeLiteral if shouldTransform(literal) =>
val newValue = newObjectValue(literal.value)
log.debug(s"Transformed ${literal.value} => ${newValue.value}")
statementsToRemove += statement
statementsToAdd += nodeFactory.makeStatement(
subj = statement.subj,
pred = statement.pred,
obj = newValue,
context = statement.context
)
case other => ()
}
}

model.removeStatements(statementsToRemove.toSet)
model.addStatements(statementsToAdd.toSet)
}

}
Expand Up @@ -36,7 +36,7 @@ DELETE {
GRAPH ?dataNamedGraph {
@maybeLastModificationDate match {
case Some(lastModificationDate) => {
?resource knora-base:lastModificationDate "@lastModificationDate"^^xsd:dateTime .
?resource knora-base:lastModificationDate ?lastModificationDate .
}

case None => {}
Expand Down Expand Up @@ -90,7 +90,7 @@ WHERE {

@maybeLastModificationDate match {
case Some(lastModificationDate) => {
?resource knora-base:lastModificationDate "@lastModificationDate"^^xsd:dateTime .
?resource knora-base:lastModificationDate ?lastModificationDate .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the last modification date still typed in the triplestore with this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I'm just using whatever is in the triplestore already rather than the value passed to he template, when deleting the old last modification date. (The idea is to make this independent of the serialization in the triplestore)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I misread it. It's only inside the DELETE and WHERE clause, the INSERT still has the typed version! Ignore my comment then.

}

case None => {
Expand Down
@@ -0,0 +1,76 @@
/*
* Copyright © 2021 - 2022 Swiss National Data and Service Center for the Humanities and/or DaSCH Service Platform contributors.
* SPDX-License-Identifier: Apache-2.0
*/

package org.knora.webapi.store.triplestore.upgrade.plugins

import com.typesafe.scalalogging.LazyLogging
import dsp.errors.AssertionException
import org.knora.webapi.messages.OntologyConstants
import org.knora.webapi.messages.util.rdf._

class UpgradePluginPR2081Spec extends UpgradePluginSpec with LazyLogging {
private val nodeFactory: RdfNodeFactory = RdfFeatureFactory.getRdfNodeFactory(defaultFeatureFactoryConfig)

private def getDateValue(model: RdfModel, subj: IriNode, pred: IriNode): String = {
val statement = model.find(subj = Some(subj), pred = Some(pred), obj = None).toSet.head
statement.obj match {
case literal: DatatypeLiteral if literal.datatype == OntologyConstants.Xsd.DateTime =>
literal.value
case other => throw AssertionException(s"Unexpected object for $pred: $other")
}
}

"Upgrade plugin PR2081" should {

"fix invalid date serializations" in {
val resource1 = nodeFactory.makeIriNode("http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9ma")
val resource2 = nodeFactory.makeIriNode("http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9mb")
val resource3 = nodeFactory.makeIriNode("http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9mc")
val creationDate = nodeFactory.makeIriNode(OntologyConstants.KnoraBase.CreationDate)
val lastModificationDate = nodeFactory.makeIriNode(OntologyConstants.KnoraBase.LastModificationDate)
val deletionDate = nodeFactory.makeIriNode(OntologyConstants.KnoraBase.DeleteDate)

// Parse the input file.
val model: RdfModel = trigFileToModel("../test_data/upgrade/pr2081.trig")

// Store previous values
val resource1CreationDate = getDateValue(model, resource1, creationDate)
val resource2CreationDate = getDateValue(model, resource2, creationDate)
val resource2LastModificationDate = getDateValue(model, resource2, lastModificationDate)
val resource3CreationDate = getDateValue(model, resource3, creationDate) // only this one should stay the same
val resource3LastModificationDate = getDateValue(model, resource3, lastModificationDate)
val resource3DeletionDate = getDateValue(model, resource3, deletionDate)

// Use the plugin to transform the input.
val plugin = new UpgradePluginPR2081(defaultFeatureFactoryConfig, log)
plugin.transform(model)

// get the new values after transformation
val newResource1CreationDate = getDateValue(model, resource1, creationDate)
val newResource2CreationDate = getDateValue(model, resource2, creationDate)
val newResource2LastModificationDate = getDateValue(model, resource2, lastModificationDate)
val newResource3CreationDate =
getDateValue(model, resource3, creationDate) // only this one should have stayed the same
val newResource3LastModificationDate = getDateValue(model, resource3, lastModificationDate)
val newResource3DeletionDate = getDateValue(model, resource3, deletionDate)

// Check that the dates were fixed.
newResource1CreationDate should not equal (resource1CreationDate)
newResource1CreationDate should endWith("Z")

newResource2CreationDate should not equal (resource2CreationDate)
newResource2CreationDate should endWith("Z")
newResource2LastModificationDate should not equal (resource2CreationDate)
newResource2LastModificationDate should endWith("Z")

newResource3CreationDate should equal(resource3CreationDate) // is equal!
newResource3CreationDate should endWith("Z")
newResource3LastModificationDate should not equal (resource3CreationDate)
newResource3LastModificationDate should endWith("Z")
newResource3DeletionDate should not equal (resource3CreationDate)
newResource3DeletionDate should endWith("Z")
}
}
}