New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix!: add upgrade plugin that fixes invalid date serialisations #2081
Changes from 14 commits
0496f6d
5475fa8
fa8a657
fe8b31f
db7475c
2244d8d
2c2f412
eb2e5de
390af2c
6df7de2
d863ded
1de5f2d
6cc4b54
f956eb5
b08fadb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,7 +36,7 @@ sipi/test | |
*.bak | ||
*.rdb | ||
.sbtrc | ||
*.trig | ||
/*.trig | ||
|
||
dependencies.txt | ||
/client-test-data.zip | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . | ||
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . | ||
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> . | ||
@prefix knora-base: <http://www.knora.org/ontology/knora-base#> . | ||
|
||
# resources with old date serialization | ||
|
||
<http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9ma> | ||
a knora-base:Resource ; | ||
knora-base:attachedToUser <http://rdfh.ch/users/9XBCrDV3SRa7kS1WwynB4Q> ; | ||
knora-base:attachedToProject <http://rdfh.ch/projects/0001> ; | ||
knora-base:hasPermissions "V knora-admin:UnknownUser|M knora-admin:ProjectMember" ; | ||
knora-base:creationDate "2019-11-29T10:00:00.673298+00:00"^^xsd:dateTime ; | ||
rdfs:label "a thing" ; | ||
knora-base:isDeleted false . | ||
|
||
<http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9mb> | ||
a knora-base:Resource ; | ||
knora-base:attachedToUser <http://rdfh.ch/users/9XBCrDV3SRa7kS1WwynB4Q> ; | ||
knora-base:attachedToProject <http://rdfh.ch/projects/0001> ; | ||
knora-base:hasPermissions "V knora-admin:UnknownUser|M knora-admin:ProjectMember" ; | ||
knora-base:creationDate "2018-08-07T13:27:18.518+00:00"^^xsd:dateTime ; | ||
knora-base:lastModificationDate "2019-11-29T10:00:01.673298+00:00"^^xsd:dateTime ; | ||
rdfs:label "a thing" ; | ||
knora-base:isDeleted false . | ||
|
||
<http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9mc> | ||
a knora-base:Resource ; | ||
knora-base:attachedToUser <http://rdfh.ch/users/9XBCrDV3SRa7kS1WwynB4Q> ; | ||
knora-base:attachedToProject <http://rdfh.ch/projects/0001> ; | ||
knora-base:hasPermissions "V knora-admin:UnknownUser|M knora-admin:ProjectMember" ; | ||
knora-base:creationDate "2018-08-07T13:27:18.518123Z"^^xsd:dateTime ; | ||
knora-base:lastModificationDate "2019-11-29T10:00:01.673298+02:00"^^xsd:dateTime ; | ||
knora-base:deleteDate "2020-04-07T14:59:28.960124+00:00"^^xsd:dateTime ; | ||
rdfs:label "a thing" ; | ||
knora-base:isDeleted false . |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
/* | ||
* Copyright © 2021 - 2022 Swiss National Data and Service Center for the Humanities and/or DaSCH Service Platform contributors. | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.knora.webapi.store.triplestore.upgrade.plugins | ||
|
||
import com.typesafe.scalalogging.Logger | ||
import org.knora.webapi.feature.FeatureFactoryConfig | ||
import org.knora.webapi.messages.OntologyConstants | ||
import org.knora.webapi.messages.util.rdf._ | ||
import org.knora.webapi.store.triplestore.upgrade.UpgradePlugin | ||
import java.time.Instant | ||
|
||
/** | ||
* Transforms a repository for Knora PR 2081. | ||
* Fixes wrong date serialisations (all `xsd:dateTime` in the database should end on `Z` rather than specifying a time zone). | ||
*/ | ||
class UpgradePluginPR2081(featureFactoryConfig: FeatureFactoryConfig, log: Logger) extends UpgradePlugin { | ||
private val nodeFactory: RdfNodeFactory = RdfFeatureFactory.getRdfNodeFactory(featureFactoryConfig) | ||
|
||
override def transform(model: RdfModel): Unit = { | ||
val statementsToRemove: collection.mutable.Set[Statement] = collection.mutable.Set.empty | ||
val statementsToAdd: collection.mutable.Set[Statement] = collection.mutable.Set.empty | ||
Comment on lines
+23
to
+24
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks familiarly inelegant 🤣 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm glad you dislike it too! xD |
||
|
||
val newObjectValue: String => DatatypeLiteral = (in: String) => | ||
nodeFactory.makeDatatypeLiteral(Instant.parse(in).toString, OntologyConstants.Xsd.DateTime) | ||
val shouldTransform: DatatypeLiteral => Boolean = (literal: DatatypeLiteral) => | ||
(literal.datatype == OntologyConstants.Xsd.DateTime && | ||
literal.value != newObjectValue(literal.value).value) | ||
|
||
for (statement: Statement <- model) { | ||
statement.obj match { | ||
case literal: DatatypeLiteral if shouldTransform(literal) => | ||
val newValue = newObjectValue(literal.value) | ||
log.debug(s"Transformed ${literal.value} => ${newValue.value}") | ||
statementsToRemove += statement | ||
statementsToAdd += nodeFactory.makeStatement( | ||
subj = statement.subj, | ||
pred = statement.pred, | ||
obj = newValue, | ||
context = statement.context | ||
) | ||
case other => () | ||
} | ||
} | ||
|
||
model.removeStatements(statementsToRemove.toSet) | ||
model.addStatements(statementsToAdd.toSet) | ||
} | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -36,7 +36,7 @@ DELETE { | |
GRAPH ?dataNamedGraph { | ||
@maybeLastModificationDate match { | ||
case Some(lastModificationDate) => { | ||
?resource knora-base:lastModificationDate "@lastModificationDate"^^xsd:dateTime . | ||
?resource knora-base:lastModificationDate ?lastModificationDate . | ||
} | ||
|
||
case None => {} | ||
|
@@ -90,7 +90,7 @@ WHERE { | |
|
||
@maybeLastModificationDate match { | ||
case Some(lastModificationDate) => { | ||
?resource knora-base:lastModificationDate "@lastModificationDate"^^xsd:dateTime . | ||
?resource knora-base:lastModificationDate ?lastModificationDate . | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the last modification date still typed in the triplestore with this change? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, I'm just using whatever is in the triplestore already rather than the value passed to he template, when deleting the old last modification date. (The idea is to make this independent of the serialization in the triplestore) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I misread it. It's only inside the DELETE and WHERE clause, the INSERT still has the typed version! Ignore my comment then. |
||
} | ||
|
||
case None => { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
/* | ||
* Copyright © 2021 - 2022 Swiss National Data and Service Center for the Humanities and/or DaSCH Service Platform contributors. | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.knora.webapi.store.triplestore.upgrade.plugins | ||
|
||
import com.typesafe.scalalogging.LazyLogging | ||
import dsp.errors.AssertionException | ||
import org.knora.webapi.messages.OntologyConstants | ||
import org.knora.webapi.messages.util.rdf._ | ||
|
||
class UpgradePluginPR2081Spec extends UpgradePluginSpec with LazyLogging { | ||
private val nodeFactory: RdfNodeFactory = RdfFeatureFactory.getRdfNodeFactory(defaultFeatureFactoryConfig) | ||
|
||
private def getDateValue(model: RdfModel, subj: IriNode, pred: IriNode): String = { | ||
val statement = model.find(subj = Some(subj), pred = Some(pred), obj = None).toSet.head | ||
statement.obj match { | ||
case literal: DatatypeLiteral if literal.datatype == OntologyConstants.Xsd.DateTime => | ||
literal.value | ||
case other => throw AssertionException(s"Unexpected object for $pred: $other") | ||
} | ||
} | ||
|
||
"Upgrade plugin PR2081" should { | ||
|
||
"fix invalid date serializations" in { | ||
val resource1 = nodeFactory.makeIriNode("http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9ma") | ||
val resource2 = nodeFactory.makeIriNode("http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9mb") | ||
val resource3 = nodeFactory.makeIriNode("http://rdfh.ch/0001/55UrkgTKR2SEQgnsLWI9mc") | ||
val creationDate = nodeFactory.makeIriNode(OntologyConstants.KnoraBase.CreationDate) | ||
val lastModificationDate = nodeFactory.makeIriNode(OntologyConstants.KnoraBase.LastModificationDate) | ||
val deletionDate = nodeFactory.makeIriNode(OntologyConstants.KnoraBase.DeleteDate) | ||
|
||
// Parse the input file. | ||
val model: RdfModel = trigFileToModel("../test_data/upgrade/pr2081.trig") | ||
|
||
// Store previous values | ||
val resource1CreationDate = getDateValue(model, resource1, creationDate) | ||
val resource2CreationDate = getDateValue(model, resource2, creationDate) | ||
val resource2LastModificationDate = getDateValue(model, resource2, lastModificationDate) | ||
val resource3CreationDate = getDateValue(model, resource3, creationDate) // only this one should stay the same | ||
val resource3LastModificationDate = getDateValue(model, resource3, lastModificationDate) | ||
val resource3DeletionDate = getDateValue(model, resource3, deletionDate) | ||
|
||
// Use the plugin to transform the input. | ||
val plugin = new UpgradePluginPR2081(defaultFeatureFactoryConfig, log) | ||
plugin.transform(model) | ||
|
||
// get the new values after transformation | ||
val newResource1CreationDate = getDateValue(model, resource1, creationDate) | ||
val newResource2CreationDate = getDateValue(model, resource2, creationDate) | ||
val newResource2LastModificationDate = getDateValue(model, resource2, lastModificationDate) | ||
val newResource3CreationDate = | ||
getDateValue(model, resource3, creationDate) // only this one should have stayed the same | ||
val newResource3LastModificationDate = getDateValue(model, resource3, lastModificationDate) | ||
val newResource3DeletionDate = getDateValue(model, resource3, deletionDate) | ||
|
||
// Check that the dates were fixed. | ||
newResource1CreationDate should not equal (resource1CreationDate) | ||
newResource1CreationDate should endWith("Z") | ||
|
||
newResource2CreationDate should not equal (resource2CreationDate) | ||
newResource2CreationDate should endWith("Z") | ||
newResource2LastModificationDate should not equal (resource2CreationDate) | ||
newResource2LastModificationDate should endWith("Z") | ||
|
||
newResource3CreationDate should equal(resource3CreationDate) // is equal! | ||
newResource3CreationDate should endWith("Z") | ||
newResource3LastModificationDate should not equal (resource3CreationDate) | ||
newResource3LastModificationDate should endWith("Z") | ||
newResource3DeletionDate should not equal (resource3CreationDate) | ||
newResource3DeletionDate should endWith("Z") | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we could have above
init-db-test-from-ls-test-server
make command split into 2 steps anywaySo basically command introduced by you stays and the other one could be reduced. Reason for that is the huge amount of data and the issue with that file if the download dies before it ends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my initial thought. (Additionally, once downloaded, you don't need the password anymore, which is nice.)
But I didn't want to mess with the other make target, so I thought I'd add a separate one. Do you think it makes a difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can leave it like that for now, I might fo it later on. I had some problems with dumping the data and then you need to remember to remove the file otherwise another error appears. This could be overall improved but not the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you quickly add what Marcin suggested? It follows the pattern we are using in other cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done