Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(xmlupload): enable migration of resource creation date (DEV-1402) #238

Merged
merged 24 commits into from Oct 18, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 9 additions & 9 deletions Makefile
Expand Up @@ -38,7 +38,7 @@ docs-serve: ## serve docs for local viewing
mkdocs serve --dev-addr=0.0.0.0:7979

.PHONY: install-requirements
install-requirements: ## install requirements
install-requirements: ## install Python dependencies from the diverse requirements.txt files
python3 -m pip install --upgrade pip
pip3 install -r requirements.txt
pip3 install -r docs/requirements.txt
Expand All @@ -50,25 +50,25 @@ install: ## install from source (runs setup.py)
pip3 install -e .

.PHONY: test
test: dsp-stack ## run all tests
pytest test/
test: dsp-stack ## run all tests located in the "test" folder (intended for local usage)
-pytest test/
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
$(MAKE) stack-down

.PHONY: test-no-stack
test-no-stack: ## run tests without starting the stack (if a dsp-stack is already running)
test-no-stack: ## run all tests located in the "test" folder, without starting the stack (intended for local usage)
pytest test/

.PHONY: test-end-to-end
test-end-to-end: dsp-stack ## run e2e tests
pytest test/e2e/
test-end-to-end: dsp-stack ## run e2e tests (intended for local usage)
-pytest test/e2e/
$(MAKE) stack-down

.PHONY: test-end-to-end-ci
test-end-to-end-ci: dsp-stack ## run e2e tests on GitHub CI, where it isn't possible nor necessary to remove .tmp
test-end-to-end-ci: dsp-stack ## run e2e tests (intended for GitHub CI, where it isn't possible nor necessary to remove .tmp)
pytest test/e2e/

.PHONY: test-end-to-end-no-stack
test-end-to-end-no-stack: ## run e2e tests without starting the dsp-stack (if a dsp-stack is already running)
test-end-to-end-no-stack: ## run e2e tests without starting the dsp-stack (intended for local usage)
pytest test/e2e/

.PHONY: test-unittests
Expand All @@ -77,7 +77,7 @@ test-unittests: ## run unit tests

.PHONY: clean
clean: ## clean local project directories
@rm -rf dist/ build/ site/ dsp_tools.egg-info/
@rm -rf dist/ build/ site/ dsp_tools.egg-info/ id2iri_*_mapping_*.json stashed_*_properties_*.txt

.PHONY: help
help: ## show this help
Expand Down
Binary file modified docs/assets/images/img-excel2xml.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 4 additions & 1 deletion docs/dsp-tools-excel.md
Expand Up @@ -220,4 +220,7 @@ Some notes:

- The special tags `<annotation>`, `<link>`, and `<region>` are represented as resources of restype `Annotation`,
`LinkObj`, and `Region`.
- The columns "ark" and "iri" are only used for DaSCH-internal data migration.
- The columns "ark", "iri", and "creation_date" are only used for DaSCH-internal data migration.
- If `file` is provided, but no `file permissions`, an attempt will be started to deduce them from the resource
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
permissions (`res-default` --> `prop-default` and `res-restricted` --> `prop-restricted`). If this attempt is not
successful, a `BaseError` will be raised.
20 changes: 12 additions & 8 deletions docs/dsp-tools-xmlupload.md
Expand Up @@ -201,14 +201,18 @@ To take `KnownUser` as example:

A `<resource>` element contains all necessary information to create a resource. It has the following attributes:

- `label`: a human-readable, preferably meaningful short name of the resource (required)
- `restype`: the resource type as defined within the ontology (required)
- `id`: a unique, arbitrary string providing a unique ID to the resource in order to be referencable by other resources;
the ID is only used during the import process and later replaced by the IRI used internally by DSP (required)
- `permissions`: a reference to a permission set; the permissions will be applied to the created resource (optional)
- `iri`: a custom IRI used when migrating existing resources (optional)
- `ark`: a version 0 ARK used when migrating existing resources from salsah.org to DSP (optional), it is not possible to
use `iri` and `ark` in the same resource. When `ark` is used, it overrides `iri`.
- `label` (required): a human-readable, preferably meaningful short name of the resource
- `restype` (required): the resource type as defined within the ontology
- `id` (required): a unique, arbitrary string providing a unique ID to the resource in order to be referencable by other
resources; the ID is only used during the import process and later replaced by the IRI used internally by DSP
- `permissions` (optional, but if omitted, users who are lower than a `ProjectAdmin` have no permissions at all, not
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
even view rights): a reference to a permission set; the permissions will be applied to the created resource
- `iri` (optional): a custom IRI, used when migrating existing resources (DaSCH-internal only)
- `ark` (optional): a version 0 ARK, used when migrating existing resources. It is not possible
to use `iri` and `ark` in the same resource. When `ark` is used, it overrides `iri` (DaSCH-internal only).
- `creation_date` (optional): the creation date of the resource, used when migrating existing resources
. It must be formatted according to the constraints of [xsd:dateTimeStamp](https://www.w3.org/TR/xmlschema11-2/#dateTimeStamp),
which means that the timezone is required, e.g.: `2005-10-23T13:45:12.502951+02:00` (DaSCH-internal only)

A complete `<resource>` element may look as follows:

Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Expand Up @@ -20,7 +20,7 @@ dsp-tools helps you with the following tasks:
a DSP server and writes it into a JSON file.
- [`dsp-tools xmlupload`](./dsp-tools-usage.md#upload-data-to-a-dsp-server) uploads data from an XML file (bulk
data import) and writes the mapping from internal IDs to IRIs into a local file.
- [`dsp-tools excel`](./dsp-tools-usage.md#create-the-lists-section-of-a-json-project-file-from-excel-files)
- [`dsp-tools excel2lists`](./dsp-tools-usage.md#create-the-lists-section-of-a-json-project-file-from-excel-files)
creates the "lists" section of a JSON project file from one or several Excel files. The resulting section can be
integrated into a JSON project file and then be uploaded to a DSP server with `dsp-tools create`.
- [`dsp-tools excel2resources`](./dsp-tools-usage.md#create-the-resources-section-of-a-json-project-file-from-an-excel-file)
Expand Down
76 changes: 41 additions & 35 deletions knora/dsplib/models/helpers.py
Expand Up @@ -2,7 +2,6 @@
import sys
from dataclasses import dataclass
from enum import Enum, unique
from traceback import format_exc
from typing import NewType, Optional, Any, Tuple, Union, Pattern

from pystrict import strict
Expand Down Expand Up @@ -63,7 +62,7 @@ def __str__(self) -> str:
Convert to string
:return: stringyfied error message
"""
return self._message + "\n\n" + format_exc()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a clue what format_exc() was good for in the past, but I know that it causes problems. It was the source of this strange behaviour that I already had in the past: when a BaseError occurs while testing, Python prints an infinitely long stacktrace full of riddles, and then crashes.

I found out that I can just remove format_exc()

return self._message
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved

@property
def message(self) -> str:
Expand Down Expand Up @@ -423,67 +422,74 @@ def print(self) -> None:
print(a[0] + ': "' + a[1].iri + '"')


class LastModificationDate:
class DateTimeStamp:
"""
Class to hold and process the last modification date of a ontology
Class to hold and process an xsd:dateTimeStamp
"""
_last_modification_date: str
_dateTimeStamp: str
_validation_regex = r"^-?([1-9][0-9]{3,}|0[0-9]{3})" \
r"-(0[1-9]|1[0-2])" \
r"-(0[1-9]|[12][0-9]|3[01])" \
r"T(([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9](\.[0-9]+)?|(24:00:00(\.0+)?))" \
r"(Z|(\+|-)((0[0-9]|1[0-3]):[0-5][0-9]|14:00))$"

def __init__(self, val: Any):
"""
The constructor works for different inputs:
- a string holding the modification date
- an instance of "LastModificationDate"
- a string
- an instance of "DateTimeStamp"
- json-ld construct of the form { "@type": "xsd:dateTimeStamp", "@value": "date-str" }
:param val: datetimestamp as string, instance of "LastModificationDate" or json-ld construct
:param val: xsd:dateTimeStamp as string, instance of "DateTimeStamp" or json-ld construct
"""
if isinstance(val, str):
self._last_modification_date = val
elif isinstance(val, LastModificationDate):
self._last_modification_date = str(val)
if not re.search(self._validation_regex, val):
raise BaseError(f"Invalid xsd:dateTimeStamp: '{val}'")
self._dateTimeStamp = val
elif isinstance(val, DateTimeStamp):
self._dateTimeStamp = str(val)
else:
if val.get("@type") is not None and val.get("@type") == "xsd:dateTimeStamp":
self._last_modification_date = val["@value"]
if val.get("@type") == "xsd:dateTimeStamp" and re.search(self._validation_regex, str(val.get("@value"))):
self._dateTimeStamp = val["@value"]
else:
raise BaseError("Invalid LastModificationDate")
raise BaseError(f"Invalid xsd:dateTimeStamp: '{val}'")

def __eq__(self, other: Union[str, 'LastModificationDate']) -> bool:
def __eq__(self, other: Union[str, 'DateTimeStamp']) -> bool:
if isinstance(other, str):
other = LastModificationDate(other)
return self._last_modification_date == other._last_modification_date
other = DateTimeStamp(other)
return self._dateTimeStamp == other._dateTimeStamp

def __lt__(self, other: 'LastModificationDate') -> bool:
def __lt__(self, other: 'DateTimeStamp') -> bool:
if isinstance(other, str):
other = LastModificationDate(other)
return self._last_modification_date < other._last_modification_date
other = DateTimeStamp(other)
return self._dateTimeStamp < other._dateTimeStamp

def __le__(self, other: 'LastModificationDate') -> bool:
def __le__(self, other: 'DateTimeStamp') -> bool:
if isinstance(other, str):
other = LastModificationDate(other)
return self._last_modification_date <= other._last_modification_date
other = DateTimeStamp(other)
return self._dateTimeStamp <= other._dateTimeStamp

def __gt__(self, other: 'LastModificationDate') -> bool:
def __gt__(self, other: 'DateTimeStamp') -> bool:
if isinstance(other, str):
other = LastModificationDate(other)
return self._last_modification_date > other._last_modification_date
other = DateTimeStamp(other)
return self._dateTimeStamp > other._dateTimeStamp

def __ge__(self, other: 'LastModificationDate') -> bool:
def __ge__(self, other: 'DateTimeStamp') -> bool:
if isinstance(other, str):
other = LastModificationDate(other)
return self._last_modification_date >= other._last_modification_date
other = DateTimeStamp(other)
return self._dateTimeStamp >= other._dateTimeStamp

def __ne__(self, other: 'LastModificationDate') -> bool:
def __ne__(self, other: 'DateTimeStamp') -> bool:
if isinstance(other, str):
other = LastModificationDate(other)
return self._last_modification_date != other._last_modification_date
other = DateTimeStamp(other)
return self._dateTimeStamp != other._dateTimeStamp

def __str__(self: 'LastModificationDate') -> Union[None, str]:
return self._last_modification_date
def __str__(self: 'DateTimeStamp') -> Union[None, str]:
return self._dateTimeStamp

def toJsonObj(self):
return {
"@type": "xsd:dateTimeStamp",
"@value": self._last_modification_date
"@value": self._dateTimeStamp
}


Expand Down
20 changes: 10 additions & 10 deletions knora/dsplib/models/ontology.py
Expand Up @@ -7,7 +7,7 @@
from pystrict import strict

from .connection import Connection
from .helpers import Actions, BaseError, Context, LastModificationDate, OntoIri, WithId
from .helpers import Actions, BaseError, Context, DateTimeStamp, OntoIri, WithId
from .model import Model
from .project import Project
from .propertyclass import PropertyClass
Expand Down Expand Up @@ -64,7 +64,7 @@ class Ontology(Model):
_name: str
_label: str
_comment: str
_lastModificationDate: LastModificationDate
_lastModificationDate: DateTimeStamp
_resource_classes: list[ResourceClass]
_property_classes: list[PropertyClass]
_context: Context
Expand All @@ -77,7 +77,7 @@ def __init__(self,
name: Optional[str] = None,
label: Optional[str] = None,
comment: Optional[str] = None,
lastModificationDate: Optional[Union[str, LastModificationDate]] = None,
lastModificationDate: Optional[Union[str, DateTimeStamp]] = None,
resource_classes: list[ResourceClass] = [],
property_classes: list[PropertyClass] = [],
context: Context = None):
Expand All @@ -92,10 +92,10 @@ def __init__(self,
self._comment = comment
if lastModificationDate is None:
self._lastModificationDate = None
elif isinstance(lastModificationDate, LastModificationDate):
elif isinstance(lastModificationDate, DateTimeStamp):
self._lastModificationDate = lastModificationDate
else:
self._lastModificationDate = LastModificationDate(lastModificationDate)
self._lastModificationDate = DateTimeStamp(lastModificationDate)
self._resource_classes = resource_classes
self._property_classes = property_classes
self._context = context if context is not None else Context()
Expand Down Expand Up @@ -144,12 +144,12 @@ def comment(self, value: str):
self._changed.add('comment')

@property
def lastModificationDate(self) -> LastModificationDate:
def lastModificationDate(self) -> DateTimeStamp:
return self._lastModificationDate

@lastModificationDate.setter
def lastModificationDate(self, value: Union[str, LastModificationDate]):
self._lastModificationDate = LastModificationDate(value)
def lastModificationDate(self, value: Union[str, DateTimeStamp]):
self._lastModificationDate = DateTimeStamp(value)

@property
def resource_classes(self) -> list[ResourceClass]:
Expand Down Expand Up @@ -250,7 +250,7 @@ def fromJsonObj(cls, con: Connection, json_obj: Any) -> 'Ontology':
project = json_obj[knora_api + ':attachedToProject']['@id']
tmp = json_obj.get(knora_api + ':lastModificationDate')
if tmp is not None:
last_modification_date = LastModificationDate(json_obj.get(knora_api + ':lastModificationDate'))
last_modification_date = DateTimeStamp(json_obj.get(knora_api + ':lastModificationDate'))
else:
last_modification_date = None
resource_classes = None
Expand Down Expand Up @@ -303,7 +303,7 @@ def __oneOntologiesFromJsonObj(cls, con: Connection, json_obj: Any, context: Con
project = json_obj[knora_api + ':attachedToProject']['@id']
tmp = json_obj.get(knora_api + ':lastModificationDate')
if tmp is not None:
last_modification_date = LastModificationDate(json_obj.get(knora_api + ':lastModificationDate'))
last_modification_date = DateTimeStamp(json_obj.get(knora_api + ':lastModificationDate'))
else:
last_modification_date = None
label = json_obj.get(rdfs + ':label')
Expand Down
18 changes: 9 additions & 9 deletions knora/dsplib/models/propertyclass.py
Expand Up @@ -6,7 +6,7 @@
from pystrict import strict

from .connection import Connection
from .helpers import Actions, BaseError, Context, LastModificationDate, WithId
from .helpers import Actions, BaseError, Context, DateTimeStamp, WithId
from .langstring import Languages, LangString
from .listnode import ListNode
from .model import Model
Expand Down Expand Up @@ -301,7 +301,7 @@ def fromJsonObj(cls, con: Connection, context: Context, json_obj: Any) -> Any:
editable=editable,
linkvalue=linkvalue)

def toJsonObj(self, lastModificationDate: LastModificationDate, action: Actions, what: Optional[str] = None) -> Any:
def toJsonObj(self, lastModificationDate: DateTimeStamp, action: Actions, what: Optional[str] = None) -> Any:

def resolve_propref(resref: str):
tmp = resref.split(':')
Expand Down Expand Up @@ -379,14 +379,14 @@ def resolve_propref(resref: str):

return tmp

def create(self, last_modification_date: LastModificationDate) -> Tuple[LastModificationDate, 'PropertyClass']:
def create(self, last_modification_date: DateTimeStamp) -> Tuple[DateTimeStamp, 'PropertyClass']:
jsonobj = self.toJsonObj(last_modification_date, Actions.Create)
jsondata = json.dumps(jsonobj, cls=SetEncoder, indent=2)
result = self._con.post(PropertyClass.ROUTE, jsondata)
last_modification_date = LastModificationDate(result['knora-api:lastModificationDate'])
last_modification_date = DateTimeStamp(result['knora-api:lastModificationDate'])
return last_modification_date, PropertyClass.fromJsonObj(self._con, self._context, result['@graph'])

def update(self, last_modification_date: LastModificationDate) -> Tuple[LastModificationDate, 'ResourceClass']:
def update(self, last_modification_date: DateTimeStamp) -> Tuple[DateTimeStamp, 'ResourceClass']:
#
# Note: Knora is able to change only one thing per call, either label or comment!
#
Expand All @@ -396,23 +396,23 @@ def update(self, last_modification_date: LastModificationDate) -> Tuple[LastModi
jsonobj = self.toJsonObj(last_modification_date, Actions.Update, 'label')
jsondata = json.dumps(jsonobj, cls=SetEncoder, indent=4)
result = self._con.put(PropertyClass.ROUTE, jsondata)
last_modification_date = LastModificationDate(result['knora-api:lastModificationDate'])
last_modification_date = DateTimeStamp(result['knora-api:lastModificationDate'])
something_changed = True
if 'comment' in self._changed:
jsonobj = self.toJsonObj(last_modification_date, Actions.Update, 'comment')
jsondata = json.dumps(jsonobj, cls=SetEncoder, indent=4)
result = self._con.put(PropertyClass.ROUTE, jsondata)
last_modification_date = LastModificationDate(result['knora-api:lastModificationDate'])
last_modification_date = DateTimeStamp(result['knora-api:lastModificationDate'])
something_changed = True
if something_changed:
return last_modification_date, PropertyClass.fromJsonObj(self._con, self._context, result['@graph'])
else:
return last_modification_date, self

def delete(self, last_modification_date: LastModificationDate) -> LastModificationDate:
def delete(self, last_modification_date: DateTimeStamp) -> DateTimeStamp:
result = self._con.delete(PropertyClass.ROUTE + '/' + quote_plus(self._id) + '?lastModificationDate=' + str(
last_modification_date))
return LastModificationDate(result['knora-api:lastModificationDate'])
return DateTimeStamp(result['knora-api:lastModificationDate'])

def createDefinitionFileObj(self, context: Context, shortname: str):
"""
Expand Down