Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(xmlupload): enable migration of resource creation date (DEV-1402) #238

Merged
merged 24 commits into from Oct 18, 2022
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 9 additions & 9 deletions Makefile
Expand Up @@ -38,7 +38,7 @@ docs-serve: ## serve docs for local viewing
mkdocs serve --dev-addr=0.0.0.0:7979

.PHONY: install-requirements
install-requirements: ## install requirements
install-requirements: ## install Python dependencies from the diverse requirements.txt files
python3 -m pip install --upgrade pip
pip3 install -r requirements.txt
pip3 install -r docs/requirements.txt
Expand All @@ -50,25 +50,25 @@ install: ## install from source (runs setup.py)
pip3 install -e .

.PHONY: test
test: dsp-stack ## run all tests
pytest test/
test: dsp-stack ## run all tests located in the "test" folder (intended for local usage)
-pytest test/
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
$(MAKE) stack-down

.PHONY: test-no-stack
test-no-stack: ## run tests without starting the stack (if a dsp-stack is already running)
test-no-stack: ## run all tests located in the "test" folder, without starting the stack (intended for local usage)
pytest test/

.PHONY: test-end-to-end
test-end-to-end: dsp-stack ## run e2e tests
pytest test/e2e/
test-end-to-end: dsp-stack ## run e2e tests (intended for local usage)
-pytest test/e2e/
$(MAKE) stack-down

.PHONY: test-end-to-end-ci
test-end-to-end-ci: dsp-stack ## run e2e tests on GitHub CI, where it isn't possible nor necessary to remove .tmp
test-end-to-end-ci: dsp-stack ## run e2e tests (intended for GitHub CI, where it isn't possible nor necessary to remove .tmp)
pytest test/e2e/

.PHONY: test-end-to-end-no-stack
test-end-to-end-no-stack: ## run e2e tests without starting the dsp-stack (if a dsp-stack is already running)
test-end-to-end-no-stack: ## run e2e tests without starting the dsp-stack (intended for local usage)
pytest test/e2e/

.PHONY: test-unittests
Expand All @@ -77,7 +77,7 @@ test-unittests: ## run unit tests

.PHONY: clean
clean: ## clean local project directories
@rm -rf dist/ build/ site/ dsp_tools.egg-info/
@rm -rf dist/ build/ site/ dsp_tools.egg-info/ id2iri_*_mapping_*.json stashed_*_properties_*.txt

.PHONY: help
help: ## show this help
Expand Down
Binary file modified docs/assets/images/img-excel2xml.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 4 additions & 1 deletion docs/dsp-tools-excel.md
Expand Up @@ -220,4 +220,7 @@ Some notes:

- The special tags `<annotation>`, `<link>`, and `<region>` are represented as resources of restype `Annotation`,
`LinkObj`, and `Region`.
- The columns "ark" and "iri" are only used for DaSCH-internal data migration.
- The columns "ark", "iri", and "creation_date" are only used for DaSCH-internal data migration.
- If `file` is provided, but no `file permissions`, an attempt will be started to deduce them from the resource
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
permissions (`res-default` --> `prop-default` and `res-restricted` --> `prop-restricted`). If this attempt is not
successful, a `BaseError` will be raised.
20 changes: 12 additions & 8 deletions docs/dsp-tools-xmlupload.md
Expand Up @@ -201,14 +201,18 @@ To take `KnownUser` as example:

A `<resource>` element contains all necessary information to create a resource. It has the following attributes:

- `label`: a human-readable, preferably meaningful short name of the resource (required)
- `restype`: the resource type as defined within the ontology (required)
- `id`: a unique, arbitrary string providing a unique ID to the resource in order to be referencable by other resources;
the ID is only used during the import process and later replaced by the IRI used internally by DSP (required)
- `permissions`: a reference to a permission set; the permissions will be applied to the created resource (optional)
- `iri`: a custom IRI used when migrating existing resources (optional)
- `ark`: a version 0 ARK used when migrating existing resources from salsah.org to DSP (optional), it is not possible to
use `iri` and `ark` in the same resource. When `ark` is used, it overrides `iri`.
- `label` (required): a human-readable, preferably meaningful short name of the resource
- `restype` (required): the resource type as defined within the ontology
- `id` (required): a unique, arbitrary string providing a unique ID to the resource in order to be referencable by other
resources; the ID is only used during the import process and later replaced by the IRI used internally by DSP
- `permissions` (optional, but if omitted, users who are lower than a `ProjectAdmin` have no permissions at all, not
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
even view rights): a reference to a permission set; the permissions will be applied to the created resource
- `iri` (optional): a custom IRI, used when migrating existing resources (DaSCH-internal only)
- `ark` (optional): a version 0 ARK, used when migrating existing resources. It is not possible
to use `iri` and `ark` in the same resource. When `ark` is used, it overrides `iri` (DaSCH-internal only).
- `creation_date` (optional): the creation date of the resource, used when migrating existing resources
. It must be formatted according to the constraints of [xsd:dateTimeStamp](https://www.w3.org/TR/xmlschema11-2/#dateTimeStamp),
which means that the timezone is required, e.g.: `2005-10-23T13:45:12.502951+02:00` (DaSCH-internal only)

A complete `<resource>` element may look as follows:

Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Expand Up @@ -20,7 +20,7 @@ dsp-tools helps you with the following tasks:
a DSP server and writes it into a JSON file.
- [`dsp-tools xmlupload`](./dsp-tools-usage.md#upload-data-to-a-dsp-server) uploads data from an XML file (bulk
data import) and writes the mapping from internal IDs to IRIs into a local file.
- [`dsp-tools excel`](./dsp-tools-usage.md#create-the-lists-section-of-a-json-project-file-from-excel-files)
- [`dsp-tools excel2lists`](./dsp-tools-usage.md#create-the-lists-section-of-a-json-project-file-from-excel-files)
creates the "lists" section of a JSON project file from one or several Excel files. The resulting section can be
integrated into a JSON project file and then be uploaded to a DSP server with `dsp-tools create`.
- [`dsp-tools excel2resources`](./dsp-tools-usage.md#create-the-resources-section-of-a-json-project-file-from-an-excel-file)
Expand Down
3 changes: 1 addition & 2 deletions knora/dsplib/models/helpers.py
Expand Up @@ -2,7 +2,6 @@
import sys
from dataclasses import dataclass
from enum import Enum, unique
from traceback import format_exc
from typing import NewType, Optional, Any, Tuple, Union, Pattern

from pystrict import strict
Expand Down Expand Up @@ -63,7 +62,7 @@ def __str__(self) -> str:
Convert to string
:return: stringyfied error message
"""
return self._message + "\n\n" + format_exc()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a clue what format_exc() was good for in the past, but I know that it causes problems. It was the source of this strange behaviour that I already had in the past: when a BaseError occurs while testing, Python prints an infinitely long stacktrace full of riddles, and then crashes.

I found out that I can just remove format_exc()

return self._message
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved

@property
def message(self) -> str:
Expand Down
12 changes: 12 additions & 0 deletions knora/dsplib/models/resource.py
Expand Up @@ -71,6 +71,7 @@ class ResourceInstance(Model):
_iri: Optional[str]
_ark: Optional[str]
_version_ark: Optional[str]
_creation_date: Optional[str]
_label: Optional[str]
_permissions: Optional[Permissions]
_user_permission: Optional[PermissionValue]
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
Expand All @@ -82,6 +83,7 @@ def __init__(self,
iri: Optional[str] = None,
ark: Optional[str] = None,
version_ark: Optional[str] = None,
creation_date: Optional[str] = None,
label: Optional[str] = None,
permissions: Optional[Permissions] = None,
user_permission: Optional[PermissionValue] = None,
Expand All @@ -93,6 +95,7 @@ def __init__(self,
self._iri = iri
self._ark = ark
self._version_ark = version_ark
self._creation_date = creation_date
self._label = label
self._permissions = permissions
self._user_permission = user_permission
Expand Down Expand Up @@ -181,6 +184,10 @@ def iri(self) -> str:
def ark(self) -> str:
return self._ark

@property
def creation_date(self) -> str:
return self._creation_date

@property
def vark(self) -> str:
return self._version_ark
Expand Down Expand Up @@ -286,6 +293,11 @@ def toJsonLdObj(self, action: Actions) -> Any:
tmp[property_name] = value.toJsonLdObj(action)

tmp['@context'] = self.context
if self._creation_date:
tmp['knora-api:creationDate'] = {
'@type': 'xsd:dateTimeStamp',
'@value': self._creation_date
}
return tmp

def create(self) -> 'ResourceInstance':
Expand Down
9 changes: 8 additions & 1 deletion knora/dsplib/models/xmlresource.py
Expand Up @@ -2,10 +2,10 @@

from lxml import etree

from knora.dsplib.models.xmlbitstream import XMLBitstream
from knora.dsplib.models.helpers import BaseError
from knora.dsplib.models.permission import Permissions
from knora.dsplib.models.value import KnoraStandoffXml
from knora.dsplib.models.xmlbitstream import XMLBitstream
from knora.dsplib.models.xmlproperty import XMLProperty


Expand All @@ -18,6 +18,7 @@ class XMLResource:
_label: str
_restype: str
_permissions: Optional[str]
_creation_date: Optional[str]
_bitstream: Optional[XMLBitstream]
_properties: list[XMLProperty]

Expand All @@ -35,6 +36,7 @@ def __init__(self, node: etree.Element, default_ontology: str) -> None:
self._id = node.attrib['id']
self._iri = node.attrib.get('iri')
self._ark = node.attrib.get('ark')
self._creation_date = node.attrib.get('creation_date')
self._label = node.attrib['label']
# get the resource type which is in format namespace:resourcetype, p.ex. rosetta:Image
tmp_res_type = node.attrib['restype'].split(':')
Expand Down Expand Up @@ -74,6 +76,11 @@ def ark(self) -> Optional[str]:
"""The custom ARK of the resource"""
return self._ark

@property
def creation_date(self) -> Optional[str]:
"""The creation date of the resource"""
return self._creation_date

@property
def label(self) -> str:
"""The label of the resource"""
Expand Down
12 changes: 11 additions & 1 deletion knora/dsplib/schemas/data.xsd
Expand Up @@ -410,9 +410,10 @@
<xs:attribute name="label" type="xs:string" use="required"/>
<xs:attribute name="restype" type="xs:string" use="required"/>
<xs:attribute name="id" type="xs:ID" use="required"/>
<xs:attribute name="iri" type="xs:string" use="optional"/>
<xs:attribute name="permissions" type="xs:NCName" use="optional"/>
<xs:attribute name="iri" type="xs:string" use="optional"/>
<xs:attribute name="ark" type="xs:string" use="optional"/>
<xs:attribute name="creation_date" type="xs:dateTime" use="optional"/>
</xs:complexType>

<!-- annotation tag -->
Expand All @@ -424,6 +425,9 @@
<xs:attribute name="label" type="xs:string" use="required"/>
<xs:attribute name="id" type="xs:ID" use="required"/>
<xs:attribute name="permissions" type="xs:NCName" use="optional"/>
<xs:attribute name="iri" type="xs:string" use="optional"/>
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
<xs:attribute name="ark" type="xs:string" use="optional"/>
<xs:attribute name="creation_date" type="xs:dateTime" use="optional"/>
</xs:complexType>

<!-- region tag -->
Expand All @@ -437,6 +441,9 @@
<xs:attribute name="label" type="xs:string" use="required"/>
<xs:attribute name="id" type="xs:ID" use="required"/>
<xs:attribute name="permissions" type="xs:NCName" use="optional"/>
<xs:attribute name="iri" type="xs:string" use="optional"/>
<xs:attribute name="ark" type="xs:string" use="optional"/>
<xs:attribute name="creation_date" type="xs:dateTime" use="optional"/>
</xs:complexType>

<!-- link tag -->
Expand All @@ -448,6 +455,9 @@
<xs:attribute name="label" type="xs:string" use="required"/>
<xs:attribute name="id" type="xs:ID" use="required"/>
<xs:attribute name="permissions" type="xs:NCName" use="optional"/>
<xs:attribute name="iri" type="xs:string" use="optional"/>
<xs:attribute name="ark" type="xs:string" use="optional"/>
<xs:attribute name="creation_date" type="xs:dateTime" use="optional"/>
</xs:complexType>

<!-- data type for knora shortcode -->
Expand Down
22 changes: 22 additions & 0 deletions knora/dsplib/utils/validation.py
@@ -0,0 +1,22 @@
import regex
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved

from knora.dsplib.models.helpers import BaseError


def validate_resource_creation_date(creation_date: str, err_msg: str) -> None:
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
"""
Checks if creation_date is a valid https://www.w3.org/TR/xmlschema11-2/#dateTimeStamp.

Args:
creation_date: the attribute "creation_date" from the <resource> tag in the XML

Returns:
None if validation passes. Raises a BaseError if validation fails.
"""
_regex = r"-?([1-9][0-9]{3,}|0[0-9]{3})" \
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
r"-(0[1-9]|1[0-2])" \
r"-(0[1-9]|[12][0-9]|3[01])" \
r"T(([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9](\.[0-9]+)?|(24:00:00(\.0+)?))" \
r"(Z|(\+|-)((0[0-9]|1[0-3]):[0-5][0-9]|14:00))"
if not regex.search(_regex, creation_date):
raise BaseError(err_msg)