Skip to content

Commit

Permalink
docs: clarify docs of onto creation (DEV-1164) (#225)
Browse files Browse the repository at this point in the history
  • Loading branch information
jnussbaum committed Sep 9, 2022
1 parent 575f974 commit f64d2cf
Show file tree
Hide file tree
Showing 7 changed files with 343 additions and 337 deletions.
468 changes: 250 additions & 218 deletions docs/dsp-tools-create-ontologies.md

Large diffs are not rendered by default.

28 changes: 15 additions & 13 deletions docs/dsp-tools-create.md
@@ -1,20 +1,22 @@
[![PyPI version](https://badge.fury.io/py/dsp-tools.svg)](https://badge.fury.io/py/dsp-tools)

# JSON data model definition format
# JSON project definition format

This document describes the structure of a data model (ontology) used by DSP. According to Wikipedia,
the [data model](https://en.wikipedia.org/wiki/Data_model) is "an abstract model that organizes elements of data and
standardizes how they relate to one another and to the properties of real-world entities. [...] A data model explicitly
determines the structure of data. Data models are typically specified by a data specialist, data librarian, or a digital
humanities scholar in a data modeling notation". The following sections describe the notation for ontologies in the
context of DSP.
This document describes the structure of a JSON project definition file that can be uploaded to a DSP server. The
command to do so is [documented here](./dsp-tools-usage.md#create-a-project-on-a-dsp-server).

A data model as described in this document can be uploaded to a DSP server. The command to do so is described
[here](./dsp-tools-usage.md#create-a-data-model-on-a-dsp-server).
A project on a DSP server is like a container for data. It defines some basic metadata, the data model(s) and optionally
the user(s) who will be able to access the data. After the creation of a project, data can be uploaded that conforms
with the data model(s).

This documentation is divided into two parts:

- Overview of the project description file (this page)
- The "ontologies" section [explained in detail](./dsp-tools-create-ontologies.md)

## A short overview

A complete data model definition for DSP looks like this:
A complete project definition looks like this:

```json
{
Expand Down Expand Up @@ -444,15 +446,15 @@ The `users` element is optional. If not used, it should be omitted.

`"ontologies": [<ontology-definition>, <ontology-definition>, ...]`

Inside the `ontologies` section all resources and properties are described. A project may have multiple ontologies. It
requires the following data fields:
Inside the `ontologies` section, all resource classes and properties are defined. A project may have multiple
ontologies. It requires the following fields:

- `name`
- `label`
- `properties`
- `resources`

A detailed description of `ontologies` can be found [here](dsp-tools-create-ontologies.md)
The `ontologies` section is [documented here](./dsp-tools-create-ontologies.md)

## Fully fleshed out example ontology

Expand Down
2 changes: 1 addition & 1 deletion docs/dsp-tools-excel.md
Expand Up @@ -104,7 +104,7 @@ Only Excel files with file extension `.xlsx` are considered. All Excel files hav
When calling the `excel` command, this folder is provided as an argument to the call. The language of the labels has
to be provided in the Excel file's file name after an underline and before the file extension, e.g.
`Beschreibung_de.xlsx` would be considered a list with German (`de`) labels, `description_en.xlsx` a list with
English (`en`) labels. The language has to be one of {de, en, fr, it}.
English (`en`) labels. The language has to be one of {de, en, fr, it, rm}.

The following example shows how to create a JSON list from two Excel files which are in a directory called `listfolder`.
The output is written to the file `list.json`.
Expand Down
42 changes: 18 additions & 24 deletions docs/dsp-tools-usage.md
Expand Up @@ -24,10 +24,10 @@ pip3 install --upgrade dsp-tools



## Create a data model on a DSP server
## Create a project on a DSP server

```bash
dsp-tools create [options] data_model_definition.json
dsp-tools create [options] project_definition.json
```

The following options are available:
Expand All @@ -40,21 +40,20 @@ The following options are available:
- `-v` | `--verbose`: If set, more information about the progress is printed to the console.
- `-d` | `--dump`: If set, dump test files for DSP-API requests.

The command is used to read the definition of a data model (provided in a JSON file) and create it on the DSP server.
The following example shows how to load the ontology defined in `data_model_definition.json` onto the DSP
server `https://api.dsl.server.org` provided with the `-s` option. The username `root@example.com` and the password
`test` are used.
The command is used to read the definition of a project with its data model(s) (provided in a JSON file) and create it
on the DSP server. The following example shows how to upload the project defined in `project_definition.json` to the DSP
server `https://admin.dasch.swiss`:

```bash
dsp-tools create -s https://api.dsl.server.org -u root@example.com -p test data_model_definition.json
dsp-tools create -s https://api.dasch.swiss -u root@example.com -p test project_definition.json
```

The description of the expected JSON format can be found [here](./dsp-tools-create.md).
The expected JSON format is [documented here](./dsp-tools-create.md).




## Get a data model from a DSP server
## Get a project from a DSP server

```bash
dsp-tools get [options] output_file.json
Expand All @@ -69,16 +68,15 @@ The following options are available:
[IRI](https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier) of the project (mandatory)
- `-v` | `--verbose`: If set, some information about the progress is printed to the console.

The command is used to get the definition of a data model from a DSP server and write it into a JSON file. This JSON
file could then be used to upload the data model to another DSP server. The following example shows how to get the data
model from a DSP server `https://test.dasch.swiss` provided with the `-s` option. The username `root@example.com` and
the password `test` are used. The data model is saved into the output file `output_file.json`.
The command is used to get the definition of a project with its data model(s) from a DSP server and write it into a JSON
file. This JSON file can then be used to create the same project on another DSP server. The following example shows how
to get a project from the DSP server `https://admin.dasch.swiss`.

```bash
dsp-tools get -s https://api.test.dasch.swiss -u root@example.com -p test -P my_project output_file.json
dsp-tools get -s https://api.dasch.swiss -u root@example.com -p test -P my_project output_file.json
```

The description of the JSON format can be found [here](./dsp-tools-create.md).
The expected JSON format is [documented here](./dsp-tools-create.md).



Expand All @@ -101,16 +99,13 @@ The following options are available:
- `-v` | `--verbose`: If set, more information about the uploaded resources is printed to the console.

The command is used to upload data defined in an XML file onto a DSP server. The following example shows how to upload
data from an XML file `xml_data_file.xml` onto the DSP server `https://api.dsl.server.org` provided with the `-s`
option. The username `root@example.com` and the password `test` are used. The interface for the SIPI IIIF server is
provided with the `-S`
option (`https://iiif.dsl.server.org`).
data from the XML file `xml_data_file.xml` to the DSP server `https://admin.dasch.swiss`:

```bash
dsp-tools xmlupload -s https://api.dsl.server.org -u root@example.com -p test -S https://iiif.dsl.server.org xml_data_file.xml
dsp-tools xmlupload -s https://api.dasch.swiss -u root@example.com -p test -S https://iiif.dasch.swiss xml_data_file.xml
```

The description of the expected XML format can be found [here](./dsp-tools-xmlupload.md).
The expected XML format is [documented here](./dsp-tools-xmlupload.md).

An internal ID is used in the `<resptr>` tag of an XML file to reference resources inside the same XML file. Once data
is uploaded to DSP, it cannot be referenced by this internal ID anymore. Instead, the resource's IRI has to be used.
Expand Down Expand Up @@ -144,9 +139,8 @@ The following example shows how to create a JSON list from Excel files in a dire
dsp-tools excel lists list.json
```

The description of the expected Excel format can be found [here](./dsp-tools-create.md#lists-from-excel). More
information about the usage of this command can be
found [here](./dsp-tools-excel.md#create-a-list-from-one-or-several-excel-files).
The expected Excel format is [documented here](./dsp-tools-create.md#lists-from-excel). More information about the usage
of this command can be found [here](./dsp-tools-excel.md#create-a-list-from-one-or-several-excel-files).



Expand Down
120 changes: 49 additions & 71 deletions docs/dsp-tools-xmlupload.md
Expand Up @@ -244,7 +244,6 @@ The following property elements exist:
- `<geometry-prop>`: contains JSON geometry definitions for a region
- `<geoname-prop>`: contains [geonames.org](https://www.geonames.org/) location codes
- `<list-prop>`: contains list element labels
- `<iconclass-prop>`: contains [iconclass.org](http://iconclass.org/) codes (not yet implemented)
- `<integer-prop>`: contains integer values
- `<interval-prop>`: contains interval values
- `<period-prop>`: contains time period values (not yet implemented)
Expand All @@ -268,13 +267,13 @@ Note:
Supported file extensions:

| Representation | Supported formats |
| --------------------------- |----------------------------------------|
| `ArchiveRepresentation` | ZIP, TAR, GZ, Z, TAR.GZ, TGZ, GZIP, 7Z |
| `AudioRepresentation` | MP3, MP4, WAV |
| `DocumentRepresentation` | PDF, DOC, DOCX, XLS, XLSX, PPT, PPTX |
| `MovingImageRepresentation` | MP4 |
| `StillImageRepresentation` | JPG, JPEG, PNG, TIF, TIFF, JP2 |
| `TextRepresentation` | TXT, CSV, XML, XSL, XSD |
|-----------------------------|----------------------------------------|
| `ArchiveRepresentation` | ZIP, TAR, GZ, Z, TAR.GZ, TGZ, GZIP, 7Z |
| `AudioRepresentation` | MP3, MP4, WAV |
| `DocumentRepresentation` | PDF, DOC, DOCX, XLS, XLSX, PPT, PPTX |
| `MovingImageRepresentation` | MP4 |
| `StillImageRepresentation` | JPG, JPEG, PNG, TIF, TIFF, JP2 |
| `TextRepresentation` | TXT, CSV, XML, XSL, XSD |

For more details, please consult the [API docs](https://docs.dasch.swiss/latest/DSP-API/01-introduction/file-formats/).

Expand Down Expand Up @@ -373,8 +372,17 @@ calendar:epoch:yyyy-mm-dd:epoch:yyyy-mm-dd
- `mm`: month with two digits (optional, e.g. 01, 02, ..., 12)
- `dd`: day with two digits (optional, e.g. 01, 02, ..., 31)

If two dates are provided, the date is defined as range between the two dates. If the day is omitted, then the precision
it _month_, if also the month is omitted, the precision is _year_.
Notes:

- If the day is omitted, then the precision is month, if also the month is omitted, the precision is year.
- Internally, a date is always represented as a start and end date.
- If start and end date match, it's an exact date.
- If start and end date don't match, it's a range.
- If the end date is omitted, it's a range from the earliest possible beginning of the start date to the latest possible
end of the start date. For example:
- "1893" will be expanded to a range from January 1st 1893 to December 31st 1893.
- "1893-01" will be expanded to a range from January 1st 1893 to January 31st 1893.
- "1893-01-01" will be expanded to the exact date January 1st 1893 to January 1st 1893 (technically also a range).

Attributes:

Expand Down Expand Up @@ -510,21 +518,18 @@ Example (city of Vienna):
```


### &lt;list-prop&gt;
### &lt;integer-prop&gt;

The `<list-prop>` element is used as entry point into a list (list node). List nodes are identified by their `name`
attribute that was given when creating the list nodes (which must be unique within each list!). It must contain at least
one `<list>` element.
The `<integer-prop>` element is used for integer values. It must contain at least one `<integer>` element.

Attributes:

- `name`: name of the property as defined in the ontology (required)
- `list`: name of the list as defined in the ontology (required)


#### &lt;list&gt;
#### &lt;integer&gt;

The `<list>` element references a node in a (pull-down or hierarchical) list.
The `<integer>` element contains an integer value.

Attributes:

Expand All @@ -534,55 +539,27 @@ Attributes:
Example:

```xml
<list-prop list="category" name=":hasCategory">
<list>physics</list>
</list-prop>
```


### &lt;iconclass-prop&gt; (_not yet implemented_)

The `<iconclass-prop>` element is used for [iconclass.org](http://iconclass.org) ID. It must contain at least one
`<iconclass>` element.

For example: `92E112` stands
for `(story of) Aurora (Eos); 'Aurora' (Ripa) - infancy, upbringing Aurora · Ripa · air · ancient history · child · classical antiquity · goddess · gods · heaven · history · infancy · mythology · sky · upbringing · youth`

Attributes:

- `name`: name of the property as defined in the ontology (required)


#### &lt;iconclass&gt; (_not yet implemented_)

References an [iconclass.org](https://iconclass.org) ID.

Attributes:

- `permissions`: Permission ID (optional, but if omitted, users who are lower than a `ProjectAdmin` have no permissions at all, not even view rights)
- `comment`: a comment for this specific value (optional)

Usage:

```xml
<iconclass-prop name=":hasIcon">
<iconclass>92E112</iconclass>
</iconclass-prop>
<integer-prop name=":hasInteger">
<integer>4711</integer>
</integer-prop>
```


### &lt;integer-prop&gt;
### &lt;interval-prop&gt;

The `<integer-prop>` element is used for integer values. It must contain at least one `<integer>` element.
The `<interval-prop>` element is used for intervals with a start and an end point on a timeline, e.g. relative to the beginning of an audio or video file.
An `<interval-prop>` must contain at least one `<interval>` element.

Attributes:

- `name`: name of the property as defined in the ontology (required)


#### &lt;integer&gt;
#### &lt;interval&gt;

The `<integer>` element contains an integer value.
A time interval is represented by plain decimal numbers (=seconds), without a special notation for minutes and hours.
The `<interval>` element contains two decimals separated by a colon (`:`). The places before the decimal point are
seconds, and the places after the decimal points are fractions of a second.

Attributes:

Expand All @@ -592,27 +569,28 @@ Attributes:
Example:

```xml
<integer-prop name=":hasInteger">
<integer>4711</integer>
</integer-prop>
<interval-prop name=":hasInterval">
<interval>60.5:120.5</interval> <!-- 0:01:00.5 - 0:02:00.5 -->
<interval>61:3600</interval> <!-- 0:01:01 - 1:00:00 -->
</interval-prop>
```


### &lt;interval-prop&gt;
### &lt;list-prop&gt;

The `<interval-prop>` element is used for intervals with a start and an end point on a timeline, e.g. relative to the beginning of an audio or video file.
An `<interval-prop>` must contain at least one `<interval>` element.
The `<list-prop>` element is used as entry point into a list (list node). List nodes are identified by their `name`
attribute that was given when creating the list nodes (which must be unique within each list!). It must contain at least
one `<list>` element.

Attributes:

- `name`: name of the property as defined in the ontology (required)
- `list`: name of the list as defined in the ontology (required)


#### &lt;interval&gt;
#### &lt;list&gt;

A time interval is represented by plain decimal numbers (=seconds), without a special notation for minutes and hours.
The `<interval>` element contains two decimals separated by a colon (`:`). The places before the decimal point are
seconds, and the places after the decimal points are fractions of a second.
The `<list>` element references a node in a (pull-down or hierarchical) list.

Attributes:

Expand All @@ -622,10 +600,9 @@ Attributes:
Example:

```xml
<interval-prop name=":hasInterval">
<interval>60.5:120.5</interval> <!-- 0:01:00.5 - 0:02:00.5 -->
<interval>61:3600</interval> <!-- 0:01:01 - 1:00:00 -->
</interval-prop>
<list-prop list="category" name=":hasCategory">
<list>physics</list>
</list-prop>
```


Expand Down Expand Up @@ -721,7 +698,7 @@ conform to the special format `IRI:[res-id]:IRI` where [res-id] is the resource

### &lt;time-prop&gt;

The `<time-prop>` element is used for time values. It must contain at least one `<time>` element.
The `<time-prop>` element is used for time values in the Gregorian calendar. It must contain at least one `<time>` element.

Attributes:

Expand All @@ -734,7 +711,8 @@ The `<time>` element represents an exact datetime value in the form of `yyyy-mm-
following abbreviations describe this form:

- `yyyy`: a four-digit numeral that represents the year. The value cannot start with a minus (-) or a plus (+) sign.
0001 is the lexical representation of the year 1 of the Common Era (also known as 1 AD). The value cannot be 0000.
0001 is the lexical representation of the year 1 of the Common Era (also known as 1 AD). The value cannot be 0000. The
calendar is always the Gregorian calendar.
- `mm`: a two-digit numeral that represents the month
- `dd`: a two-digit numeral that represents the day
- `hh`: a two-digit numeral representing the hours. Must be between 0 and 23
Expand Down

0 comments on commit f64d2cf

Please sign in to comment.