Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(excel2json): use rosetta as example data (DEV-1478) #254

Merged
merged 9 commits into from Nov 16, 2022
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added docs/assets/images/img-excel2xml-closeup.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/img-properties-example.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/img-resources-example-1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/assets/images/img-resources-example-2.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 5 additions & 5 deletions docs/dsp-tools-excel2json.md
@@ -1,6 +1,6 @@
[![PyPI version](https://badge.fury.io/py/dsp-tools.svg)](https://badge.fury.io/py/dsp-tools)

# `excel2json`: Create a data model (JSON project file) from Excel
# excel2json

With dsp-tools, a JSON project file can be created from Excel files. The command for this is documented
[here](./dsp-tools-usage.md#create-a-json-project-file-from-excel-files).
Expand Down Expand Up @@ -57,8 +57,8 @@ this is documented [here](./dsp-tools-usage.md#create-the-resources-section-of-a
Only `XLSX` files are allowed. The `resources` section can be inserted into the ontology file and then be uploaded onto
a DSP server.

**An Excel file template can be found [here](assets/data_model_templates/onto_name (onto_label)/resources.xlsx). It is recommended to work from
the template.**
**An Excel file template can be found [here](assets/data_model_templates/rosetta (rosetta)/resources.xlsx). It is recommended to work from
the template.**

The expected worksheets of the Excel file are:

Expand Down Expand Up @@ -99,7 +99,7 @@ this is documented [here](./dsp-tools-usage.md#create-the-properties-section-of-
Only the first worksheet of the Excel file is considered and only XLSX files are allowed. The `properties` section can
be inserted into the ontology file and then be uploaded onto a DSP server.

**An Excel file template can be found [here](assets/data_model_templates/onto_name (onto_label)/properties.xlsx). It is recommended to work
**An Excel file template can be found [here](assets/data_model_templates/rosetta (rosetta)/properties.xlsx). It is recommended to work
from the template.**

The Excel sheet must have the following structure:
Expand Down Expand Up @@ -127,7 +127,7 @@ For further information about properties, see [here](./dsp-tools-create-ontologi

## "lists" section

With dsp-tools, the "lists" section of a JSON project file can be created from one or several Excel files. The lists can
With dsp-tools, the `lists` section of a JSON project file can be created from one or several Excel files. The lists can
then be inserted into a JSON project file and uploaded to a DSP server. The command for this is documented
[here](./dsp-tools-usage.md#create-the-lists-section-of-a-json-project-file-from-excel-files).

Expand Down
28 changes: 23 additions & 5 deletions docs/dsp-tools-excel2xml.md
@@ -1,14 +1,32 @@
[![PyPI version](https://badge.fury.io/py/dsp-tools.svg)](https://badge.fury.io/py/dsp-tools)

# Module `excel2xml`: Convert a data source to XML
# excel2xml

This page is about the module `excel2xml` that can be imported into a custom Python script that transforms any tabular
data into an XML.
## Two use cases - two approaches

There is also a CLI command `dsp-tools excel2xml` that creates an XML file from an Excel/CSV file which is already
structured according to the DSP specifications. The CLI command is documented
There are two kinds of Excel files that can be transformed into an XML file:

| structure | provenance | tool | example screenshot |
|------------------|-------------|--------------------------|----------------------------------------------------------|
| custom structure | customer | module `excel2xml` | ![](./assets/images/img-excel2xml-raw-data-category.png) |
| DSP structure | DSP server | CLI command `excel2xml` | ![](./assets/images/img-excel2xml-closeup.png) |

The use first case is the most frequent: The DaSCH receives a data export from a research project. Every project uses
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved
different software, so every project will deliver their data in a different structure. The screenshot is just a
simplified example. For this use case, it is necessary to write a Python script that transforms the data from an
undefined state X into a DSP-conforming XML file that can be uploaded with `dsp-tools xmlupload`. For this, you need to
import the module `excel2xml` into your Python script.

The second use case is less frequent: We migrate data DaSCH-internally from one server to another. In this case, the
data already has the correct structure, and can automatically be transformed to XML. This can be done with the CLI
command `dsp-tools excel2xml` which is documented
[here](./dsp-tools-usage.md#use-the-module-excel2xml-to-convert-a-data-source-to-xml).

**This page is about the module `excel2xml`** .
jnussbaum marked this conversation as resolved.
Show resolved Hide resolved


## Module `excel2xml`: Convert a data source to XML

To demonstrate the usage of the `excel2xml` module, there is a GitHub repository named `0123-import-scripts`. It
contains:

Expand Down
5 changes: 3 additions & 2 deletions docs/dsp-tools-xmlupload.md
Expand Up @@ -266,8 +266,9 @@ Notes:

- There is only _one_ `<bitstream>` element allowed per representation.
- The `<bitstream>` element must be the first element.
- The path is relative to the working directory where `dsp-tools xmlupload` is executed in. It is recommended to
choose the project folder as working directory, `my_project` in the example below:
- By default, the path is relative to the working directory where `dsp-tools xmlupload` is executed in. This behaviour
can be modified with the flag [`--imgdir`](./dsp-tools-usage.md#upload-data-to-a-dsp-server). If you keep the default,
it is recommended to choose the project folder as working directory, `my_project` in the example below:

```
my_project
Expand Down
28 changes: 14 additions & 14 deletions docs/index.md
Expand Up @@ -2,7 +2,7 @@

# DSP-TOOLS documentation

dsp-tools is a command line tool that helps you to interact with the DaSCH Service Platform server (DSP server).
dsp-tools is a command line tool that helps you to interact with a DaSCH Service Platform (DSP) server.

In order to archive your data on the DaSCH Service Platform, you need a data model (ontology) that describes your data.
The data model is defined in a JSON project definition file which has to be transmitted to the DSP server. If the DSP
Expand All @@ -22,21 +22,21 @@ dsp-tools helps you with the following tasks:
data import) and writes the mapping from internal IDs to IRIs into a local file.
- [`dsp-tools excel2json`](./dsp-tools-usage.md#create-a-json-project-file-from-excel-files) creates an entire JSON
project file from a folder with Excel files in it.
- [`dsp-tools excel2lists`](./dsp-tools-usage.md#create-the-lists-section-of-a-json-project-file-from-excel-files)
creates the "lists" section of a JSON project file from one or several Excel files. The resulting section can be
integrated into a JSON project file and then be uploaded to a DSP server with `dsp-tools create`.
- [`dsp-tools excel2resources`](./dsp-tools-usage.md#create-the-resources-section-of-a-json-project-file-from-an-excel-file)
creates the "resources" section of a JSON project file from an Excel file. The resulting section can be integrated
into a JSON project file and then be uploaded to a DSP server with `dsp-tools create`.
- [`dsp-tools excel2properties`](./dsp-tools-usage.md#create-the-properties-section-of-a-json-project-file-from-an-excel-file)
creates the "properties" section of a JSON project file from an Excel file. The resulting section can be integrated
into a JSON project file and then be uploaded to a DSP server with `dsp-tools create`.
- [`dsp-tools id2iri`](./dsp-tools-usage.md#replace-internal-ids-with-iris-in-xml-file)
takes an XML file for bulk data import and replaces referenced internal IDs with IRIs. The mapping has to be provided
with a JSON file.
- [`dsp-tools excel2lists`](./dsp-tools-usage.md#create-the-lists-section-of-a-json-project-file-from-excel-files)
creates the "lists" section of a JSON project file from one or several Excel files. The resulting section can be
integrated into a JSON project file and then be uploaded to a DSP server with `dsp-tools create`.
- [`dsp-tools excel2resources`](./dsp-tools-usage.md#create-the-resources-section-of-a-json-project-file-from-an-excel-file)
creates the "resources" section of a JSON project file from an Excel file. The resulting section can be integrated
into a JSON project file and then be uploaded to a DSP server with `dsp-tools create`.
- [`dsp-tools excel2properties`](./dsp-tools-usage.md#create-the-properties-section-of-a-json-project-file-from-an-excel-file)
creates the "properties" section of a JSON project file from an Excel file. The resulting section can be integrated
into a JSON project file and then be uploaded to a DSP server with `dsp-tools create`.
- [`dsp-tools excel2xml`](./dsp-tools-usage.md#create-an-xml-file-from-excelcsv) transforms a data source to XML if it
is already structured according to the DSP specifications.
- [The module excel2xml](./dsp-tools-usage.md#use-the-module-excel2xml-to-convert-a-data-source-to-xml) provides helper
- [The module `excel2xml`](./dsp-tools-usage.md#use-the-module-excel2xml-to-convert-a-data-source-to-xml) provides helper
methods that can be used in a Python script to convert data from a tabular format into XML.
- [`dsp-tools id2iri`](./dsp-tools-usage.md#replace-internal-ids-with-iris-in-xml-file)
takes an XML file for bulk data import and replaces referenced internal IDs with IRIs. The mapping has to be provided
with a JSON file.
- [`dsp-tools start-api / stop-api / start-app`](./dsp-tools-usage.md#start-a-dsp-stack-on-your-local-machine-for-dasch-internal-use-only)
assist you in running a DSP software stack on your local machine.