Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: improve docs (DEV-1478) #249

Merged
merged 5 commits into from Nov 9, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
17 changes: 17 additions & 0 deletions README.md
@@ -1,10 +1,13 @@
[![PyPI version](https://badge.fury.io/py/dsp-tools.svg)](https://badge.fury.io/py/dsp-tools)

# DSP-TOOLS - DaSCH Service Platform Tools

dsp-tools is a command line tool that helps you interacting with the DaSCH Service Platform API.
Go to [Full Documentation](https://docs.dasch.swiss/latest/DSP-TOOLS)


## Information for developers

There is a `Makefile` for all the following tasks (and more). Type `make` to print the available targets.

For a quick start, use:
Expand All @@ -22,7 +25,9 @@ make install-requirements
make install
```


## Pipenv

We use pipenv for our dependency management. There are two ways to get started:
- `pipenv install --dev` installs all dependencies, while giving them the opportunity to update themselves
- `pipenv install --ignore-pipfile` is used to get a deterministic build in production
Expand Down Expand Up @@ -54,7 +59,9 @@ For security reasons, the maintainer regularly executes
without pipenv, you can freeze your requirements with `pip3 freeze > requirements.txt` and update `setup.py`
manually.


### Pipenv setup in PyCharm

- Go to Add Interpreter > Pipenv Environment
- Base Interpreter: PyCarm auto-detects one of your system-wide installed Pythons as base interpreter.
- Pipenv executable: auto-detected
Expand All @@ -63,15 +70,19 @@ manually.
If you already initialized a pipenv-environment via command line, you can add its interpreter in PyCharm,
but this will create the pipenv-environment again.


## Testing

Please note that testing requires launching the complete DSP API stack which is based on docker images.
Therefore, we recommend installing the [docker desktop client](https://www.docker.com/products).
To run the complete test suite:
```bash
make test
```


## Code style

When contributing to the project please make sure you use the same code style rules as we do. We use
[autopep8](https://pypi.org/project/autopep8/) and [mypy](https://pypi.org/project/mypy/). The
configuration is defined in `pyproject.toml` in the root directory of the project.
Expand All @@ -89,7 +100,9 @@ In VSCode, both mypy and autopep8 can be set up as default linter and formatter

For formatting Markdown files (*.md) we use the default styling configuration provided by PyCharm.


## Publishing

Publishing is automated with GitHub Actions and should _not_ be done manually. Please follow the
[Pull Request Guidelines](https://docs.dasch.swiss/latest/developers/dsp/contribution/#pull-request-guidelines). If done
correctly, when merging a pull request into `main`, the `release-please` action will create or update a pull request for
Expand All @@ -99,7 +112,9 @@ create a release on GitHub, on PyPI and the docs.

Please ensure you have only one pull request per feature.


## Publishing manually

Publishing is automated with GitHub Actions and should _not_ be done manually. If you still need to do it, follow the
steps below.

Expand Down Expand Up @@ -129,7 +144,9 @@ For local development:
python3 setup.py develop
```


## Contributing to the documentation

The documentation is a collection of [markdown](https://en.wikipedia.org/wiki/Markdown) files in the `docs` folder.
After updates of the files, build and check the result with the following command:

Expand Down
8 changes: 5 additions & 3 deletions docs/dsp-tools-create-ontologies.md
Expand Up @@ -24,7 +24,7 @@ resource or not. The cardinality definitions are explained [further below](#card

Example of an `ontologies` object:

```json
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with that you remove syntax highlighting, which is good to have in my opinion

{
"ontologies": [
{
Expand Down Expand Up @@ -653,6 +653,7 @@ Example:


### Link-properties

Link properties do not follow the pattern of the previous data types, because they do not connect to a final value but
to an existing resource. Thus, the `object` denominates the resource class the link will point to.

Expand Down Expand Up @@ -752,7 +753,7 @@ directly as cardinalities in a resource. The example belows shows both possibili

Example:

```json
```
"properties": [
{
"name": "partOfBook",
Expand Down Expand Up @@ -845,7 +846,7 @@ they can be used directly as cardinalities in a resource. The example below show

Example:

```json
```
"properties": [
{
"name": "sequenceOfAudio",
Expand Down Expand Up @@ -1096,6 +1097,7 @@ it is necessary to reference entities that are defined elsewhere. The following


## DSP base resources / base properties to be used directly in the XML file

There is a number of DSP base resources that must not be subclassed in a project ontology. They are directly available
in the XML data file:

Expand Down
42 changes: 6 additions & 36 deletions docs/dsp-tools-create.md
Expand Up @@ -20,7 +20,7 @@ This documentation is divided into two parts:

A complete project definition looks like this:

```json
```
{
"prefixes": {
"foaf": "http://xmlns.com/foaf/0.1/",
Expand All @@ -32,10 +32,12 @@ A complete project definition looks like this:
"shortname": "BiZ",
"longname": "Bildung in Zahlen",
"descriptions": {
...
"en": "This is a simple example project",
"de": "Dies ist ein einfaches Beispielprojekt"
},
"keywords": [
...
"example",
"simple"
],
"lists": [
...
Expand Down Expand Up @@ -113,38 +115,6 @@ The following fields are optional (if one or more of these fields are not used,
- groups
- users

A simple example definition of the `project` object looks like this:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A complete example of a project is already given some lines above, so it doesn't make sense to repeat this here.


```json
{
"project": {
"shortcode": "0809",
"shortname": "test",
"longname": "Test Example",
"descriptions": {
"en": "This is a simple example project",
"de": "Dies ist ein einfaches Beispielprojekt"
},
"keywords": [
"example",
"simple"
],
"lists": [
...
],
"groups": [
...
],
"users": [
...
],
"ontologies": [
...
]
}
}
```



## "project" object in detail
Expand Down Expand Up @@ -426,7 +396,7 @@ example, the list "colors" could be imported as follows:
"en": "A list with categories"
},
"nodes": [
...
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and other similar changes are just that PyCharm doesn't show that many problems. In the future, I'd like to look more frequently on the problems that are detected by PyCharm. But for this, I need to kick out the false positives.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understandable (but confirms my dislike for PyCharm ;-)... in any case, I'm sure you could deactivate specific warnings).
Another thing you could consider would be annotating the code block as jsonc not json, so that comments are allowed , and then have // ... where the ellipsis is

"..."
]
}
]
Expand Down
6 changes: 3 additions & 3 deletions docs/dsp-tools-excel.md
Expand Up @@ -142,7 +142,7 @@ The output of the above command, with the template files, is:
"en": "red"
}
},
...
"..."
]
},
{
Expand All @@ -163,7 +163,7 @@ The output of the above command, with the template files, is:
"en": "artwork"
}
},
...
"..."
]
},
{
Expand All @@ -184,7 +184,7 @@ The output of the above command, with the template files, is:
"en": "Faculty of Science"
}
},
...
"..."
]
}
]
Expand Down
21 changes: 19 additions & 2 deletions docs/dsp-tools-excel2xml.md
@@ -1,6 +1,7 @@
[![PyPI version](https://badge.fury.io/py/dsp-tools.svg)](https://badge.fury.io/py/dsp-tools)

# `excel2xml`: Convert a data source to XML

dsp-tools assists you in converting a data source in CSV/XLS(X) format to an XML file.

| **Hint** |
Expand Down Expand Up @@ -48,15 +49,18 @@ These steps are now explained in-depth:


## 1. Read in your data source

In the first paragraph of the sample script, insert your ontology name, project shortcode, and the path to your data
source. If necessary, activate one of the lines that are commented out.


## 2. Create root element `<knora>`

Then, the root element is created, which represents the `<knora>` tag of the XML document.


## 3. Append the permissions

As first children of `<knora>`, some standard permissions are added. At the end, please carefully check the permissions
of the finished XML file to ensure that they meet your requirements, and adapt them if necessary.

Expand All @@ -69,6 +73,7 @@ here](./dsp-tools-xmlupload.md#how-to-use-the-permissions-attribute-in-resources


## 4. Create list mappings

Let's assume that your data source has a column containing list values named after the "label" of the JSON project list,
instead of the "name" which is needed for the `dsp-tools xmlupload`. You need a way to get the names from the labels.
If your data source uses the labels correctly, this is an easy task: The method `create_json_list_mapping()` creates a
Expand Down Expand Up @@ -137,10 +142,12 @@ used.


## 5. Iterate through the rows of your data source

With the help of Pandas, you can then iterate through the rows of your Excel/CSV, and create resources and properties.


### 6. Create the `<resource>` tag

There are four kind of resources that can be created:

| super | tag | method |
Expand All @@ -154,6 +161,7 @@ There are four kind of resources that can be created:
here](./dsp-tools-xmlupload.md#dsp-base-resources--base-properties-to-be-used-directly-in-the-xml-file).

#### Resource ID

Special care is needed when the ID of a resource is created. Every resource must have an ID that is unique in the file,
and it must meet the constraints of xsd:ID. You can simply achieve this if you use the method `make_xsd_id_compatible()`.

Expand All @@ -162,6 +170,7 @@ ID in a dict, so that you can retrieve it later. The example script contains an


### 7. Append the properties

For every property, there is a helper function that explains itself when you hover over it. So you don't need to worry
any more how to construct a certain XML value for a certain property.

Expand All @@ -180,6 +189,7 @@ Here's how the Docstrings assist you:


#### Fine-tuning with `PropertyElement`

There are two possibilities how to create a property: The value can be passed as it is, or as `PropertyElement`. If it
is passed as it is, the `permissions` are assumed to be `prop-default`, texts are assumed to be encoded as `utf8`, and
the value won't have a comment:
Expand Down Expand Up @@ -212,6 +222,7 @@ make_text_prop(


#### Supported boolean formats

For `make_boolean_prop(cell)`, the following formats are supported:

- true: True, "true", "True", "1", 1, "yes", "Yes"
Expand All @@ -220,7 +231,7 @@ For `make_boolean_prop(cell)`, the following formats are supported:
N/A-like values will raise an Error. So if your cell is empty, this method will not count it as false, but will raise an
Error. If you want N/A-like values to be counted as false, you may use a construct like this:

```python
```
Comment on lines -225 to +235
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, this removes syntax highlighting I think, which reduces readability

if excel2xml.check_notna(cell):
# the cell contains usable content
excel2xml.make_boolean_prop(":hasBoolean", cell)
Expand All @@ -230,6 +241,7 @@ else:
```

#### Supported text values

DSP's only restriction on text-properties is that the string must be longer than 0. It is, for example, possible to
upload the following property:
```xml
Expand All @@ -241,22 +253,26 @@ upload the following property:

`excel2xml` allows to create such a property, but text values that don't meet the requirements of
[`excel2xml.check_notna()`](#check-if-a-cell-contains-a-usable-value) will trigger a warning, for example:
```python
```
excel2xml.make_text_prop(":hasText", " ") # OK, but triggers a warning
excel2xml.make_text_prop(":hasText", "-") # OK, but triggers a warning
```


### 8. Append the resource to root

At the end of the for-loop, it is important not to forget to append the finished resource to the root.


## 9. Save the file

At the very end, save the file under a name that you can choose yourself.


## Other helper methods

### Check if a cell contains a usable value

The method `check_notna(cell)` checks a value if it is usable in the context of data archiving. A value is considered
usable if it is

Expand Down Expand Up @@ -306,6 +322,7 @@ In contrast, `check_notna(cell)` will return the expected value for all cases in


### Calendar date parsing

The method `find_date_in_string(string)` tries to find a calendar date in a string. If successful, it
returns the DSP-formatted date string.

Expand Down
2 changes: 2 additions & 0 deletions docs/dsp-tools-usage.md
Expand Up @@ -189,6 +189,7 @@ More information about the usage of this command can be found


## Create an XML file from Excel/CSV

```bash
dsp-tools excel2xml data-source.xlsx project_shortcode ontology_name
```
Expand All @@ -210,6 +211,7 @@ described in the next paragraph.


## Use the module `excel2xml` to convert a data source to XML

dsp-tools assists you in converting a data source in CSV/XLS(X) format to an XML file. Unlike the other features of
dsp-tools, this doesn't work via command line, but via helper methods that you can import into your own Python script.
Because every data source is different, there is no single algorithm to convert them to a DSP conform XML. Every user
Expand Down