Skip to content

Commit

Permalink
feat: use docker commands for stack handling (DEV-1530) (#261)
Browse files Browse the repository at this point in the history
  • Loading branch information
jnussbaum committed Dec 7, 2022
1 parent f4822dc commit c11edc5
Show file tree
Hide file tree
Showing 10 changed files with 292 additions and 200 deletions.
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -70,6 +70,7 @@ stashed*
**/~$*.*
testdata/tmp/
testdata/test-list.json
knora/dsplib/docker/sipi.docker-config.lua

# for testing in development
tmp/
1 change: 1 addition & 0 deletions MANIFEST.in
Expand Up @@ -4,3 +4,4 @@ include knora/dsplib/schemas/lists-only.json
include knora/dsplib/schemas/resources-only.json
include knora/dsplib/schemas/properties-only.json
include knora/dsplib/schemas/data.xsd
include knora/dsplib/docker/*
106 changes: 68 additions & 38 deletions docs/dsp-tools-usage.md
Expand Up @@ -2,14 +2,16 @@

# Installation and usage

The following paragraphs gives you an overview of how to install and use dsp-tools.
DSP-TOOLS is a Python package with a command line interface that helps you interact with a DSP server. The DSP server
you interact with can be on a remote server, or on your local machine. The following paragraphs give you an overview of
how to install and use dsp-tools.




## Installation

To install the latest version run:
To install the latest version, run:

```bash
pip3 install dsp-tools
Expand Down Expand Up @@ -255,61 +257,89 @@ In order to upload data incrementally the procedure described [here](dsp-tools-x



## Start a DSP-stack on your local machine (for DaSCH-internal use only)
## Start a DSP stack on your local machine

For testing purposes, it is sometimes necessary to run DSP-API and DSP-APP on a local machine. But the startup
and shutdown of API and APP can be complicated: Both repos need to be cloned locally, a `git pull` has to be executed
from time to time to stay up to date, and then there are several commands for each repository to remember.
DSP-API is the heart of the DaSCH service platform. It is a server application for storing data from the Humanities.
DSP-APP is a generic user interface for the user to look at and work with data stored in DSP-API. It's a server
application, too. For testing purposes, it is sometimes necessary to run DSP-API and DSP-APP on a local machine.
There are two ways to do this:

Another challenge is the software that DSP depends upon: JDK, node, npm, Angular, etc. should be kept up to date. And
it might happen that a dependency is replaced, e.g. JDK 11 Zulu by JDK 17 Temurin. An non-developer can quickly get lost
in this jungle.
- simple: run `dsp-tools start-stack`
- advanced: execute commands from within the DSP-API/DSP-APP repositories

That's why dsp-tools offers some commands to facilitate the handling of API and APP. These commands
Here's an overview of the two ways:

- clone the repos to `~/.dsp-tools`, and keep them up to date.
- check every time if the dependencies are up to date, and give you advice how to update them, if necessary.
- pass on the right commands to APP and API, even if the correct usage of these commands changes over time.
- make sure that the repos don't get cluttered with old files over time.
- log their activity in `~/.dsp-tools`, so you can check the logs for troubleshooting, if necessary.
| | simple | advanced |
|-----------------------------|-----------------------------|--------------------------------------------------------------------------|
| target group | researchers, RDU employees | developers of DSP-API or DSP-APP |
| how it works | run `dsp-tools start-stack` | execute commands from within locally cloned DSP-API/DSP-APP repositories |
| software dependencies | Docker, Python, dsp-tools | XCode command line tools, Docker, sbt, Java, Angular, node, yarn |
| OS | Windows, Mac OS, Linux | Mac OS, Linux |
| mechanism in the background | run pre-built Docker images | build DSP-API and DSP-APP from a branch in the repository |
| available versions | latest released version | any branch, or locally modified working tree |
| caveats | | dependencies must be kept up to date |

The only requirements for these commands are:

- the initial installation of all software that you accomplished when you started working at DaSCH
- Docker must be running (for DSP-API only)

Please note that these commands were developed for DaSCH-internal use only. They only work on Macs that have the
required software installed that makes it possible to run the API and APP. We don't offer support or troubleshooting
for these commands.
### Simple way: `dsp-tools start-stack`


### Start DSP-API
This command runs Docker images with the latest released versions of DSP-API and DSP-APP, i.e. the versions that are
running on [https://admin.dasch.swiss](https://admin.dasch.swiss). The only prerequisite for this is that Docker
is running, and that you have Python and dsp-tools installed. Just type:

```
dsp-tools start-api
dsp-tools start-stack
```

This command makes a clone of the [DSP-API repository](https://github.com/dasch-swiss/dsp-api) into `~/.dsp-tools`. If
it finds an existing clone there, it runs `git pull` instead. If the API is already running, it shuts down the old
instance, deletes all data that was in it, and starts a new one. If the dependencies are outdated or not installed, a
warning is printed to the console.
**dsp-tools will ask you for permission to clean Docker with a `docker system prune`. This will remove all unused
containers, networks and images. If you don't know what that means, just type `y` ("yes") and then `Enter`.**

The following options are available:

- `--max_file_size=int` (optional, default: `250`): max. multimedia file size allowed by SIPI, in MB (max: 100'000)
- `--prune` (optional): if set, execute `docker system prune` without asking the user
- `--no-prune` (optional): if set, don't execute `docker system prune` (and don't ask)

Example: If you start the stack with `dsp-tools start-stack --max_file_size=1000`, it will be possible to upload files
that are up to 1 GB big. If a file bigger than `max_file_size` is uploaded, SIPI will reject it.

### Shut DSP-API down
When your work is done, shut down DSP-API and DSP-APP with

```
dsp-tools stop-api
dsp-tools stop-stack
```

This command shuts DSP-API down, deletes all Docker volumes, and removes temporary files.
This command deletes all Docker volumes, and removes all data that was in the database.

Some notes:

### Start DSP-APP
- As long as you want to keep the data in the database, don't execute `dsp-tools stop-stack`.
- It is possible to leave DSP-API up for a long time. If you want to save power, you can pause Docker. When you resume
it, DSP-API will still be running, in the state how you left it.
- You can also send your computer to sleep while the DSP stack is running. For this, you don't even need to pause
Docker.
- This command was developed for DaSCH-internal use only. We don't offer support or troubleshooting for it.

```
dsp-tools start-app
```

This command makes a clone of the [DSP-APP repository](https://github.com/dasch-swiss/dsp-app) into `~/.dsp-tools`. If
it finds an existing clone there, it runs `git pull` instead. Then, it installs the `npm` dependencies and runs DSP-APP.
You must keep the terminal window open as long as you work with the APP. Then, you can press `Ctrl` + `C` to stop DSP-APP.
#### When should I restart DSP-API?
After creating a data model and adding some data in your local DSP stack, you can work on DSP as if it was the live
platform. But there are certain actions that are irreversible or can only be executed once, e.g. uploading the same JSON
project file. If you edit your data model in the JSON file, and then you want to upload it a second time, DSP-API will
refuse to create the same project again. So, you might want to restart the stack and start over again from a clean setup.

It is possible, however, to modify the XML data file and upload it again and again. But after some uploads, DSP is
cluttered with data, so you might want to restart the stack.



### Advanced way

If you want to run a specific branch of DSP-API / DSP-APP, or to modify them yourself, you need to:

- install the dependencies (check [https://github.com/dasch-swiss/dsp-api](https://github.com/dasch-swiss/dsp-api) and
[https://github.com/dasch-swiss/dsp-app](https://github.com/dasch-swiss/dsp-app) how to do it)
- keep the dependencies up to date (keep in mind that dependencies might be replaced over time)
- clone the repositories from GitHub
- keep them up to date with `git pull`
- execute commands from within the repositories (`make` for DSP-API, `angular` for DSP-APP)
- take care that the repositories don't get cluttered with old data over time
19 changes: 12 additions & 7 deletions docs/index.md
Expand Up @@ -2,17 +2,22 @@

# DSP-TOOLS documentation

dsp-tools is a command line tool that helps you to interact with a DaSCH Service Platform (DSP) server.
DSP-TOOLS is a Python package with a command line interface that helps you interact with a DSP server. The DSP server
you interact with can be on a remote server, or on your local machine. The two main tasks that DSP-TOOLS serves for are:

In order to archive your data on the DaSCH Service Platform, you need a data model (ontology) that describes your data.
**Create a project with its data model(s), described in a JSON file, on a DSP server**
In order to archive your data on the DaSCH Service Platform, you need a data model that describes your data.
The data model is defined in a JSON project definition file which has to be transmitted to the DSP server. If the DSP
server is aware of the data model for your project, conforming data can be uploaded into the DSP repository.

Often, data is initially added in large quantities. Therefore, dsp-tools allows you to perform bulk imports of your
data. In order to do so, the data has to be described in an XML file. dsp-tools is able to read the XML file and upload
**Upload data, described in an XML file, to a DSP server that has a project with a matching data model**
Sometimes, data is added in large quantities. Therefore, DSP-TOOLS allows you to perform bulk imports of your
data. In order to do so, the data has to be described in an XML file. DSP-TOOLS is able to read the XML file and upload
all data to the DSP server.

dsp-tools helps you with the following tasks:
All of DSP-TOOLS' functionality revolves around these two basic tasks.

DSP-TOOLS provides the following functionalities:

- [`dsp-tools create`](./dsp-tools-usage.md#create-a-project-on-a-dsp-server) creates the project with its data model(s)
on a DSP server from a JSON file.
Expand All @@ -38,5 +43,5 @@ dsp-tools helps you with the following tasks:
- [`dsp-tools id2iri`](./dsp-tools-usage.md#replace-internal-ids-with-iris-in-xml-file)
takes an XML file for bulk data import and replaces referenced internal IDs with IRIs. The mapping has to be provided
with a JSON file.
- [`dsp-tools start-api / stop-api / start-app`](./dsp-tools-usage.md#start-a-dsp-stack-on-your-local-machine-for-dasch-internal-use-only)
assist you in running a DSP software stack on your local machine.
- [`dsp-tools start-stack / stop-stack`](./dsp-tools-usage.md#start-a-dsp-stack-on-your-local-machine)
assist you in running a DSP stack on your local machine.
60 changes: 21 additions & 39 deletions knora/dsp_tools.py
Expand Up @@ -4,14 +4,9 @@
import argparse
import datetime
import os
import re
import subprocess
import sys
from importlib.metadata import version

import requests
import yaml

from knora.dsplib.utils.excel_to_json_lists import excel2lists, validate_lists_section_with_schema
from knora.dsplib.utils.excel_to_json_project import excel2json
from knora.dsplib.utils.excel_to_json_properties import excel2properties
Expand All @@ -22,6 +17,7 @@
from knora.dsplib.utils.onto_get import get_ontology
from knora.dsplib.utils.onto_validate import validate_project
from knora.dsplib.utils.shared import validate_xml_against_schema
from knora.dsplib.utils.stack_handling import start_stack, stop_stack
from knora.dsplib.utils.xml_upload import xml_upload
from knora.excel2xml import excel2xml

Expand Down Expand Up @@ -151,19 +147,21 @@ def program(user_args: list[str]) -> None:
parser_excel2xml.add_argument('shortcode', help='Shortcode of the project that this data belongs to')
parser_excel2xml.add_argument('default_ontology', help='Name of the ontology that this data belongs to')

# startup DSP-API
parser_stackup = subparsers.add_parser('start-api', help='Startup a local instance of DSP-API')
parser_stackup.set_defaults(action='start-api')
# startup DSP stack
parser_stackup = subparsers.add_parser('start-stack', help='Startup a local instance of the DSP stack (DSP-API and '
'DSP-APP)')
parser_stackup.set_defaults(action='start-stack')
parser_stackup.add_argument('--max_file_size', type=int, default=None,
help="max. multimedia file size allowed by SIPI, in MB (default: 250, max: 100'000)")
parser_stackup.add_argument('--prune', action='store_true',
help='if set, execute "docker system prune" without asking the user')
parser_stackup.add_argument('--no-prune', action='store_true',
help='if set, don\'t execute "docker system prune" (and don\'t ask)')

# shutdown DSP-API
parser_stackdown = subparsers.add_parser('stop-api', help='Shut down the local instance of DSP-API, delete '
'volumes, clean SIPI folders')
parser_stackdown.set_defaults(action='stop-api')

# startup DSP-APP
parser_dsp_app = subparsers.add_parser('start-app', help='Startup a local instance of DSP-APP')
parser_dsp_app.set_defaults(action='start-app')

parser_stackdown = subparsers.add_parser('stop-stack', help='Shut down the local instance of the DSP stack, and '
'delete all data in it')
parser_stackdown.set_defaults(action='stop-stack')


# call the requested action
Expand Down Expand Up @@ -239,29 +237,13 @@ def program(user_args: list[str]) -> None:
excel2xml(datafile=args.datafile,
shortcode=args.shortcode,
default_ontology=args.default_ontology)
elif args.action == 'start-api' and not sys.platform.startswith('win'):
try:
response = requests.get("https://raw.githubusercontent.com/dasch-swiss/dsp-api/main/.github/actions/preparation/action.yml")
action = yaml.safe_load(response.content)
for step in action.get("runs", {}).get("steps", {}):
if re.search("(JDK)|(Java)", step.get("name", "")):
distribution = step.get("with", {}).get("distribution", "").lower()
java_version = step.get("with", {}).get("java-version", "").lower()
except:
distribution = "temurin"
java_version = "17"
subprocess.run(['/bin/bash', os.path.join(current_dir, 'dsplib/utils/start-api.sh'), distribution, java_version])
elif args.action == 'stop-api' and not sys.platform.startswith('win'):
subprocess.run(['/bin/bash', os.path.join(current_dir, 'dsplib/utils/stop-api.sh')])
elif args.action == 'start-app' and not sys.platform.startswith('win'):
try:
subprocess.run(['/bin/bash', os.path.join(current_dir, 'dsplib/utils/start-app.sh')])
except KeyboardInterrupt:
print("\n\n"
"================================\n"
"You successfully stopped the APP\n"
"================================")
exit(0)
elif args.action == 'start-stack':
start_stack(max_file_size=args.max_file_size,
enforce_docker_system_prune=args.prune,
suppress_docker_system_prune=args.no_prune)
elif args.action == 'stop-stack':
stop_stack()



def main() -> None:
Expand Down
70 changes: 70 additions & 0 deletions knora/dsplib/docker/docker-compose.yml
@@ -0,0 +1,70 @@
version: '3.7'

services:

app:
image: daschswiss/dsp-app:v10.11.0-11-g4356dea # after every deployment (fortnightly), check latest tag at https://hub.docker.com/r/daschswiss/dsp-app/tags
ports:
- "4200:4200"
networks:
- knora-net

db:
image: daschswiss/apache-jena-fuseki:2.0.10 # after every deployment (fortnightly), check latest tag at https://hub.docker.com/r/daschswiss/apache-jena-fuseki/tags
ports:
- "3030:3030"
networks:
- knora-net
environment:
- TZ=Europe/Zurich
- ADMIN_PASSWORD=test
- JVM_ARGS=-Xmx3G

sipi:
image: daschswiss/knora-sipi:24.0.8-18-gb8eaadf # after every deployment (fortnightly), check latest tag at https://hub.docker.com/r/daschswiss/knora-sipi/tags
ports:
- "1024:1024"
volumes:
- .:/docker
networks:
- knora-net
environment:
- TZ=Europe/Zurich
- SIPI_EXTERNAL_PROTOCOL=http
- SIPI_EXTERNAL_HOSTNAME=0.0.0.0
- SIPI_EXTERNAL_PORT=1024
- SIPI_WEBAPI_HOSTNAME=api
- SIPI_WEBAPI_PORT=3333
- KNORA_WEBAPI_KNORA_API_EXTERNAL_HOST=0.0.0.0
- KNORA_WEBAPI_KNORA_API_EXTERNAL_PORT=3333
command: --config=/docker/sipi.docker-config.lua

api:
image: daschswiss/knora-api:25.0.0 # after every deployment (fortnightly), check latest tag at https://hub.docker.com/r/daschswiss/knora-api/tags
depends_on:
- sipi
- db
ports:
- "3333:3333"
networks:
- knora-net
environment:
- TZ=Europe/Zurich
- KNORA_AKKA_LOGLEVEL=DEBUG
- KNORA_AKKA_STDOUT_LOGLEVEL=DEBUG
- KNORA_WEBAPI_TRIPLESTORE_HOST=db
- KNORA_WEBAPI_TRIPLESTORE_DBTYPE=fuseki
- KNORA_WEBAPI_SIPI_INTERNAL_HOST=sipi
- KNORA_WEBAPI_TRIPLESTORE_FUSEKI_REPOSITORY_NAME=knora-test
- KNORA_WEBAPI_TRIPLESTORE_FUSEKI_USERNAME=admin
- KNORA_WEBAPI_TRIPLESTORE_FUSEKI_PASSWORD=test
- KNORA_WEBAPI_CACHE_SERVICE_ENABLED=true
- KNORA_WEBAPI_CACHE_SERVICE_REDIS_HOST=redis
- KNORA_WEBAPI_CACHE_SERVICE_REDIS_PORT=6379
- KNORA_WEBAPI_ALLOW_RELOAD_OVER_HTTP=true
- KNORA_WEBAPI_KNORA_API_EXTERNAL_HOST=0.0.0.0
- KNORA_WEBAPI_KNORA_API_EXTERNAL_PORT=3333

networks:
knora-net:
name: knora-net

0 comments on commit c11edc5

Please sign in to comment.