diff --git a/.github/workflows/ckeck-pr-title.yml b/.github/workflows/ckeck-pr-title.yml index f407700c8..7d2aafa67 100644 --- a/.github/workflows/ckeck-pr-title.yml +++ b/.github/workflows/ckeck-pr-title.yml @@ -13,8 +13,8 @@ jobs: - uses: deepakputhraya/action-pr-title@master with: regex: '([a-z])+(\(([a-z\-_ ])+\))?!?: [a-z]([a-zA-Z-\.\d \(\)\[\]#_])+$' # Regex the title should match. - allowed_prefixes: "fix,refactor,feat,docs,chore,style,test" # title should start with the given prefix - disallowed_prefixes: "feature,hotfix,doc" # title should not start with the given prefix + allowed_prefixes: 'fix,refactor,feat,docs,chore,style,test' # title should start with the given prefix + disallowed_prefixes: 'feature,hotfix' # title should not start with the given prefix prefix_case_sensitive: true # title prefix are case insensitive min_length: 7 # Min length of the title max_length: 120 # Max length of the title diff --git a/docs/dsp-tools-create.md b/docs/dsp-tools-create.md index e3c8bb5de..70142afff 100644 --- a/docs/dsp-tools-create.md +++ b/docs/dsp-tools-create.md @@ -1,15 +1,17 @@ +[![PyPI version](https://badge.fury.io/py/dsp-tools.svg)](https://badge.fury.io/py/dsp-tools) + # JSON data model definition format ## Introduction -This document contains all the information you need to create an data model that's used by DSP. According to -Wikipedia, da [data model](https://en.wikipedia.org/wiki/Data_model) is "_is an abstract model that organizes elements +This document contains all the information you need to create a data model that can be used by DSP. According to +Wikipedia, the [data model](https://en.wikipedia.org/wiki/Data_model) is "_an abstract model that organizes elements of data and standardizes how they relate to one another and to the properties of real-world entities._" Further it states: "_A data model explicitly determines the structure of data. Data models are typically specified by a data -specialist, data librarian, or a digital humanities scholar in a data modeling notation_". In this section we will -describe one of the notations that is used by the _dsp-tools_ to create a data model in the dsp repository. The dsp -repository is loosely based on [Linked Open Data](https://en.wikipedia.org/wiki/Linked_data) where also the term -_"ontology"_ is used for the data model. It should be noted that in this context an ontology is not used in the -philosophical sense. +specialist, data librarian, or a digital humanities scholar in a data modeling notation_". + +In this section, we will describe one of the notations that is used by dsp-tools to create a data model in the DSP +repository. The DSP repository is loosely based on [Linked Data](https://en.wikipedia.org/wiki/Linked_data) where also +the term _ontology_ is used. In the first section you find a rough overview of the data model definition, all the necessary components with a definition and a short example of the definition. @@ -31,22 +33,21 @@ A complete data model definition looks like this: "shortcode": "0123", "shortname": "BiZ", "longname": "Bildung in Zahlen", - "descriptions": {}, - "keywords": [], - "lists": [], - "groups": [], - "users": [], - "ontologies": [] + "descriptions": {...}, + "keywords": [...], + "lists": [...], + "groups": [...], + "users": [...], + "ontologies": [...] } } ``` As you can see, only two umbrella terms define our ontology: the "prefixes" object and the "project" object. In the -following we take a deeper look into both of them since, as you can see in the example above, -both objects have further fine grained definition levels. - +following we take a deeper look into both of them since, as you can see in the example above, both objects have further +fine-grained definition levels. ### "Prefixes" object -`"prefixes": { "prefix": ", ...}` +`"prefixes": { "prefix": "", ...}` The "prefixes" object contains - as you may already have guessed by the name - the `prefixes` of *external* ontologies that are also used in the current project. All prefixes are composed of a keyword, followed by its iri. This is used as @@ -54,7 +55,7 @@ a shortcut for later so that you don't always have to specify the full qualified keyword instead. That means that e.g. instead of addressing a property called "familyname" via `http://xmlns.com/foaf/0.1/familyName` you can simply use foaf:familyName. -As you can see in the example below, you can have more then one prefix too. In the example we have "foaf" as well as +As you can see in the example below, you can have more than one prefix too. In the example we have "foaf" as well as "dcterms" as our prefixes. ```json @@ -67,13 +68,13 @@ As you can see in the example below, you can have more then one prefix too. In t ``` ### "Project" object -`"project": {"key": ", ...}` +`"project": {"key": "", ...}` Right after the "prefix" object the "project" object has to follow, which contains all resources and properties of the ontology. The "project" object is the bread and butter of the ontology. All its important properties are specified therein. -As you saw in the complete ontology definition in the beginning, the project definitions `requires` ***exactly*** all -of the following datafields: +As you saw in the complete ontology definition in the beginning, the project definitions requires all the following +data fields: - shortcode - shortname @@ -81,30 +82,31 @@ of the following datafields: - keywords - ontologies -Whereas the following fields are `optional` (if one or more of these fields are not -used, it must be omitted): +Whereas the following fields are optional (if one or more of these fields are not used, it must be omitted): - descriptions - lists - groups - users -So a simple example definition of the "project" object could look like this: +So, a simple example definition of the "project" object could look like this: ```json -"project": { - "shortcode": "0809", - "shortname": "test" , - "longname": "Test Example", - "descriptions": { - "en": "This is a simple example project", - "de": "Dies ist ein einfaches Beispielprojekt" - } - "keywords": ["example", "simple"], - "lists": […], - "groups": […], - "users": […], - "ontology": […] +{ + "project": { + "shortcode": "0809", + "shortname": "test" , + "longname": "Test Example", + "descriptions": { + "en": "This is a simple example project", + "de": "Dies ist ein einfaches Beispielprojekt" + }, + "keywords": ["example", "simple"], + "lists": [...], + "groups": [...], + "users": [...], + "ontology": [...] + } } ``` @@ -113,18 +115,18 @@ At that point we will go through all of this step by step and take a more in dep "project" object. The first four fields of the "project" object are "key"/"value" pairs. Therefore they are quite simple. ### Shortcode -`"shortcode": "<4-hex-characters>` +`"shortcode": "<4-hex-characters>"` -It's a hexadecimal string in the range between "0000" and "FFFF" that's used to uniquely identifying the project. The +It's a hexadecimal string in the range between "0000" and "FFFF" that's used to uniquely identify the project. The shortcode has to be provided by the DaSCH. ### Shortname -`"shortname": ""` +`"shortname": ""` -This is a short name (string) for the project. It's ment to be like a nickname. If the name of the project is e.g. +This is a short name (string) for the project. It's meant to be like a nickname. If the name of the project is e.g. "Albus Percival Wulfric Dumbledore", then the shortname for it could be "Albi". It should be in the form of a [xsd:NCNAME](https://www.w3.org/TR/xmlschema11-2/#NCName), that is a name without blanks and special characters like -":", ";", "&", "%" etc., but "-" and "_" are allowed +`:`, `;`, `&`, `%` etc., but `-` and `_` are allowed. ### Longname `"longname": ""` @@ -212,17 +214,16 @@ example a classification of disciplines in the Humanities might look like follow - Ontology - Philosophy of mind - Teleology - DSP allows to define such controlled vocabularies or thesauri. They can be arranged "flat" or in "hierarchies" (as the given example about the disciplines in Humanities is). The definition of these entities are called "lists" in the DSP. Thus, the list object is used to give the resources of the ontology a taxonomic quality. A taxonomy makes it possible to categorize a resource. The big advantage of a taxonomic structure as it is implemented by the DSP is that the user can subcategorize the objects. This allows the user to formulate his search requests more or less -specifically as desired. Thus, in the example above a search for "Vocal music" would result in all works that ate +specifically as desired. Thus, in the example above a search for "Vocal music" would result in all works that are characterized by a subelement of "Vocal music". However a search for "Masses" would retrun only works that have been characterized as such. The number of hierarchy levels is not limited, but for practical reasons -should not exceed 3-4 levels +it should not exceed 3-4 levels. Thus, a taxonomy is a hierarchical list of categories in a tree-like structure. The taxonomy must be complete. This means that the entire set of resources must be mappable to the sub-categorization of the taxonomy. To come back to the previous @@ -250,194 +251,201 @@ A resource can be assigned to a taxonomic node within its properties. So a resou title "La Traviata" would have the property/attribute "musical-genre" with the value "Grand opera". Within the DSP, each property or attribute has an assigned cardinality. Sometimes, a taxonomy allows that an object may belong to different categories at the same time (e.g. an image which depicts several categories at the same time). In these cases, -a cardinality > 1 allows to add multiple attributes -of the same time. See further below the description of the [cardinalities](#cardinalities) +a cardinality greater than 1 allows adding multiple attributes of the same time. See further below the description of the +[cardinalities](#cardinalities). A node of the Taxonomy may have the following elements: -- _name_: Name of the node. This should be unique for the given list. The name-element is _optional_ but highly - recommended]. -- _labels_: Language dependent labels in the form ```{ "": "