Skip to content

cgrevisse/swift-vocabulary

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Swift Vocabulary

This repository contains reference serializations and a generation script for the Swift Vocabulary, an SKOS-based vocabulary on the Swift programming language. This is a work to appear at the "Resources Track" of ISWC 2020.

Requirements & Usage

To run the generate.py script, you first need to install a couple of Python packages:

pip install beautifulsoup4 spacy spacy-lookups-data rdflib
python3 -m spacy download en_core_web_sm

Now you can run the script:

python3 generate.py

It was tested on an Ubuntu 20.04 running Python 3.8, as well as on macOS Catalina (10.15.6) running Python 3.6. The execution takes less than a minute.

Features

The generation script provides the following features:

  • Extraction of concepts & resources from the "Swift book"
  • Cleansing of concept names (removing non-alphabetical characters, applying camel case
  • Creation of the RDF graph (metadata, concept scheme, concepts & labels, resources)
  • Serialization to Turtle & XML format

Manual Tasks

Some manual curation is needed for the generated files. This can be done, e.g., using Protégé.

  • Selection of final concepts
  • Selection of best resources per concept
  • Creation of associative links (skos:related) and hierarchical links (skos:broader)
  • Determination of the scheme's top concepts (skos:hasTopConcept)
  • Alignment with DBpedia (cf. dbpedia.txt for common programming-related DBpedia concepts)

Reference serializations in Turtle and XML format after manual curation can be found in this repository.

Further References

Citation

As a canonical citation, please use:

Grévisse, C. and Rothkugel, S. (2020). Swift Vocabulary. http://purl.org/lu/uni/alma/swift

The citation to the paper will be provided after its publication.

About

SKOS-based Vocabulary on the Swift programming language. To appear at ISWC 2020.

Topics

Resources

Stars

Watchers

Forks

Languages