DCAT DRY

DCAT-AP Dataset Relationship Indexer. Indexing linked data and relationships between datasets.

Features:

index a distribution or a SPARQL endpoint
extract and index distributions from a DCAT catalog
extract a DCAT catalog from SPARQL endpoint and index distributions from it
generate a dataset profile
show related datasets based mainly on DataCube and SKOS vocabularies
indexing sameAs identities and related concepts

Build & run with Docker

For DCAT-DRY service only:

docker build . -t dcat-dry
docker run -p 80:8000 --name dcat-dry dcat-dry

For the full environment use docker-compose:

docker-compose up --build

Build & run manually

CPython 3.8+ is supported.

Install redis server first. In following example we will assume it runs on localhost, port 6379 and DB 0 is used.

Setup postgresql server as well. In the following example we will assume it runs on localhost, port 5432, DB is postgres and user/password is postgres:example

You will need some libraries installed: libxml2-dev libxslt-dev libleveldb-dev libsqlite3-dev and sqlite3

Run the following commands to bootstrap your environment :

git clone https://github.com/eghuro/dcat-dry
cd dcat-dry
poetry install --with robots,gevent --without dev
# Start redis and postgres servers

# Export environment variables
export REDIS_CELERY=redis://localhost:6379/1
export REDIS=redis://localhost:6379/0
export DB=postgresql+psycopg2://postgres:example@localhost:5432/postgres

# Setup the database
alembic upgrade head

# Run concurrently
celery -A tsa.celery worker -l debug -Q high_priority,default,query,low_priority -c 4
gunicorn -w 4 -b 0.0.0.0:8000 --log-level debug app:app
nice -n 10 celery -l info -A tsa.celery beat

In general, before running shell commands, set the FLASK_APP and FLASK_DEBUG environment variables :

export FLASK_APP=autoapp.py
export FLASK_DEBUG=1

Deployment

To deploy:

export FLASK_DEBUG=0
# Follow commands above to bootstrap the environment

In your production environment, make sure the FLASK_DEBUG environment variable is unset or is set to 0, so that ProdConfig is used.

Shell

To open the interactive shell, run :

flask shell

By default, you will have access to the flask app.

Running Tests

To run all tests, run :

flask test

Before execution

# Prepare couchdb :

curl -X PUT http://admin:password@127.0.0.1:5984/_users
curl -X PUT http://admin:password@127.0.0.1:5984/_replicator
curl -X PUT http://admin:password@127.0.0.1:5984/_global_changes

# Migrate database :

alembic upgrade head

API

To start batch scan, run :

flask batch -g /tmp/graphs.txt -s http://10.114.0.2:8890/sparql

Get a full result :

/api/v1/query/analysis

Query a dataset :

/api/v1/query/dataset?iri=http://abc

Name		Name	Last commit message	Last commit date
Latest commit History 1,242 Commits
alembic		alembic
docs		docs
prefixcc		prefixcc
tests		tests
tsa		tsa
.dockerignore		.dockerignore
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
.prospector.yaml		.prospector.yaml
CONTRIBUTING.rst		CONTRIBUTING.rst
Dockerfile		Dockerfile
HISTORY.rst		HISTORY.rst
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
README.rst		README.rst
alembic.ini		alembic.ini
app.py		app.py
docker-compose.yml		docker-compose.yml
exclude.txt		exclude.txt
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

License

eghuro/dcat-dry

Folders and files

Latest commit

History

Repository files navigation

DCAT DRY

Build & run with Docker

Build & run manually

Deployment

Shell

Running Tests

Before execution

API

About

Topics

Resources

License

Stars

Watchers

Forks

Languages