Marduk

Marduk orchestrates the timetable data import pipeline.

Data input channels

Marduk receives datasets from the following channels:

File upload from the operator portal (Bel)
File upload from the administration console (Ninkasi)
File transfer via a REST API
Internal PubSub messaging for flight data (Extime)

Input data are primarily NeTEx datasets. GTFS datasets are still in use but will ultimately be migrated to NeTEx.

Data import workflow

Marduk performs basic validation checks on the input files (check that the file is a valid zip archive, simple data format check) and then initiates the import workflow:

Pre-validation of NeTEx data (Antu)
Antu runs a set of validation rules against the NeTEx dataset (XML Schema validation, XPath queries, ...).
Import of NeTEx data into the data provider timetable database (Chouette)
Chouette imports the NeTEx data into an intermediate, work-in-progress database.
Validation of the imported data (Chouette)
Chouette runs a second set of validation rules on the intermediate database.
Transfer from the data provider database to the central timetable database (Chouette)
Chouette copies the timetable data from the intermediate database to the central database. The central database contains only validated data.
Validation of the transferred timetable data (Chouette)
Chouette runs the same set of validation rules as in step 3.
NeTEx export (Chouette)
Chouette exports the timetable data into a NeTEx archive.
NeTEx export post-validation (Antu)
Antu runs the same set of validation rules as in step 1. against the exported NeTEx dataset.
Publication of validated NeTEx datasets
Marduk publishes the validated datasets on Entur Open Data Portal
Update of the journey planner graph
Marduk triggers a rebuild of the journey planner graph so that it reflects the latest version of timetable data (OTP)

Periodic timetable data revalidation

Some NeTEx validation rules are time-dependent, in particular those that rely on external reference data such as the Norwegian Stop Place Register. It is necessary to revalidate periodically the imported datasets to guarantee that they still refer to valid stop places. Revalidation allows also for pruning expired data, such as trip whose effective date is in the past.
Marduk schedules a nightly revalidation of every dataset which triggers a regeneration of each NeTEx export file. Expired data are removed from the new exports.

GTFS export

In addition to orchestrating NeTEx data export, Marduk triggers also an export of GTFS data (Damu)

Integration with flexible transport data

Marduk merges the NeTEx datasets containing flexible timetables generated in NPlan (Uttu and Enki) with those generated in Chouette.

Journey planner street graph update

OpenTripPlanner relies on OpenStreetMap data to calculate the first/last leg of a journey (walk from start point or to destination point). Marduk schedules a nightly download of OpenStreetMap data that in turn is used by OpenTripPlanner to build an updated street graph.

Local environment configuration

A minimal local setup requires a database, a Google PubSub emulator and access to a providers repository service (Baba)

Marduk database

Marduk uses a database to store the history of imported file names and checksums.
This is used by the idempotent filter to reject files that have been already imported. A Docker PostgreSQL database instance can be used for local testing:

docker run -p 5432:5432 --name marduk-database -e POSTGRES_PASSWORD=myPostgresPassword postgres:13

A test database can be created with the following commands from the psql client:

create database marduk;
create user marduk with password 'mypassword';
ALTER ROLE marduk SUPERUSER;

The database configuration is specified in the Spring Boot application.properties file:

# Datasource
spring.datasource.driver-class-name=org.postgresql.Driver
spring.datasource.url=jdbc:postgresql://localhost:5432/marduk
spring.datasource.username=marduk
spring.datasource.password=mypassword
spring.flyway.enabled=true

When setting the property spring.flyway.enabled=true, the database will be auto-created at application startup.

Google PubSub emulator

See https://cloud.google.com/pubsub/docs/emulator for details on how to install the Google PubSub emulator.
The emulator is started with the following command:

gcloud beta emulators pubsub start

and will listen by default on port 8085.

The emulator port must be set in the Spring Boot application.properties file as well:

spring.cloud.gcp.pubsub.emulatorHost=localhost:8085
camel.component.google-pubsub.endpoint=localhost:8085

Access to the providers repository service

Access to the providers database is configured in the Spring Boot application.properties file:

providers.api.url=http://localhost:11101/services/providers/

Spring boot configuration file

The application.properties file used in unit tests src/test/resources/application.properties can be used as a template.
The Kubernetes configmap helm/marduk/templates/configmap.yaml can also be used as a template.

Starting the application locally

Run mvn package to generate the Spring Boot jar.
The application can be started with the following command line:
java -Xmx500m -Dspring.config.location=/path/to/application.properties -Dfile.encoding=UTF-8 -jar target/marduk-0.0.1-SNAPSHOT.jar

Name		Name	Last commit message	Last commit date
Latest commit History 2,153 Commits
.circleci		.circleci
api		api
conf		conf
helm/marduk		helm/marduk
src		src
terraform		terraform
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
dependencycheck-suppression.xml		dependencycheck-suppression.xml
pom.xml		pom.xml
renovate.json		renovate.json

License

entur/marduk

Folders and files

Latest commit

History

Repository files navigation

Marduk

Data input channels

Data import workflow

Periodic timetable data revalidation

GTFS export

Integration with flexible transport data

Journey planner street graph update

Local environment configuration

Marduk database

Google PubSub emulator

Access to the providers repository service

Spring boot configuration file

Starting the application locally

About

Topics

Resources

License

Stars

Watchers

Forks

Languages