AutoRDF2GML

AutoRDF2GML is an innovative framework designed to convert RDF data into graph representations suitable for graph-based machine learning methods such as Graph Neural Networks (GNNs). It uniquely generates content-based features from RDF datatype properties and topology-based features from RDF object properties, enabling the effective integration of Semantic Web technologies with Graph Machine Learning.

Key Features

Content-based Node Features: Automatically extract node features from RDF datatype properties.
Topology-based Edge Features: Derive edge features from RDF object properties.
User-friendly Interface: Features a modular design with automatic feature selection for simplicity and ease of use.
Graph ML Integration: Seamlessly integrates with leading frameworks like PyTorch Geometric and DGL.

Quick User Guide

For a step-by-step guide on using the framework, see our example and example-topologyfeatures directories.

Usage

To start using AutoRDF2GML, you need an (1) RDF file and (2) config file describing the configuration for the transformation. In the config file, define the RDF classes and properties as needed for your project. Once configured, execute the AutoRDF2GML script to generate a heterogeneous graph dataset suitable for your machine learning applications. For a step-by-step guide, see our example and example-topologyfeatures directories.

The output can then be used for various machine learning tasks, including node classification, link prediction, and graph classification. It can be readily integrated into common graph machine learning frameworks. For example, see how the output from AutoRDF2GML can be loaded into a PyTorch Geometric HeteroData object in this script. For instance, the structure of the loaded PyG HeteroData object is available as a directed graph here and as an undirected graph here.

Feature Configuration

Content-based Node Features

Quick example for Content-based Node Features Transformation: example

AutoRDF2GML with content-based node features is implemented in the Python script autordf2gml-cb.py. The related template and documentation of the configuration file is defined in the config-template.ini file. The default model for calculating the embeddings based on the natural language descriptions is SciBERT, but also other huggingface BERT variant models (e.g., bert-base) can be used.

Topology-based Node Features

Quick example for Topology-based Node Features Transformation: example-topologyfeatures directory.

AutoRDF2GML with topology-based node features is implemented in the Python script autordf2gml-tb.py. The related template and documentation of the configuration file is defined in the config-template.ini file. The following KG embedding models are possible for calculating the topology-based feature: TransE, DistMult, ComplEx, RotatE. The default parameters (hidden channel size 128) are defined and commented in the implementation.

Contributing

Contributions to AutoRDF2GML are welcome!

License

AutoRDF2GML is made available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
content-based-feature		content-based-feature
example		example
topology-based-feature		topology-based-feature
use-case_aifb-linkedmdb		use-case_aifb-linkedmdb
use-case_lpwc		use-case_lpwc
use-case_semopenalex-semanticweb		use-case_semopenalex-semanticweb
use-with-pyg		use-with-pyg
LICENSE		LICENSE
README.md		README.md
autordf2gml-overview.png		autordf2gml-overview.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

content-based-feature

content-based-feature

example

example

topology-based-feature

topology-based-feature

use-case_aifb-linkedmdb

use-case_aifb-linkedmdb

use-case_lpwc

use-case_lpwc

use-case_semopenalex-semanticweb

use-case_semopenalex-semanticweb

use-with-pyg

use-with-pyg

LICENSE

LICENSE

README.md

README.md

autordf2gml-overview.png

autordf2gml-overview.png

requirements.txt

requirements.txt

Repository files navigation

AutoRDF2GML

Key Features

Quick User Guide

Usage

Feature Configuration

Content-based Node Features

Topology-based Node Features

Contributing

License

About

Releases

Packages

Contributors 3

Languages

License

davidlamprecht/AutoRDF2GML

Folders and files

Latest commit

History

Repository files navigation

AutoRDF2GML

Key Features

Quick User Guide

Usage

Feature Configuration

Content-based Node Features

Topology-based Node Features

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Languages