Skip to content

charlesvardeman/dGit-REU

Repository files navigation

#dGit Semantic Data Management Based on git

dGit is a rough prototype to utilize git as a source of provenance chain information to populate a linked open data (RDF) graph using the W3C prov ontology to describe entities in the repository. This software was developed as part of the NSF funded Computational Science Research Experiences for Undergratuates (REU) program hosted in the Center for Research Computing at the University of Notre Dame. Undergraduate researchers India Stewart and Judy Long authored this software during the Summer of 2013 under the supervision of Dr. Charles Vardeman as part of the NSF funded Data and Software Preservation for Open Science (DASPOS) project . An overview poster from the projects final presentation is available as part of this repository in the doc directory and contains the vision how the dgit system will eventually be implimented.

###Brief Overview The software is structured to capture git command line application commands using python facilitating provenance information to be gathered. This information is cast into the RDF formalism using the python rdflib module and written in a turtle serialization to a .dgit directory located in the repository. Metadata stored in .dgit directory is commited as part of the git repository to allow metadata to be connected to data stored in the repository. Additionally, users may annotate respository entities by attaching triples to the repository object using the describe command. This allows users to attach any metadata to any file contained in the repository. Simple SPARQL searches were tested as a proof of concept for extracting metadata information about objects in the repository. Merging/push/pull git operations are also implimented as a proof of concept.

About

dGit semantic wrapper for linked open data

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages