Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rdf-api): Add a general-purpose SHACL validation utility (DSP-930) #1762

Merged
merged 4 commits into from Dec 1, 2020

Conversation

benjamingeer
Copy link

@benjamingeer benjamingeer commented Nov 25, 2020

resolves DSP-930.

  • Add SHACL validation API in ShaclValidator.scala.
  • Add implementations:
    • JenaShaclValidator
    • RDF4JShaclValidator
  • Return these from RdfFeatureFactory.
  • Add tests.
  • Update docs.

@benjamingeer benjamingeer self-assigned this Nov 25, 2020
@benjamingeer benjamingeer marked this pull request as draft November 25, 2020 15:42
@benjamingeer benjamingeer added API/V2 enhancement improve existing code or new feature labels Nov 25, 2020
@benjamingeer benjamingeer marked this pull request as ready for review November 26, 2020 11:50
Copy link
Contributor

@SepidehAlassi SepidehAlassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thanks for implementing this.

* Recursively loads graphs of SHACL shapes from a directory and its subdirectories.
*
* @param baseDir the base directory that SHACL graphs are loaded from.
* @param dir the base directory or a subdirectory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the Shacl shapes are actually loaded from dir directory and its subdirectories, and baseDir is only used to get its path to make the relativePath. Can you please change the comments for baseDir and dir in scaladocs?

I also wonder, if only the path of the baseDir is needed here, why not just using its path as the parameter of this function, I mean

 private def  loadShaclGraphs(baseDirPath: Path, dir: File) : Map[Path, ShaclGraphT] = {...}

two directory parameters are a bit confusing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the Shacl shapes are actually loaded from dir directory and its subdirectories, and baseDir is only used to get its path to make the relativePath.... I also wonder, if only the path of the baseDir is needed here, why not just using its path as the parameter of this function

Actually, the SHACL shapes are loaded from baseDir and its subdirectories. File and Path are basically the same thing, but File is the old java.io API which I'm more used to, and Path is the newer java.nio API. The consensus in the Java community seems to be that seems that Path should be used instead of File for everything nowadays (https://stackoverflow.com/a/26658436). I guess we should do a PR sometime to use java.nio instead of java.io everywhere. For now, I've refactored AbstractShaclValidator to use java.nio to walk the directory tree, which seems clearer.

val graphToValidate: jena.graph.Graph = rdfModel.asJenaDataset.asDatasetGraph.getDefaultGraph

// Validate the data and get Jena's validation report.
val validationReport: jena.shacl.ValidationReport = shaclValidator.validate(shaclGraph, graphToValidate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Jena Shacl validate is like pyShacl getting the same parameters and returning report and conforms in the same way.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standards are nice. :)

@benjamingeer benjamingeer merged commit bfd3192 into main Dec 1, 2020
@benjamingeer benjamingeer deleted the wip/DSP-930-shacl branch December 1, 2020 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API/V2 enhancement improve existing code or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants