Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: It should be possible to figure out why a check failed from the quality report page #293

Open
amoeba opened this issue Oct 5, 2021 · 3 comments
Assignees

Comments

@amoeba
Copy link
Contributor

amoeba commented Oct 5, 2021

If you're an end-user looking at a quality report with one or more failed checks, it's pretty hard to figure out why the check failed and get any actionable advice on how to fix it. Take an example failed check on that page, "A distribution contact identifier was not found":

Screen Shot 2021-10-05 at 3 39 33 PM

My first question would be how this report is determining whether I have or do not have a distribution contact identifier and then I might go off and my second step would be to go off and fix my metadata. Helping users see specifically why they're failing a check is a key part of metadata improvement and I think we're not quite helping users enough here.

A useful answer is contained in https://github.com/NCEAS/metadig-checks/blob/c8173af26d07774772be900d7222819409f79d27/src/checks/resource.distributionContactIdentifier.present.xml but you'd really have to know the system pretty well to get there and come up with a solid answer.

I think it'd be really nice if the popover in the above screenshot just went on to say:

A distribution contact identifier is defined as having content at either /eml/*/contact/userId or /eml/*/associatedParty/role[RoleType='distributor'].

This is actually kinda hard to do automatically right now with the current check schema and Metadig design because check success and failure don't depend just on the XPaths but also on the code that's run.

Curious to hear what others think and for ideas for how we might improve things a bit here.

@mbjones
Copy link
Member

mbjones commented Oct 6, 2021

Good point, and one we've recognized too. At one point our web report linked the code for the check into the report -- but we removed after some usability testing where people were not understanding the technical details. Probably needs a UI for tech folks somehow. But there is more info on each check available in the api.

We've also talked about and prototyped collection-level summaries about which checks are failing most and where things are improving over time.

@gothub
Copy link
Contributor

gothub commented Oct 6, 2021

@bryce thx for logging this issue from the LTER folks.

Is updating the assessment report schema to include an element that has a targeted message sufficient? What would that element be? Currently we just have <output> which is fairly generic and can have error messages or simple explanations.

Should the assessment report contain a reference to a 'best practices' document for FAIR metadata, similar to the LTER best practices?

@amoeba
Copy link
Contributor Author

amoeba commented Oct 7, 2021

Probably needs a UI for tech folks somehow.

I think this might be a good way to go, rather than bringing more information into the current UI. Could we write a couple of XSLTs to turn suites and checks into HTML and include those pages in an overall Metadig docs site? We could even link that from the quality reports for folks that are prepared to dive in.

Is updating the assessment report schema to include an element that has a targeted message sufficient?

Maybe. Though the things that immediately pops into my head is that the targeted message might get out of sync with the check itself. The current check schema makes my idea a bit harder but we could consider changing the schema to look something like:

<check>
  <xpaths>
    <xpath dialect="eml">/eml/*/associatedParty/role[RoleType='distributor']/text()[normalize-space()]</xpath>
    <xpath dialect="eml">/eml/*/contact/userId/text()[normalize-space()]</xpath>
    <xpath dialect="isotc211">//*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorContact/CI_ResponsibleParty[normalize-space(role/CI_RoleCode)='distributor']/party/*/partyIdentifier/MD_Identifier/code</xpath>
    <xpath dialect="isotc211">//*/distributionInfo/MD_Distribution/distributor/MD_Distributor/distributorContact/CI_ResponsibleParty[normalize-space(role/CI_RoleCode)='pointOfContact']/party/*/partyIdentifier/MD_Identifier/code</xpath>
  </xpaths>
  <dialect>
    <id>datacite</id>
    <name>DataCite 3.1</name>
    <xpath>boolean(/*[local-name() = 'resource'])</xpath>
  </dialect>
  <dialect>
    <id>dryad</id>
    <name>Dryad Data Package and Data File Modules</name>
    <xpath>boolean(/*[local-name() = 'DryadDataFile' or local-name() = 'DryadDataPackage'])</xpath>
  </dialect>
  <dialect>
    <id>eml</id>
    <name>Ecological Metadata Language</name>
    <xpath>boolean(/*[local-name() = 'eml'])</xpath>
  </dialect>
  <dialect>
    <id>isotc211</id>
    <name>ISO 19115 and ISO 19115-2 / ISO 19139, ISO 19139-2, ISO 19115-1, ISO 19115-3</name>
    <xpath>boolean(/*[local-name() = 'MI_Metadata' or local-name() = 'MD_Metadata'])</xpath>
  </dialect>
</check>

This way we could pull out the set of XPath expressions we actually ran for that check.

Should the assessment report contain a reference to a 'best practices' document for FAIR metadata, similar to the LTER best practices?

I think that sounds pretty reasonable. It's hard to figure out what "The Fair Suite" is or what it's based on from the current quality report display.

To summarize my ideas here, not mutually exclusive:

  1. Build a documentation site for advanced users that's automatically generated from the suites/checks. Send this to users when they ask for more info and consider linking to it from the quality report web pages
  2. Make a schema change that makes it easier to programatically pull the relevant XPaths for the document we just ran the quality report on. Show the XPaths in the current report UI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants