Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DHS: namespace in output N42s has no definition. #14

Open
jpbrodsky opened this issue Jun 8, 2022 · 3 comments
Open

DHS: namespace in output N42s has no definition. #14

jpbrodsky opened this issue Jun 8, 2022 · 3 comments

Comments

@jpbrodsky
Copy link

Interspec v 1.0.10 rc2

I used the Interspec "Export" function to produce a N42 (2012) file. This file looks like this (with some sections removed):

<?xml version="1.0" encoding="utf-8"?> <RadInstrumentData n42DocUUID="92fc70f2-0f2d-4ac3-8018-2cf53b5b7b3c" xmlns="http://physics.nist.gov/N42/2011/N42" n42DocDateTime="2022-06-08T21:09:54Z"> <Remark>Source of intrinsic activity:Cesium137</Remark> <RadInstrumentDataCreatorName>InterSpec</RadInstrumentDataCreatorName> ... <RadMeasurement id="Sample1"> ... </RadMeasurement> ... <DHS:InterSpec version="1"> <DisplayedSampleNumbers>2</DisplayedSampleNumbers> ... </DHS:InterSpec> </RadInstrumentData>

The "DHS" namespace designation in <DHS:InterSpec> is not defined in the document as required by the XML standard (i.e. using an xmlsn=....). This makes the file unparseable by the python package lxml due it the non-compliance with the XML standard. As lxml is based on libxml2, presumably that package will also have trouble parsing these files.

lxml error message:
Namespace prefix DHS on Interspec is not defined, line xx, column yy

I suggest this issue be corrected by defining the DHS namespace in the output xml. While different parsers may be more or less tolerant of this issue, my understanding is that lxml is correct here in objecting to the use of an undefined namespace (even if I wish it might "loosen up a little" and parse the file regardless of this issue).

@wcjohns
Copy link
Collaborator

wcjohns commented Jun 8, 2022

Thanks for reporting this Jason.

This is an item that has been on my TODO list for quite a while - however, I hadn't been aware of impact to any one (e.g., all the other spectroscopy programs seem to read the files without issues), so it has been low priority; I'll bump it up in priority, but can't promise a date when it will be done by.

A few things worth noting are:

  • All information with the <DHS:InterSpec> tag is InterSpec specific information, so can likely be safely removed using something like sed or a regex (but sorry, I know this is a pain!), for example, the following seems to work:
import re
from lxml import etree
n42file = open( "temp.n42", "r");
n42_data = n42file.read()
n42file.close()
clean_n42_data = re.sub('<DHS[\d\D]+DHS:InterSpec>', '', n42_data )
root = etree.XML( clean_n42_data.encode('utf-8') )
  • The SpecUtils library has python bindings (but you have to compile it from source, and you probably already have your code setup, so maybe to late to switch to using this to parse files)
  • In addition, even excluding the DHS namespace part, I wouldn't be surprised if there could be one or two other small deviations away from the N42 standard, here or there (but at least the XML should be valid).

thanks again,
-will

@jpbrodsky
Copy link
Author

Thanks, Will!

At this point, we're not planning on modifying our software to specifically support the output of InterSpec, but being able to read it alongside other N42s would be a nice bonus. SpecUtils may be a good solution for that, but for the reasons you mention it's a relatively large job to solve a somewhat small problem.

Best,
Jason

@Am6er
Copy link

Am6er commented Aug 8, 2022

Or just do something like that, before fix incoming.

            //Add DHS namespace for Interspec compatibility
            XmlDocument xmldoc = new XmlDocument();
            XmlReaderSettings settings = new XmlReaderSettings { NameTable = new NameTable() };
            XmlNamespaceManager xmlns = new XmlNamespaceManager(settings.NameTable);
            xmlns.AddNamespace("DHS", "http://www.w3.org/2001/XMLSchema-instance");
            XmlParserContext context = new XmlParserContext(null, xmlns, "", XmlSpace.Default);
            //Add DHS namespace for Interspec compatibility
            RadInstrumentData radInstrumentData = new RadInstrumentData();
            using (XmlReader reader = XmlReader.Create(filename, settings, context))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants