Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canonical measurements for container dimensions #1

Open
rpgoldman opened this issue Jun 21, 2021 · 18 comments
Open

Canonical measurements for container dimensions #1

rpgoldman opened this issue Jun 21, 2021 · 18 comments

Comments

@rpgoldman
Copy link
Collaborator

It's hard to provide clear specifications in a world that uses the measurement ontology, because measurements are things and if they are in the wrong properties, etc., we can get bad results.

But we would like the flexibility of putting arbitrary measurements on the containers.

One suggestion would be to have data properties on each of the containers that are measured in a canonical unit (probably millimetres), but also permit measurements and provide rules of some sort that will relate the canonical dimension properties to the properties that are measurement valued. That should allow us to, for example, automatically determine whether a plate will fit into a centrifuge, even if the plate's dimensions were recorded in the ontology in centimeters, or imperial units instead of metric.

Alternatively, we could demand that modelers enter dimensions in millimetres (but permit them to add other measurements in different units, if they so please).

Thoughts? @jakebeal

@jakebeal
Copy link
Member

If we're going to have a canonical unit, I believe it should be the actual canonical unit of meters, liters, etc.

Since people will want to enter their units in other volumes like mm, uL, and mL, I think that we should then add rules to support conversion. Maybe OM has them already too?

@danbryce
Copy link
Collaborator

I had the same issue with time. We are planning to use a library to do the conversion before checking the temporal constraints. Jack found this: https://pint.readthedocs.io/en/stable/

@rpgoldman
Copy link
Collaborator Author

@jakebeal -- I am a little worried about using the canonical unit of meters given potential floating point issues. Indeed, arguably we should use nanometers and integers instead of meters or millimeters and floats. But I don't think it matters what the canonical unit is, since the user should not have to see that.

Related to this, two further notes, which perhaps should be separate issues somewhere else:

  1. While the Measure ontology gives us a flexible way to enter specifications into the ontology, I don't see what it does for us on the property inspection side: if we have a length property in python, what should that return? A unit-qualified value using whatever measure information is contained in the data structure? Or a numerical value, without units, that use some canonical unit of measure?
  2. Most of the resources I have seen so far for the container ontology have had tolerances, so we get expressions like:
Measure and
  hasNumericalValue only xsd:float[>= 127.51f , <= 128.01f]) and
  hasUnit value millimetre

those are ok for using classification to decide if one Individual or all Individuals of a given Class match a requirement, but will obviously cause problems for SBOL-factory. I imagine one could just use the central value for these. More generally, we could add an annotation that specifies what the typical value is so that, for example we could make a guess when instantiating the ThermoFisher_EnduraPlate_96Well class.


@danbryce We used pint a little on the LogX program and it seemed to behave well. It was nice that it integrated with numpy and pandas

@jakebeal
Copy link
Member

Floating point issues matter for precision, not scale, so we shouldn't have any issue there.
Similarly, I see no reason that we can't subclass measure to add a precision.

@rpgoldman
Copy link
Collaborator Author

Let's return to the example we have above for dimension requirements (here given more completely):

length some 
  (Measure and
    hasNumericalValue only xsd:float[>= 127.51f , <= 128.01f]) and
    hasUnit value millimetre)

So we are saying that the length property must have some value that is a Measure object with the right units and its numerical value must be in the specified range.

This is automatically checkable (and note that this works in both ways -- if you tell me that something is an owl:subclassOf ANSI-SLAS-2004-1-compliant then an owl engine will know that it must have such a Measure in its length property.

But this also means that the length property cannot be functional -- because there can be multiple Measure properties in the length property, both with millimetre as unit and with other units for length used.

I'd argue that being able to say:

canonicalLength only xsd:float[>= 127.51f , <= 128.01f]

is both a lot simpler, and captures the fact that plates have only a single length, not multiple ones (that OWL has no way of ensuring will be consistent).

We can still use Measure objects, just also build from them classes that have properties in canonical units. Unfortunately, I believe that can only be done outside of the OWL engine, but it is possible that using SWRL rules would make this possible; I haven't checked.

An alternative might be to have rules like the original one, and just ensure that any Measure in non-canonical units is automatically augmented by a measure in canonical units. So if we got a Measure denominated in meters, for example, we would store it, but also store another inferred Measure denominated in millimeters. This would impose a requirement on any processing engine.

@jakebeal
Copy link
Member

In SBOL, the pattern we've used is to attach a Measure to an object, then have the measure indicate the a property that it quantifies. UML uses a similar pattern. That would deal with the multiplicity issue, because you'd have a Measure pointing to a property rather than the property pointing to a set of Measures.

SHACL can be used to check whether an object has a Measure that points to the required property with the canonical units --- that's the sort of pattern it's good at.

I do want to push back on the idea of using anything other than one of the two standard SI unit systems (either MKS or CGS would be OK, though I tend to default to MKS). The reasons that I think we should stick with one of the two SI systems are:

  1. The bias towards small units comes from the particular protocols we happen to be working with right now. Larger scale protocols will use much larger quantities, e.g., when scaling up to a 500 liter tank.
  2. We're eventually likely to be dealing with a bunch of other units as well for more specialized quantities, e.g., specifications of capacitance or transmittance or elasticity. If we stray off of SI, then even though we may be able to automatically calculate, we are likely to have multiplying bugs because nobody will have good intuitions about how the alternative derived unit comparisons.

@rpgoldman
Copy link
Collaborator Author

For the purposes of classification, we cannot use the SBOL pattern. We can’t use something that points to a container as a basis for categorizing that container as one that meets the requirements of a standard, a protocol, etc.

(Sorry if this is curt: I’m answering from my phone)

@jakebeal
Copy link
Member

When you get a chance to answer more fully, I would be interested in the blocker. I also think you may be slightly misunderstanding the pattern: the container would point to a set of measures that in turn indicate the properties of the container that they measure (e.g., well depth, well volume)

@rpgoldman
Copy link
Collaborator Author

When you get a chance to answer more fully, I would be interested in the blocker. I also think you may be slightly misunderstanding the pattern: the container would point to a set of measures that in turn indicate the properties of the container that they measure (e.g., well depth, well volume)

Sorry I didn't get around to answering this earlier. Here's an example of a class expression I use to define a plate the satisfies the constraints of SLAS-4-2004 for a 96-well plate:

Plate
 and (columnCount value 12)
 and (rowCount value 8)
 and (wellCount value 96)

I don't see how to do this kind of thing with the property axioms made "external" to the class definition as you suggest. At best it would be extremely cumbersome involving reifications in a way that couldn't easily be hidden from the programmer.

Where canonical units are concerned, I have axiomatizations like this:

Plate and 
  (height some 
    (Measure
     and (hasUnit value millimetre)
     and (hasNumericalValue only xsd:float[>= 13.59f , <= 15.11f]))))

I hope it's clear what's challenging about this -- we still require the modeler to supply height measured in millimetres, or the class definition will malfunction.

I don't have a great solution for this problem.

@rpgoldman
Copy link
Collaborator Author

For now I do all reasoning about plate heights using millimetre as the canonical unit. I wonder if there's a way I can force such heights to be present. Probably not. It will be up to the import process to make sure that any non-mm depth measurements are complemented by ones whose unit is millimetre.

@jakebeal
Copy link
Member

Can I suggest that it would be better to have the canonical unit be the base unit meter? I've seen terrible pain and failures around different people having different assumptions about what the "normal" unit is.

@rpgoldman
Copy link
Collaborator Author

@jakebeal I could move to using base units. Could you give me some advice about using OM to do that? In particular, I'm struggling to figure out how one uses OM2 to choose the right base unit.

It seems like I can find all the candidate units by matching a Dimension against the unit's hasDimension property. Then I can choose a SystemOfMeasurement and narrow down to the baseUnit for the Dimension in question. Is that correct?

An example in the OM-2 README suggests that I should be using both a Quantity and a Measure:

ex:_10Centimetres rdf:type om:Measure ;
  om:hasNumericalValue "10"^^xsd:double ;
  om:hasUnit om:centimeter .
ex:diameterOfApple1 om:hasValue ex:_10Centimetres ;
  a om:Diameter ;
  om:hasPhenomenon ex:apple1 .
ex:apple1 rdf:type ex:Apple .

So, this would suggest something like:

:_11mm a om:Measure ;
  om:hasNumericalValue "0.011"^^xsd:decimal ;
  om:hasUnit om:metre .
:_Corning96WellPlateHeight a om:Height ;
  om:hasPhenomenon cont:Corning96WellPlate ;
  om:hasValue :_11mm .
cont:Corning96WellPlate a cont:SLAS-4-2004-96WellPlate ;
  cont:hasWellHeight :_Corning96WellPlateHeight .

and in this case the relationship of cont:hasWellHeight would be made a subPropertyOf the inverse of om:hasPhenomenon (assuming there is such an inverse relation; I'm not sure).

Does this sound right? And how does it line up with how OM is being used in SBOL and PAML?

@rpgoldman rpgoldman reopened this Oct 15, 2021
@jakebeal
Copy link
Member

I haven't attempted to do this reasoning automatically yet.

@bbartley
Copy link
Collaborator

Hi @rpgoldman , whether correct or not, we didn't import either om:Dimension or om:Quantity into SBOL.

How I think we might express the relationships in your last example is more like this:

:_11mm a om:Measure ;
  om:hasNumericalValue "0.011"^^xsd:decimal ;
  om:hasUnit om:metre ;
  sbol:type om:Diameter .

Then I would probably use a SHACL rule to enforce the desired base unit on Measures of type Diameter

That is probably wrong from the perspective of using an OWL reasoner, but it is how I naively approach the problem.

@rpgoldman
Copy link
Collaborator Author

OK, I think the Right Thing for the container ontology will be to use the OM ontology as above, and then we will have some API that supports turning the container ontology entities into something in Python.

The container ontology already can't be translated by SBOL Factory, because it uses multiple inheritance and classification to decide whether or not a container is suitable. E.g.:

cont:ClearPlate and
 cont:SLAS-4-2004 and
 (cont:wellVolume some
    ((om:hasUnit value om:microlitre) and
     (om:hasNumericalValue only xsd:decimal[>= "200"^^xsd:decimal])))

Perhaps the easiest thing will be for us to have a micro-container OWL representation in PAML that can be translated back and forth from the container ontology.

That would be the right place to also deal with what works for numeric types in OWL (for some value of "works"), and what is appropriate for Python (translate all the numerical values into Python floats)

@photocyte
Copy link
Member

Hi there, sorry to bump this relatively older thread, just happened to notice the discussion on pint Python library, and the floating point concerns. Would the Python Decimal library (https://docs.python.org/3/library/decimal.html) be helpful?

@rpgoldman
Copy link
Collaborator Author

@photocyte I'm not sure if this would help, unless there would be some way to combine using exact decimals (rationals, really) with using pint to manage units.

Honestly, I'm not sure how the PAML library deals with units of measure; this would be more an issue for them.

For the container ontology, I believe it would be enough to translate all input measurements to canonical units. Note that this would not require us to remove the input measurements, whatever they were, just add the canonical measurement objects, as well.

@rpgoldman
Copy link
Collaborator Author

I was thinking of using SWRL in the container server to add the canonical measurement Individuals, but I'm not entirely sure how to make this whole ball of OWL-wax stick together...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants