Skip to content

Using UN CEFACT Codes

MichaelAndrews-RM edited this page Jul 8, 2018 · 3 revisions

Schema contains abbreviations of units. I couldn't find any reference on the web that gave these abbreviations.

A wide range of quantitative values for properties utilize UN/CEFACT codes to indicate the measure units of the value.

For example, if specifying values relating to the properties of automobiles, the CEFACT codes provide a range of units for both metric, and non-metric units. The values of https://auto.schema.org/unitCode are codes, not "abbreviations". Codes are unique sequences of symbols that computers can understand more reliably because there is exactly one sequence per meaning (i.e. no syntactic/lexical variability) and the very same sequence always refers to the same meaning (i.e. no code collision).

The full list of approximately 1000 CEFACT codes is available as a PDF from the United Nations: https://www.unece.org/fileadmin/DAM/cefact/recommendations/rec20/rec20_rev3_Annex3e.pdf

A shorter list of the most commonly used metric CEFACT codes is available http://wiki.goodrelations-vocabulary.org/Documentation/UN/CEFACT_Common_Codes

When unitCode is used, the value is assumed to be a CEFACT code, unless a prefix with an colon is included indicating it is another kind of code.

How do I use the unitCode to indicate the units of measurement?

We will provide two examples. The first one, we want to indicate the weight of an automobile using the weight property.

It may not be obvious from the documentation, but the range of this property is

https://auto.schema.org/QuantitativeValue

The unit is encoded using the property

https://auto.schema.org/unitCode

So you can use various UN/CEFACT Common Code unit codes for a weight. For our example, we will specify the weight in kilograms using the CEFACT code, although we could have chosen another measure such as pounds (LBR). Note that the CEFACT code will be different from some common abbreviations. In CEFACT, kilogram is KGM, not KG (which represents a "keg" in CEFACT). The potential for such confusion is an important reason to use a standard code such as CEFACT.

<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Car",
"name": "Audi A3",
"weight": {
"@type": "QuantitativeValue",
"value": "1490",
"unitCode": "KGM"
}
}
</script>

The unitCode is also used when indicating values that the data publisher additionalProperty entity to express more complex values. For example, to indicate the voltage range of an electronic device, you will need to indicate the name of the value, the minimum and maximum values, and the units of the value (using the CEFACT code for volts, which is VLT).

<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Product",
"name": "XYZ Smartphone Charger",
   "additionalProperty":    
        {
      "@type": "PropertyValue",
      "maxValue": "250",
      "minValue": "100",
      "name": "Operating Voltage",
      "unitCode": "VLT"
    }
}
</script>    

Why did schema.org adopt the CEFACT codes, and not something else?

Schema.org reused standard codes for measurement developed by the United Nations. Reusing existing codes reduces the effort for specifying conceptual elements in the vocabulary. CEFACT codes are used internationally to support electronic commerce. They are widely used in business, which means that they are often already included in corporate databases.

CEFACT covers around 1000 different units of measure, and will likely meet the needs of most data publishers.

Although other national and industry codes for measurement units exist, they are not as comprehensive as CEFACT, and sometimes presenting naming conflicts.

Codes are machine readable. CEFACT codes are always two or three letters.

Codes don't involve in characters that are commonly used to abbreviate units such as hyphens, slashes, mathematical operators such as multiplication signs, and superscripts and subscripts. Abbreviations for units can be hard to abbreviate and encode in a standardized manner, making it difficult for machines to parse.

Can I use a different code?

Data publishers are encouraged to use of UN/CEFACT codes where available for optimal precision.

For units of quantitative data, you can use any unit of measurement you want, as long as there is either a UN/CEFACT Common Code or a URI for it, or if you can establish a standard prefix plus unique identifiers for it.

How do I indicate units if there is no CEFACT code for the measurement unit I want to specify?

You can use the unitText property. unitText is a string or text indicating the unit of measurement. It is useful if you cannot provide a standard unit code for unitCode.

For example, page sizes are not covered by CEFACT, although they are standard units of measurement defined by the International Standards Organization (ISO). You can indicate the ISO-designation of the value through the unitText.

<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": ["Product", "Book"],
"name": "ABC Notebook",
"numberOfPages": "200",
   "additionalProperty":  
        {
      "@type": "PropertyValue",
      "value": "A5",
      "name": "Paper size",
      "unitText": "ISO"
    }
}    
</script>

When using unitText, try to use standard abbreviations.

The unit of measure I need to use doesn't have a standard code or abbreviation

In some cases, data publishers will need to specify a unit for which there is no CEFACT code. This is most likely for measures that are historical, unique to a country, or aren't widely used internationally. In this example, we specify the size of a bottle of sparkling wine as "Magnum."

<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type": "Product",
"name" : "White sparkling wine",
"additionalProperty":
{
      "@type": "PropertyValue",
      "name": "Bottle size",
      "value": "Magnum",
      "sameAs": "https://www.wikidata.org/wiki/Q23502"
    }
}
 </script>   

If there's any possibility the quantity of the value can be interpreted more than one way, it's helpful to indicate units separately from the name and from the value. This avoids the problem of needing to indicate the units as part of the name (e.g. "Size expressed in X units"), or within the value itself ("6 units").