Skip to content

rwilson4/unit_parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

unit_parser: Unit Parser and Conversions

Latest Release latest release
License licence
Build Status travis build status
Code Coverage

Installing

Install from PyPI:

$ pip install unit_parser

Usage

This library is primarily for parsing strings representing physical quantities, like "5 feet" or "88 miles_per_hour". It can also be used for converting between compatible units and doing basic arithmetic operations.

The parsing function does double duty as a method for converting between units and is thus called "convert".

  >>> from unit_parser import unit_parser
  >>> up = unit_parser()
  >>> up.convert("3 gallons", "liters")
    11.356235352

It may seem a little strange to have the number as part of the string, but keep in mind this function is used in the context of converting text inputs. For convenience, the---perhaps more intuitive---syntax works as well:

  >>> up.convert(3, "gallons", "liters")
    11.356235352

Note the unit parser must be initialized before being used by calling the unit_parser() function without any arguments. That uses the built-in unit specification file to define the units recognized by this library. If a unit is not supported, you can create your own unit specification file and provide the file name to this function.

The next thing we see is that physical quantities and units are represented by strings. I find this to be the most intuitive way of interacting with physical quantities. (Aside, something like "3 gallons" is a physical quantity, while "liters" is a unit.)

Since we represent physical quantities by strings, it is trivial to use the convert function to parse an input in unknown units, but ensure it is in the appropriate units for the needs of the program. For example, suppose we have a program that does calculations on volumes of water. Suppose there is an input (e.g. from a JSON file) representing how much volume of water is to be used. The user is free to specify the input in whatever units are most convenient, e.g. "2 gallons". The code that is parsing this input might call:

  >>> import json
  >>> from unit_parser import unit_parser
  >>> up = unit_parser()
  >>> config = json.load(open('example.json', 'r'))
  >>> water_volume = config['water_volume']
  >>> water_volume_liters = up.convert(water_volume, "liters")

In this way, no assumptions need to be made about what units the input is in. The code requires the water volume to be expressed in liters, but it doesn't need to know or care how it was specified.

Although the above example seems trivial, perhaps the most interesting feature is the flexible unit specification parsing. Units may be specified by a definition file (built-in, or provide your own!), or by combinations of units defined in that file.

For example, if 'second', 'meter', and 'kilogram' are defined by the file, the specification 'kilogram_meter_per_second_squared' is valid and parsed as expected. These compound specifications consist of tokens separated by underscores. Tokens include previously defined units like 'kilogram' as well as the special keywords 'per', 'squared', and 'cubed'.

The keyword 'per' may be used at most once per specification and directs that all subsequent tokens belong in the denominator of the unit.

The keywords 'squared' and 'cubed' indicate that the preceding token should be repeated once or twice more, respectively. They cannot be daisy chained, for example: 'meters_squared_squared' is not permitted, but 'meters_squared_meters_squared' or 'meters_cubed_meters' would be. They also only apply to the preceding token, so 'second_second_meter_meter' is equivalent to 'second_squared_meter_squared' but not 'second_meter_squared'.

Finally, 'second', 'seconds', and 'sec' are not automatically treated as equivalent units, but the unit definition file can and does create these as if they were aliases. For example, 'seconds' is defined as '1 second'.

We also permit simple arithmetic operations on units. There are functions "add", "subtract", "multiply", and "divide". Each function takes three arguments: two physical quantities, and the desired units of the answer.

  >>> from unit_parser import unit_parser
  >>> up = unit_parser()
  >>> up.add("5 meters", "2 feet", "yards")
    6.13473315836
  >>> up.subtract("5 meters", "2 feet", "yards")
    4.80139982502
  >>> up.multiply("5 meters_per_sec_squared", "2 kg", "pounds")
    2.248089431
  >>> up.divide("5 meters", "2 sec", "mph")
    5.59234073014

As mentioned above, this library ships with a unit specification file. It contains many of the most common units, but you may find some glaring omissions. For your particular use case, you may prefer to create your own unit specification file and include that with your application. The unit specification file syntax is simple, if not intuitive. The file is plain-text, with key-value pairs separated by a colon. Comments may be included and begin with a pound symbol (#). Units are either specified as primitives, or in terms of other units. A primitive definition consists of specifying the signature of the unit, which is represented as a vector of non-negative integers. For example:

second: [1 0 0]

The entries of this vector correspond to the exponents of units in an arbitrary order that nonetheless needs to be consistent throughout the application. We might arbitrarily decide that time units go first, then length, then mass. Then since force has dimension mass times length divided by time squared, all units of force have signature [-2 1 1]. The way we define the second really just tells the program which index (the first) corresponds to a particular primitive. A more complete specification might look like this:

second: [1 0 0]
meter: [0 1 0]
kilogram: [0 0 1]
minute: 60 second
hour: 60 minute
# We can define a newton either like this:
newton: 1 kilogram_meter_per_second_squared
# or like this:
# newton: [-2 1 1]
# (first way preferred).

Once we have defined the "primitive" units, it is simple and intuitive to define other units recursively in terms of previously specified units. For example, the newton could have been defined in terms of its signature, but it is better to define it in terms of kilograms, meters, and seconds. A pound (of force) could not have been defined in terms of its signature, because although it has the same signature, it has a different quantity. Meaning, a pound is not equal to one kilogram meter per second squared, and defining units in terms of signatures implicitly assumes the quantity is one.

The included unit specification file uses MKS (meters, kilograms, seconds) as the base, but as long as the internal specification is consistent, that is transparent to the user of such a file. A unit specification file using imperial units as the base would be just as valid, and the end user would never even notice.

If you are the sort of person that enjoys reading esoteric Wikipedia pages, the flexible syntax gives the fun opportunity to look up the official definitions of units, and using that in the file. For example, it turns out the slug, which is a unit of mass in the imperial system, is actually defined as one pound of force times one second squared per foot. So this unit of mass is actually defined in terms of a unit of force, even though conceptually mass seems like the more primitive notion! The included unit file tries to be faithful to these definitions. The reader may quite reasonably ask if it makes any difference. The answer is no, except perhaps for the fun of it. (If you are not the sort of person who enjoys reading esoteric Wikipedia articles, this whole paragraph makes me sound like a weirdo.)

Finally, this library ships with a command line utility called "convert". This can be run from the command line like so:

$ convert 5 feet to meters
1.524

(The "to" is optional, but I find it more intuitive to include it.)

About

Unit Parser and Conversions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages