Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalars on units on right-hand of -> are silently ignored #305

Open
Quelklef opened this issue Jan 30, 2024 · 9 comments
Open

Scalars on units on right-hand of -> are silently ignored #305

Quelklef opened this issue Jan 30, 2024 · 9 comments
Milestone

Comments

@Quelklef
Copy link

Hi! If I write a scalar multiple on the right-hand side of a unit conversion, such as in 1 MB -> (B * 100), the 100 seems to be silently ignored. See below:

>>> 1 MB -> B

  1 megabyte ➞ byte

      = 1_000_000 B    [DigitalInformation]
>>> 1 MB -> (B * 37)

  1 megabyte ➞ byte × 37

      = 1_000_000 B    [DigitalInformation]
>>> 1 MB -> B * 37

  1 megabyte ➞ byte × 37

      = 1_000_000 B    [DigitalInformation]

I would expect 1 MB -> B * 37 to be evaluated as (1 MB -> B) * 37

Apologies if this has already been reported; I wasn't able to find an existing issue

@Quelklef
Copy link
Author

Observed on numbat.dev as of 2024-01-30, FWIW

@Quelklef
Copy link
Author

Also, thanks for the lovely tool :-). It eases the pain on many day-to-day computations for me!

@eminence
Copy link
Contributor

Hi, can you expand a bit on what your expectations are?

Is 1 MB -> B * 37 the same thing as 1 MB * 37 -> B ?

@Quelklef
Copy link
Author

Hi! I would expect that one of three things would happen:

  1. 1 MB -> B * 37 would be equivalent to (1 MB -> B) * 37, which current seems to evaluate as (1 MB * 37) -> B
  2. 1 MB -> B * 37 would be parsed as 1 MB -> (B * 37), where B * 37 is the unit "37 bytes", so the output would be something like27027.027 (37 B) [DigitalInformation]
  3. 1 MB -> B * 37 would be an error of some kind

@eminence
Copy link
Contributor

eminence commented Jan 31, 2024

For me personally, I would expect 1 MB -> B * 37 to be an error. It's valid syntax, but semantically I don't think it makes much sense.

Option 2 is interesting, as it's a way to define anonymous adhoc units. Today you'd have to do something like:

>>> unit ThirtySevenBytes = 37 B

  unit ThirtySevenBytes: DigitalInformation = 37 byte

>>> 1 MB -> ThirtySevenBytes

  1 megabyte ➞ ThirtySevenBytes

    = 27027.0 ThirtySevenBytes    [DigitalInformation]

Edit: If we wanted to allow option 2, then maybe a more explicit syntax would be better: 1 MB -> unit 37 B

@sharkdp
Copy link
Owner

sharkdp commented Jan 31, 2024

I have thought about this multiple times in the past.

In my opinion, there are two approaches. The first approach is how Numbat currently behaves. The second is what you propose (if I understand correctly):

  1. a -> b converts the expression a to the unit of expression b (ignore the value of b)
  2. a -> b converts the expression a to value_of(a/b) × b (take the value of b into account as well)

To illustrate this, imagine you want to know how many 45 min slots (b) you can fit into 6 hours (a). Yes, of course: you can just divide a / b and get your answer (8), but I think it's reasonable to want to phrase this as a conversion problem.

In approach (1), we have something like

  6 hours ➞ 45 min

    = 360 min    # the '45' is ignored completely, we simply convert to minutes

In approach (2), we would probably print something like

  6 hours ➞ 45 min

    = 8 × 45 min

unless b has a value of 1, in which case we would skip the ... × 1 ... part and just print the same thing that we had in approach 1.

I think there is an advantage here compared to just typing 6 hours / 45 min and getting a plain 8 as a result. It's pretty clear in this example, but in the division case, someone might accidentally use the wrong order, type b / a when they actually wanted a / b, and get the inverse result without any error (still results in a scalar). However, if you accidentally type 45 min -> 6 hours, you would get 0.125 × 6 hours and immediately realize the mistake.

The other advantages are:

  • It still works the same way as approach 1 in most use cases, but can fulfill additional ones.
  • We can use this (or something very similar) to convert to constants, something that has been requested a few times already. You would just ask for 500 km/s -> speed_of_light and it would result in 4.6e-7 speed_of_light, instead of the (correct but) completely useless 138.889 m/s result that we currently print.

Just a curiosity: we can implement something similar to approach in Numbat already:

>>> fn convert<T>(a: T, b: T) = if value_of(b) == 1 then "{a -> b}" else "{value_of(a/b->1)} × {b}"
>>> convert(6 hours, min)

    = 360 min    [String]

>>> convert(6 hours, 45 min)

    = 8 × 45 min    [String]

@bsidhom
Copy link

bsidhom commented Feb 21, 2024

I think the use case for actually using scalars on the right is more obvious if you consider rate units. For example, 1/(30 mpg) -> L/(100 km). It would be really nice to have the correct units pop out (in L/100 km). I assume the reason for this behavior is so that you can put a variable on the right hand side and have units of the left hand side match that (I recall reading about this somewhere in the docs). Would it be possible and reasonable to support both use cases? IMO, it would make more sense to respect scales in the units (those do semantically change the units scale just line SI prefixes) and have some auxiliary mechanism to extract just the base units from a variable if that’s what’s desired. Then you can use this base unit on the right hand side.

On the other hand, rendering of the scalar might be confusing without a proper rational type in the units. For example, it wouldn’t be ideal to display the above example in 0.01 L/km. Is there any support for rationals at the moment?

@bsidhom
Copy link

bsidhom commented Feb 21, 2024

(Note that the above is what GNU Units does when producing output units.)

@sharkdp
Copy link
Owner

sharkdp commented Feb 21, 2024

I think the use case for actually using scalars on the right is more obvious if you consider rate units. For example, 1/(30 mpg) -> L/(100 km).

That is a good example, yes!

I assume the reason for this behavior is so that you can put a variable on the right hand side and have units of the left hand side match that (I recall reading about this somewhere in the docs).

Kind of the other way around. The current implementation treats lhs -> rhs as a binary operator between two expressions. And it simply takes the unit of the rhs expression and converts to that. Ignoring whatever value rhs has.

It was easy to implement it like this and I kind of liked that both sides are simply expressions. No special parsing rules, no additional AST nodes, etc.

And that implementation happened to fulfill this … -> variable use case.. which I found by accident and then put into the documentation. But it's not like this was intentionally designed this way.

It has been increasingly clear to me that we want -> to be something more powerful. And in fact, in the latest release, we already have a special feature where x -> f is a synonym for f(x), if f is of function type Fn((type x) -> _). We do this to support things like 2^31 -> hex or 120 hours -> human.

Would it be possible and reasonable to support both use cases? IMO, it would make more sense to respect scales in the units (those do semantically change the units scale just line SI prefixes) and have some auxiliary mechanism to extract just the base units from a variable if that’s what’s desired.

Yes! I think with the feature proposed here, we could still fulfill the "convert to unit of variable" use case, if someone really needs that. We already have that mechanism to extract the unit of a quantity. And it would even look much clearer, because we would call expr -> unit_of(variable).

And the new expr -> variable use case is even more important, I think. Imagine someone wants to see an electric charge in units of the electron_charge. They would simply do my_charge -> electron_charge. Or convert a speed to fractions of the speed of light with my_velocity -> speed_of_light.

So let's implement this!

On the other hand, rendering of the scalar might be confusing without a proper rational type in the units. For example, it wouldn’t be ideal to display the above example in 0.01 L/km. Is there any support for rationals at the moment?

There is no support for rationals, but I'm thinking about it (#4 (comment))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants