Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disappearing variables #391

Open
aquohn opened this issue Jul 28, 2021 · 6 comments
Open

Disappearing variables #391

aquohn opened this issue Jul 28, 2021 · 6 comments

Comments

@aquohn
Copy link

aquohn commented Jul 28, 2021

I'm trying to optimise an expression with 4 variables:

1/2 * sqrt((2/(t^2) + 1/(1-x)) * 2^(l - H))

with precondition

0 < t < 1 and 0 < x < 1 and l > 1 and H > 0

However, the final output is independent of x!
image

This was on my local install, with the following reproduce block:

herbie shell --seed 236171898 
(FPCore (t x l H)
  :name "1/2 * sqrt((2/(t^2) + 1/(1-x)) * 2^(l - H))"
  :precision binary64
  :pre (and (and (and (< 0.0 t 1.0) (< 0.0 x 1.0)) (> l 1.0)) (> H 0.0))
  (* (/ 1.0 2.0) (sqrt (* (+ (/ 2.0 (pow t 2.0)) (/ 1.0 (- 1.0 x))) (pow 2.0 (- l H))))))

x seems to disappear when Herbie Taylor expands around 0.

Running the online version gave me
image

With reproduce block

herbie shell --seed 1 
(FPCore (t x l H)
  :name "1/2 * sqrt((2/(t^2) + 1/(1-x)) * 2^(l - H))"
  :precision binary64
  :pre (and (and (and (< 0.0 t 1.0) (< 0.0 x 1.0)) (> l 1.0)) (> H 0.0))
  (* (/ 1.0 2.0) (sqrt (* (+ (/ 2.0 (pow t 2.0)) (/ 1.0 (- 1.0 x))) (pow 2.0 (- l H))))))

However, trying an equivalent version of the same expression on the online demo gives
image

x also disappears in the Taylor expansion step. The reproduce block is as follows.

herbie shell --seed 1 
(FPCore (t x l H)
  :name "1/2 * sqrt((2/(t^2) + 1/(1-x))) * sqrt(2^(l - H))"
  :precision binary64
  :pre (and (and (and (< 0.0 t 1.0) (< 0.0 x 1.0)) (> l 1.0)) (> H 0.0))
  (* (* (/ 1.0 2.0) (sqrt (+ (/ 2.0 (pow t 2.0)) (/ 1.0 (- 1.0 x))))) (sqrt (pow 2.0 (- l H)))))

I'm quite confused as to what's going on here; does it set x to 0 and then forget about it somehow during the Taylor expansion step?

@pavpanchekha
Copy link
Contributor

Great question. This sort of thing sometimes happens when Herbie tries a Taylor expansion and finds that a zeroth-order expansion is just as accurate. Kind of surprised how it felt that way in this case though! I'll try to dig into this expression at some point just to double check that nothing went wrong.

@fgreen
Copy link

fgreen commented Mar 9, 2023

In case it helps, I think sqrt(x*x + y*y + z*z) hits a similar issue:
https://herbie.uwplse.org/demo/5bc1447606a6726fb2937ef86767db0bf4aab8b1.1.6/graph.html#graphs

@pavpanchekha
Copy link
Contributor

Hi all, I had some spare time today and did a deep-dive into this issue, to get a deeper sense of what's going on than my reply above.

A quick reminder of the problem: When given this benchmark, herbie returns the following program:

[x, y, z] = sort([x, y, z])
hypot(x, z)

This is bizarre because y is unused, and also because a great alternative is available:

hypot(x, hypot(y, z))

Herbie’s result has an accuracy of something like 99.3%, while the better alternative has an accuracy of something like 99.9999%. On specific inputs, Herbie’s result is especially bad. On (1, 1, 1), the worst case, the error is about 22%. The alternative has an ULP of error on some points, but no more than that.

Let's recall my earlier explanation. The one-hypot version works because most points Herbie samples have x, y, and z with dramatically different orders of magnitude. After sorting, either x or z is the biggest, while y is the one in the middle. So Herbie's one-hypot version uses the biggest input—sometimes the biggest two—and that's more or less all that counts for most inputs. Cases where all three are reasonably big are very rare, which is why Herbie thinks its result is OK.

Today I looked into this a bit more, and got a fuller picture. The key is that Herbie ends up considering three similar programs: hypot(x, y), hypot(x, z), and hypot(y, z).

Now, for most points (3/4 of them), the three variables have different signs, and hypot(x, z) is best on those, as discussed in the previous paragraph. However, in a few cases all three variables have the same sign. In this case, either hypot(x, y) or hypot(y, z) is good, because those now consider the two largest inputs.

There is a worst case scenario, (1, 1, 1), where none of the three programs is good, but these are extremely rare and Herbie doesn't sample them. On the points we can actually sample, one of the three one-hypot variations is going to be good.

This causes Herbie to discard the two-hypot version. Internally, Herbie builds something called an alt-table, which (to simplify a bit) stores only the single best program on each sampled point. Here, the one-hypot versions are better than the two-hypot versions, because both have no error and the one-hypot version is cheaper.

Of course, this is a trap. Different inputs need different one-hypot versions, while the two-hypot version is good everywhere; Herbie doesn't take this into account. At the end of a run, when Herbie is actually called to output a single program, it picks the best one-hypot version, which indeed is OK, but the two-hypot version that it discarded earlier would have been better.

So the way I characterize the problem here is to say that our alt-table rule—evaluate all programs point-by-point and keep the best program at each point is what steers us wrong. On any individual point (except extremely rare ones), the two-hypot option loses to one of the one-hypot option. But it is better overall.

The big challenge here is how to fix things. The alt-table rule is important for filtering, because Herbie constructs thousands of options when working on this problem, and we can't store all of them. (Later passes like regimes are O(N^2) in the number of options.) It also isn't a bad thing that Herbie generates the one-hypot variations; Herbie is supposed to be robust to a few bad options sneaking in, and also if you really care about speed the one-hypot version may in fact be preferable.

Anyway, I'll leave this issue open—not because I don't want to fix it but because there's no easy fix. As we keep working on Herbie, perhaps we'll find ways to work around this!

@Carandiru0
Copy link

Carandiru0 commented Jul 4, 2023

@pavpanchekha pavpanchekha
this didn't happen on the previous version of Herbie, is the Taylor expansion a new feature?
I recently came into this problem with:
fma(fma(a, m, a), s, s) * l * 20.0
where Herbie returns a result of:
s * ((20 + 20*a) * l
where m is somehow dropped. The equations are not equal !
Is this simple math error a regression?
I would consider this a serious issue, rendering any output by Herbie into question and useless!
Is it because there are too many independent variables?

I read your explanation, dated Apr 14th. Isn't this issue a showstopper bug?

The main thing is, the earlier versions of Herbie didn't do this.

@pavpanchekha
Copy link
Contributor

Taylor expansion has been in Herbie since version 0.9, and we haven't made major changes in the last version. (But we did make it more obvious that the sort instruction is an important part of the output.) We've talked about replacing Taylor with something more sophisticated like Chebyshev or Remez approximation, but at the moment there are architectural issues that make that very difficult.

You can check if Herbie used Taylor approximations by looking at the "Derivation" below any alternative suggested by Herbie. I take it this was the submission you were talking about?

https://herbie.uwplse.org/demo/ea53ed137edb112a5651954cffeb139864572772.2.0/graph.html

@Carandiru0
Copy link

Yes that is the submission. The only reason I suspected Taylor expansion was due to it being labelled as so in the derivation's that you can open like you mentioned.
Anyways, I've never seen a result where the derivation is mathematically incorrect on Herbie until now. If the submission is helpful, great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants