Generated operations are in a different order on different platforms #343

gareth-cross · 2023-06-07T02:45:13Z

Describe the bug
Operations are placed in a different order on different platforms. This can result in unexpected numerical issues in some cases.

Testing details:

Linux: Ubuntu 22.04, Python 3.8.16, x86
Mac: OSX Ventura 13.3.1, Python 3.8.16, Apple M2/ARM
Symforce v0.8.0, installed from pip

To Reproduce

Run the following script to generate a test function on both OSX and Ubuntu:

from pathlib import Path

import symforce
symforce.set_symbolic_api('symengine')
symforce.set_epsilon_to_number(1.0e-16)

from symforce import geo
from symforce import symbolic as sm
from symforce.codegen import Codegen, CppConfig
from symforce import typing as T


def test_function(time: T.Scalar) -> geo.V4:
    rate = sm.pi / 6
    angle = rate * time
    x_axis = geo.V3([-sm.cos(angle), -sm.sin(angle), 0.0])
    z_axis = geo.V3([0, 0, 1])
    y_axis = z_axis.cross(x_axis)

    world_R_body = geo.Matrix33.column_stack(x_axis, y_axis, z_axis)
    world_R_body_quat = geo.Rot3.from_rotation_matrix(R=world_R_body)
    quat_elements = geo.V4(world_R_body_quat.to_storage())

    world_R_body_subbed = world_R_body.subs(time, 3.01).evalf()
    quat_elements_subbed = quat_elements.subs(time, 3.01).evalf()
    print(world_R_body_subbed)
    print(quat_elements_subbed)
    return quat_elements


if __name__ == '__main__':
    cg = Codegen.function(func=test_function, config=CppConfig())
    path = Path(__file__).parent.absolute()
    cg.generate_function(output_dir=path / 'output')

Observe the output of the python script (the values of world_R_body_subbed and quat_elements_subbed)

[0.00523596383141919, 0.999986292247427, 0.0]
[-0.999986292247427, 0.00523596383141919, 0.0]
[0.0, 0.0, 1.0]

[0.0]
[0.0]
[-0.705253158861618]
[0.708955557080773]

Observe that the order of operations is different on the two platforms. This is the exact diff:

35,42c35,43
<   const Scalar _tmp4 = -_tmp1;
<   const Scalar _tmp5 = -std::max<Scalar>(1, std::max<Scalar>(_tmp4, _tmp3 + 1));
<   const Scalar _tmp6 = 1 - std::max<Scalar>(0, -(((_tmp4 + _tmp5) > 0) - ((_tmp4 + _tmp5) < 0)));
<   const Scalar _tmp7 = std::min<Scalar>(1, _tmp6);
<   const Scalar _tmp8 = std::min<Scalar>(_tmp6, 1 - std::max<Scalar>(0, _tmp7));
<   const Scalar _tmp9 = _tmp5 + 1;
<   const Scalar _tmp10 = std::min<Scalar>(1 - std::max<Scalar>(0, std::max<Scalar>(_tmp7, _tmp8)),
<                                          1 - std::max<Scalar>(0, -(((_tmp9) > 0) - ((_tmp9) < 0))));
---
>   const Scalar _tmp4 = _tmp3 + 1;
>   const Scalar _tmp5 = -_tmp1;
>   const Scalar _tmp6 = -std::max<Scalar>(1, std::max<Scalar>(_tmp4, _tmp5));
>   const Scalar _tmp7 = 1 - std::max<Scalar>(0, -(((_tmp5 + _tmp6) > 0) - ((_tmp5 + _tmp6) < 0)));
>   const Scalar _tmp8 = std::min<Scalar>(1, _tmp7);
>   const Scalar _tmp9 = std::min<Scalar>(_tmp7, 1 - std::max<Scalar>(0, _tmp8));
>   const Scalar _tmp10 =
>       std::min<Scalar>(1 - std::max<Scalar>(0, std::max<Scalar>(_tmp8, _tmp9)),
>                        1 - std::max<Scalar>(0, -(((_tmp6 + 1) > 0) - ((_tmp6 + 1) < 0))));
46,47c47,48
<       1 - std::max<Scalar>(0, std::max<Scalar>(_tmp10, std::max<Scalar>(_tmp7, _tmp8))),
<       1 - std::max<Scalar>(0, -(((_tmp3 + _tmp9) > 0) - ((_tmp3 + _tmp9) < 0))));
---
>       1 - std::max<Scalar>(0, std::max<Scalar>(_tmp10, std::max<Scalar>(_tmp8, _tmp9))),
>       1 - std::max<Scalar>(0, -(((_tmp4 + _tmp6) > 0) - ((_tmp4 + _tmp6) < 0))));
53,54c54,55
<   _res(0, 0) = (Scalar(1) / Scalar(2)) * _tmp7 * (1 - _tmp7);
<   _res(1, 0) = (Scalar(1) / Scalar(2)) * _tmp8 * (1 - _tmp8);
---
>   _res(0, 0) = (Scalar(1) / Scalar(2)) * _tmp8 * (1 - _tmp8);
>   _res(1, 0) = (Scalar(1) / Scalar(2)) * _tmp9 * (1 - _tmp9);

Run the C++ test code and evaluate at time=3.01:

#include <fmt/format.h>

#include "output/cpp/symforce/sym/test_function.h"

int main() {
  const double time = 3.01;
  auto v = sym::TestFunction(time);
  fmt::print("time = {:>6}, v = {:>6}, {:>6}, {:>6}, {:>6}\n", time, v[0], v[1],
             v[2], v[3]);
  return 0;
}

Observe that the numerical output is non-trivially different:

Mac version (resultant quaternion has zero norm):
time =   3.01, v =      0,      0,      0,      0
Linux version (resultant quaternion matches python code):
time =   3.01, v =      0,      0, -0.7052531588616179, 0.7089555570807733

NOTE: In this particular case, there may additionally be a bug in Rot3.from_rotation_matrix - the epsilon value is unused. This example just highlights how unpredictable behavior emerges from this issue.

Expected behavior
Ideally order of operations would be identical on all operating systems.

Environment

OS and version: Ubuntu 22.04, OSX Ventura
Python version: 3.8
SymForce Version 0.8

The text was updated successfully, but these errors were encountered:

aaron-skydio · 2023-06-07T17:47:54Z

This is a weird one - we had similar issues in the past when SymEngine was built with different implementations of std::unordered_map. I'm a little surprised we haven't seen this, we should definitely fix

NOTE: In this particular case, there may additionally be a bug in Rot3.from_rotation_matrix - the epsilon value is unused. This example just highlights how unpredictable behavior emerges from this issue.

Yep, good callout - this is not a bug but is definitely surprising - our current implementation of from_rotation_matrix doesn't require an epsilon, but we still have the argument mostly to not break the API. In general we aim for our generated code to be identical everywhere, and treat it as a bug if it's not, regardless of whether evaluated numerical results are meaningfully different

gareth-cross added the bug Something isn't working label Jun 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generated operations are in a different order on different platforms #343

Generated operations are in a different order on different platforms #343

gareth-cross commented Jun 7, 2023

aaron-skydio commented Jun 7, 2023 •

edited

Generated operations are in a different order on different platforms #343

Generated operations are in a different order on different platforms #343

Comments

gareth-cross commented Jun 7, 2023

aaron-skydio commented Jun 7, 2023 • edited

aaron-skydio commented Jun 7, 2023 •

edited