Best method to compute gradients, Jacobians and Hessians #25933
-
I work on solving nonlinear optimization problems. The state of the art method to compute derivatives is through automatic differentiation which is provided by libraries like JAX, Aesera, casadi etc. Symbolic expression can be converted into functions via lambdify. Within labdify options can be passed and they can evaluated using JAX or aeseara function. From a point of computational efficiency, which of the approaches is better? A
Remark: sym diff is much slower than auto diff, in priciple I expect this to be slowest B
C
Secondly, in which situation can I benefit from using sympy over the above mentioned appraoches? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 9 replies
-
If you are concerned with the performance of numerical evaluation, then your A will most likely provide the fastest result. If you are concerned with the performance of generating the numerical functions, then A may likely be slower than B and C. Automatic differentiation is only "state of the art" in the sense that it allows you to differentiate more arbitrary numerical codes. But there is a performance cost for using these more generalized methods. I show in this recent blog post a solution to a nonlinear optimization problem where careful implementations that follows your "A" gives very high performance numerical evaluation: https://mechmotum.github.io/blog/czi-sympy-wrapup.html which I doubt any alternative methods you mention above can beat. opty solves optimal control problems using your method A and we have introduced a very fast symbolic differentiating algorithm there. pycollo also solves optimal control problems but does what you describe in "C" and relies on Casadi's compute graph, but Casadi currently can't manage such a large compute graph, so SymPy is the only option. Bottom line is that you have to very precisely define your performance metrics and then try A, B, and C for specific problems. Opty supports JAX and pycollo supports Casadi, so you can do some comparisons easily with those two tools. For small expression trees, then there may be little difference, but for large ones my bet is on method A for the highest numerical evaluation performance that retains maximum accuracy. |
Beta Was this translation helpful? Give feedback.
-
The problem with lambdify is that it "compiles" to Python code and Python code is not fast for this kind of floating point evaluation. Actually with pypy this sort of thing is a lot faster so that might be worth trying with You either want to generate C code as @moorepants says or you should use something that compiles to machine code in memory like numba or llvmlite. You might want to try SymPy's llvmjitcode module. First In [1]: from sympy.printing.llvmjitcode import llvm_callable as lambdify_llvm
In [2]: N = 200
In [3]: # sympy
...: expr=x
...: for i in range(N):
...: expr= sin(expr)
...:
In [4]: %time ed = expr.diff(x)
CPU times: user 8.05 s, sys: 2.17 s, total: 10.2 s
Wall time: 10.3 s
In [5]: f = lambdify_llvm([x], ed)
In [6]: %time f(1)
CPU times: user 1.34 ms, sys: 225 µs, total: 1.56 ms
Wall time: 1.58 ms
Out[6]: 0.0013500947626989782
In [7]: %time f(1)
CPU times: user 1.54 ms, sys: 0 ns, total: 1.54 ms
Wall time: 1.56 ms
Out[7]: 0.0013500947626989782
In [9]: %timeit f(1)
425 µs ± 1.43 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
In [11]: %timeit [f(v) for v in vals]
43.1 ms ± 71.2 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) Also here are timings for protosym using llvmlite ( In [1]: from protosym.simplecas import sin, x, lambdify
In [2]: N = 200
In [3]: expr=x
...: for i in range(N):
...: expr= sin(expr)
...:
In [4]: %time ed = expr.diff(x)
CPU times: user 24.1 ms, sys: 15 µs, total: 24.1 ms
Wall time: 22.6 ms
In [6]: %time f = lambdify([x], ed) # uses llvm
CPU times: user 71.9 ms, sys: 7.44 ms, total: 79.3 ms
Wall time: 77.4 ms
In [7]: vals = list(map(float,np.linspace(1,10,100)))
In [8]: %time f(1)
CPU times: user 65 µs, sys: 9 µs, total: 74 µs
Wall time: 89.6 µs
Out[8]: 0.0013500947626989782
In [9]: %timeit [f(v) for v in vals]
593 µs ± 753 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each) Also SymEngine plus llvmlite ( In [10]: from symengine import sin,lambdify,symbols
In [11]: x = symbols('x')
In [12]: expr=x
...: for i in range(N):
...: expr= sin(expr)
...:
In [13]: %time ed = expr.diff(x)
CPU times: user 13.4 ms, sys: 23 µs, total: 13.4 ms
Wall time: 16.2 ms
In [14]: %time f = lambdify([x], [ed]) # uses llvm
CPU times: user 67.7 ms, sys: 38.2 ms, total: 106 ms
Wall time: 337 ms
In [15]: %time f(1)
CPU times: user 203 µs, sys: 16 µs, total: 219 µs
Wall time: 234 µs
Out[15]: array(0.00135009)
In [16]: %timeit [f(v) for v in vals]
1.71 ms ± 17.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) |
Beta Was this translation helpful? Give feedback.
If you are concerned with the performance of numerical evaluation, then your A will most likely provide the fastest result.
If you are concerned with the performance of generating the numerical functions, then A may likely be slower than B and C.
Automatic differentiation is only "state of the art" in the sense that it allows you to differentiate more arbitrary numerical codes. But there is a performance cost for using these more generalized methods.
I show in this recent blog post a solution to a nonlinear optimization problem where careful implementations that follows your "A" gives very high performance numerical evaluation: https://mechmotum.github.io/blog/czi-sympy-wrapup.html which …