Performance drop in QuTiP 5.0.1 vs QuTiP 4.7.6 #2406

gautierronan · 2024-04-24T12:11:14Z

Bug Description

While benchmarking dynamiqs vs QuTiP, I've noticed a large performance drop of qutip.mesolve when going from v4.7.6 to v5.0.1. On the example below, the benchmarks show:

# QuTiP 4.7.6
# 1.42 s ± 3.17 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# QuTiP 5.0.1
# 14.5 s ± 2.32 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

Found similar behavior on two different CPUs (mac M2, and AMD Ryzen 7).

Code to Reproduce the Bug

import timeit

import numpy as np
import qutip as qt


def init(
    kappa_2: float = 1.0,
    g_cnot: float = 0.3,
    nbar: float = 4.0,
    num_tsave: int = 100,
    N: int = 16,
):
    # time evolution
    alpha = np.sqrt(nbar)
    gate_time = np.pi / (4 * alpha * g_cnot)
    tlist = np.linspace(0.0, gate_time, num_tsave)

    # operators
    ac = qt.tensor(qt.destroy(N), qt.qeye(N))
    nt = qt.tensor(qt.qeye(N), qt.num(N))

    # Hamiltonian
    H = g_cnot * (ac + ac.dag()) * (nt - nbar)

    # collapse operators
    c_ops = [np.sqrt(kappa_2) * (ac**2 - nbar)]

    # initial state
    plus = (qt.coherent(N, alpha) + qt.coherent(N, -alpha)).unit()
    psi0 = qt.tensor(plus, plus)

    kwargs = {'H': H, 'rho0': psi0, 'tlist': tlist, 'c_ops': c_ops}
    return kwargs

kwargs = init()
%timeit qt.mesolve(**kwargs)

Your Environment

Numpy Version:      1.25.2
Scipy Version:      1.11.2
Cython Version:     0.29.37
Matplotlib Version: 3.7.2
Python Version:     3.11.4
Number of CPUs:     8
BLAS Info:          OPENBLAS
INTEL MKL Ext:      False
Platform Info:      Darwin (arm64)

The text was updated successfully, but these errors were encountered:

nwlambert · 2024-04-24T13:34:49Z

Thanks for pointing this out.. I had a quick play around, this seems to be some overhead from the normalization flag in mesolve. In v5 if I use options = {"normalize_output": False} I see basically same runtimes as v4.7. we should probably think about making this default to false if its having such a big cost.

To make it really "like for like" you can also wrap it so that it uses CSR format for everything

with qt.CoreOptions(default_dtype="csr"):
    kwargs = init()
    options = {"normalize_output": False}
    %timeit qt.mesolve(**kwargs, options = options)

Which data-type is best for is a bit problem dependent. let us know if this gives you comparable results!

On the topic of dynamiqs, you might be interested in trying out the jax/jaxdia data layer, there's some examples here
https://github.com/nwlambert/QuTiP-Jax-GPU-example

We also tried this example with dynamiqs (with its double precision setting), I think you guys came out slightly ahead speed-wise because I think you dont vectorize/use superoperators, which seems to suit jax/diffrax! though if i did that manually with qutip we essentially get the same, which I guess makes sense. The jaxdia data layer lets us go to a few more spins (more with sesolve, not so many with mesolve!)

gautierronan · 2024-04-24T13:49:48Z

Thanks, indeed the options = {"normalize_output": False} seems to do the trick, but I don't see why normalizing the output state would take so long? Is it normalized at every time step? The system is not that big, so computing the trace of a bunch of density matrices should be almost instantaneous.

On the topic of dynamiqs, you might be interested in trying out the jax/jaxdia data layer

Yes we're in the process of implementing a sparse DIA format as well. This is something I had suggested to @Ericgig when visiting Sherbrooke in early 2023, but you guys were much faster than us to make it work! I'm quite happy that this is being spread around. Maybe we'll be able to push cuSPARSE for a native DIA format at some point.

Ericgig · 2024-04-24T18:14:05Z

I looked at it too and the issue is with the norm function. The norm trace function is used (tr(sqrt(A @ A.dag()))) and it's quite slow. Using the normal trace does not cause visible slowdown.
We should be able to fix that on our side,

Ericgig mentioned this issue Apr 24, 2024

Faster dm norm in solvers #2408

Merged

Ericgig closed this as completed in #2408 Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance drop in QuTiP 5.0.1 vs QuTiP 4.7.6 #2406

Performance drop in QuTiP 5.0.1 vs QuTiP 4.7.6 #2406

gautierronan commented Apr 24, 2024 •

edited

nwlambert commented Apr 24, 2024 •

edited

gautierronan commented Apr 24, 2024

Ericgig commented Apr 24, 2024

Performance drop in QuTiP 5.0.1 vs QuTiP 4.7.6 #2406

Performance drop in QuTiP 5.0.1 vs QuTiP 4.7.6 #2406

Comments

gautierronan commented Apr 24, 2024 • edited

Bug Description

Code to Reproduce the Bug

Your Environment

nwlambert commented Apr 24, 2024 • edited

gautierronan commented Apr 24, 2024

Ericgig commented Apr 24, 2024

gautierronan commented Apr 24, 2024 •

edited

nwlambert commented Apr 24, 2024 •

edited