Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atexit.unregister function speeds up computation sometimes #258

Open
agoscinski opened this issue Aug 16, 2018 · 1 comment
Open

atexit.unregister function speeds up computation sometimes #258

agoscinski opened this issue Aug 16, 2018 · 1 comment

Comments

@agoscinski
Copy link

I have two (almost) identical simulations of a circuit in two different Ipython notebook. In one Ipython notebook the simulation is way faster than in the other. It is independent from the notebook because I know both cases happened in both notebooks. I appended a snippet of the logs of %prun at the end of the post. I checked in both cases the C++ compiler should be used (I assume this because of the usage of the _simulator.py:55(init) file in the log). I assume I did some initialization differently, but I do not know what. In the faster case the atexit.unregister function is used way longer, but I do not quite understand the effect of this function on the simulation.

I initialize the engine and circuit and do the measurement within the scipy.optimize.minimize function. So the broad code structure is

scipy.optimize.minimize(experiments, init_parameter, ...)
def experiments(parameter):
    for in range(100):
        eng = projectq.MainEngine(backend=projectq.backends.Simulator(gate_fusion=True), engine_list=[])
        q = eng.allocate_qubit()
        circuit(parameter, eng, q)
        projectq.ops.Measure | q
        eng.flush()
        do some non-projectq related postprocessing

But I also tried to put all code in one function and initialize the engine at different spots, but it seems to not change the result.

Here a snapshot of the log of the fast computation

         1066623 function calls (1066621 primitive calls) in 3.150 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      766    1.272    0.002    1.272    0.002 {built-in method atexit.unregister}
    13000    0.277    0.000    1.098    0.000 _simulator.py:350(_handle)
    14600    0.180    0.000    1.295    0.000 _simulator.py:422(receive)
    23421    0.135    0.000    0.135    0.000 {built-in method numpy.core.multiarray.array}
    23400    0.117    0.000    0.221    0.000 {built-in method __new__ of type object at 0x9e5d60}
     7200    0.115    0.000    0.130    0.000 {built-in method numpy.core.multiarray.dot}
    23401    0.090    0.000    0.090    0.000 {built-in method _warnings.warn}
    23400    0.085    0.000    0.544    0.000 defmatrix.py:112(__new__)
    14600    0.064    0.000    1.419    0.000 _command.py:86(__init__)
   217841    0.053    0.000    0.053    0.000 {built-in method builtins.isinstance}
    13600    0.045    0.000    0.057    0.000 _basics.py:123(make_tuple_of_qureg)
    37800    0.039    0.000    0.120    0.000 defmatrix.py:164(__array_finalize__)
    16800    0.037    0.000    0.038    0.000 {built-in method builtins.sorted}
     9800    0.034    0.000    0.243    0.000 _basics.py:166(generate_command)
      800    0.033    0.000    1.607    0.002 <ipython-input-7-1a377ac0e731>:1(h2_bk_circuit)
     7200    0.030    0.000    0.320    0.000 _gates.py:55(matrix)
     2200    0.026    0.000    0.453    0.000 _metagates.py:190(__or__)
    14600    0.024    0.000    0.059    0.000 _command.py:214(control_qubits)
      800    0.024    0.000    0.030    0.000 _simulator.py:55(__init__)
    14600    0.022    0.000    0.026    0.000 _command.py:173(_order_qubits)
    14600    0.021    0.000    0.026    0.000 _command.py:263(engine)
    14600    0.017    0.000    0.100    0.000 _command.py:109(<listcomp>)
    29200    0.017    0.000    0.772    0.000 _command.py:109(<genexpr>)
     4800    0.016    0.000    0.121    0.000 _gates.py:211(matrix)
    14600    0.014    0.000    0.040    0.000 _command.py:123(qubits)
     9000    0.013    0.000    0.258    0.000 _gates.py:68(matrix)
      800    0.013    0.000    3.056    0.004 <ipython-input-8-990688fbed52>:2(run_h2_bk_circuit)
    37600    0.013    0.000    0.021    0.000 _basics.py:202(__eq__)
     8200    0.012    0.000    1.341    0.000 _basics.py:184(__or__)
     9800    0.012    0.000    1.168    0.000 _command.py:47(apply_command)
     1600    0.012    0.000    0.068    0.000 _basics.py:134(deallocate_qubit)
    14600    0.011    0.000    1.354    0.000 _main.py:268(send)
      800    0.010    0.000    0.015    0.000 _main.py:57(__init__)
    21600    0.010    0.000    0.010    0.000 _qubit.py:44(__init__)

and here from the slow computation

         1048203 function calls (1048201 primitive calls) in 44.361 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    13000   41.300    0.003   43.490    0.003 _simulator.py:350(_handle)
     7200    1.401    0.000    1.420    0.000 {built-in method numpy.core.multiarray.dot}
    23421    0.164    0.000    0.164    0.000 {built-in method numpy.core.multiarray.array}
    23400    0.147    0.000    0.188    0.000 {built-in method __new__ of type object at 0x9e5d60}
    23401    0.115    0.000    0.115    0.000 {built-in method _warnings.warn}
    23400    0.103    0.000    0.589    0.000 defmatrix.py:112(__new__)
   217041    0.075    0.000    0.075    0.000 {built-in method builtins.isinstance}
    13800    0.074    0.000    0.288    0.000 _command.py:86(__init__)
    13800    0.065    0.000   43.576    0.003 _simulator.py:422(receive)
    13600    0.057    0.000    0.074    0.000 _basics.py:123(make_tuple_of_qureg)
    16000    0.046    0.000    0.047    0.000 {built-in method builtins.sorted}
    37800    0.045    0.000    0.060    0.000 defmatrix.py:164(__array_finalize__)
     9800    0.041    0.000    0.303    0.000 _basics.py:166(generate_command)
      800    0.040    0.000   27.540    0.034 <ipython-input-4-a0c9527638d5>:4(h2_bk_circuit)
     7200    0.036    0.000    1.648    0.000 _gates.py:55(matrix)
     2200    0.032    0.000    7.462    0.003 _metagates.py:190(__or__)
    13800    0.027    0.000    0.071    0.000 _command.py:214(control_qubits)
    13800    0.026    0.000    0.033    0.000 _command.py:173(_order_qubits)
    13800    0.025    0.000    0.031    0.000 _command.py:263(engine)
      779    0.020    0.000    0.020    0.000 {built-in method atexit.unregister}
     4800    0.019    0.000    0.145    0.000 _gates.py:211(matrix)
    13800    0.019    0.000    0.027    0.000 _command.py:109(<listcomp>)
     1600    0.018    0.000    3.011    0.002 _basics.py:85(allocate_qubit)
      800    0.016    0.000    0.024    0.000 _simulator.py:55(__init__)
    27600    0.016    0.000    0.056    0.000 _command.py:109(<genexpr>)
     9000    0.016    0.000    0.244    0.000 _gates.py:68(matrix)
     9800    0.016    0.000   34.245    0.003 _command.py:47(apply_command)
    13800    0.015    0.000    0.049    0.000 _command.py:123(qubits)
    36800    0.015    0.000    0.028    0.000 _basics.py:202(__eq__)
     8200    0.015    0.000   27.376    0.003 _basics.py:184(__or__)
      800    0.015    0.000   37.818    0.047 <ipython-input-5-0573d3543f43>:5(run_h2_bk_circuit)
    13800    0.014    0.000   43.620    0.003 _main.py:268(send)
     1600    0.013    0.000    6.508    0.004 _basics.py:134(deallocate_qubit)
    52633    0.012    0.000    0.012    0.000 {built-in method builtins.len}
    20000    0.012    0.000    0.012    0.000 _qubit.py:44(__init__)
      800    0.011    0.000    0.016    0.000 _main.py:57(__init__)
     2400    0.011    0.000    0.023    0.000 _basics.py:243(__init__)
     9800    0.010    0.000    0.017    0.000 {built-in method builtins.all}
     2400    0.010    0.000    0.010    0.000 {built-in method builtins.round}
     8200    0.010    0.000    0.024    0.000 defmatrix.py:261(tolist)
    17000    0.008    0.000    0.008    0.000 _basics.py:65(__init__)
     2400    0.008    0.000    0.073    0.000 _gates.py:231(matrix)
    10600    0.008    0.000   34.235    0.003 _main.py:258(receive)
     9800    0.008    0.000    0.008    0.000 _basics.py:179(<listcomp>)
     1600    0.007    0.000    6.534    0.004 _qubit.py:121(__del__)

@thomashaener
Copy link
Contributor

Are you able to reproduce this?
Did you export OMP_NUM_THREADS=1 for the fast version? Because if you only use a single qubit, you shouldn't be using multiple threads. Also, gate fusion should be turned off for a single qubit since the simulator will perform a bunch of matrix-matrix multiplications if it's on, instead of just matrix-vector mult.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants