How to run a random circuite in a batched way? #485

zipeilee · 2023-11-08T04:03:25Z

I wand to run a random unitary circuite in so many instance (like 1000 instance) and return the averge value like :

reg = zero_state(1)
mean([expect(Z, reg |> dispatch!(Rx,:random)) for _ in 1:1000])

but I want run it in a batched way, I know 幺 has batchedarrayreg, but I don't know how to performance random circuit in each instance to a batchedreg in a batched way. I tried

reg = zero_state(1,nbatch=1000)
expect(Z, reg|>dispatch!(Rx,:random))

but it seems not work, it just will pick one random instance 1000 times. What is the correct way?

The text was updated successfully, but these errors were encountered:

GiggleLiu · 2023-11-08T04:47:23Z

Unfortunately, there is no easy way to do that. You need to copy reg, because |> changes the state inplace.

julia> sum([expect(Z, copy(reg) |> dispatch!(Rx(0),rand()*2π)) for _ in 1:1000])/1000
0.027340815055604813 + 0.0im

zipeilee · 2023-11-08T05:40:18Z

In fact, I hope to use the parallel computing of the GPU for batch processing. But this does not seem to be a good use of the parallel computing of the GPU.

GiggleLiu · 2023-11-08T11:34:10Z

I see. In your case, I would suggest you writing a new kernel, since this features is not supported by Yao yet.

define a new gate type with batched parameters.
dispatch the gate to the correct instruct! function. The current single parameter rotation gate calls into this implementation:
https://github.com/QuantumBFS/CuYao.jl/blob/05f365f8f8e49fa2787df50a6e2226f508c94d80/src/instructs.jl#L19
You need to implement a new CUDA kernel (check bellow), it should not be too difficult if you know CUDA programming.

Hint of rewriting this instruct

instruct!(::Val{2}, state::DenseCuVecOrMat, U0::AbstractMatrix, locs::NTuple{M, Int}, clocs::NTuple{C, Int}, cvals::NTuple{C, Int})

The Val{2} means it is for qubit, rather than qudit.
The state is a vector or matrix as the register storage.
U0 is the gate matrix. In your case, you need to input a rank-3 tensor, and each batch stores a 2x2 matrix. In different CUDA thread, you should use different matrix.
locs is the locations that this bit applies on. For single qubit gate, it should only contain one element.
clocs and cvals should be empty tuple in the absense of control bits.

Please feel free to ask if you encounter any issue.

zipeilee · 2023-11-09T10:48:44Z

Thanks for your patience and guidance! Actually, I need compute a chain block with such one qubit gates layer and two qubits gate layer in many qubits. You give me a good advice, I will try it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to run a random circuite in a batched way? #485

How to run a random circuite in a batched way? #485

zipeilee commented Nov 8, 2023

GiggleLiu commented Nov 8, 2023

zipeilee commented Nov 8, 2023

GiggleLiu commented Nov 8, 2023 •

edited

zipeilee commented Nov 9, 2023

How to run a random circuite in a batched way? #485

How to run a random circuite in a batched way? #485

Comments

zipeilee commented Nov 8, 2023

GiggleLiu commented Nov 8, 2023

zipeilee commented Nov 8, 2023

GiggleLiu commented Nov 8, 2023 • edited

Hint of rewriting this instruct

zipeilee commented Nov 9, 2023

GiggleLiu commented Nov 8, 2023 •

edited