Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to run a random circuite in a batched way? #485

Open
zipeilee opened this issue Nov 8, 2023 · 4 comments
Open

How to run a random circuite in a batched way? #485

zipeilee opened this issue Nov 8, 2023 · 4 comments

Comments

@zipeilee
Copy link

zipeilee commented Nov 8, 2023

I wand to run a random unitary circuite in so many instance (like 1000 instance) and return the averge value like :

reg = zero_state(1)
mean([expect(Z, reg |> dispatch!(Rx,:random)) for _ in 1:1000])

but I want run it in a batched way, I know 幺 has batchedarrayreg, but I don't know how to performance random circuit in each instance to a batchedreg in a batched way. I tried

reg = zero_state(1,nbatch=1000)
expect(Z, reg|>dispatch!(Rx,:random))

but it seems not work, it just will pick one random instance 1000 times. What is the correct way?

@GiggleLiu
Copy link
Member

Unfortunately, there is no easy way to do that. You need to copy reg, because |> changes the state inplace.

julia> sum([expect(Z, copy(reg) |> dispatch!(Rx(0),rand()*2π)) for _ in 1:1000])/1000
0.027340815055604813 + 0.0im

@zipeilee
Copy link
Author

zipeilee commented Nov 8, 2023

In fact, I hope to use the parallel computing of the GPU for batch processing. But this does not seem to be a good use of the parallel computing of the GPU.

@GiggleLiu
Copy link
Member

GiggleLiu commented Nov 8, 2023

I see. In your case, I would suggest you writing a new kernel, since this features is not supported by Yao yet.

  1. define a new gate type with batched parameters.
  2. dispatch the gate to the correct instruct! function. The current single parameter rotation gate calls into this implementation:
    https://github.com/QuantumBFS/CuYao.jl/blob/05f365f8f8e49fa2787df50a6e2226f508c94d80/src/instructs.jl#L19
    You need to implement a new CUDA kernel (check bellow), it should not be too difficult if you know CUDA programming.

Hint of rewriting this instruct

instruct!(::Val{2}, state::DenseCuVecOrMat, U0::AbstractMatrix, locs::NTuple{M, Int}, clocs::NTuple{C, Int}, cvals::NTuple{C, Int})
  1. The Val{2} means it is for qubit, rather than qudit.
  2. The state is a vector or matrix as the register storage.
  3. U0 is the gate matrix. In your case, you need to input a rank-3 tensor, and each batch stores a 2x2 matrix. In different CUDA thread, you should use different matrix.
  4. locs is the locations that this bit applies on. For single qubit gate, it should only contain one element.
  5. clocs and cvals should be empty tuple in the absense of control bits.

Please feel free to ask if you encounter any issue.

@zipeilee
Copy link
Author

zipeilee commented Nov 9, 2023

Thanks for your patience and guidance! Actually, I need compute a chain block with such one qubit gates layer and two qubits gate layer in many qubits. You give me a good advice, I will try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants