New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add coo format #2314
base: master
Are you sure you want to change the base?
Add coo format #2314
Conversation
0b0f39f
to
5f39d1f
Compare
97b9e16
to
a2b9b8d
Compare
@Ericgig would you also be able to approve workflows for this - it should also be ready to review |
a2b9b8d
to
6dfb5a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started looking at it.
As few general comments:
-
idxint
vssize_t
.idxint
is used for the shape of the matrix, per default it is 32bit to play nice with scipy. This mean that the max matrix size is (232, 232), but then for dense array, the data pointer can go up to 2**64. Thus it is not safe to change the types.
For COO, you do not check if nnz overflowidxint
which could eventually cause errors. -
For many operations, you allow the creation of repeated elements. These repetition could easily be accumulated and without any clean up, I don't see how COO would be better than CSR in real use cases.
-
Focus on the COO layer only. Please revert changes in explicit_rk, sode, ssystem, qobjevo... We made it such as adding a new data layer could be done in a separate project without having to touch the existing code base. It's not perfect, no auto-differentiation for brmesolve with qutip-jax as is, but you should not need to update these files in this PR.
Lastly, please add commit to this PR instead of force-pushing a single commit. This allows me to focus on the changes only.
I will run the tests when I will have at least skimmed through everything.
for ptr in range(left_nnz): | ||
out.data[ptr] = left.data[ptr] | ||
out.col_index[ptr] = left.col_index[ptr] | ||
out.row_index[ptr] = left.row_index[ptr] | ||
|
||
for ptr in range(right_nnz): | ||
out.data[ptr + left_nnz] = scale * right.data[ptr] | ||
out.col_index[ptr + left_nnz] = right.col_index[ptr] | ||
out.row_index[ptr + left_nnz] = right.row_index[ptr] | ||
out._nnz = nnz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be a clean-up step: if a + (-a)
is done, the output should have 0
elements, not 2*nnz(a)
.
Also it would be better to keep the entry sorted. If the index were sorted, this collision checking would be a lot smoother.
@cython.boundscheck(False) | ||
@cython.wraparound(False) | ||
@cython.profile(False) | ||
@cython.linetrace(False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are defined at the file level and not needed here.
cpdef Dense iadd_dense_data(Dense left, Data right, double complex scale=1): | ||
cdef Data out = add(left, right) | ||
cdef idxint size = left.shape[0]* left.shape[1] | ||
cdef Dense out_dense = _to(Dense, out) | ||
memcpy(left.data, out_dense.data, size*sizeof(double complex)) | ||
return left |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scale is not applied.
What if out is fortran ordered and left is C ordered?
<double complex *> | ||
PyDataMem_NEW_ZEROED(out.shape[0] * out.shape[1], sizeof(double complex)) | ||
) | ||
if not out.data: raise MemoryError() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We added error message for these in #2304.
cdef idxint row=0, ptr, col | ||
cdef size_t n = <size_t> sqrt(state.shape[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change the row
type but not n
?
for ptr_l in range(coo.nnz(left)): | ||
for ptr_r in range(coo.nnz(right)): | ||
if left.col_index[ptr_l] == right.row_index[ptr_r]: | ||
if out._nnz >= out.size: | ||
out = coo.expand(out, 2 * out.size) | ||
out.data[out._nnz] = scale * left.data[ptr_l] * right.data[ptr_r] | ||
out.row_index[out._nnz] = left.row_index[ptr_l] | ||
out.col_index[out._nnz] = right.col_index[ptr_r] | ||
out._nnz += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I don't dislike the use of expand
here, you can easily compute the output's nnz since you allow duplicate entries.
@@ -40,9 +40,9 @@ cdef class Euler: | |||
|
|||
cdef Data a = system.drift(t, state) | |||
b = system.diffusion(t, state) | |||
cdef Data new_state = _data.add(state, a, dt) | |||
cdef Dense new_state = _data.to(Dense, _data.add(state, a, dt)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This solver should work with any data type, not just dense.
@@ -198,7 +300,7 @@ cpdef Dense add_dense(Dense left, Dense right, double complex scale=1): | |||
return out | |||
|
|||
|
|||
cpdef Dense iadd_dense(Dense left, Dense right, double complex scale=1): | |||
cpdef Dense iadd_dense_dense(Dense left, Dense right, double complex scale=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When only one data type is used, the function name is {operation}_{dtype}.
memcpy(left.data, out_dense.data, size*sizeof(double complex)) | ||
return left | ||
|
||
cpdef Dense iadd_dense(Dense left, Data right, double complex scale=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For function with multiple type: {operation}_{first_dtype}_{second_dtype}_..._{out_dtype}
cpdef Dense iadd_dense(Dense left, Data right, double complex scale=1): | |
cpdef Dense iadd_dense_data_dense(Dense left, Data right, double complex scale=1): |
(Dia, Dia, Dia, add_dia), | ||
], _defer=True) | ||
|
||
from .convert import to as _to | ||
|
||
cpdef Dense iadd_dense_data(Dense left, Data right, double complex scale=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be inline in the function below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank @Matt-Ord,
Finished reading through it.
As I said, creating a full data layer is lot of work. This is a good start, but feel a little bit rushed.
While we can add specialisation one at a time, to be in qutip/qutip, it need to be mostly complete. It need to at least have everything needed to run mesolve
.
Please add kron
expect_super
and project
.
Many of the functions you wrote are not added. (add_coo
, mul_coo
, ...).
@@ -210,7 +210,7 @@ cpdef bint isherm_dense(Dense matrix, double tol=-1): | |||
if matrix.shape[0] != matrix.shape[1]: | |||
return False | |||
tol = tol if tol >= 0 else settings.core["atol"] | |||
cdef size_t row, col, size=matrix.shape[0] | |||
cdef idxint row, col, size=matrix.shape[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idxint
is not safe here.
@@ -147,7 +147,7 @@ cpdef bint isherm_csr(CSR matrix, double tol=-1): | |||
|
|||
cpdef bint isherm_dia(Dia matrix, double tol=-1) nogil: | |||
cdef double complex val, valT | |||
cdef size_t diag, other_diag, col, start, end, other_start | |||
cdef idxint diag, other_diag, col, start, end, other_start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idxint
is not safe here.
@@ -333,9 +331,9 @@ isherm.__doc__ =\ | |||
`qutip.settings.atol` is used instead. | |||
""" | |||
isherm.add_specialisations([ | |||
(CSR, isherm_csr), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You forgot to create the new functions here.
pytest.param(factory(0.001, True), id="sparse"), | ||
pytest.param(factory(0.8, True), id="filled,sorted"), | ||
pytest.param(factory(0.8, False), id="filled,unsorted"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no sorted/unsorted variant supported yet...
class TestTranspose(UnaryOpMixin): | ||
def op_numpy(self, matrix): | ||
return np.transpose(matrix) | ||
|
||
specialisations = [ | ||
pytest.param(data.transpose_coo, COO, COO), | ||
pytest.param(data.transpose_csr, CSR, CSR), | ||
pytest.param(data.transpose_dense, Dense, Dense), | ||
pytest.param(data.transpose_dia, Dia, Dia), | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestTranspose already exist.
@@ -121,12 +121,11 @@ def test_rand_unitary(dimensions, distribution, density, dtype): | |||
density=density, dtype=dtype | |||
) | |||
I = qeye(dimensions) | |||
assert random_qobj * random_qobj.dag() == I | |||
np.testing.assert_array_almost_equal((random_qobj * random_qobj.dag()).full(), I.full()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unrelated to this PR...
) | ||
one_element.add_specialisations( | ||
[ | ||
(CSR, one_element_csr), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to add COO specialisation here.
|
||
# Very little should be exported on star-import, because most stuff will have | ||
# naming collisions with other type modules. | ||
__all__ = ['CSR'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__all__ = ['CSR'] | |
__all__ = ['COO'] |
|
||
cdef object _coo_matrix(data, row, col, shape): | ||
""" | ||
Factory method of scipy csr_matrix: we skip all the index type-checking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Factory method of scipy csr_matrix: we skip all the index type-checking | |
Factory method of scipy coo_matrix: we skip all the index type-checking |
cpdef COO sorted(COO matrix): | ||
cdef COO out = empty_like(matrix) | ||
return out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not sorting...
Thanks for the review, and sorry for the slow response - I still intend to get this in a more complete state but it will take some time. I am happy to add kron expect_super and project if that is all that is required but I don't want to get into the trap of doing everything in one massive pr. I don't really see the issue with doing any other specializations in a follow up pr (I could even split up the current specializations into separate prs if that makes it easier to get it merged incrementally...) |
It's fine to not split the PR in too many chunks. It's fine to take some time. Just ping me when you want a re-review. |
@Ericgig for size_t vs idxint doesn't this mean nnz() should return a size_t? |
Either |
Checklist
Thank you for contributing to QuTiP! Please make sure you have finished the following tasks before opening the PR.
You can use pycodestyle to check your code automatically
doc
folder, and the notebook. Feel free to ask if you are not sure.doc/changes/<PR number>.<type>
'type' can be one of the following: feature, bugfix, doc, removal, misc, or deprecation (see here for more information).Delete this checklist after you have completed all the tasks. If you have not finished them all, you can also open a Draft Pull Request to let the others know this on-going work and keep this checklist in the PR description.
Description
Add the sparse COO format.
Related issues or PRs
Please mention the related issues or PRs here. If the PR fixes an issue, use the keyword fix/fixes/fixed followed by the issue id, e.g. fix #1184