Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add coo format #2314

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Add coo format #2314

wants to merge 1 commit into from

Conversation

Matt-Ord
Copy link
Contributor

@Matt-Ord Matt-Ord commented Feb 2, 2024

Checklist
Thank you for contributing to QuTiP! Please make sure you have finished the following tasks before opening the PR.

  • Please read Contributing to QuTiP Development
  • Contributions to qutip should follow the pep8 style.
    You can use pycodestyle to check your code automatically
  • Please add tests to cover your changes if applicable.
  • If the behavior of the code has changed or new feature has been added, please also update the documentation in the doc folder, and the notebook. Feel free to ask if you are not sure.
  • Include the changelog in a file named: doc/changes/<PR number>.<type> 'type' can be one of the following: feature, bugfix, doc, removal, misc, or deprecation (see here for more information).

Delete this checklist after you have completed all the tasks. If you have not finished them all, you can also open a Draft Pull Request to let the others know this on-going work and keep this checklist in the PR description.

Description
Add the sparse COO format.

Related issues or PRs
Please mention the related issues or PRs here. If the PR fixes an issue, use the keyword fix/fixes/fixed followed by the issue id, e.g. fix #1184

@Matt-Ord Matt-Ord force-pushed the Add-COO-format branch 2 times, most recently from 0b0f39f to 5f39d1f Compare February 2, 2024 15:16
@Matt-Ord Matt-Ord marked this pull request as ready for review February 2, 2024 15:16
@Matt-Ord Matt-Ord force-pushed the Add-COO-format branch 4 times, most recently from 97b9e16 to a2b9b8d Compare February 4, 2024 11:00
@Matt-Ord
Copy link
Contributor Author

Matt-Ord commented Feb 4, 2024

@Ericgig would you also be able to approve workflows for this - it should also be ready to review

Copy link
Member

@Ericgig Ericgig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started looking at it.

As few general comments:

  • idxint vs size_t. idxint is used for the shape of the matrix, per default it is 32bit to play nice with scipy. This mean that the max matrix size is (232, 232), but then for dense array, the data pointer can go up to 2**64. Thus it is not safe to change the types.
    For COO, you do not check if nnz overflow idxint which could eventually cause errors.

  • For many operations, you allow the creation of repeated elements. These repetition could easily be accumulated and without any clean up, I don't see how COO would be better than CSR in real use cases.

  • Focus on the COO layer only. Please revert changes in explicit_rk, sode, ssystem, qobjevo... We made it such as adding a new data layer could be done in a separate project without having to touch the existing code base. It's not perfect, no auto-differentiation for brmesolve with qutip-jax as is, but you should not need to update these files in this PR.

Lastly, please add commit to this PR instead of force-pushing a single commit. This allows me to focus on the changes only.

I will run the tests when I will have at least skimmed through everything.

Comment on lines +167 to +176
for ptr in range(left_nnz):
out.data[ptr] = left.data[ptr]
out.col_index[ptr] = left.col_index[ptr]
out.row_index[ptr] = left.row_index[ptr]

for ptr in range(right_nnz):
out.data[ptr + left_nnz] = scale * right.data[ptr]
out.col_index[ptr + left_nnz] = right.col_index[ptr]
out.row_index[ptr + left_nnz] = right.row_index[ptr]
out._nnz = nnz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a clean-up step: if a + (-a) is done, the output should have 0 elements, not 2*nnz(a).

Also it would be better to keep the entry sorted. If the index were sorted, this collision checking would be a lot smoother.

Comment on lines +120 to +123
@cython.boundscheck(False)
@cython.wraparound(False)
@cython.profile(False)
@cython.linetrace(False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are defined at the file level and not needed here.

Comment on lines +437 to +442
cpdef Dense iadd_dense_data(Dense left, Data right, double complex scale=1):
cdef Data out = add(left, right)
cdef idxint size = left.shape[0]* left.shape[1]
cdef Dense out_dense = _to(Dense, out)
memcpy(left.data, out_dense.data, size*sizeof(double complex))
return left
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scale is not applied.
What if out is fortran ordered and left is C ordered?

<double complex *>
PyDataMem_NEW_ZEROED(out.shape[0] * out.shape[1], sizeof(double complex))
)
if not out.data: raise MemoryError()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added error message for these in #2304.

Comment on lines +149 to 150
cdef idxint row=0, ptr, col
cdef size_t n = <size_t> sqrt(state.shape[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change the row type but not n?

Comment on lines +237 to +245
for ptr_l in range(coo.nnz(left)):
for ptr_r in range(coo.nnz(right)):
if left.col_index[ptr_l] == right.row_index[ptr_r]:
if out._nnz >= out.size:
out = coo.expand(out, 2 * out.size)
out.data[out._nnz] = scale * left.data[ptr_l] * right.data[ptr_r]
out.row_index[out._nnz] = left.row_index[ptr_l]
out.col_index[out._nnz] = right.col_index[ptr_r]
out._nnz += 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't dislike the use of expand here, you can easily compute the output's nnz since you allow duplicate entries.

@@ -40,9 +40,9 @@ cdef class Euler:

cdef Data a = system.drift(t, state)
b = system.diffusion(t, state)
cdef Data new_state = _data.add(state, a, dt)
cdef Dense new_state = _data.to(Dense, _data.add(state, a, dt))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solver should work with any data type, not just dense.

@@ -198,7 +300,7 @@ cpdef Dense add_dense(Dense left, Dense right, double complex scale=1):
return out


cpdef Dense iadd_dense(Dense left, Dense right, double complex scale=1):
cpdef Dense iadd_dense_dense(Dense left, Dense right, double complex scale=1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When only one data type is used, the function name is {operation}_{dtype}.

memcpy(left.data, out_dense.data, size*sizeof(double complex))
return left

cpdef Dense iadd_dense(Dense left, Data right, double complex scale=1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For function with multiple type: {operation}_{first_dtype}_{second_dtype}_..._{out_dtype}

Suggested change
cpdef Dense iadd_dense(Dense left, Data right, double complex scale=1):
cpdef Dense iadd_dense_data_dense(Dense left, Data right, double complex scale=1):

(Dia, Dia, Dia, add_dia),
], _defer=True)

from .convert import to as _to

cpdef Dense iadd_dense_data(Dense left, Data right, double complex scale=1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be inline in the function below.

Copy link
Member

@Ericgig Ericgig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank @Matt-Ord,
Finished reading through it.

As I said, creating a full data layer is lot of work. This is a good start, but feel a little bit rushed.

While we can add specialisation one at a time, to be in qutip/qutip, it need to be mostly complete. It need to at least have everything needed to run mesolve.
Please add kron expect_super and project.

Many of the functions you wrote are not added. (add_coo, mul_coo, ...).

@@ -210,7 +210,7 @@ cpdef bint isherm_dense(Dense matrix, double tol=-1):
if matrix.shape[0] != matrix.shape[1]:
return False
tol = tol if tol >= 0 else settings.core["atol"]
cdef size_t row, col, size=matrix.shape[0]
cdef idxint row, col, size=matrix.shape[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idxint is not safe here.

@@ -147,7 +147,7 @@ cpdef bint isherm_csr(CSR matrix, double tol=-1):

cpdef bint isherm_dia(Dia matrix, double tol=-1) nogil:
cdef double complex val, valT
cdef size_t diag, other_diag, col, start, end, other_start
cdef idxint diag, other_diag, col, start, end, other_start
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idxint is not safe here.

@@ -333,9 +331,9 @@ isherm.__doc__ =\
`qutip.settings.atol` is used instead.
"""
isherm.add_specialisations([
(CSR, isherm_csr),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to create the new functions here.

Comment on lines +141 to +143
pytest.param(factory(0.001, True), id="sparse"),
pytest.param(factory(0.8, True), id="filled,sorted"),
pytest.param(factory(0.8, False), id="filled,unsorted"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no sorted/unsorted variant supported yet...

Comment on lines +627 to +636
class TestTranspose(UnaryOpMixin):
def op_numpy(self, matrix):
return np.transpose(matrix)

specialisations = [
pytest.param(data.transpose_coo, COO, COO),
pytest.param(data.transpose_csr, CSR, CSR),
pytest.param(data.transpose_dense, Dense, Dense),
pytest.param(data.transpose_dia, Dia, Dia),
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestTranspose already exist.

@@ -121,12 +121,11 @@ def test_rand_unitary(dimensions, distribution, density, dtype):
density=density, dtype=dtype
)
I = qeye(dimensions)
assert random_qobj * random_qobj.dag() == I
np.testing.assert_array_almost_equal((random_qobj * random_qobj.dag()).full(), I.full())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated to this PR...

)
one_element.add_specialisations(
[
(CSR, one_element_csr),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to add COO specialisation here.


# Very little should be exported on star-import, because most stuff will have
# naming collisions with other type modules.
__all__ = ['CSR']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
__all__ = ['CSR']
__all__ = ['COO']


cdef object _coo_matrix(data, row, col, shape):
"""
Factory method of scipy csr_matrix: we skip all the index type-checking
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Factory method of scipy csr_matrix: we skip all the index type-checking
Factory method of scipy coo_matrix: we skip all the index type-checking

Comment on lines +271 to +273
cpdef COO sorted(COO matrix):
cdef COO out = empty_like(matrix)
return out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not sorting...

@Matt-Ord
Copy link
Contributor Author

Matt-Ord commented Feb 12, 2024

Thanks for the review, and sorry for the slow response - I still intend to get this in a more complete state but it will take some time. I am happy to add kron expect_super and project if that is all that is required but I don't want to get into the trap of doing everything in one massive pr. I don't really see the issue with doing any other specializations in a follow up pr (I could even split up the current specializations into separate prs if that makes it easier to get it merged incrementally...)

@Ericgig
Copy link
Member

Ericgig commented Feb 13, 2024

It's fine to not split the PR in too many chunks.
Yes it's easier to review, but I don't want to have it half implemented/merged at release, which is coming soon.
You could just add kron here. The other can be implemented using matmul and can wait.

It's fine to take some time. Just ping me when you want a re-review.

@Matt-Ord
Copy link
Contributor Author

@Ericgig for size_t vs idxint doesn't this mean nnz() should return a size_t?

@Ericgig
Copy link
Member

Ericgig commented Feb 15, 2024

Either nnz return size_t or raise an error when allocating matrix with too many elements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More detailed information for GSoC 2020 project
2 participants