Overhaul of Qobj dimension handling #1476

jakelishman · 2021-01-20T23:35:22Z

jakelishman
Jan 20, 2021
Collaborator

This document is a design specification for new dimensions handling. It is only a draft right now; please feel free to offer comments and suggestions.

If you only read one section, read "Overview" inside "Proposal" to get the gist of what will happen.

Background

For an extended discussion of some of the problems in 4.x branch dimension handling, see #1320.

With the new system, we aim to solve a few main problems:

Qobj type inference should be instantaneous
Binary operation dimension compatibility tests should be instantaneous
Dimension/shape equality tests should be instantaneous
Invalid dimensions should not be representable
Currently, dimension handling is the major overhead in Qobj because these problems are not solved.

The allowed exception to point 1 is if we include a short-hand notation to represent dimension objects; we may allow a "pure-Python" representation (effectively the current dimension specification) to be parsed into new-form dimension objects for user convenience.

Certain objects, like the excitation-number-restricted spaces (enr) may not have "compatible" dimensions and shapes. This may need further discussion elsewhere.

Proposal

Overview

The principle change is to make dimension objects singleton instances of classes. All Qobj of the same dimension will have a reference to the exact same object, which has all the expensive operations already calculated.

Internally, dimensions will represented in a very pure linear algebra manner. A dimension object is a single Dims object, which is exactly one of the subclasses:

Space representing a vector space
- Space(size: int) is a standard ket
- Space(Map(...)) is an operator-ket
Space(Map(from, to)) representing an operator in column-stacked format.
Map(from, to) representing some mapping Dims -> Dims
Field used only to represent the absence of Dims in Map and Compound. A Qobj may not have this dims; it would simply be a complex number.
Compound(dims1, dims2, ...) for tensor-product spaces.

The current Qobj.type values (with no tensor products) map like so:

ket: Space
bra: Map(Space, Field)
oper: Map(Space, Space)
operator-ket: Space(Map(Space, Space))
operator-bra: Map(Space(Map(Space, Space)), Field)
super: Map(Space(Map(Space, Space)), Space(Map(Space, Space)))

Users will not have to type out such monstrosities as the mapping for super; the current QuTiP dims syntax will be parsed into these types, but internally this form will almost completely remove parsing costs.

Some explicit mappings between the current list syntax and the new parsed syntax:

[[2], [1]] = Space(2): a qubit pure state
[[2, 10], [1, 1]] = Compound(Space(2), Space(10)): a pure state of a qubit space tensor-product a 10-element space
[[2, 2], [2, 2]] = Map(Compound(Space(2), Space(2)), Compound(Space(2), Space(2))): a square operator on a two-qubit state
[[4, 5], [2, 2]] = Map(Compound(Space(2), Space(2)), Compound(Space(4), Space(5))): (note the reversal) a non-square operator that takes a two-qubit state to a tensor-product space between a 4-element space and 5-element space
[[[2], [2]], [[2], [2]]] = Map(Space(Map(Space(2), Space(2))), Space(Map(Space(2), Space(2)))): a superoperator acting on square operators on qubit spaces.
[[1], [5]] = Map(Space(5), Field) is a bra for a 5-element space.

The current Qobj.type attribute will be stored within the dimension object; unlike the list format, each object is unambiguously one single type (1D spaces are a problem in list form). Similarly, the "size" of a given dimension object is stored within it.

How this solves the problems

Type inference is removed as a problem; each Qobj type has only one unambiguous representation when expressed as dimension objects. The actual name of the type could be stored as a string attached to the objects to maintain compatibility with the 4.x branch.

Dimension compatibility test speed are solved by having dimensions represented by singleton class instances like the Python builtin None. The reason to use a singleton class is to replace == tests with is tests; the former is structural equality and requires walking the tensor structure, whereas the latter is referential equality, and is true if and only if the two operands are the same object in memory. For example the dimensions test of the add operation is now left.dims is right.dims, which is the same speed as comparing two integers.

Dimension/shape compatibility is solved by attaching size information into the singleton classes. As the dimension objects are singletons, the size of a dimension object is calculated only on creation of the object. All subsequent Qobj that are of the same dimensions as one that came earlier will consequently reuse the same dimensions object, which already calculated its size. This avoids (relatively) expensive calls to np.prod on Python lists.

The current list syntax allows for invalid dimensions to be represented such as [2, 1] (should be [[2], [1]], probably). These sorts of failures cannot be represented in the new system. Similarly, [[2], [1], [1]] cannot be represented as the Map constructor will take only two arguments.

Problems this does not immediately solve

Since QuTiP uses matrices to represent linear algebra objects, we tie ourselves to working in some particular basis. For example, it is invalid to add a vector in the Pauli-Z basis to one in the Pauli-X basis by element-wise addition, but QuTiP has no way of knowing if this is what the user is doing, and will simply allow it because the dimensions will match. This is still the case if the user used Qobj.transform to get from one to the other; it is one case where we have to trust that the user is doing the right thing, rather than enforcing correctness. In the future, the system proposed here could be extended to enforce this; the dims parameter would be renamed basis, and some unique identifier would be attached to each Space object. This would allow us to safely define basis-transformation "operators"; they would have the dimensions object Map(State(2, 'paulix'), State(2, 'pauliz')), or something to that effect.

In #1320, I mentioned the possibility of a new 'scalar' type object. Here, this is effectively the Field subtype. There is a choice to be made whether Compound(Field, Field) should be Field (implicit contraction of 1D spaces), or whether we should keep track of "missing" spaces. The missing spaces are useful in principle in QIP settings for defining local operations on subsets of the whole system, but right now we do not have the mathematics backend to implement this completely. For now, I propose we keep track of all the missing spaces; it allows this extension in the future, with no cost right now.

Implementation details

All objects will be completely immutable, and all their construction arguments will be as well (e.g. State will take only int, which is immutable). This means that singleton instances can be found by looking them up in a global store, similar to Python's builtin package or its import system.

The singleton nature of the dimension classes is achieved by defining __new__ for the instances, and not __init__. The former is effectively a class method, while the latter is an instance method; since we want instances representing the same object to be unique, we don't want it.

In order to maintain referential equality, tensor-product operations must move into a canonical form. Calling Compound(Compound(x, y), z) must return the same object as Compound(x, y, z). Internally this parsing is easy; if one is using the new object constructors, Python evaluation order guarantees that they will flatten themselves; so long as the Compound constructor unpacks Compound objects at a depth of 1, the whole object will always be as flat as possible.

The tensor product will be expanded by having Compound "thread" over Map. This effectively expands the mathematicians' definition of the tensor product to allows us to continue to represent "silly" objects such as

    tensor(qeye(5), basis(2).dag())

which is an odd object that contracts one element of a tensor-product space down to the field and leaves the other. This will report its Qobj.type as 'other', since it is not a standard operation, but that's ok because we no longer need Qobj.type for fast dimension parsing.

The Compound threading over Map goes follows these rules:

Compound(Map(x1, y1), Map(x2, y2)) is Map(Compound(x1, x2), Compound(y1, y2))
Compound(Map(x1, y1), Space(z)) is Map(Compound(x1, Field), Compound(y1, Space(z)))

In other words, the from and to fields inside maps are Compounded with their counterparts, and Space is "promoted" to Map(Field, Space). This latter object is not actually valid, but Space will behave as if it were within Compound. Related but different, Compound(Field, Field) will exist for the purposes of tensor-product 'bra' types as the to field of Map (to allow us to keep track of empty spaces), but a Qobj whose dimensions would be a Compound made entirely of Field will instead become a Python complex number.

The dimensions types should be available for advanced users (to allow them to access the full parsing speed-ups), but should not be presented as the standard choice. I propose we place the types inside a nested namespace, such as qutip.dims (logically - physically it would be qutip/core/dims.py), to allow the form from qutip.dims import * where appropriate without forcing the user to do the modern bad practice left over from our MATLAB past from qutip import *.

User impact

In principle, nothing will change for the normal QuTiP user compared to the 4.x branch. You will still be able to supply the dims argument to the Qobj constructor as lists in the exact same format, and they will be parsed in the same way. Users do not need to type out the new computer-friendly dimensions objects, but they will be available for advanced users who frequently make Qobj using the raw constructor with funny dimensions. We will publicly provide qutip.dims.parse to turn a list into the new form, so even advanced users do not need to type out all the nonsense.

Qobj factory functions that take a dims parameter should now also accept the new form. Since almost all of them just pass this directly to the dims argument in the Qobj constructor, this likely won't involve any developer effort.

Qobj construction overhead should be reduced to near-zero when passed a new dimensions object, which we will always do within the library. Compared to the 4.x branch, the overhead of Qobj will shrink from ~100µs to ~1µs in library code, even for functions where the Qobj type cannot be cleanly inferred from the input types.

Particular points worth commenting on

Are there currently valid Qobj that cannot be represented with this system?
Should we push to implement basis-safety for QuTiP 5.x?
Do you agree we should keep track of "missing" tensor product spaces?
Should we change the pretty-printed format of dims and type in Qobj.__repr__()?

jakelishman · 2021-01-21T00:23:43Z

jakelishman
Jan 21, 2021
Collaborator Author

There's maybe some tricks here to do with multiprocessing and pickle/unpickle, but since the objects are deterministic and completely immutable, I can't see anything inherently wrong with a singleton approach here. These objects are purely data; they must not have behaviour attached to them (methods) only immutable state (properties), so they're inherently thread-safe.

0 replies

Ericgig · 2021-01-21T17:50:47Z

Ericgig
Jan 21, 2021
Maintainer

Using singleton will certainly speed thing up.

With subspace to represent enr or other special case, it should be able to represent everything. But super operators have multiple representation (choi, kraus, super, etc.). So I would suggest adding a super subclass with that information instead of using map.
Basis safety, if required, will make Qutip usage heavier.
Let's keep it as an inside representation only: having Space(2) for a ket but Map(Space, Field) for bra will only cause confusion and super-operator of tensor system will get very long.

1 reply

jakelishman Mar 29, 2021
Collaborator Author

Point 1

That's a very good point about superoperator representations. We could have a Super class, and I think it would be a very thin subclass of Map - Qobj.superrep would move into Super, but that would be the only difference between them. After all, it is just a linear map, and the parsing of them should be the same. It would certainly be nice to get superrep out of Qobj. Kraus operators would just be regular Map, I assume. In terms of internal representation with a Super class, it would just change

Map(Space(Map(Space, Space)), Space(Map(Space, Space)))

to

Super(Space(Map(Space, Space)), Space(Map(Space, Space)), rep='super')

and I definitely like having the superop rep included in it.

The user is never ever meant to write any of this themselves, so the literal length shouldn't be too much of a problem. You'd still specify dimensions using the exact same list syntax that we currently use, it's just we'd immediately parse it into this internal representation and internally operate on this, because it's much faster. Essentially what I'm describing here is an abstract syntax tree for relevant linear algebra structures. We could even have the tensor index dimensions stored within the Compound objects, to help with ptrace, permute, the future local_multiply algorithms and so on. I wouldn't want to add that immediately, though - no need to complicate things.

Point 2

Basis safety wouldn't have any performance cost here - Space(2, basis='x') and Space(2, basis='y') would referentially be unequal, so the test would be free. It's basically the same thing as checking superoperator representations. I would worry about user ergonomics for creating these though. I'd propose that all QuTiP functions maintain their current behaviour of creating everything in the number basis (sigmaz(), num() and so on all imply a particular basis). Beyond that, the ENR functions would attach some basis information onto their outputs to make them safe, and functions like Qobj.transform could take a required argument to name the new basis.

I'm certainly not considering this a priority, just a possible solution to the ENR problem and a couple of people had expressed interest in basis safety in the google group. We can always tack it on in a later release if it ever seems like a good idea in the future.

Point 3

Yeah, this is absolutely all intended to be internal only. We wouldn't even print out this form in Qobj.__repr__, to my mind.

You'd still type dims=[[2], [1]] to get a qubit ket and dims=[[1], [2]] for a qubit bra, so I don't think there's any confusion there. The reason there's not a special "bra" structure internally is because it's not necessary; a bra really is just a linear mapping from a particular vector space to the field, so having a special case for that makes more complex - the matmul compatibility test with Map(Space, Field) and Map(Space, Space) is the exact same test as for two operators, which simplifies the logic.

After sleeping on it, I still generally like the singleton pattern for this, but I think completely relying on referential equality is probably a bit short-sighted. We can define, for example, Space.__eq__ as

class Space:
    def __eq__(self, other):
        return (
            self is other
            or (
                isinstance(other, Space)
                and self.size == other.size
                and self.basis == other.basis
            )
        )

so we'll almost invariably get the benefits right now, but we're rather more future-proof in the code. By analogy, it's clearly wrong to do (1, 2) is (1, 2) to compare tuples, even though tuple produces singletons in CPython (and that code will generally be True). The Python tuple class is basically what inspired me, and I'm 100% certain that the Python devs are smarter than I am, so we should probably stick with them.

hodgestar · 2021-09-27T14:05:07Z

hodgestar
Sep 27, 2021
Maintainer

We could introduce DualSpace(x) for Map(Space(x), Field)) for clarity, although we could even just make that part of __repr__.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overhaul of Qobj dimension handling #1476

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Overhaul of Qobj dimension handling #1476

jakelishman Jan 20, 2021 Collaborator

Background

Proposal

Overview

How this solves the problems

Problems this does not immediately solve

Implementation details

User impact

Particular points worth commenting on

Replies: 3 comments · 1 reply

jakelishman Jan 21, 2021 Collaborator Author

Ericgig Jan 21, 2021 Maintainer

jakelishman Mar 29, 2021 Collaborator Author

Point 1

Point 2

Point 3

hodgestar Sep 27, 2021 Maintainer

jakelishman
Jan 20, 2021
Collaborator

Replies: 3 comments 1 reply

jakelishman
Jan 21, 2021
Collaborator Author

Ericgig
Jan 21, 2021
Maintainer

jakelishman Mar 29, 2021
Collaborator Author

hodgestar
Sep 27, 2021
Maintainer