Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Observable container type for Estimator #11594

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

chriseclectic
Copy link
Member

Summary

This aims to add a concrete container class type for the items in an ObservableArray which was an oversite when adding that container.

This is important so that we can use this to abstract away the internal storage format of an observable from how it is constructed for Estimator inputs, and consumed by Estimator primitive developers.

Details and comments

Some open questions / ideas:

  • Should the abc for this class be a Mapping or Sequence to define the developer API for how it is iterated over and accessed.
  • What should the item types be?
    • We should make the internal format of the Observable sparse in terms of qubits rather than full strings
    • Do we need a BaseObservable that doesn't assume the allowed strings (ie "basis") supported by a vendor Estimator implementation, then we should have ObservableArray[BaseObservable] and EstimatorPub[BaseObservable] as templated types?
    • We should not use single-character representations for "basis" terms in strings, but a type of strings to allow custom label for basis observables supported by a vendor Estimator

@chriseclectic chriseclectic added priority: high mod: primitives Related to the Primitives module labels Jan 18, 2024
@chriseclectic chriseclectic added this to the 1.0.0 milestone Jan 18, 2024
@chriseclectic chriseclectic requested review from a team as code owners January 18, 2024 21:04
@qiskit-bot
Copy link
Collaborator

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core
  • @ajavadia
  • @ikkoham
  • @levbishop
  • @t-imamichi

The intent is that an Estimator can retrieve the basis terms in an observables array from the `terms` attribute and validate them itself, rather than hard code the allowed terms here. If we do still want to hardcode this in the classes we could add it back, but the current version of expecting subclasses to override ALLOWED_BASIS doesn't work with the current implementation of class methods which done return the subclass type from methods like reshape etc
This changes the mapping key type to be `(tuple[int], str)` instead of str, to allow for sparse representations in the future. Eg instead of `"XIIIY" you can have `((0, 5,), "XY")`. Also removes validation of allowed characters in the strings. Still need to update unit tests if we go with this.
@t-imamichi
Copy link
Member

I have a question about the following commit message.

This changes the mapping key type to be (tuple[int], str) instead of str, to allow for sparse representations in the future. Eg instead of "XIIIY" you can have ((0, 5,), "XY")`. Also removes validation of allowed characters in the strings. Still need to update unit tests if we go with this.

X seems to correspond to 0. Do the indices start from left to right of the string?
Minor comment: 5 should be 4 because there 5 qubits.

@t-imamichi
Copy link
Member

Observable currently allows inconsistent ObservableKey such as Observable({((1,), 'XX'): 1}) (1 index but two paulis). Could you add a check?

from qiskit.quantum_info import Pauli, SparsePauliOp


ObservableKey = "tuple[tuple[int, ...], str]" # pylint: disable = invalid-name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to make a dataclass of ObservableKey and add a validation method for #11594 (comment).

Copy link
Contributor

@ihincks ihincks Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the name ObservableTerm here, but at the end of the day, it's just a name. I wonder if you can comment on the feasibility of leaving this as tuple[tuple[int, ...], str] for now and changing it to a class (eg as suggested by @t-imamichi) in the future? In terms of breaking changes and possible subclassing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im mostly worried about this container being slow and memory intensive with all the nested classes. I did have a prototype using a class that function like a tuple (had getitem), but im a bit skeptical of validation other than via type since it might be slow. We can't validate the strings, so validation would be that qubits is a tuple with no duplicates and (if we want to enforce this) that the string is same lenght as qubits.

In the current version the design doesn't actually require the string to represent a bunch of operators for each qubit, its up to an estimator to decide what it does with it, but we are using it to represent a Pauli label

(side note: our reference estimator is going to be rather inefficient since it unpacks SparsePauliOps into this potentially much more memory heavy format, and then has to convert back to SparsePauliOp for its simulation)

Copy link
Contributor

@ihincks ihincks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Chris, this looks like a strict improvement because it allows sparsity. I think we should continue to think about how an Estimator is supposed to express its abilities (eg can it estimate only Paulis, or more stuff?) and whether it would make sense to class-ify ObservableKey. One or both of these questions could be out-of-scope for this PR.

qiskit/primitives/containers/observable.py Outdated Show resolved Hide resolved
qiskit/primitives/containers/observable.py Outdated Show resolved Hide resolved
from qiskit.quantum_info import Pauli, SparsePauliOp


ObservableKey = "tuple[tuple[int, ...], str]" # pylint: disable = invalid-name
Copy link
Contributor

@ihincks ihincks Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the name ObservableTerm here, but at the end of the day, it's just a name. I wonder if you can comment on the feasibility of leaving this as tuple[tuple[int, ...], str] for now and changing it to a class (eg as suggested by @t-imamichi) in the future? In terms of breaking changes and possible subclassing.


def __init__(
self,
data: Mapping[ObservableKey, complex],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can and should brush this under the rug for now, but the dtype of the values is complex here, but the StatevectorEstimator is already setting evs to np.float64. If we could have some mechanism to determine tho common type of an array and use that to set the dtype in the DataBin, it would be nice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also,

Suggested change
data: Mapping[ObservableKey, complex],
data: Mapping[ObservableKeyLike, complex],

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to change to float instead of complex (since any hermitian nqubit matrix can be decompose with real coefficients in Pauli basis). Im not sure different exisiting estimators are consistent with enforcing hermitian or not.

Currently for performance its not the inits job to convert Observable-key like, thats the coerce methods job. If you initialize with not an actual observable key validation should fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reference to preceding comment, Iean to Key instead of Term since this object is a mapping, and this type is literally the key type for the mapping. In operator speak my first impression of a term would probably be the key * value (coeff).

qiskit/primitives/containers/observable.py Outdated Show resolved Hide resolved
Raises:
ValueError: If ``validate=True`` and the input observable-like is not valid.
"""
self._data = data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we care about not sharing a reference to the internal thing:

Suggested change
self._data = data
self._data = dict(data.items())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a trade off with that vs performance since this requires an iteration over the data at init

qiskit/primitives/containers/observable.py Show resolved Hide resolved
Comment on lines +149 to +150
qubits = tuple(range(len(key)))
key = (qubits, key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have the same TODO as Pauli.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes we always mean "I" in a key to mean identity, which if we do we should be more explicit about (and in general with strings being single characters same length as qubits etc)

self._num_qubits = num_qubits
return self._num_qubits

def validate(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation allows no terms, which I guess corresponds to the identity observable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should fail if emtpy so identity has to be represented as ((0,), "I") or similar?

qiskit/primitives/containers/observable.py Outdated Show resolved Hide resolved
@chriseclectic
Copy link
Member Author

chriseclectic commented Jan 24, 2024

I should add that i am not a fan of the internal data structure of this class (or of previous ObservableArray values it is encapsulating). A python dict with tuple keys seems like it will scale poorly. Eg something that seems more scalable would be using 1D BitArray as mask of non-iden qubits, 1D float array as coeffs. The labels are the tricky part since they could change in size depending on the number of masked qubits (unless we encode a small number of allowed label terms). Ideally I would like to not enforce an internal representation with this class, but i am struggling to figure out how to do this in a practical way that can still be used by primitive implementers

@chriseclectic chriseclectic modified the milestones: 1.0.0, 1.1.0 Jan 30, 2024
@chriseclectic
Copy link
Member Author

I'm going to close this now that we aren't rushed to put it into 1.0, it would be good to consider the questions raised by this PR and the review comments to come up with a nice proposal for 1.1

@ihincks
Copy link
Contributor

ihincks commented Apr 19, 2024

Reopening for 1.1

@jakelishman jakelishman modified the milestones: 1.1.0, 1.2.0 May 1, 2024
@jakelishman
Copy link
Member

jakelishman commented May 1, 2024

I'm bumping this out of the 1.1, which is my understanding of the intention of the primitives team from discussion with Ian.

Fwiw, we may well want to work out new timescales together to unify this with #12282.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mod: primitives Related to the Primitives module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants