Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save experiment and analysis config as artifacts #1424

Draft
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

coruscating
Copy link
Collaborator

@coruscating coruscating commented Mar 18, 2024

Summary

Adds BaseExperiment.load and BaseExperiment.save methods and a helper function load_all() that loads both experiment and experiment data, with the option to run analysis:

from qiskit_experiments.framework import load_all
from qiskit_ibm_runtime import QiskitRuntimeService
service = QiskitRuntimeService(channel="ibm_quantum")

exp, exp_data = load_all("9e242995-d793-4940-920f-f3970d7f777b", provider=service, run_analysis=True)

This PR incorporates changes from @ItamarGoldman's in #1423. To enable saving and loading, this PR saves experiment and analysis config as experiment_config and analysis_config in experiment data artifacts. Because analysis_options shouldn't be added as a new artifact every time analysis is rerun, this also adds overwrite_name option to add_artifacts() and renames the current overwrite option to overwrite_id. have been incorporated.

To-do

  • add tests
  • add seeds used in final circuit generation
  • update rerun analysis and cloud database service documentation with usage examples

Notes

load_all() is inefficient because BaseExperiment.load() and ExperimentData.load() make separate and identical calls to the experiment service. Ideally this would be combined in a separate interface class in a future refactoring.

@coruscating coruscating added this to the Release 0.7 milestone Mar 19, 2024
Copy link
Contributor

@yaelbh yaelbh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only a partial review for now. I'm publishing it to emphasize some possible issues that I've spotted in BaseAnalysis.run.

I'm intentionally not checking the Request changes box, because I'm going on vacation in a few days.


Raises:
QiskitError: If the experiment class is not stored,
QiskitError: If the drawer class is not stored,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace "successful" with "successfully" in the subsequent line

type is currently stored, or the :meth:`~.BaseExperiment.from_config` class method of the
appropriate experiment.
This stores the current configuration of a :class:`.BaseDrawer` and can be used to reconstruct
the drawer class using either the :meth:`drawer` property if the drawer class type is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Property of which class? And how is it related to whether the class type is stored?

def config(self) -> AnalysisConfig:
"""Return the config dataclass for this analysis. We replicate the method from `BaseAnalysis`
because we cannot directly modify the returned object. This limitation arises from our use of
it hashable we use `dataclasses.dataclass(frozen=True)` to ensure its hashability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last sentence is grammatically incorrect (and indeed it's not clear what it's trying to say).

else:
self._artifacts[artifact.artifact_id] = artifact
else:
self._artifacts[artifact.artifact_id] = artifact
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here is somewhat strange and unintuitive, I don't know if it's documented anywhere, and it's not clear where it comes from. From the code I see that multiple, say five calls to add_artifacts with overwrite_name=False will create a collection of five different artifacts having the same name. Until the sixth call sets overweite_name to True, in this case all the five existing artifacts with that name will remain in the collection, but will be overridden by the new artifact. However a sixth artifact with artifact_id will not be added.

expdata.add_artifacts(result)
if (
result.name != "analysis_config"
): # Not saving constituent analysis configs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the flow that will make _run_analysis return an analysis config?

ArtifactData(name="analysis_config", data=self.config()),
overwrite_name=True,
overwrite_id=True,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If replace_results is False then we create a copy of the ExperimentData object. Both the original object and its copy will now contain the analysis config artifact, although the original one will not contain the corresponding analysis result.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also concerned about the old analysis config that's stored in the experiment data and is now overridden. The concern applies for the two possible values of replace_results.

args = tuple(getattr(self, "__init_args__", OrderedDict()).values())
kwargs = dict(getattr(self, "__init_kwargs__", OrderedDict()))
# Only store non-default valued options
options = dict((key, getattr(self._options, key)) for key in self._set_options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BaseAnalysis and BaseCurveAnalysis should call a common helper function, instead of duplicating the code. This is possible in spite of everything being hashable.


@dataclasses.dataclass(frozen=True)
class PlotterConfig:
"""Store configuration settings for a Drawer class.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Store configuration settings for a Drawer class.
"""Store configuration settings for a Plotter class.

cls: type = None
options: Dict[str, Any] = dataclasses.field(default_factory=dict)
figure_options: Dict[str, Any] = dataclasses.field(default_factory=dict)
# drawer: Dict[str, Any] = dataclasses.field(default_factory=dict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# drawer: Dict[str, Any] = dataclasses.field(default_factory=dict)

"""Store configuration settings for a Drawer class.

This stores the current configuration of a :class:`.BasePlotter` and can be used to reconstruct
the plotter class using either the :meth:`plotter` property if the plotter class type is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before.


Raises:
QiskitError: If the plotter class is not stored,
was not successful deserialized, or reconstruction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
was not successful deserialized, or reconstruction
was not successfully deserialized, or reconstruction

if not service:
raise QiskitError("A service must be provided to save the experiment.")
if not self.backend and not backend:
raise QiskitError("Backend must be set to save the experiment.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the backend mandatory?

f"but you have versions {cur_versions}."
)
try:
reconstructed_experiment = experiment_config.cls.from_config(experiment_config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity, does the cls here have any effect? Or is it only for readability?

@@ -2404,7 +2409,7 @@ def copy(self, copy_results: bool = True) -> "ExperimentData":

with self._artifacts.lock:
new_instance._artifacts = ThreadSafeOrderedDict()
new_instance.add_artifacts(self._artifacts.values())
new_instance.add_artifacts(self._artifacts.values(), overwrite_id=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it matter?

@@ -552,27 +554,34 @@ def image(
def figure(self):
"""Return figure object handler to be saved in the database."""

def config(self) -> Dict:
def config(self) -> DrawerConfig:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interface change, do we have to follow some type of deprecation policy?

if "figure_options" in value:
instance.set_figure_options(**value["figure_options"])
return instance
return cls.from_config(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have configs, do we still need __json_encode__ and __json_decode__?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are still required, with __json_encode__ now returning PlotterConfig and not dict, would that work?

Copy link
Contributor

@eliarbel eliarbel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Various questions and comments at this stage. I couldn't fully test it due to issues with BaseExperiment.load; once solved I'll run some manual tests

@@ -553,36 +554,39 @@ def _configure_drawer(self):
# Use drawer.set_figure_options so figure options are serialized.
self.drawer.set_figure_options(**_drawer_figure_options)

def config(self) -> Dict:
def config(self) -> PlotterConfig:
"""Return the config dictionary for this drawing."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Return the config dictionary for this drawing."""
"""Return the config dataclass for this plotter."""


@classmethod
def from_config(cls, config: Union[PlotterConfig, Dict]) -> "BasePlotter":
"""Initialize a plotter class from analysis config"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Initialize a plotter class from analysis config"""
"""Initialize a plotter class from a plotter config"""

@@ -552,27 +554,34 @@ def image(
def figure(self):
"""Return figure object handler to be saved in the database."""

def config(self) -> Dict:
def config(self) -> DrawerConfig:
"""Return the config dictionary for this drawer."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Return the config dictionary for this drawer."""
"""Return the config dataclass for this drawer."""

}
@classmethod
def from_config(cls, config: Union[DrawerConfig, Dict]) -> "BaseDrawer":
"""Initialize a drawer class from analysis config"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Initialize a drawer class from analysis config"""
"""Initialize a drawer class from a drawer config"""

ret.set_options(**config.options)
if config.figure_options:
ret.set_figure_options(**config.figure_options)
return ret

def __json_encode__(self):
return self.config()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this now returning DrawerConfig and not a dict, would this work?

if "figure_options" in value:
instance.set_figure_options(**value["figure_options"])
return instance
return cls.from_config(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these are still required, with __json_encode__ now returning PlotterConfig and not dict, would that work?

from qiskit_experiments.framework import BaseExperiment, ExperimentData


def load_all(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change name to something a bit more informative, like load_experiment_and_data

@@ -422,3 +424,36 @@ def _initialize(
DeprecationWarning,
)
self.set_options(data_subfit_map=data_subfit_map)

def config(self) -> AnalysisConfig:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have config/from_config with plotter handling specifically here in BaseCurveAnalysis and not in BaseAnalysis? What about analyzes not derived from BaseCurveAnalysis, e.g. TomographyAnalysis or StarkP1SpectAnalysis?

data = service.experiment(experiment_id, json_decoder=ExperimentDecoder)

# Recreate artifacts
experiment_config_filename = "experiment_config.zip"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better store the name in one place to be used here and in `ExperimentData.init" (maybe in helpers.py to avoid cyclic dependency)

@@ -171,13 +179,20 @@ def config(self) -> ExperimentConfig:
(key, getattr(self._transpile_options, key)) for key in self._set_transpile_options
)
run_options = dict((key, getattr(self._run_options, key)) for key in self._set_run_options)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an issue with config handling when it comes to serializing a backend object when one is provided in the __init__ method. In this case, the backend is included in the config e.g. via kwargs and is serialized in way that upon deserialization (through BaseExperiment.load) the following warning is emitted:

/tmp/qiskit-experiments/qiskit_experiments/framework/json.py:581: UserWarning: Could not deserialize instance of class <class 'qiskit_ibm_runtime.ibm_backend.IBMBackend'> from settings {}. Value is {'class': <class 'qiskit_ibm_runtime.ibm_backend.IBMBackend'>, 'settings': {}, 'version': '0.23.0'}
The following exception was raised:
Traceback (most recent call last):
File "/tmp/qiskit-experiments/qiskit_experiments/framework/json.py", line 338, in _deserialize_object
return cls(**settings)
TypeError: IBMBackend.init() missing 3 required positional arguments: 'configuration', 'service', and 'api_client'

return _deserialize_object(obj_val)

The experiment still loads, but this warning is annoying.

cls,
experiment_id: str,
service: Optional[IBMExperimentService] = None,
provider: Optional[Provider] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If service it not None, but provider is None, we get this warning:

/tmp/qiskit-experiments/qiskit_experiments/framework/base_experiment.py:286: UserWarning: Unable to retrieve backend.
warnings.warn("Unable to retrieve backend.")

I think this should be fixed - either don't allow an empty provider or update the doc string to explain the behavior when provider is empty. Currently the doc string explicitly says that IBMProvider is required

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants