Investigate performance of ExperimentData regarding adding more data to it #1394

wshanks · 2024-02-06T17:04:16Z

In #1360, there was some discussion that we give up performance in order to allow experiments to be extended with additional data. We should look at how much ExperimentData could benefit from only allowing a fixed set of data and if it would be better to write a function that could run analysis on multiple ExperimentData objects. Or perhaps calling .run() a second time should generate a new ExperimentData object and new experiment ID that combines the old data and the new data? Perhaps there would need to be a new method .extend_run() instead of changing run.

          Another option would be to stop over-engineering. We currently allow users to run experiment on existing experiment data to get more statistically confident result. But I don't think this is heavy used feature and we can remove. In principle we run very expensive operation for the sake of something useless; if we have 100 xvals, we need to iterate over them and create new table with size=1 to average... Even if we switch to polars there still be overhead of object creation.

Originally posted by @nkanazawa1989 in #1360 (comment)

The text was updated successfully, but these errors were encountered:

wshanks mentioned this issue Feb 6, 2024

Cleanup dataframes #1360

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate performance of ExperimentData regarding adding more data to it #1394

Investigate performance of ExperimentData regarding adding more data to it #1394

wshanks commented Feb 6, 2024

Investigate performance of ExperimentData regarding adding more data to it #1394

Investigate performance of ExperimentData regarding adding more data to it #1394

Comments

wshanks commented Feb 6, 2024