Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add cache/memoization to PVcell properties (#121) #123

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

kandersolar
Copy link
Contributor

This PR adds a caching mechanism to the PVcell class such that successive calls to a property reuse the previously calculated value instead of recalculating it and causing a cascade of other property recalculations as well.

There are two things I want to highlight:

  1. The PR in its current state does fail several tests in the test_setsuns suite due to PVmodule's use of copy.copy() on PVcell objects: because it is a shallow copy, the cache dictionary is shared by all instances, which is of course not desirable. By replacing all of the copy calls with deepcopy I am able to get the tests to pass, but that approach seems rather heavy-handed and I wonder if others see a better approach.
  2. As with all caching, invalidation is the hard part. I'm not familiar enough with the core internals of pvmismatch (really I have only read through the PVcell source code in any depth) to be confident that the cache is cleared in all cases that it should be, although the fact that the test suite passes is encouraging.

Here is a timing comparison -- nothing rigorous, just pytest --durations=0 pvmismatch\tests. Note that these timings were performed using the deepcopy modification described above, which is not currently committed in this PR.

test master PR
test_pvmodule.py::test_calc_tct_mod 3.55s 0.44s
test_setsuns.py::test_gh34_35 3.42s 0.46s
test_settemps.py::test_settemp 3.13s 0.49s
test_setsuns.py::test_basic 1.66s 0.77s
test_pvconstants.py::test_minimum_current_close_to_max_voc_gh110 1.48s 0.76s
test_setsuns.py::test_set_str_2 0.98s 0.13s
test_setsuns.py::test_set_str_1 0.84s 0.14s
test_setsuns.py::test_set_mod_1 0.83s 0.08s
test_setsuns.py::test_set_mod_2 0.80s 0.09s
test_setsuns.py::test_dictionary 0.78s 0.07s
test_pvmodule.py::test_calc_pct_bridges 0.67s 0.21s
test_pvmodule.py::test_calc_pct_mod 0.49s 0.15s
test_pvsystem.py::test_pvsystem_with_pvstrs_obj 0.46s 0.04s
test_pvsystem.py::test_pvsystem_with_no_pvstrs 0.44s 0.05s
test_pvsystem.py::test_pvsystem_with_pvstrs_list 0.42s 0.05s
test_pvmodule.py::test_bypass_diode_configurations 0.05s 0.04s
test_pvstring.py::test_pvstring_with_no_pvmods 0.01s 0.01s
test_pvstring.py::test_pvstring_with_pvmods_obj 0.01s 0.01s
test_pvstring.py::test_pvstring_with_pvmods_list 0.01s 0.01s
test_pvcell.py::test_calc_series 0.01s 0.01s
test_pvmodule.py::test_pvmodule_with_pvcells_obj 0.01s 0.01s
test_pvmodule.py::test_pvmodule_with_no_pvcells 0.01s 0.01s
test_pvmodule.py::test_pvmodule_with_pvcells_list 0.01s 0.01s
test_pvmodule.py::test_calc_mod 0.01s 0.01s
test_settemps.py::test_settemp_cell 0.01s 0.01s

@mikofski
Copy link
Contributor

The PR in its current state does fail several tests in the test_setsuns suite due to PVmodule's use of copy.copy() on PVcell objects

I'm totally struggling to understand what the existing code in setSuns() is actually doing with those copies. I remember vaguely this has to do with an analysis @bmeyers did with thousands of cells and modules which mostly had identical conditions, so we just allowed all of the cells to reference the same exact PVcell object to save memory. Then we only added new objects as needed to change the irradiance or temperature. I believe the shallow copy was used since we only needed a new object as a template, because later I think we use assignment to overwrite the existing attributes with new references. Copy was faster than instantiation. Anyway, I remember that this was a harsh lesson in the dangers of OOP and object proliferation (out of control) and the effect on memory. Anyway, I think the code was too clever and now feels opaque to me. I think it could use a quick clean-up with more comments and more SPACESTM 😉 .

SPACES = simple, performance, accurate, collaborate, extensible, & stable

@kandersolar
Copy link
Contributor Author

I admit I was rather put off by that code too, but your explanation makes sense. I suppose if setSuns is changed, its twin setTemps should as well. Do you have something in mind for the "quick clean-up"? Seems like it should be a separate PR, and this is put on hold in the mean time.

Even with the speed-up in this PR, some quick testing shows instantiation as about 50x the execution time of copy.copy(cell). Maybe instead of the "copy template + overwrite" pattern it would be possible to instantiate the PVcell with the desired parameters so that it doesn't incur the cost of initial calculations for nothing?

@mikofski
Copy link
Contributor

mikofski commented May 1, 2020

setSuns and setTemps are nearly identical. #48 was meant to addresses this with a single generic "update" method for all parameters.

I don't have a plan. I like the idea of replacing copy with instantiation. I don't know where I got the idea copy would be faster. 50x is a big difference.

@kandersolar
Copy link
Contributor Author

Even though a more involved fix might be better overall, a very unintrusive option would be a new PVcell method that looks like this:

def clone(self):
    cloned = copy.copy(self)
    cloned._cache = {}
    return cloned

and use pvcell.clone() instead of copy(pvcell) in the PVmodule methods. Trying it out, it appears to satisfy the test suite. Just another option.

@mikofski
Copy link
Contributor

mikofski commented May 2, 2020

Sounds great. I was thinking something similar but just named PVcell.copy(), altho now I like clone better

So wait, I'm confused which is slower, instantiation or copy? At first I thought copy was faster, but then it seemed like you were saying copy was slower, now it seems like copy is faster. Anyway thanks for diving into this. Anyway I think I misunderstood you earlier

Ping @chetan201

@kandersolar
Copy link
Contributor Author

Sorry, should have been more clear. Copy is faster than instantiation, presumably because instantiation incurs the cost of all the numerical IV calculations but copying does not. Here are the timings with the clone method (updated from above to use the parent setattr to avoid triggering a recalculation).

def clone(self):
    cloned = copy.copy(self)
    super(PVcell, cloned).__setattr__('_cache', {})
    return cloned
In [1]: import pvmismatch

In [2]: %timeit _ = pvmismatch.PVcell()
132 µs ± 3.15 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [3]: cell = pvmismatch.PVcell()

In [4]: %timeit _ = cell.clone()
3.32 µs ± 117 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

@mikofski
Copy link
Contributor

mikofski commented May 2, 2020

This is very cool! Thanks!

I just realized another reason why shallow copy better than deep copy or instantiation: we want all cells (and other components) to refernece the same pvconst, copy does that, but deep copy & instantiation do not.

@chetan201 do you know if SunPower intends to continue maintaining PVMismatch?

@chetan201
Copy link
Contributor

chetan201 commented May 20, 2020

this is really cool indeed! For some reason, I haven't been getting notifications from this repo to my email so missed all these reminders. My apologies.
I will plan to review and merge soon.
Thanks @kanderso-nrel for these excellent contributions.

@kandersolar
Copy link
Contributor Author

@chetan201 FYI I still need to include the above cloning idea to get this PR ready to merge. I'll try to get to that in the next few days.

@kandersolar
Copy link
Contributor Author

I think this is ready for review. Two comments:

  • For reasons I don't fully understand, some of the copies in PVmodule.setSuns() and setTemps() are needed to pass the test suite and cloning doesn't appear to be a suitable replacement. See the Revert "missed some copies" commit.
  • I have it copy the cache dictionary as well in PVcell.clone() instead of giving the new cell a blank cache. I think that makes sense -- all the derived values should be the same for the cloned cell since the inputs aren't changing, right? Though maybe it doesn't really matter since the inputs get changed anyways right after cloning, so the cache gets invalidated and cleared either way.

@chetan201 chetan201 self-requested a review June 17, 2020 18:36
@mikofski
Copy link
Contributor

@ahoffmanSPWR you should check this out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] avoid redundant PVcell property calls
3 participants