Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPOPT duplicated history handling #252

Merged
merged 19 commits into from
May 18, 2021
Merged

Conversation

kanekosh
Copy link
Contributor

@kanekosh kanekosh commented May 7, 2021

Purpose

Closes #182. I added iter entries to the history data at every call counter. Other than that the hist file is unchanged.
Then the duplicated entries are removed in OptView_baseclasses and History.getValues().
When we read the old hist files which don't have the iter entries, OptView and getValues() raise a warning.

Type of change

What types of change is it?
Select the appropriate type(s) that describe this PR

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (non-backwards-compatible fix or feature)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no API changes)
  • Documentation update
  • Maintenance update
  • Other (please describe)

Testing

The history file and OptView plot for test_hs015.py. IPOPT.out says it had 12 function evaluations (= iter).
Screenshot from 2021-05-07 16-45-49
Screenshot from 2021-05-07 18-00-23

Checklist

Put an x in the boxes that apply.

  • I have run flake8 and black to make sure the code adheres to PEP-8 and is consistently formatted
  • I have run unit and regression tests which pass locally with my changes
  • I have added new tests that prove my fix is effective or that my feature works
  • I have added necessary documentation

@kanekosh kanekosh requested a review from a team as a code owner May 7, 2021 23:21
@kanekosh kanekosh mentioned this pull request May 7, 2021
12 tasks
@codecov
Copy link

codecov bot commented May 7, 2021

Codecov Report

Merging #252 (184d238) into master (bf140e2) will decrease coverage by 10.83%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #252       +/-   ##
===========================================
- Coverage   83.44%   72.61%   -10.84%     
===========================================
  Files          22       22               
  Lines        3250     3268       +18     
===========================================
- Hits         2712     2373      -339     
- Misses        538      895      +357     
Impacted Files Coverage Δ
pyoptsparse/pyoptsparse/pySNOPT/pySNOPT.py 13.86% <0.00%> (-76.24%) ⬇️
pyoptsparse/pyoptsparse/pyNLPQLP/pyNLPQLP.py 25.92% <0.00%> (-66.67%) ⬇️
pyoptsparse/pyoptsparse/pyOpt_solution.py 57.77% <0.00%> (-42.23%) ⬇️
pyoptsparse/pyoptsparse/pyOpt_utils.py 61.86% <0.00%> (-7.91%) ⬇️
pyoptsparse/pyoptsparse/pyOpt_history.py 76.44% <0.00%> (-5.26%) ⬇️
pyoptsparse/pyoptsparse/pyOpt_optimizer.py 83.29% <0.00%> (-0.97%) ⬇️
pyoptsparse/pyoptsparse/pyOpt_optimization.py 78.01% <0.00%> (-0.82%) ⬇️
pyoptsparse/pyoptsparse/pyOpt_error.py 92.00% <0.00%> (+36.00%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bf140e2...184d238. Read the comment docs.

@kanekosh
Copy link
Contributor Author

kanekosh commented May 8, 2021

I don't know why flake8 is failing, looks like F821 is pyNSGA2/setup.py? Also isort error was fine on my local machine...

@ewu63
Copy link
Collaborator

ewu63 commented May 8, 2021

Don't worry about flake8 and isort, I changed the configuration today but only applied the fixes to PR #251 rather than to the master branch. I will try to fix it, but worst case we can merge this regardless.

Copy link
Collaborator

@ewu63 ewu63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some comments, the main thing I would like to fix is how the callCounter loop is done in History. A few other things:

  • We definitely want to add a test for this, but given that I am working on refactoring the tests in parallel maybe that needs to happen in a separate PR later.
  • The docs page history.rst need to be updated, to add the iter key to the ASCII diagram showing history file structure.

pyoptsparse/pyOpt_optimizer.py Show resolved Hide resolved
@@ -630,29 +630,32 @@ def getValues(self, names=None, callCounters=None, major=True, scale=False, stac
callCounters.append(self.read("last"))
callCounters.remove("last")

self._ipoptIterCounter = -1 # track iteration, only relevant for IPOPT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this approach here. Can we go through a pre-processing step to determine the "suitable" callCounters to loop over in the main loop instead? We should be able to come up with an approach that does not depend on checking the optimizer, and is generally applicable to all history files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current method should work for other optimizers as well (will double check), I just put IPOPT in the variable/method names so that what I intend here is clear. But I will rename those.
I don't know if we want to have another pre-processing loop, as we are already doing some callCounter validation in the main loop (func vs funcSense, major, fail)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was thinking of creating a loop before we parse the file, where we generate the list of callCounters given all the various input flags. Then we only loop over those. But maybe that would decrease the efficiency so maybe it's not worth it. I'm okay with keeping it as is, but maybe some minor refactor would be helpful:

  • lump all the checks into a single function that validates the callCounter against things like funcs, major flags etc
  • make it general for all optimizers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored

pyoptsparse/postprocessing/OptView_baseclass.py Outdated Show resolved Hide resolved
pyoptsparse/pyOpt_history.py Show resolved Hide resolved
@kanekosh
Copy link
Contributor Author

kanekosh commented May 8, 2021

I don't have a good idea for the test other than comparing the final iterCounter with the hardcoded number of iterations we get from each optimizers' output file. But it'll be painful to retrain the ref data (because someone has to go through all the output files manually).
Do you have any suggestions?

@ewu63
Copy link
Collaborator

ewu63 commented May 9, 2021

I don't have a good idea for the test other than comparing the final iterCounter with the hardcoded number of iterations we get from each optimizers' output file. But it'll be painful to retrain the ref data (because someone has to go through all the output files manually).
Do you have any suggestions?

I think it's fine to do that for one test and for just IPOPT. We can have another one to test that getValues gets you the correct values. We can work on this in a separate PR or something given the ongoing refactor.

@kanekosh
Copy link
Contributor Author

Ready for another review.
I add a new method _generateValidCallCounters() which does all the call counter checks. This follows the same logical flow (I tested with old and new hist files from IPOPT and SNOPT), but we may want to wait to merge until adding new tests?

Copy link
Collaborator

@ewu63 ewu63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think? Would the code still be readable?

pyoptsparse/postprocessing/OptView_baseclass.py Outdated Show resolved Hide resolved
@@ -630,40 +630,26 @@ def getValues(self, names=None, callCounters=None, major=True, scale=False, stac
callCounters.append(self.read("last"))
callCounters.remove("last")

self._ipoptIterCounter = -1 # track iteration, only relevant for IPOPT
# get a list of valid call counters
validCallCounters = self._generateValidCallCounters(callCounters, user_specified_callCounter, allowSens, major)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I am looking at this again, I think this may have a significant performance penalty since we are looping over the database twice. Could we have a function that checks if the current callCounter is valid? We would call that at every iteration, and only proceed if valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, yeah I think this is better now

Copy link
Collaborator

@ewu63 ewu63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just some minor comments. I will wait for others to take a look too. Thanks for this work!

pyoptsparse/pyOpt_history.py Show resolved Hide resolved
return data

def _readValidCallCounter(self, i, user_specified_flag, allowSens_flag, major_flag):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add docstrings

@ewu63 ewu63 requested review from marcomangano and sseraj May 11, 2021 20:48
ewu63
ewu63 previously approved these changes May 11, 2021
Copy link
Contributor

@marcomangano marcomangano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time I look into pyOptSparse I learn something new that I completely ignored before - i.e. the duplicated func and sens calls that are managed within the History() class.

So, if I get it right (aside the warnings for "old" hst files) you added the iter key to facilitate the process of picking the "right" stored iter - namely the one with the function evaluation - by checking against the cached value of the DV vector. This cached value is currently used only to ensure we are not calling the function evaluation unnecessarily within the masterFunc() method. A few comments:

  • I am getting a bit confused with the major/minor iteration definition as I am used to SNOPT, here we are just talking about func eval vs fcon/sens eval?
  • for @nwu63: I don't get (even in our current approach) how we squeeze all the info into the major iteration. Are we doing it at all - i.e. saving the sensitivities values within a major iter - or are we just discarding that info? I might be confused again by SNOPT which has the optimality key for every iter
  • looks like _readValidCallCounter mostly overlaps with Pointexist(), maybe we can get rid of that?
  • I have further questions about some low-level machinery in Optimizer() now that we scratched it with this cache thing, but that is a separate discussion

Thanks a lot for putting this together! Questions aside, I would also feel safer if @sseraj takes a look at this too

@kanekosh
Copy link
Contributor Author

you added the iter key to facilitate the process of picking the "right" stored iter - namely the one with the function evaluation - by checking against the cached value of the DV vector.

Yes, and iter can be helpful in case users look into the hist file by themselves?

I am getting a bit confused with the major/minor iteration definition as I am used to SNOPT, here we are just talking about func eval vs fcon/sens eval?

Minor iterations are internal iterations for solving the QP subproblems I believe.

looks like _readValidCallCounter mostly overlaps with Pointexist(), maybe we can get rid of that?

Pointexist() only checks if the call counter is within the range of the hist file, _readValidCallCounter() checks a few other things as well. Also Pointexist() is a public method so it might be being called by some other code, so we don't want to change it.

Copy link
Contributor

@sseraj sseraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good

pyoptsparse/pyOpt_history.py Outdated Show resolved Hide resolved
@marcomangano
Copy link
Contributor

Yes, and iter can be helpful in case users look into the hist file by themselves?

Yeah this seems an improvement beyond the IPOPT issue itself

Minor iterations are internal iterations for solving the QP subproblems I believe.

Exactly, here the difference is between major (i.e. new x) and the other stored calls for fcon anf g, sounds good!

Pointexist() only checks if the call counter is within the range of the hist file, _readValidCallCounter() checks a few other things as well. Also Pointexist() is a public method so it might be being called by some other code, so we don't want to change it.

Good point on the method being public. The snippet itself is pretty small so I agree that touching it would be pretty unnecessary right now

@marcomangano
Copy link
Contributor

I think once Sabet's comments are addressed this is good to go!

sseraj
sseraj previously approved these changes May 13, 2021
Copy link
Contributor

@sseraj sseraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we merge #251 first?

@ewu63
Copy link
Collaborator

ewu63 commented May 13, 2021

Should we merge #251 first?

Yes I think that'd be best, then we can add some new tests without causing merge conflicts.

@marcomangano
Copy link
Contributor

marcomangano commented May 15, 2021

@kanekosh we just merged the test refactoring PR. The only thing missing for this PR to be merged is the additional test for IPOPT stored values, do you think you can address this soon?

@kanekosh
Copy link
Contributor Author

kanekosh commented May 17, 2021

will do this week

  • check that the iteration counters are correct
  • check that getValues() does not return the duplicated entry.

@marcomangano
Copy link
Contributor

The test looks good, it makes sense to split it from the main parameterized now. Any more actionables here, aside #259 and #250 ?


# Check iteration counters
hist = History(self.histFileName, flag="r")
data_init = hist.getValues(names=["iter"], callCounters=[0], allowSens=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a similar test via the read function? I want to make sure that the history file is "correct" and the read function should be used instead of getValues since it does a bunch of other things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced getValues with read for the iterchecks.
I didn't replace it for the second part (where I loop over the iteration and check the consecutive objective values) because we do want to test getValues there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep looks good to me

self.assertEqual(0, data_init["iter"])
data_last = hist.getValues(names=["iter"], callCounters=["last"], allowSens=True)
data_last = hist.read(hist.read("last"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this duplicated? you just get the int counter with the first call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? hist.read("last") returns the last call counter (integer)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense!

Copy link
Contributor

@marcomangano marcomangano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go! Do we want to address the issues with getValues() before a new release?

@marcomangano marcomangano merged commit 9e45e5e into mdolab:master May 18, 2021
@kanekosh kanekosh deleted the itercouter branch June 8, 2023 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IPOPT history recording
4 participants