Simplify Binder build configuration and notebook execution #50

willingc · 2019-07-23T18:47:03Z

@shaunagm This WIP PR updates the Binder build.

The todo.md file has some recommendations after I ran all the notebooks. If you want to walk through this later this week, I would be happy to do so.

willingc · 2019-07-23T18:53:02Z

Looks like the missing file is due to a data directory/submodule not being copied to this repo.

JupyterLab's demo repo for Binder has a nice layout for binder with separate directories for data files and notebooks. Also it uses pyinvoke which simplifies a local install.

shaunagm · 2019-07-23T19:05:58Z

The missing file issue (the usacturial.txt issue) was due to a problem with how our setup.py was configured. The .txt file should actually be part of our HARK package which these notebooks use. Unfortunately fixing this for one notebook involves changing the version of all our notebooks, which we'd strongly prefer not to do.

And yes, let's do a videocall later this week?

willingc · 2019-07-23T19:08:52Z

Sure. I'm not clear on why the notebooks would need to all change version unless there are major breaking changes in EconARK's API.

I could set up another branch on my repo that pulls the latest commit from master for EconARK so at least you could test it out if you want. Should only take 5 minutes.

shaunagm · 2019-07-23T19:11:38Z

We don't want them to change version - we'd like to reach a point with most of these notebooks where they're "done" and we don't ever touch them again. Changing versions of Econ-ARK and then having to update the notebooks so they don't break when Econ-ARK has a breaking change is exactly what we're trying to avoid. But newer notebooks will need newer versions of Econ-ARK. If all notebooks in binder must have the same version of Econ-ARK we are trapped having to occasionally update 30+ notebooks to use new versions.

willingc · 2019-07-23T19:28:33Z

@shaunagm This branch https://github.com/willingc/DemARK/tree/try-master-latest-sha uses the latest master on Binder and it looks like stuff is working.

Look at the binder folder requirements.txt and see how the version for binder is configured to pull from master. The notebooks with the missing file are working now.

As an FYI, you can have binder serve a different version of Econ-ARK than the repo itself. Binder will default to taking the requirements in the binder folder first and if not found use the requirements at the repo's root.

Here's the binder link for a build using the master of econ-ARK: https://mybinder.org/v2/gh/willingc/DemARK/try-master-latest-sha

shaunagm · 2019-07-23T19:59:11Z

Oh, neat. Could we do something like git+git://github.com/econ-ark/HARK.git@latest_stable_release so that, unless overwritten in the notebook itself, the notebooks would automatically get whatever branch we'd created under the name latest_stable_release? (We already create release branches, so we could just duplicate stables one to latest_stable_release.) That's probably safer than just pinning to master, which we introduce bugs into all the time.

willingc · 2019-07-23T20:16:38Z

Yep you can pin it to any SHA1, tag, or release.

llorracc · 2019-07-25T00:37:38Z

This is related to a pet idea of mine, discussed and articulated in this PR <econ-ark/HARK#275>: Briefly, we should figure out how to add, to our testing/CI regime, the *.py files of some subset of the notebooks in the DemARK/REMARK repos. If we had had that infrastructure in place, we would have caught a subtle bug introduced by a PR that nearly derailed the weeklong class I am teaching in Budapest this week. Nobody seems to know an easy way to do this, which is surprising to me. which is to choose

…

On Tue, Jul 23, 2019 at 9:28 PM Carol Willing ***@***.***> wrote: @shaunagm <https://github.com/shaunagm> This branch https://github.com/willingc/DemARK/tree/try-master-latest-sha uses the latest master on Binder and it looks like stuff is working. Look at the binder folder requirements.txt and see how the version for binder is configured to pull from master. The notebooks with the missing file are working now. As an FYI, you can have binder serve a different version of Econ-ARK than the repo itself. Binder will default to taking the requirements in the binder folder first and if not found use the requirements at the repo's root. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#50?email_source=notifications&email_token=AAKCK74Z3UUGEUGN6YBGMH3QA5LWDA5CNFSM4IGHZLA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2UFYKY#issuecomment-514350123>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAKCK7ZKBRVMY267XE5NMNDQA5LWDANCNFSM4IGHZLAQ> .

-- - Chris Carroll

willingc · 2019-07-25T03:32:13Z

@llorracc Shauna and I chatted this morning. There are 3 general ways to test notebooks: nbval, nbconvert, and papermill. Typically using pytest as the test runner. A good resource for all things notebooks and reproducible research: https://github.com/jupyter-guide/jupyter-guide and its related https://github.com/jupyter-guide/ten-rules-jupyter.

shaunagm · 2019-07-25T14:57:59Z

@llorracc can you get us an initial list of notebooks you want to run, that should pass (or fail!) quickly?

Also: see this open active discussion on testing notebooks.

llorracc · 2019-07-25T21:39:36Z

@willingc Thanks very much for your extensive improvements on a bunch of fronts.

I was about to merge but then realized that this discussion is independently valuable and would be less prominent after a merge.

Having reread the discussion in the related issue I think my preferences have clarified.

Steps:

Get to a point where we have verified that ALL DemARKs and REMARKs work with the current stable release.
- We are pretty close to that point right now; the Alphacruncher guys have tested all DemARKs and REMARKs and I think we have addressed most or all of the problems
Set things up so that, when there is a new dev release, that triggers a process to run all of the DemARK *.py files and all the REMARK do_min.py files
- If there are notebooks that break, we can either:
  1. Do a "quick fix" which is to pin the notebooks to the previous dev release and put the unpinned version on a "to-fix" branch; or
  2. Actually fix the problem(s) and post the corrected versions

For every notebook that breaks, we therefore have a means to unbreak it.

I much prefer this workflow to pinning everything to the current (working) version, precisely because I think that trying to run existing notebooks on a new candidate release is an excellent tool for bug-finding in the new release.

willingc · 2019-07-26T00:58:42Z

I was about to merge but then realized that this discussion is independently valuable and would be less prominent after a merge.

No rush to merge. 😄

Having reread the discussion in the related issue I think my preferences have clarified.

Steps:

Get to a point where we have verified that ALL DemARKs and REMARKs work with the current stable release.

Excellent plan!

We are pretty close to that point right now; the Alphacruncher guys have tested all DemARKs and REMARKs and I think we have addressed most or all of the problems

Nice!

Set things up so that, when there is a new dev release, that triggers a process to run all of the DemARK *.py files and all the REMARK do_min.py files

I would recommend to test the notebooks on all pushes to master and run the matrix on all supported Python versions and nightly. This way there should be minimal breakage at release time. Azure Pipelines may be a good CI tool instead of or in addition to Travis or CircleCI since it supports Windows and is usually much faster.

If there are notebooks that break, we can either:

Do a "quick fix" which is to pin the notebooks to the previous dev release and put the unpinned version on a "to-fix" branch; or

Actually fix the problem(s) and post the corrected versions

For every notebook that breaks, we therefore have a means to unbreak it.

I much prefer this workflow to pinning everything to the current (working) version, precisely because I think that trying to run existing notebooks on a new candidate release is an excellent tool for bug-finding in the new release.

Yep, getting tests written would simplify quite a bit since you should be able to have binder use the latest stable version which would be pulled into the container if it is unpinned.

llorracc · 2019-07-27T22:42:14Z

Seems like the right way to do this is to have some way of tagging the notebooks that should be automatically run, maybe by adding something to the metadata, and then the script should just run those. Either that or have in the DemARK (and REMARK) repos a /test-me folder or something like that with a list of the names of the .py files that should be run when there is a pull request.

project-bot bot added this to Needs Triage in Issues & PRs Jul 23, 2019

shaunagm mentioned this pull request Jul 23, 2019

Identify "finished" notebooks and pin them to versions of HARK #51

Closed

willingc added 13 commits July 24, 2019 12:52

Add local requirements and unpin packages

bb875ec

Update readme

837d2a4

rename binder folder for testing

6999deb

rename requirements so binder will find at root

991f95f

add binder link to carol branch

6234064

add extension install/enable

a526ed4

add temporary todo file

0dde7d3

add requirements and post build instructions

0f04d0f

add credential to todo

1d8f0e7

update instructions in README

8f0e60c

add notes after running all notebooks

99432af

remove holding directory of old binder config

9c087c6

clean up README

b4ecf88

willingc force-pushed the troubleshoot branch from 9717278 to b4ecf88 Compare July 24, 2019 19:55

willingc added 6 commits July 24, 2019 13:05

add repo documentation

77ffd6b

move todo to docs directory

9c185cb

pin to latest dev release for binder

c072df3

add conda environment file and install docs

f4e854c

add docker install

f3980b7

fix typo

fdfc496

willingc changed the title ~~[WIP] Troubleshoot Binder build and notebook execution~~ Simplify Binder build configuration and notebook execution Jul 24, 2019

llorracc assigned MridulS Aug 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify Binder build configuration and notebook execution #50

Simplify Binder build configuration and notebook execution #50

willingc commented Jul 23, 2019

willingc commented Jul 23, 2019

shaunagm commented Jul 23, 2019

willingc commented Jul 23, 2019

shaunagm commented Jul 23, 2019 •

edited

willingc commented Jul 23, 2019 •

edited

shaunagm commented Jul 23, 2019 •

edited

willingc commented Jul 23, 2019

llorracc commented Jul 25, 2019 via email

willingc commented Jul 25, 2019

shaunagm commented Jul 25, 2019 •

edited

llorracc commented Jul 25, 2019

willingc commented Jul 26, 2019 •

edited

llorracc commented Jul 27, 2019

Simplify Binder build configuration and notebook execution #50

Are you sure you want to change the base?

Simplify Binder build configuration and notebook execution #50

Conversation

willingc commented Jul 23, 2019

willingc commented Jul 23, 2019

shaunagm commented Jul 23, 2019

willingc commented Jul 23, 2019

shaunagm commented Jul 23, 2019 • edited

willingc commented Jul 23, 2019 • edited

shaunagm commented Jul 23, 2019 • edited

willingc commented Jul 23, 2019

llorracc commented Jul 25, 2019 via email

willingc commented Jul 25, 2019

shaunagm commented Jul 25, 2019 • edited

llorracc commented Jul 25, 2019

willingc commented Jul 26, 2019 • edited

llorracc commented Jul 27, 2019

shaunagm commented Jul 23, 2019 •

edited

willingc commented Jul 23, 2019 •

edited

shaunagm commented Jul 23, 2019 •

edited

shaunagm commented Jul 25, 2019 •

edited

willingc commented Jul 26, 2019 •

edited