Skip to content
This repository has been archived by the owner on Mar 1, 2023. It is now read-only.

Tutorials and documentation #2057

Open
valeriabarra opened this issue Feb 24, 2021 · 3 comments
Open

Tutorials and documentation #2057

valeriabarra opened this issue Feb 24, 2021 · 3 comments

Comments

@valeriabarra
Copy link
Member

Description

Our tutorials are currently used to generate documentation. This can be favorable because it avoids the maintenance burden given by the bifurcation between source code of tutorials (intended as interactive scripts) and documentation. Also, our setup with Literate.jl and Documenter.jl allows to execute the generated notebook so that if there any plots in the tutorial, they will show up. The time these tutorials take though is getting prohibitive making our CI time out.

My understanding at the moment is that the generated .ipynb notebooks seem to be more used to generate documentation material rather than an educational/interactive one, which is how I think notebooks are commonly intended (and often hosted on platforms like Binder so that users can interactively play/experiment with different parameters in the example, without having to clone the repository and rely on their local environment).

@jakebolewski pointed out on Slack that the newer version of Literate supports the inclusion of .png / .svg files so that they become available in the file system and are then referenced in the markdown files. Could you please confirm this? In that case, the notebook generation step seems redundant and could probably be avoided.

Any other ideas for how we can improve this would be appreciated.

cc: @kpamnany @charleskawczynski @simonbyrne

@jakebolewski
Copy link
Contributor

jakebolewski commented Mar 8, 2021

The issue is really the cumulative runtime of the tutorials, not small things like generating extra files that might be (now) unnecessary like ipynb files.

The amount of parallelism available right now given the way the docs are structured is dominated by the amount of memory needed by the most expensive tutorial (~6 ish GB during runtime). For a 64 GB node that means we can run ~10 tutorials in parallel. One step around this would be to just split the tutorial generation steps as separate slurm tasks and then build the documentation after this is done. But really that will just help lower the runtime to the doc deps setup + longest tutorial + the serial build time which would probably still be well over an hour.

@valeriabarra
Copy link
Member Author

The thing that I mentioned before in the Slack channel and that I find counter intuitive is why these tutorials are executed to generate documentation. In some other projects I worked on (and thus, I may be biased by this past experience) tutorials as a mean of educational material (often interactive) are separate from documentation generation. I know this puts a maintenance burden, because they can easily fall out of sync and you have to maintain both, but it would significantly cut the time of our Docs build. My understanding from @charleskawczynski 's comment in the Slack discussion was that they are currently executed to generate plots (when applicable)

@jakebolewski
Copy link
Contributor

jakebolewski commented Mar 8, 2021

I think in practice no one would check that the tutorials still worked if we did not build them every merge commit, so going rapidly out of sync is an issue. Also the tutorial output could materially change if some internals changed, so it's not like you necessarily can rebuild them only when the tutorial files changed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants