Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linting and formatting notebooks #45

Open
rossbar opened this issue Jul 13, 2021 · 1 comment
Open

Linting and formatting notebooks #45

rossbar opened this issue Jul 13, 2021 · 1 comment

Comments

@rossbar
Copy link
Contributor

rossbar commented Jul 13, 2021

There have been some previous discussions on how best to handle tasks like linting and formatting the notebooks. For example - we'd like to have the code cells formatted (e.g. via black) and have the markdown cells respect line character limits. Ultimately, I think we'd like to be able to incorporate automated linting/formatting to the workflow so that authors don't have to think about it at all. This issue is for starting the discussion on how best to do that.

These topics have been discussed in other communities as well, most notably in mwouts/jupytext#432. There are some really nice ideas there. Fortunately, it looks like jupytext has built-in support for applying black to code cells. In our case (with .md-formatted notebooks) this would look like:

jupytext notebook.md --pipe black

I've tried this out a bit and it seems pretty robust - I think we could incorporate this into the workflow (maybe via pre-commit + a "style" CI job, similar to how NetworkX does it) relatively easily.

Formatting the non-code-cell markdown is a bit trickier. There is a suggestion from that same jupytext issue (mwouts/jupytext#432 (comment)) to try pandoc which has support for softbreaks (i.e. line breaks in source files) and automatically applies this formatting:

jupytext notebook.md --pipe-fmt ipynb --pipe 'pandoc --from ipynb --to ipynb --atx-headers'

This kind of works, but doesn't inherently support/recognize all the features of MyST markdown. For example, when I tried this on one of our notebooks, it dropped the footnotes. This seems like a reasonable approach, but it's not suitable (in the current state) for incorporation into an automated workflow.

@rossbar
Copy link
Contributor Author

rossbar commented Jul 28, 2021

Some followup on how best to incorporate this into the workflow...

As mentioned above, I think the code formatting (i.e. black) step is robust enough add, so I'd focus only on that to start. My initial idea would be to add a "linting" check to the CI similar to what is done in NetworkX itself.

As a first step, a CI job that fails when things aren't formatted properly is fine, as maintainers can always tell/help contributors lint their code (run jupytext *.md --pipe black). It would also be nice to get things working via pre-commit since that will automate everything and eliminate the need to have maintainers explain/run the linting procedure every time a job fails. Getting the pre-commit hooks set up might be a little more involved, so starting with the linting CI job seems totally reasonable and probably what I would do.

@harshal-dupare who has expressed interest in the infrastructure 🎉 . Feel free to comment/ask questions in this thread or open new issues as you see fit!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant