Issue 7 add sphinx doc #23

kan-fu · 2024-01-26T00:39:19Z

Fix #7, Fix #8. I will leave #9 open because I haven't finished all the docstring.

Angela kindly agrees to review the second commit that contains a lot of documentation. I have some additional commit message in each commit to explain what I have done. https://oceannetworkscanada.github.io/api-python-client/

I added some line breaks for the markdown files to get rid of long sentences. The rule is not based on a fixed width, but more on a semantic meaning.

CI (doc.yml, doc build and deploy, there is no test step) is only triggered when pushing to main. I added a manual trigger in case it is desired before the merge. Currently the gh-page deployment workflow run has to be approved first by me as a protection rule.

For Eli: I have noticed some outdated info on the schema of response json (e.g, some values can be null, allowableRange and suboptions are missing in the discover data products, ...). I will compile a list after I finish all the docstring.

Jacob-Stevens-Haas

Congrats on migrating the documentation! I'm not entirely sure what's new vs what's just migrated, so let me keep my thoughts general to some of the technology in the PR, based on my experience with pysindy.

Have you tried adding the -W flag to the CI's sphinx invocation? sphinx is pretty good at building the docs regardless of RST mistakes, but it's useful to treat those mistakes as errors in CI, rather than just a warning in the GH action log.
The doc CI calls Makefile/Make.bat, instead of directly calling sphinx .... To wit, I'm not sure how to pass sphinx arguments to make html. RTD itself doesn't call make html, we've removed it, and looking at pandas, apparently they have too. Sphinx itself is pretty cross platform, so I'm not sure why sphinx quickstart autogenerates these files still.
RTD recommends pinning doc dependency versions. I'm a bit torn on the issue, myself, but their rationale is that without pinned versions, build errors can appear in CI are difficult to troubleshoot locally without scanning through the CI logs to find the exact versions of everything that might have caused the error.
I have struggled in the past to manage jupyter notebooks in a package repository. None of these seem problematically large or long-running, but for the sake of reading diffs it might be worth saving them as python files and only creating the notebooks using jupytext on doc builds.
Regardless, there's the question of how to test the jupyter notebooks. E.g. if you change the call signature of a function, how does CI catch the notebook that demonstrates that function? Here's our infrastructure for doing so. We actually test the .py files that shadow notebooks, because we have to substitute certain globals in order to run the notebooks extra fast, but you can do the same with actual ipynb files using nbconvert and nbformat if you're not concerned with test duration.

kan-fu · 2024-01-27T01:18:54Z

Congrats on migrating the documentation! I'm not entirely sure what's new vs what's just migrated, so let me keep my thoughts general to some of the technology in the PR, based on my experience with pysindy.

Have you tried adding the -W flag to the CI's sphinx invocation? sphinx is pretty good at building the docs regardless of RST mistakes, but it's useful to treat those mistakes as errors in CI, rather than just a warning in the GH action log.

The doc CI calls Makefile/Make.bat, instead of directly calling sphinx .... To wit, I'm not sure how to pass sphinx arguments to make html. RTD itself doesn't call make html, we've removed it, and looking at pandas, apparently they have too. Sphinx itself is pretty cross platform, so I'm not sure why sphinx quickstart autogenerates these files still.

RTD recommends pinning doc dependency versions. I'm a bit torn on the issue, myself, but their rationale is that without pinned versions, build errors can appear in CI are difficult to troubleshoot locally without scanning through the CI logs to find the exact versions of everything that might have caused the error.

I have struggled in the past to manage jupyter notebooks in a package repository. None of these seem problematically large or long-running, but for the sake of reading diffs it might be worth saving them as python files and only creating the notebooks using jupytext on doc builds.

Regardless, there's the question of how to test the jupyter notebooks. E.g. if you change the call signature of a function, how does CI catch the notebook that demonstrates that function? Here's our infrastructure for doing so. We actually test the .py files that shadow notebooks, because we have to substitute certain globals in order to run the notebooks extra fast, but you can do the same with actual ipynb files using nbconvert and nbformat if you're not concerned with test duration.

Thanks for your suggestions!

Getting rid of make file makes perfect sense to me. I actually saw a conversation at sphinx issues about moving away from makefile. It was in 2019, and I guess back then the makefile was included by default and now they provide a sphinx-quickstart for generating the makefile if anyone still likes the makefile idea. It is still in their official tutorial, so not sure how the sphinx develops think about it. -W is also a good idea. Other flags like 'T', 'E', and 'd' are optional, so I am not using them for simplicity.
About pinning the version, this is the same question for the dependencies in pyproject.toml. I can understand their rationale, but I think the onc library is simple enough to leave all the versions blank. If CI errors happen in the future, we could pin the versions at that time.
For the Jupyter notebooks, I admit the git diff is hard to read in GitHub, but I think there are two reasons that makes that not a big issue:
- Most of our internal users are data scientists. Jupyter notebooks are a great tool for communication than python script with conversion back and forth. They use Jupyter notebooks for conference and workshop presentation, and it is very easy for developers to use them here without many modifications. Users can also download the Jupyter notebook to play with it locally (I just added the link in the new commit).
- Once the Jupyter notebook is merged in main, I anticipate there won't be frequent changes on it. Even if a review is needed, VS Code has a better diff than GitHub, or the PR author can highlight the diff in the message, and reviewers just review the deployed version (I am thinking using my personal fork as a qa version of the GitHub Pages). I guess it is really a headache in the pysindy because there are too many notebooks from different levels of users. For now, I will be the only one to put Jupyter notebooks in this repo, so things are simpler here.
For testing, myst_nb has the execution mode, so each build (push to main or manual trigger, only in CI) will run the notebook, and the doc will show the most updated output. I set nb_execution_raise_on_error to True in conf.py, so it will raise an exception for failed running error instead of warning (kind of unnecessary after adding the -W flag). I guess I shouldn't say "there is no test step" in CI. The running/testing is in the build process.

Jacob-Stevens-Haas · 2024-01-27T15:14:27Z

Oh fully agree that notebooks are a great communication tool! I was suggesting having both the notebook and the script:

notebooks go into documentation build (and users who want them can download them directly)
changes in notebook script are easy to see in git diff, run in pytest locally.
jupytext in CI keeps the notebook and it's script in sync.

Not for now though! Just letting you know that if you run into the same issues we've run into at pysindy (notebooks that take too long to run even in CI, but which still need to keep up with breaking API changes, and whose edits to do so run large in git diff), there's a solution out there! And for a lighter version that doesn't use jupytext/scripts, but still has an option to do pytest notebooks so that it doesn't take a full doc build, that's also in the pysindy link. Just options for the future if you want!

kan-fu · 2024-01-29T23:49:58Z

Oh fully agree that notebooks are a great communication tool! I was suggesting having both the notebook and the script:

notebooks go into documentation build (and users who want them can download them directly)

changes in notebook script are easy to see in git diff, run in pytest locally.

jupytext in CI keeps the notebook and it's script in sync.

Not for now though! Just letting you know that if you run into the same issues we've run into at pysindy (notebooks that take too long to run even in CI, but which still need to keep up with breaking API changes, and whose edits to do so run large in git diff), there's a solution out there! And for a lighter version that doesn't use jupytext/scripts, but still has an option to do pytest notebooks so that it doesn't take a full doc build, that's also in the pysindy link. Just options for the future if you want!

Sure, thanks for your suggestions! Definitely will keep that in mind in the future. We have planned a new internal scientific project that will have a lot of scripts and Jupyter notebooks, might be useful in the future when things become complicated.

kan-fu · 2024-02-01T21:33:36Z

I will keep this PR open for a while because it will take some time for Angela to go over all the documentation.

sphinx-quickstart is used to generate the base config (conf.py and Makefile). After that, extension configs are added in the conf.py.

Here are the sources: 1. Tutorial: https://github.com/OceanNetworksCanada/api-tutorial/blob/master/Tutorial_Notebook.ipynb. For onc library tutorial, I added the requests (vanilla) example as a comparison. 2. Code Examples: https://wiki.oceannetworks.ca/display/O2A/Sample+Code 3. Contributing: None 4. API Guide: https://wiki.oceannetworks.ca/display/O2A/Discovery+methods 5. API Reference: auto generated from docstring 6. README: https://wiki.oceannetworks.ca/display/O2A/Getting+Started 7. Licence: I add more boilerplate declaration so that GitHub can detect it is Apache license

The information is mainly copied from OpenAPI page, with numpy docstring style applied. All the doctests are disabled. I will add a separate file for them so they are not outdated.

kan-fu requested review from Jacob-Stevens-Haas, aschlesin, spencerwplovie and eliferguson January 26, 2024 00:39

Jacob-Stevens-Haas reviewed Jan 26, 2024

View reviewed changes

eliferguson approved these changes Feb 1, 2024

View reviewed changes

kan-fu added 5 commits March 11, 2024 15:49

feat: set up sphinx

4c7e0a9

sphinx-quickstart is used to generate the base config (conf.py and Makefile). After that, extension configs are added in the conf.py.

doc: add docstring for discovery method

b7a20d6

The information is mainly copied from OpenAPI page, with numpy docstring style applied. All the doctests are disabled. I will add a separate file for them so they are not outdated.

fix: replace makefile with sphinx-build

81fbb0a

doc: update documentation content

06fe891

kan-fu force-pushed the issue-7-add-sphinx-doc branch from 425ac7a to 06fe891 Compare March 11, 2024 22:53

kan-fu merged commit 469a5d7 into main Mar 11, 2024
12 checks passed

kan-fu deleted the issue-7-add-sphinx-doc branch March 11, 2024 23:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 7 add sphinx doc #23

Issue 7 add sphinx doc #23

kan-fu commented Jan 26, 2024 •

edited

Jacob-Stevens-Haas left a comment

kan-fu commented Jan 27, 2024

Jacob-Stevens-Haas commented Jan 27, 2024 •

edited

kan-fu commented Jan 29, 2024

kan-fu commented Feb 1, 2024

Issue 7 add sphinx doc #23

Issue 7 add sphinx doc #23

Conversation

kan-fu commented Jan 26, 2024 • edited

Jacob-Stevens-Haas left a comment

Choose a reason for hiding this comment

kan-fu commented Jan 27, 2024

Jacob-Stevens-Haas commented Jan 27, 2024 • edited

kan-fu commented Jan 29, 2024

kan-fu commented Feb 1, 2024

kan-fu commented Jan 26, 2024 •

edited

Jacob-Stevens-Haas commented Jan 27, 2024 •

edited