Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sphinxcontrib-apidoc for automatic API docs #116

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

djhoese
Copy link

@djhoese djhoese commented Oct 23, 2022

Description of proposed changes

This is an initial attempt at adding automated generation of API docs using the sphinxcontrib-apidoc package. This is something I've used in other packages and really takes away the maintenance burden of having to re-do the api.rst autodoc module every time something changes in the code. Every time sphinx docs are generated it runs sphinx-apidoc for you, puts files in an api/ directory, then continues on generating the docs.

Fixes #53

Concerns I've noticed:

  1. The api landing page isn't as "pretty" or simple as it is in main currently:

image

But it also includes everything in the package so in my opinion that is worth it. It is reference documentation, not a how-to.

  1. There seem to be other parts of the source code that are trying to re-define the API sections for some classes:
/home/davidh/repos/git/xbatcher/xbatcher/loaders/torch.py:docstring of xbatcher.loaders.torch.MapDataset:1: WARNING: duplicate object description of xbatcher.loaders.torch.MapDataset, other instance in api/xbatcher.loaders, use :noindex: for one of them
/home/davidh/repos/git/xbatcher/xbatcher/loaders/torch.py:docstring of xbatcher.loaders.torch.MapDataset.__init__:1: WARNING: duplicate object description of xbatcher.loaders.torch.MapDataset.__init__, other instance in api/xbatcher.loaders, use :noindex: for one of them
/home/davidh/repos/git/xbatcher/xbatcher/loaders/torch.py:docstring of xbatcher.loaders.torch.IterableDataset:1: WARNING: duplicate object description of xbatcher.loaders.torch.IterableDataset, other instance in api/xbatcher.loaders, use :noindex: for one of them
/home/davidh/repos/git/xbatcher/xbatcher/loaders/torch.py:docstring of xbatcher.loaders.torch.IterableDataset.__init__:1: WARNING: duplicate object description of xbatcher.loaders.torch.IterableDataset.__init__, other instance in api/xbatcher.loaders, use :noindex: for one of them
/home/davidh/repos/git/xbatcher/xbatcher/loaders/keras.py:docstring of xbatcher.loaders.keras.CustomTFDataset:1: WARNING: duplicate object description of xbatcher.loaders.keras.CustomTFDataset, other instance in api/xbatcher.loaders, use :noindex: for one of them
/home/davidh/repos/git/xbatcher/xbatcher/loaders/keras.py:docstring of xbatcher.loaders.keras.CustomTFDataset.__init__:1: WARNING: duplicate object description of xbatcher.loaders.keras.CustomTFDataset.__init__, other instance in api/xbatcher.loaders, use :noindex: for one of them

@djhoese
Copy link
Author

djhoese commented Oct 24, 2022

Ok the warnings in the original comment were a false alarm. It was complaining because I had the old api.rst file lying around and it was loading that and the new API docs. Let's see how the docs build go.

@maxrjones this is ready for review.

Copy link
Member

@maxrjones maxrjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this @djhoese!

I agree that completeness in the API reference documentation is the most important factor.

A couple points that seem essential to address before merging:

  • Is it possible to configure apidoc to recognize top-level imports rather than sub-module imports? For example, https://xbatcher--116.org.readthedocs.build/en/116/api/xbatcher.generators.html implies that it is necessary to use from xbatcher.generators import BatchGenerator while the recommended import style is from xbatcher import BatchGenerator.
  • How can we properly document the xarray accessors using this extension, such that readers understand to use the accessors through DataArray.batch.generator rather than using the xbatcher.accessors.BatchAccessor class directly?

@djhoese
Copy link
Author

djhoese commented Oct 25, 2022

Wow, what good questions. And ones I've never asked for my other projects. Let's see what we can come up with:

Top-level imports

This is a good point and I've thought about it a lot since you commented. I think my overall feeling (which doesn't matter as I'm not the maintainer of this library) is that the API docs should reflect what the actual layout of the code. So put another way, the API docs should show readers where the code lives and what it has to offer (the hard facts of arguments, types, names, etc). That said, I'm always annoyed when I find out years later that I could have been importing things in a much simpler way...but maybe that isn't the job of the API docs to tell me that?

Here's what I've found so far: if we want to get really custom with stuff we can create templates:

https://www.sphinx-doc.org/en/master/man/sphinx-apidoc.html#cmdoption-sphinx-apidoc-0

and control how each type of object is created and documented. Hopefully we don't need that though.

The other thing to play with is imported-members described here:

https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#module-sphinx.ext.autodoc

where it says:

In an [automodule](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html#directive-automodule) directive with the members option set, only module members whose __module__ attribute is equal to the module name as given to automodule will be documented. This is to prevent documentation of imported classes or functions. Set the imported-members option if you want to prevent this behavior and document all available members. Note that attributes from imported modules will not be documented, because attribute documentation is discovered by parsing the source file of the current module.

This might let these be documented further up, but I haven't tried it yet. But I'm a little worried that doing this for all modules will likely cause more problems than it helps.

If you or anyone knows of a project that has the API documented the way you want let me know and I'll see if I can replicate it.

Accessors

I think the best idea would be to emphasize the usage of the class in the docstring for the class. Like an "Examples" section and a note that this class does not need to be used directly. It doesn't look like xarray's .plot accessor does anything special regarding documentation:

https://docs.xarray.dev/en/stable/generated/xarray.DataArray.plot.imshow.html

but instead sticks with the other non-reference documentation to spell out the usage:

https://docs.xarray.dev/en/stable/user-guide/plotting.html#simple-example

I guess if you mention in the docstrings that they don't need to be used directly and should be used as properties on xarray objects and they are never shown with direct usage then maybe it is OK?

https://docs.xarray.dev/en/stable/user-guide/plotting.html#simple-example

Edit: The other option could be to define the classes with a _ prefix in which case they would not be included in the API docs by default and also gives the reader the understanding that these aren't meant for direct usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use sphinxcontrib apidoc for autogenerated API docs
2 participants