Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NEW PROJECT]: Standardise docstrings to numpydoc(?) standard #29

Open
6 tasks
chrishalcrow opened this issue May 10, 2024 · 6 comments · May be fixed by #30
Open
6 tasks

[NEW PROJECT]: Standardise docstrings to numpydoc(?) standard #29

chrishalcrow opened this issue May 10, 2024 · 6 comments · May be fixed by #30

Comments

@chrishalcrow
Copy link

chrishalcrow commented May 10, 2024

Project title:

Key Investigators

  • Chris Halcrow

Project Description

Bring our docstrings up to numpydoc standard (for one module)

Background

Docstrings are important. If they are formatted properly, they can be used by other packages to autogenerate stuff e.g. Sphinx uses them to build our docs API. Future GUIs would depend on them, and having (typed?) good ones can make development much easier. Our are pretty good, but it would be nice if they were met a specific standard. I propose to bring the docstrings up to the numpydoc(?) standard: https://numpydoc.readthedocs.io/en/latest/format.html

To see the fun that awaits try running e.g. python -m numpydoc --validate spikeinterface.generation.generate_noise (sorry to pick on one function: this was the first one I saw!). Think it’d be good to try and do a module and see how long it takes, and how annoying it is. Ruff (https://docs.astral.sh/ruff/) can do linting based on this standard, which makes it easier to develop.

Objectives

Standardise one module, find out how long this takes and what are the time-heavy steps (if they exist).

Approach and Plan

  • Make a simple workflow which makes standardising docstrings easy (probably installing the vscode ruff extension is enough)
  • Standarise the docstrings for one module

Progress

  • Decide on which standard to use
  • Standardise one docstring
  • Figure out how long it takes to standardise one docstring => estimate size of task
  • Standardise all docstrings in one module
@JoeZiminski
Copy link

BTW ruff has a docstring linter, a colleague is using it to great effect

@zm711
Copy link

zm711 commented May 14, 2024

@JoeZiminski you're going to get us to ruff eventually :)

@zm711
Copy link

zm711 commented May 15, 2024

Just one small point @chrishalcrow and maybe @JoeZiminski can comment on if ruff would be able to handle this tunability.

We had discussed a while back trying to move everything to a modified numpy docstring. So we use the numpy docstring sections (Parameters, Returns, Notes, etc), but for arguments we don't use the optional tag when a default is supplied (that never made sense to me because it is not optional to the function, just optional to the user) and we do arg: type, default: value.

If you think that is a bad style we can discuss as part of this project to see if fully switching over to pure numpy is better. For further example see the style guide in the docstring conventions section.

@h-mayorquin
Copy link

I think that optional should be used when the value is None (e.g. value=None) That is, the parameter does not have to be specified. Default is used when there is default value that is not None.

This policy seems to be the most consistent with both python.typing Optional and with numpy docstring specification:
https://docs.python.org/3/library/typing.html#typing.Optional
https://numpydoc.readthedocs.io/en/latest/format.html

@chrishalcrow
Copy link
Author

chrishalcrow commented May 22, 2024

Hello,

Just been preparing some stuff for this. A couple of things:

Ruff only applies a subset of numpydoc standards. In particular it doesn't implement ones about parameters, which I'm keen to implement. But it's so good for lots of other things, thanks for mentioning it @JoeZiminski !

Both of your suggestions are valid with numpydoc standards (see Section 4 of https://numpydoc.readthedocs.io/en/latest/format.html#parameters, you can either do the optional thing or supply a default). Here's a real world example:

def generate_noise(noise_level=15.0, spatial_decay=None):
    """
    Return a recording which is just noise.

    Parameters
    ==========
    noise_level : float | np.array | tuple, default: 15.0
        a description.
    spatial_decay : float, optional
        a description.
    """

Opinions? @zm711 do you agree with the optional? @h-mayorquin does it look ok? I'll ask a few more people before the hackathon!

EDIT: also numpydoc prefers "or" over "|". Thoughts?

@h-mayorquin
Copy link

I think both or and | are fine. No opinion on that one. I usually use auto generation so I hope I can set it up to that. I don't think that it matters enough to grant extra work on our side.

As for the real example, yes, that's exactly how I read the documentation that it should be. This is consistent with typing.Optional semantics. Having optional for cases when the default is None (the most common case of typing.Optional) and having default and making the value explicit when the you are choosing.

There is the degenerate case when you are choosing, None is an option but the default is something else that I think we should try to avoid anyway. But we should discuss that case separateley and don't let an edge case dictate common policy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants