Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG,DOC: Allow attach docs twice but error if wrong #16239

Merged
merged 4 commits into from May 20, 2020

Conversation

seberg
Copy link
Member

@seberg seberg commented May 14, 2020

This stops general try/except, and instead skips setting of
docstrings if the docstring is identical. The latter part allows
for the import to be robust when e.g. reloading numpy which may
cause add_docstring to run twice.

This commit also changes the documentation stubs for scalar
attributes and errors to instead refer to the array attribute.
These are typically inherited, and thus it is not useful if
they refer to some "virtual attribute"

Closes gh-16209 and gh-14384


@rgommers to get this fixed for 1.19 quickly, had a look at the C-side fix, which unfportunately flushed out the exact thing I thought we might miss...

refer_to_array_attribute('T', method=False))

add_newdoc('numpy.core.numerictypes', 'generic',
refer_to_array_attribute('base', method=False))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reviewers: If I had a typo in the attribute, this would crash. So the only thing is to not forget/double an attribute by accident.

{"flat",
(getter)gentype_flat_get,
(setter)0,
"a 1-d view of scalar",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these may actually have been better docstrings in a sense. OTOH, refering the ndarray attribute is nice.

Copy link
Member Author

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p = Process(target=try_full_reload)
p.start()
p.join()
assert p.exitcode == 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this fails on azure/windows with FileNotFoundError: [Errno 2] No such file or directory: 'D:\\a\\1\\s\\build\\test\\runtests.py' Just skip on windows:

https://dev.azure.com/numpy/numpy/_build/results?buildId=9919&view=logs&j=25bb66cf-4a16-533e-490c-8da4b5a3ec04&t=97446ee6-ebce-5662-24e5-f82eed312005&l=217

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other option is maybe try https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods muliprocessing.set_start_method('spawn')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this is already the default on windows for the version we are using

Copy link
Member

@mattip mattip May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

txt = r"""
    import sys
    import numpy as np

    for k in list(sys.modules.keys()):
        if "numpy" in k:
            del sys.modules[k]

    import numpy as np
"""
p = subprocess.run([sys.executable, '-c', textwrap.dedent(txt)])
assert p.returncode == 0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks matti, will just use that, seems simplest.

I changed the docs to link ndarray.attribute, did not realize the ~ would remove the ndarray as well.

@rgommers
Copy link
Member

@rgommers to get this fixed for 1.19 quickly, had a look at the C-side fix, which unfportunately flushed out the exact thing I thought we might miss...

Thanks @seberg, happy for you to take this over - I'm pretty short on time on weekdays.

Copy link
Member

@anirudh2290 anirudh2290 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM ! 👍

@@ -1482,7 +1482,7 @@ arr_add_docstring(PyObject *NPY_UNUSED(dummy), PyObject *args)
if (!(doc)) { \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nit and unrelated to this change, but the do {...} while(0) can probably be removed.
another nit and again unrelated to this change, but when i read this, i was searching for new in the code, but didnt realize that new was defined inside the macro, which made this a bit confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, lets move the new in front of the macro, also removes the need for the macro cast...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to clean it up, which deletes a bunch of other code since Python seems to define those types in the header now (if not, I assume CI will fail).

@@ -1507,7 +1507,8 @@ arr_add_docstring(PyObject *NPY_UNUSED(dummy), PyObject *args)
PyObject *doc_attr;

doc_attr = PyObject_GetAttrString(obj, "__doc__");
if (doc_attr != NULL && doc_attr != Py_None) {
if (doc_attr != NULL && doc_attr != Py_None &&
(PyUnicode_Compare(doc_attr, obj) != 0)) {
PyErr_Format(PyExc_RuntimeError, "object %s", msg);
Copy link
Member

@eric-wieser eric-wieser May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PyErr_Format(PyExc_RuntimeError, "object %s", msg);
if (PyErr_Occurred()) {
return NULL;
}
PyErr_Format(PyExc_RuntimeError, "object %s", msg);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, seems there is also a %R or so missing to tell you which object we are talking about...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed, the check was actually incorrect, it should have been str not obj, its fixed now (tested manually with np.add_newdoc(np.take_along_axis, "wrong docstring"))

BUT: We currently expose even np.add_newdoc as top-level... So I reverted the try/except change there, since it could be very disruptive in theory (i.e. something not importing). My tendency is to give a warning instead as soon as 1.20 is out.

@mattip
Copy link
Member

mattip commented May 17, 2020

You should skip the failing test on PyPy, you are running into gh-10167 where the call to add_newdocs should be replaced by generated C code that is compiled in and loaded before PyType_Ready

@seberg
Copy link
Member Author

seberg commented May 17, 2020

Thanks, hadn't noticed the tests were failing...

@mattip
Copy link
Member

mattip commented May 18, 2020

Now the PYOPTIMIZE=2 CI run is failing the TestAddDocstring.test_different_docstring_fail tests.

@seberg
Copy link
Member Author

seberg commented May 18, 2020

Dang, should have realized, passing now...

@mattip
Copy link
Member

mattip commented May 18, 2020

The new tp_doc assigments seem to be working. numpy.generice.base now looks something like this:

numpy.generic.base

attribute

generic.base:

Scalar attribute identical to the corresponding array attribute.

Please see ndarray.base.

@mattip
Copy link
Member

mattip commented May 18, 2020

LGTM

@charris charris added the 09 - Backport-Candidate PRs tagged should be backported label May 18, 2020
@seberg
Copy link
Member Author

seberg commented May 19, 2020

Would be nice to finish this. @charris would you prefer if I split this up to make the backport minimal?

@charris
Copy link
Member

charris commented May 19, 2020

Minimal backports are always nice :) I will probably do an rc2 next weekend as this appears a bit risky to me.

@seberg seberg force-pushed the reimport-do-not-set-docs branch 2 times, most recently from 7ff0586 to fc03476 Compare May 19, 2020 16:29
@seberg
Copy link
Member Author

seberg commented May 19, 2020

@charris I think it looks larger then it is and is only doc attaching, but still. The first commit is now the minimal fix. The TST commit is probably just as fine.

@mattip
Copy link
Member

mattip commented May 19, 2020

Windows is complaining "ValueError: path is on mount 'D:', start on mount 'C:'", which was supposed to be avoided. Is this rebased off master?

@seberg
Copy link
Member Author

seberg commented May 19, 2020

I did not rebase again, thought the commits apply easier then maybe, let me rebase...

This is technically not a bug, but some IDEs and IPython have
autoreload magic which can mean that NumPy gets reloaded a second
time. This is not safe, but when it happens ignoring that an
identical docstring is already attached fixes the issue.
Previously some of these were not set, because they were already
set on the C-level. Remove the duplicate message and replace it
instead with a forward to the ``ndarray`` attribute/method.

These docstrings are inherited by the actual scalars and thus the
"virtual class" note was misleading.
@seberg seberg closed this May 19, 2020
@seberg seberg reopened this May 19, 2020
@seberg
Copy link
Member Author

seberg commented May 19, 2020

The windows builds seem super flaky today... but maybe time will make that go away...

@mattip
Copy link
Member

mattip commented May 20, 2020

Close/Reopen to trigger CI merge with master

@mattip mattip closed this May 20, 2020
@mattip mattip reopened this May 20, 2020
@mattip
Copy link
Member

mattip commented May 20, 2020

LGTM. Can merge if CI passes

# Test should possibly be moved, but it also fits to be close to
# the newdoc tests...
@pytest.mark.skipif(sys.flags.optimize == 2, reason="Python running -OO")
def test_add_same_docstring(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails on PyPy since np.ndarray.flat.__doc__ is None. Skipping would be the easiest fix I guess

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks... dang, I had lost the fixups for this file in the rebase and forgot about this one...

Its not quite the right file, but close to newdoc seemed sensible
and we do not have a "right" file right now...
The old pointers are now provided by the C-API and the macros
seemed a bit confusing since ``new`` appears from nowhere being
defined in the macro.
@mattip mattip merged commit 78d7ab3 into numpy:master May 20, 2020
@mattip
Copy link
Member

mattip commented May 20, 2020

Thanks @seberg. ba14823 is the minimal change + test to be backported to fix this in 1.19, correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants