New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Make attaching docstrings to __array_function__ funcs more robust #16209
Conversation
add_docstring( | ||
implement_array_function, | ||
# We can't use add_newdoc (which is more robust than add_docstring) here to | ||
# attach these docstrings (cyclic imports), and add_docstring() gives a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we resolve these cyclic imports instead?
Or alternatively, just move these docstrings to add_newdocs with all the others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried for 15 minutes, it's nontrivial. The other option is to move the docstring into the C code. I don't think add_docstring
should be used ever.
I'm going for a minimal fix for 1.19.x though, this is a painful issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or alternatively, just move these docstrings to add_newdocs with all the others?
Tried that too, doesn't work - add_newdoc
cannot attached things to _multiarray_umath
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried that too, doesn't work -
add_newdoc
cannot attached things to_multiarray_umath
.
I don't know what you mean by that - isn't that exactly what https://github.com/numpy/numpy/blob/master/numpy/core/_add_newdocs.py does already?
I think you'd probably use 'numpy.core.overrides'
in that file, but it still should work just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: This is about reloading and rerunning the same code twice. How about fixing this in add_docstring
itself...
doc_attr = PyObject_GetAttrString(obj, "__doc__");
if (doc_attr != NULL && doc_attr != Py_None) {
PyErr_Format(PyExc_RuntimeError, "object %s", msg);
return NULL;
}
replace that with a check that the old and new docstring are identical. And if they are simply do nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't notice if we set it twice by accident, but since it will still error if there is a disagreement, that seems not a huge issue and feels more reasonable then a try/except here that may hide errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't really follow why this file does its own docstring attachment - can't we do it just like we do here:
numpy/numpy/core/_add_newdocs.py
Line 5846 in 84a4c4b
add_newdoc('numpy.core.multiarray', 'datetime_data', |
In both cases, it's attaching a docstring to an extension method from multiarray_umath.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: This is about reloading and rerunning the same code twice. How about fixing this in
add_docstring
itself...doc_attr = PyObject_GetAttrString(obj, "__doc__"); if (doc_attr != NULL && doc_attr != Py_None) { PyErr_Format(PyExc_RuntimeError, "object %s", msg); return NULL; }
replace that with a check that the old and new docstring are identical. And if they are simply do nothing.
yes this would be really nice. another quick and hacky option maybe just filter "already has a docstring" exceptions and raise everything else ? this will avoid hiding other exceptions, but still mask errors with same msg caused by something else.
The one CI failure is a Mingw32 install problem. |
@rgommers I assume there is an easy way to try this out locally inside spyder or IPython? |
OK, this works:
but as I suspected, all this does is move the error to the next function which is |
Not super easy, something like this isn't enough to trigger it:
I'll ask someone over in the other issue to test. |
Ah thanks for the reproducer - yes need to delete recursively in |
That makes for a good regression test. |
Well, I don't know... Maybe my snippet is way too aggressive compared to what people actually do... |
Spyder does basically that: https://github.com/spyder-ide/spyder-kernels/blob/master/spyder_kernels/customize/umr.py#L121 |
@rgommers but the fundamental problem remains... For example:
Thus, all C-functions which use |
But I will agree that this, unlike sub-interpreters, is very much fixable and probably only reasonably difficult (although easy to miss a case here and there). |
That makes sense. How/where to raise it though? |
Probably make it as specific as possible, in
Just add somethign along the lines of:
EDIT: This may not work because spyder/pycharm may be smart enough to not reload C-extension modules, but still agressive enough to reload the dependencies of those C-modules, leaving us in the same inconsistent state. One option could be to make a C-function which returns and caches |
close/reopen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I might prefer the C-level or even just checking if implementation.__doc__ == dispatcher.__doc__
here (moving it to C "solves" the issue of dealing with the duplication here – at least leaves this code untouched).
As is now, if there is some other error, for any weird reason we won't notice. And if we do actually add it twice, we could be using a second incorrect doc version and also not notice.
@@ -29,3 +32,13 @@ def test_novalue(): | |||
assert_equal(repr(np._NoValue), '<no value>') | |||
assert_(pickle.loads(pickle.dumps(np._NoValue, | |||
protocol=proto)) is np._NoValue) | |||
|
|||
def test_reimport(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I am a bit worried about this test. Because it definitely messes up internal/cached. So if for example we have one place before and after this relying on that state (testing certain exceptions, using np._NoValue
, or checking np.matrix
on the C-side), this will lead to cryptic test failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about running it in a subprocess similar to some of the tests in test_io.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Somehow I completely forgot about that possibility!
I'll close this, continued in gh-16239 |
Closes gh-14384