Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add codespell support (config, workflow to detect/not fix) and make it fix few typos #26315

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
78687a1
Add github action to codespell main on push and PRs
yarikoptic Apr 19, 2024
3abfb7c
Add rudimentary codespell config
yarikoptic Apr 19, 2024
8ea3f01
For now exclude lapack_lit from spelling
yarikoptic Apr 19, 2024
05d48a3
Add some common hits to be ignored by codespell
yarikoptic Apr 19, 2024
9c22a5c
Ignore names and acronyms
yarikoptic Apr 19, 2024
64ddc31
Rename "datas" to "data_samples"
yarikoptic Apr 19, 2024
0e62840
more skips
yarikoptic Apr 19, 2024
8afd1bf
tune some variables or annotate to be skipped to avoid triggering cod…
yarikoptic Apr 19, 2024
21d95b2
Adjust release note to become more correct and not flip codespell
yarikoptic Apr 19, 2024
efc775a
Annotate line with secnd - I think on purpose
yarikoptic Apr 19, 2024
9ba7e4d
Replace typo-close doub with 'd' in an example
yarikoptic Apr 19, 2024
639693e
DOC: adjust variables in example to be fp_{kind} instead of {k}fp and…
yarikoptic Apr 19, 2024
67d0055
Unclear typo fix guess
yarikoptic Apr 19, 2024
7c56057
Replace typo-close padd with arr_ (name padd seems to not carry seman…
yarikoptic Apr 19, 2024
8620d3f
Mark intentional typo to be skipped
yarikoptic Apr 19, 2024
4e92235
[DATALAD RUNCMD] Do interactive fixing of leftover ambigous typos
yarikoptic Apr 19, 2024
5788d04
Re-enable codespelling lapack_lite and skip few common short variable…
yarikoptic Apr 19, 2024
7763efe
[DATALAD RUNCMD] run codespell throughout for lapack_lite fixing typo…
yarikoptic Apr 19, 2024
864f866
[DATALAD RUNCMD] Do interactive fixing of lapack_lite leftover ambigo…
yarikoptic Apr 19, 2024
e12441c
ignore changelogs and "recuse"
yarikoptic Apr 19, 2024
ccfba7a
replace typo-close "notin" with "not_in" variable
yarikoptic Apr 19, 2024
6a1a95b
fix ambiguous "fals -> false" typo
yarikoptic Apr 19, 2024
c0db1a5
Whitelist implementor for codespell
yarikoptic Apr 19, 2024
3ba6b83
[DATALAD RUNCMD] run codespell throughout fixing typos automagically
yarikoptic Apr 19, 2024
3c63706
Fixing linter's gotcha on spaces before operator
yarikoptic Apr 19, 2024
bb7d7f3
Use _ for unused return value (makes line shorter to pass lint)
yarikoptic Apr 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
23 changes: 23 additions & 0 deletions .github/workflows/codespell.yml
@@ -0,0 +1,23 @@
# Codespell configuration is within pyproject.toml
---
name: Codespell

on:
push:
branches: [main]
pull_request:
branches: [main]

permissions:
contents: read

jobs:
codespell:
name: Check for spelling errors
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4
- name: Codespell
uses: codespell-project/actions-codespell@v2
4 changes: 2 additions & 2 deletions .spin/cmds.py
Expand Up @@ -344,7 +344,7 @@ def lint(ctx, branch, uncommitted):
Examples:

\b
For lint checks of your development brach with `main` or a custom branch:
For lint checks of your development branch with `main` or a custom branch:

\b
$ spin lint # defaults to main
Expand Down Expand Up @@ -465,7 +465,7 @@ def bench(ctx, tests, compare, verbose, quick, commits):
] + bench_args
_run_asv(cmd)
else:
# Ensure that we don't have uncommited changes
# Ensure that we don't have uncommitted changes
commit_a, commit_b = [_commit_to_sha(c) for c in commits]

if commit_b == 'HEAD' and _dirty_git_working_dir():
Expand Down
2 changes: 1 addition & 1 deletion doc/Makefile
Expand Up @@ -17,7 +17,7 @@ PAPER ?=
DOXYGEN ?= doxygen
# For merging a documentation archive into a git checkout of numpy/doc
# Turn a tag like v1.18.0 into 1.18
# Use sed -n -e 's/patttern/match/p' to return a blank value if no match
# Use sed -n -e 's/pattern/match/p' to return a blank value if no match
TAG ?= $(shell git describe --tag | sed -n -e's,v\([1-9]\.[0-9]*\)\.[0-9].*,\1,p')

FILES=
Expand Down
2 changes: 1 addition & 1 deletion doc/neps/nep-0043-extensible-ufuncs.rst
Expand Up @@ -241,7 +241,7 @@ to define string equality, will be added to a ufunc.
nin = 1
nout = 1
# DTypes are stored on the BoundArrayMethod and not on the internal
# ArrayMethod, to reference cyles.
# ArrayMethod, to reference cycles.
DTypes = (String, String, Bool)

def resolve_descriptors(self: ArrayMethod, DTypes, given_descrs):
Expand Down
8 changes: 4 additions & 4 deletions doc/neps/nep-0055-string_dtype.rst
Expand Up @@ -386,7 +386,7 @@ DTypes, and we think it would harm adoption to require users to refactor their
DType-handling code if they want to use ``StringDType``.

We also hope that in the future we might be able to add a new fixed-width text
version of ``StringDType`` that can re-use the ``"T"`` character code with
version of ``StringDType`` that can reuse the ``"T"`` character code with
length or encoding modifiers. This will allow a migration to a more flexible
text dtype for use with structured arrays and other use-cases with a fixed-width
string is a better fit than a variable-width string.
Expand Down Expand Up @@ -990,11 +990,11 @@ in the array buffer as a short string.

No matter where it is stored, once a string is initialized it is marked with the
``NPY_STRING_INITIALIZED`` flag. This lets us clearly distinguish between an
unitialized empty string and a string that has been mutated into the empty
uninitialized empty string and a string that has been mutated into the empty
string.

The size of the allocation is stored in the arena to allow reuse of the arena
allocation if a string is mutated. In principle we could disallow re-use of the
allocation if a string is mutated. In principle we could disallow reuse of the
arena buffer and not store the sizes in the arena. This may or may not save
memory or be more performant depending on the exact usage pattern. For now we
are erring on the side of avoiding unnecessary heap allocations when a string is
Expand Down Expand Up @@ -1070,7 +1070,7 @@ by the ``StringDType`` instance attached to the array. In this example, the
string contents are "Numpy is a very cool library", stored at offset ``0x94C``
in the arena allocation. Note that the ``size`` is stored twice, once in the
``size_and_flags`` field, and once in the arena allocation. This facilitates
re-use of the arena allocation if a string is mutated. Also note that because
reuse of the arena allocation if a string is mutated. Also note that because
the length of the string is small enough to fit in an ``unsigned char``, this is
a "medium"-length string and the size requires only one byte in the arena
allocation. An arena string larger than 255 bytes would need 8 bytes in the
Expand Down
2 changes: 1 addition & 1 deletion doc/neps/nep-0056-array-api-main-namespace.rst
Expand Up @@ -302,7 +302,7 @@ three types of behavior rather than two - ``copy=None`` means "copy if needed".
an exception because they use* ``copy=False`` *explicitly in their copy but a
copy was previously made anyway, they have to inspect their code and determine
whether the intent of the code was the old or the new semantics (both seem
rougly equally likely), and adapt the code as appropriate. We expect most cases
roughly equally likely), and adapt the code as appropriate. We expect most cases
to be* ``np.array(..., copy=False)``, *because until a few years ago that had
lower overhead than* ``np.asarray(...)``. *This was solved though, and*
``np.asarray(...)`` *is idiomatic NumPy usage.*
Expand Down
6 changes: 3 additions & 3 deletions doc/source/numpy_2_0_migration_guide.rst
Expand Up @@ -119,7 +119,7 @@ itemsizes not limited by the size of ``int`` as well as allow improving
structured dtypes in the future and not burdon new dtypes with their fields.

Code which only uses the type number and other initial fields is unaffected.
Most code will hopefull mainly access the ``->elsize`` field, when the
Most code will hopefully mainly access the ``->elsize`` field, when the
dtype/descriptor itself is attached to an array (e.g. ``arr->descr->elsize``)
this is best replaced with ``PyArray_ITEMSIZE(arr)``.

Expand All @@ -146,7 +146,7 @@ or adding ``npy2_compat.h`` into your code base and explicitly include it
when compiling with NumPy 1.x (as they are new API).
Including the file has no effect on NumPy 2.

Please do not hesitate to open a NumPy issue, if you require assistence or
Please do not hesitate to open a NumPy issue, if you require assistance or
the provided functions are not sufficient.

**Custom User DTypes:**
Expand Down Expand Up @@ -214,7 +214,7 @@ added for setting the real or imaginary part.
The underlying type remains a struct under C++ (all of the above still remains
valid).

This has implications for Cython. It is recommened to always use the native
This has implications for Cython. It is recommended to always use the native
typedefs ``cfloat_t``, ``cdouble_t``, ``clongdouble_t`` rather than the NumPy
types ``npy_cfloat``, etc, unless you have to interface with C code written
using the NumPy types. You can still write cython code using the ``c.real`` and
Expand Down
2 changes: 1 addition & 1 deletion doc/source/reference/array_api.rst
Expand Up @@ -18,7 +18,7 @@ upgraded to given NumPy's
For usage guidelines for downstream libraries and end users who want to write
code that will work with both NumPy and other array libraries, we refer to the
documentation of the array API standard itself and to code and
developer-focused documention in SciPy and scikit-learn.
developer-focused documentation in SciPy and scikit-learn.

Note that in order to use standard-complaint code with older NumPy versions
(< 2.0), the `array-api-compat
Expand Down
2 changes: 1 addition & 1 deletion doc/source/reference/c-api/array.rst
Expand Up @@ -823,7 +823,7 @@ cannot not be accessed directly.

.. c:function:: PyArray_ArrayDescr *PyDataType_SUBARRAY(PyArray_Descr *descr)

Information about a subarray dtype eqivalent to the Python `np.dtype.base`
Information about a subarray dtype equivalent to the Python `np.dtype.base`
and `np.dtype.shape`.

If this is non- ``NULL``, then this data-type descriptor is a
Expand Down
2 changes: 1 addition & 1 deletion doc/source/release/1.14.0-notes.rst
Expand Up @@ -334,7 +334,7 @@ eliminating their use internally and two new C-API functions,

have been added together with a complementary flag,
``NPY_ARRAY_WRITEBACKIFCOPY``. Using the new functionality also requires that
some flags be changed when new arrays are created, to wit:
some flags be changed when new arrays are created:
``NPY_ARRAY_INOUT_ARRAY`` should be replaced by ``NPY_ARRAY_INOUT_ARRAY2`` and
``NPY_ARRAY_INOUT_FARRAY`` should be replaced by ``NPY_ARRAY_INOUT_FARRAY2``.
Arrays created with these new flags will then have the ``WRITEBACKIFCOPY``
Expand Down
4 changes: 2 additions & 2 deletions doc/source/release/2.0.0-notes.rst
Expand Up @@ -88,7 +88,7 @@ Highlights of this release include:

- Documentation:

- The reference guide navigation was signficantly improved, and there is now
- The reference guide navigation was significantly improved, and there is now
documentation on NumPy's :ref:`module structure <module-structure>`,
- The :ref:`building from source <building-from-source>` documentation was
completely rewritten,
Expand Down Expand Up @@ -646,7 +646,7 @@ The ``metadata`` field is kept, but the macro version should also be preferred.

Descriptor ``elsize`` and ``alignment`` access
----------------------------------------------
Unless compiling only with NumPy 2 support, the ``elsize`` and ``aligment``
Unless compiling only with NumPy 2 support, the ``elsize`` and ``alignment``
fields must now be accessed via ``PyDataType_ELSIZE``,
``PyDataType_SET_ELSIZE``, and ``PyDataType_ALIGNMENT``.
In cases where the descriptor is attached to an array, we advise
Expand Down
6 changes: 3 additions & 3 deletions doc/source/user/c-info.beyond-basics.rst
Expand Up @@ -307,10 +307,10 @@ code:

.. code-block:: c

doub = PyArray_DescrFromType(NPY_DOUBLE);
PyArray_RegisterCastFunc(doub, NPY_FLOAT,
d = PyArray_DescrFromType(NPY_DOUBLE);
PyArray_RegisterCastFunc(d, NPY_FLOAT,
(PyArray_VectorUnaryFunc *)double_to_float);
Py_DECREF(doub);
Py_DECREF(d);


Registering coercion rules
Expand Down
6 changes: 3 additions & 3 deletions meson_cpu/meson.build
Expand Up @@ -202,13 +202,13 @@ foreach opt_name, conf : parse_options
endif
endforeach
else
filterd = []
filtered = []
foreach fet : result
if fet not in accumulate
filterd += fet
filtered += fet
endif
endforeach
result = filterd
result = filtered
endif # append
endforeach # tok : tokens
if ignored.length() > 0
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/include/numpy/ndarraytypes.h
Expand Up @@ -1302,7 +1302,7 @@ typedef struct {
PyArrayIterObject *iters[64];
#elif defined(__cplusplus)
/*
* C++ doesn't stricly support flexible members and gives compilers
* C++ doesn't strictly support flexible members and gives compilers
* warnings (pedantic only), so we lie. We can't make it 64 because
* then Cython is unhappy (larger struct at runtime is OK smaller not).
*/
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/include/numpy/npy_2_compat.h
Expand Up @@ -53,7 +53,7 @@
#if NPY_ABI_VERSION < 0x02000000
/*
* Define 2.0 feature version as it is needed below to decide whether we
* compile for both 1.x and 2.x (defining it gaurantees 1.x only).
* compile for both 1.x and 2.x (defining it guarantees 1.x only).
*/
#define NPY_2_0_API_VERSION 0x00000012
/*
Expand Down
24 changes: 12 additions & 12 deletions numpy/_core/memmap.py
Expand Up @@ -162,48 +162,48 @@ class memmap(ndarray):

Load the memmap and verify data was stored:

>>> newfp = np.memmap(filename, dtype='float32', mode='r', shape=(3,4))
>>> newfp
>>> fp_new = np.memmap(filename, dtype='float32', mode='r', shape=(3,4))
>>> fp_new
memmap([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]], dtype=float32)

Read-only memmap:

>>> fpr = np.memmap(filename, dtype='float32', mode='r', shape=(3,4))
>>> fpr.flags.writeable
>>> fp_ro = np.memmap(filename, dtype='float32', mode='r', shape=(3,4))
>>> fp_ro.flags.writeable
False

Copy-on-write memmap:

>>> fpc = np.memmap(filename, dtype='float32', mode='c', shape=(3,4))
>>> fpc.flags.writeable
>>> fp_cow = np.memmap(filename, dtype='float32', mode='c', shape=(3,4))
>>> fp_cow.flags.writeable
True

It's possible to assign to copy-on-write array, but values are only
written into the memory copy of the array, and not written to disk:

>>> fpc
>>> fp_cow
memmap([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]], dtype=float32)
>>> fpc[0,:] = 0
>>> fpc
>>> fp_cow[0,:] = 0
>>> fp_cow
memmap([[ 0., 0., 0., 0.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]], dtype=float32)

File on disk is unchanged:

>>> fpr
>>> fp_ro
memmap([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]], dtype=float32)

Offset into a memmap:

>>> fpo = np.memmap(filename, dtype='float32', mode='r', offset=16)
>>> fpo
>>> fp_offset = np.memmap(filename, dtype='float32', mode='r', offset=16)
>>> fp_offset
memmap([ 4., 5., 6., 7., 8., 9., 10., 11.], dtype=float32)

"""
Expand Down
16 changes: 8 additions & 8 deletions numpy/_core/numeric.py
Expand Up @@ -1147,17 +1147,17 @@ def tensordot(a, b, axes=2):

# Move the axes to sum over to the end of "a"
# and to the front of "b"
notin = [k for k in range(nda) if k not in axes_a]
newaxes_a = notin + axes_a
not_in = [k for k in range(nda) if k not in axes_a]
newaxes_a = not_in + axes_a
N2 = math.prod(as_[axis] for axis in axes_a)
newshape_a = (math.prod([as_[ax] for ax in notin]), N2)
olda = [as_[axis] for axis in notin]
newshape_a = (math.prod([as_[ax] for ax in not_in]), N2)
olda = [as_[axis] for axis in not_in]

notin = [k for k in range(ndb) if k not in axes_b]
newaxes_b = axes_b + notin
not_in = [k for k in range(ndb) if k not in axes_b]
newaxes_b = axes_b + not_in
N2 = math.prod(bs[axis] for axis in axes_b)
newshape_b = (N2, math.prod([bs[ax] for ax in notin]))
oldb = [bs[axis] for axis in notin]
newshape_b = (N2, math.prod([bs[ax] for ax in not_in]))
oldb = [bs[axis] for axis in not_in]

at = a.transpose(newaxes_a).reshape(newshape_a)
bt = b.transpose(newaxes_b).reshape(newshape_b)
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/src/_simd/_simd_vector.inc
Expand Up @@ -92,7 +92,7 @@ static PyTypeObject PySIMDVectorType = {
* miss-align load variable of 256/512-bit vector from non-aligned
* 256/512-bit stack pointer.
*
* check the following links for more clearification:
* check the following links for more clarification:
* https://github.com/numpy/numpy/pull/18330#issuecomment-821539919
* https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001
*/
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/src/multiarray/array_coercion.c
Expand Up @@ -1185,7 +1185,7 @@ PyArray_DiscoverDTypeAndShape_Recursive(
}

/*
* For a sequence we need to make a copy of the final aggreate anyway.
* For a sequence we need to make a copy of the final aggregate anyway.
* There's no need to pass explicit `copy=True`, so we switch
* to `copy=None` (copy if needed).
*/
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/src/multiarray/common_dtype.c
Expand Up @@ -131,7 +131,7 @@ reduce_dtypes_to_most_knowledgeable(
}

if (res == (PyArray_DTypeMeta *)Py_NotImplemented) {
/* guess at other being more "knowledgable" */
/* guess at other being more "knowledgeable" */
PyArray_DTypeMeta *tmp = dtypes[low];
dtypes[low] = dtypes[high];
dtypes[high] = tmp;
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/src/multiarray/dtypemeta.c
Expand Up @@ -1401,7 +1401,7 @@ PyArray_DTypeMeta *_Void_dtype = NULL;
* This function is exposed with an underscore "privately" because the
* public version is a static inline function which only calls the function
* on 2.x but directly accesses the `descr` struct on 1.x.
* Once 1.x backwards compatibility is gone, it shoudl be exported without
* Once 1.x backwards compatibility is gone, it should be exported without
* the underscore directly.
* Internally, we define a private inline function `PyDataType_GetArrFuncs`
* for convenience as we are allowed to access the `DType` slots directly.
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/src/umath/special_integer_comparisons.cpp
Expand Up @@ -293,7 +293,7 @@ get_loop(PyArrayMethod_Context *context,


/*
* Machinery to add the python integer to NumPy intger comparsisons as well
* Machinery to add the python integer to NumPy integer comparsisons as well
* as a special promotion to special case Python int with Python int
* comparisons.
*/
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/strings.py
Expand Up @@ -63,7 +63,7 @@
# _vec_string - Will probably not become ufuncs
"mod", "decode", "encode", "translate",

# Removed from namespace until behavior has been crystalized
# Removed from namespace until behavior has been crystallized
# "join", "split", "rsplit", "splitlines",
]

Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/tests/test_einsum.py
Expand Up @@ -816,7 +816,7 @@ def test_einsum_fixedstridebug(self):
tp = np.tensordot(A, B, axes=(0, 0))
assert_equal(es, tp)
# The following is the original test case from the bug report,
# made repeatable by changing random arrays to aranges.
# made repeatable by changing random arrays to arranges.
A = np.arange(3 * 3).reshape(3, 3).astype(np.float64)
B = np.arange(3 * 3 * 64 * 64).reshape(3, 3, 64, 64).astype(np.float32)
es = np.einsum('cl, cpxy->lpxy', A, B)
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/tests/test_function_base.py
Expand Up @@ -223,7 +223,7 @@ def test_complex(self):
assert_allclose(y, [-5, 3j])

def test_complex_shortest_path(self):
# test the shortest logorithmic spiral is used, see gh-25644
# test the shortest logarithmic spiral is used, see gh-25644
x = 1.2 + 3.4j
y = np.exp(1j*(np.pi-.1)) * x
z = np.geomspace(x, y, 5)
Expand Down
2 changes: 1 addition & 1 deletion numpy/_core/tests/test_half.py
Expand Up @@ -49,7 +49,7 @@ def test_half_conversions(self):
# Convert from float32 back to float16
with np.errstate(invalid='ignore'):
b = np.array(self.all_f32, dtype=float16)
# avoid testing NaNs due to differ bits wither Q/SNaNs
# avoid testing NaNs due to differ bits with Q/SNaNs
b_nn = b == b
assert_equal(self.all_f16[b_nn].view(dtype=uint16),
b[b_nn].view(dtype=uint16))
Expand Down