New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

ENH: Allow toggling madvise hugepage and fix default #15769

Merged

mattip merged 5 commits into numpy:master from seberg:hugepages-allow-toggling

May 3, 2020

Member

seberg commented Mar 17, 2020

By default this disables madvise hugepage on kernels before 4.6, since
we expect that these typically see large performance regressions when
using hugepages due to slow defragementation code presumably fixed by:

torvalds/linux@7cf91a9

This adds support to set the behaviour at startup time through the
NUMPY_MADVISE_HUGEPAGE environment variable.

Still needs some documentation somewhere probably... I am not sure where though, a new site listing all environment variables used during compile or startup time? Relaxed strides, experimental array function protocol, this one, ...?


          ENH: Allow toggling madvise hugepage and fix default

2d6edb3

By default this disables madvise hugepage on kernels before 4.6, since
we expect that these typically see large performance regressions when
using hugepages due to slow defragementation code presumably fixed by:

torvalds/linux@7cf91a9

This adds support to set the behaviour at startup time through the
``NUMPY_MADVISE_HUGEPAGE`` environment variable.

Fixes numpygh-15545

seberg force-pushed the hugepages-allow-toggling branch from 529e281 to 2d6edb3 Compare

March 17, 2020 18:20

Member

mattip commented Mar 17, 2020

Maybe a page about "Global State", and mention threadpoolctl as well, kind of like in the python documentation


          DOC: Note few global global options we currently have in refdocs

c12622b

These are options that are controlled typically through environment
variable at startup or compile time.

rossbar requested changes

View reviewed changes

Contributor

rossbar left a comment

This seems like a nice feature - I mostly did a once-over on the docs.

Just for kicks, I tried this on my system (kernel v5.5.9) and did indeed see that performance was generally better with NumPy's use of hugepages was enabled. Using the original examples from #15545:

>>> import numpy as np; print(np.use_hugepage)
1
>>> n = int(1e9)
>>> %timeit np.zeros(n)                                                                  
5.52 µs ± 75.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit np.random.rand(n)                                                            
5.87 s ± 76.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> %timeit np.linspace(0, 100, n)                                                       
2.49 s ± 50.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> %timeit np.exp(np.zeros(n))                                                          
7.05 s ± 152 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

And with NUMPY_MADVISE_HUGEPAGE=0:

>>> import numpy as np; print(np.use_hugepage)
0
>>> n = int(1e9)
>>> %timeit np.zeros(n)                                                                  
7.86 µs ± 30.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
>>> %timeit np.random.rand(n)                                                            
6.97 s ± 77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> %timeit np.linspace(0, 100, n)                                                         
3.78 s ± 12.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> %timeit np.exp(np.zeros(n))                                                          
8.09 s ± 68.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

doc/source/reference/global_state.rst Outdated Show resolved Hide resolved

doc/source/reference/global_state.rst Outdated Show resolved Hide resolved

doc/source/reference/global_state.rst Outdated Show resolved Hide resolved

doc/source/reference/global_state.rst Outdated Show resolved Hide resolved

doc/source/reference/global_state.rst Outdated

Comment on lines 35 to 36

		you can experience a significant speedup when transparent
		hugepage is enabled.

Contributor

rossbar Mar 17, 2020

Do you want to include an external link to linux docs on transparent hugepages? If so, you could consider this: https://www.kernel.org/doc/html/latest/vm/transhuge.html

Member Author

seberg Mar 17, 2020 •

edited

good idea. EDIT: Although this link is better: https://www.kernel.org/doc/html/latest/admin-guide/mm/transhuge.html

doc/source/reference/global_state.rst Outdated

Comment on lines 57 to 59

+              The array function protocol which allows array-like objects to
+              hook into the NumPy API is currently enabled by default.
+              It can be disabled using::

Contributor

rossbar Mar 17, 2020

Perhaps mention that the feature was introduced in v1.17 and is enabled by default

doc/source/reference/global_state.rst Outdated Show resolved Hide resolved

doc/source/reference/global_state.rst Outdated

		This flag is checked at import time.


		Debugging

Contributor

rossbar Mar 17, 2020

Same suggestion as above - maybe "Debugging-related Options" to be consistent

doc/source/reference/global_state.rst Outdated Show resolved Hide resolved

doc/source/reference/global_state.rst Show resolved Hide resolved

Member Author

seberg commented Mar 17, 2020

Meh, I confirmed that kernel 4.8 is fixed, then I went down a wrong turn and lost my progress on testing 4.6 as well :(.

seberg and others added 2 commits

March 17, 2020 18:11


          Apply suggestions from code review by Ross

950f8a9

Co-Authored-By: Ross Barnowski <rossbar@berkeley.edu>


          Further DOC-Fixup base don Ross' review

1400cea

seberg force-pushed the hugepages-allow-toggling branch from 3de26ef to 1400cea Compare

March 18, 2020 14:20

seberg added 01 - Enhancement component: numpy._core component: documentation labels

Member Author

seberg commented Apr 10, 2020

Not sure if it makes sense, but we could consider backporting this, although maybe it is too high risk? Since it tries to at least mitigate a performance regression.

seberg added this to the 1.18.3 release milestone

charris modified the milestones: 1.18.3 release, 1.19.0 release

anirudh2290 reviewed

View reviewed changes

doc/source/reference/global_state.rst Show resolved Hide resolved

numpy/core/src/multiarray/alloc.c

+                  }
+                  _madvise_hugepage = enabled;
+                  if (was_enabled) {
+                      Py_RETURN_TRUE;

Member

anirudh2290 Apr 27, 2020

probably doesnt matter, since our code doesnt check the return val, but will this always return Py_RETURN_TRUE ?

Member Author

seberg Apr 28, 2020

Not sure I follow, this function sets the behaviour and returns the current state, either True or False in the line below?

Member

anirudh2290 Apr 28, 2020

but was_enabled was set to _madvise_hugepage value which is set to 1 above, it doesnt seem to change. isnt was_enabled always 1 ?

Member Author

seberg Apr 28, 2020

Maybe I have to add a test to proof that it does, I admit. was_enabled is a copy of the global static _madvise_hugepage, which is modified with _madvise_hugepage = enabled. So if enabled is 0, on the next call was_enabled will be 0.

Member

anirudh2290 Apr 28, 2020

okay thanks for explaining, i guess i misunderstood this function completely 😁 . i was expecting it to return true for platforms where _set_madvise_hugepage works (i.e. it is able to use this env variable and false where it doesnt, didnt realize was_enabled was for the next function call).

Member

mattip Apr 28, 2020

A comment before the function with an explanation may help prevent future confusion.

numpy/__init__.py

+                      kernel_version = os.uname().release.split(".")[:2]
+                      kernel_version = tuple(int(v) for v in kernel_version)
+                      if kernel_version < (4, 6):
+                          use_hugepage = 0

Member

anirudh2290 Apr 28, 2020

I dont know if this will help here, but is it worth putting a message, saying that if using version less than 4.6 and you notice issues with large arrays try setting NUMPY_MADVISE_HUGEPAGE=1. If we do this we can also backport and it will be safer since there is a message shown to the user to fix the issue.

Member Author

seberg Apr 28, 2020

Well, there seemed to be some consensus around not trying to guess what is probably right for most people and instead adding into an FAQ style tip page somewhere? I actually like guessing if it helps 90% of the people, since I think very few will find the setting or even know they are getting terrible performance...

Member

mattip Apr 28, 2020

I think guessing is fine. Maybe the new troubleshooting page would be a good page to reference this setting.

Member

mattip Apr 28, 2020

Ahh, I see you have already added a global_state page. That seems fine.

mattip reviewed

View reviewed changes

numpy/core/src/multiarray/multiarraymodule.c Outdated

@@ @@ -4159,6 +4161,8 @@ static struct PyMethodDef array_module_methods[] = { @@
                       METH_VARARGS, NULL},
                   {"_add_newdoc_ufunc", (PyCFunction)add_newdoc_ufunc,
                       METH_VARARGS, NULL},
+                  {"_set_madvise_hugepage", (PyCFunction)_set_madvise_hugepage,
+                      METH_O, "Toggle and return madvise hugepage (no OS support check)."},

Member

mattip Apr 28, 2020

maybe extend this docstring, even though it is a private function:

_set_madvise_hugepage(tf: bool) -> bool
Set  or unset use of ``madvise (2)`` MADV_HUGEPAGE support when allocating
the array data. Returns the previously set value. See `global_state` for more
information.

seberg force-pushed the hugepages-allow-toggling branch from 9ffd839 to 828e761 Compare

April 29, 2020 00:18


          Fixup: Implement suggestions by Matti

seberg force-pushed the hugepages-allow-toggling branch from 828e761 to 3395802 Compare

April 29, 2020 00:22

seberg mentioned this pull request

many functions slow for bigger-than-cache arrays on linux from numpy>=1.16.5 #15545

Closed

pentschev approved these changes

View reviewed changes

Contributor

pentschev left a comment

The proposed solution looks good to me, thanks for doing the work @seberg !

Member

mattip commented May 2, 2020

close/reopen to restart builds

mattip closed this

mattip reopened this

Member

mattip commented May 3, 2020

Merging, the failing s390x job is unrelated.

mattip merged commit 22724ba into numpy:master

Member

mattip commented May 3, 2020

Thanks @seberg

bmerry reviewed

View reviewed changes

doc/release/upcoming_changes/15769.improvement.rst

+              On Linux NumPy has previously added support for madavise
+              hugepages which can improve performance for very large arrays.
+              Unfortunately, on older Kernel versions this led to peformance

Contributor

bmerry May 3, 2020

I realise I'm late to the party since this has already been merged, but "peformance" -> "performance".

rossbar mentioned this pull request

Converting data types takes longer time #15621

Closed

Megaland90 commented Jul 21, 2020

hi i have probleme with kernel name ovh "4.19-ovh-xxxx-std-ipv6-64" :

kernel_version = tuple(int(v) for v in kernel_version)
ValueError: invalid literal for int() with base 10: '19-ovh-xxxx-std-ipv6-64'

Member

mattip commented Jul 21, 2020

@Megaland90 You are looking for gh-16679 which will be part of the upcoming 1.19.1.

Megaland90 commented Jul 21, 2020

Thanks @mattip :D

Member

mattip commented Jul 21, 2020

@Megaland90 thanks for the report. For future reference, opening a new issue rather than adding comments to a PR helps keep things more organized (although I just added two more comments myself :))

seberg deleted the hugepages-allow-toggling branch

July 21, 2020 13:09

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

mattip mattip left review comments

anirudh2290 anirudh2290 left review comments

bmerry bmerry left review comments

rossbar rossbar requested changes

pentschev pentschev approved these changes