New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: roadmap update #12008
DOC: roadmap update #12008
Conversation
One other topic I thought about putting at the top level is "performance". There's a lot that could be done and it was the single most frequent thing mentioned in the outreach for the NSF grant last year. Still, it's a bit less concrete than the other topics there, so not sure if it makes sense. |
I think it's worth mentioning at least so that people know it's a target. It's even better if you can mention where particular pain points are, but even without that, a general "things should be faster" is worth having in there. |
Also address a couple of review comments, delete things that were done. [ci skip]
Note that this was one of the big ticket items in the 2019 NSF proposal for the SciPy ecosystem: https://figshare.com/articles/Mid-Scale_Research_Infrastructure_-_The_Scientific_Python_Ecosystem/8009441 For that proposal we got a lot of direct feedback from outreach to prominent users in domains like economics, physics, and biology. Example quotes: "... my top-level concern as a person who supports users is whether Python and the Scientific Python stack as a whole needs a rethink as architectures become more parallel and individual cores become less and less powerful. ... We don't have bright lines for Python users like we have for C, C++, Fortran for performance portability. If your code is in C++ we can talk about directive-based parallelism or specific libraries or compilers that will help you minimize the amount of code you have to rewrite to move across architectures (to be fair: it's rarely magic). But things seem to lag in Python." "Beyond individual features I think there's another issue worth thinking about. Using things like numba efficiently is becoming more and more useful to researchers, as special model types and solution methods may not be written in vectorized form very easily or "beautifully". However, since the whole chain has to be written using numpa for the JIT to fully materialize itself, it's kind of an issue that scipy does not generally support this way to solving problems in python (by that I mean something that's not just fast because you're using Numpy and vector operations). The leads to researchers writing their own optimizers, that are essentially duplicates of well-known methods such as brent and golden section search, multilinear interpolation and other methods, to speed up their code. SciPy should handle this part of the problem-solving (it does, but just not in a way that's efficient to the numba-users), but currently we risk a lot of code duplication and bugs. I'm not involved in scipy so I'm not fully aware of their stance of numba vs some of the other possibilities out there, but the issue of being unable to efficiently use @njit is a real issue in my opinion." [ci skip]
Done. I remembered there was text in last years' NSF proposal on performance, and that we got quite a bit of specific feedback (see some quotes in last commit message). |
Any more thoughts on this? I'd like to merge it in the next couple of days. |
On May 16, 2020, at 5:02 AM, Ralf Gommers ***@***.***> wrote:
Any more thoughts on this? I'd like to merge it in the next couple of days.
In terms of roadmap, do we care about the size of the package?
It’s come up with David PR to accept None in the __doc__ but the package might still be too big for some uses.
For example, AWS lambda service has a size limit. Scipy if used, takes up a large portion.
See,
aws/sagemaker-python-sdk#1471
For an example.
Do we care about the size of the package and keeping it under a limit?
If not then it might be good to put that in the docs.
Or If the space isn’t something we consider (within reason of course) then is it worth adding a sentence to that effect in the roadmap?
… —
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
We do, we need to pay attention to size of binaries and not let it grow too much. This is what this sentence under |
Okay we do still have an open issue for incomplete stripping of binaries that would help (not for Lambda though, cause that needs a rebuild I believe, can't use PyPI wheels): https://github.com/matthew-brett/multibuild/issues/162 |
ok, thanks for clarifying. Good to know, I'll keep that in mind for future. |
Let's keep the ball rolling. Thanks Ralf, all |
Changes to the main roadmap:
fft
changes, those were done