Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: roadmap update #12008

Merged
merged 5 commits into from May 17, 2020
Merged

DOC: roadmap update #12008

merged 5 commits into from May 17, 2020

Conversation

rgommers
Copy link
Member

@rgommers rgommers commented May 3, 2020

Changes to the main roadmap:

  • removed fft changes, those were done
  • moved "Windows builds" and "evolve LAPACK support" to the detailed roadmap
  • added "support for more hardware platforms", interest keeps growing is my impression
  • added "performance" as a main item on the roadmap
  • reordered items for importance, mainly sparse ndarrays from top to bottom, that's evolving slowly

@rgommers rgommers added the Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org label May 3, 2020
doc/source/roadmap.rst Outdated Show resolved Hide resolved
@rgommers
Copy link
Member Author

rgommers commented May 4, 2020

One other topic I thought about putting at the top level is "performance". There's a lot that could be done and it was the single most frequent thing mentioned in the outreach for the NSF grant last year. Still, it's a bit less concrete than the other topics there, so not sure if it makes sense.

@larsoner
Copy link
Member

larsoner commented May 4, 2020

I think it's worth mentioning at least so that people know it's a target. It's even better if you can mention where particular pain points are, but even without that, a general "things should be faster" is worth having in there.

Note that this was one of the big ticket items in the 2019 NSF proposal
for the SciPy ecosystem:
https://figshare.com/articles/Mid-Scale_Research_Infrastructure_-_The_Scientific_Python_Ecosystem/8009441

For that proposal we got a lot of direct feedback from outreach to
prominent users in domains like economics, physics, and biology.

Example quotes:

"... my top-level concern as a person who supports users is whether
Python and the Scientific Python stack as a whole needs a rethink as
architectures become more parallel and individual cores become less and
less powerful. ... We don't have bright lines for Python users like we
have for C, C++, Fortran for performance portability. If your code is in
C++ we can talk about directive-based parallelism or specific libraries
or compilers that will help you minimize the amount of code you have to
rewrite to move across architectures (to be fair: it's rarely magic).
But things seem to lag in Python."

"Beyond individual features I think there's another issue worth thinking
about. Using things like numba efficiently is becoming more and more
useful to researchers, as special model types and solution methods may
not be written in vectorized form very easily or "beautifully". However,
since the whole chain has to be written using numpa for the JIT to fully
materialize itself, it's kind of an issue that scipy does not generally
support this way to solving problems in python (by that I mean something
that's not just fast because you're using Numpy and vector operations).
The leads to researchers writing their own optimizers, that are
essentially duplicates of well-known methods such as brent and golden
section search, multilinear interpolation and other methods, to speed up
their code. SciPy should handle this part of the problem-solving (it
does, but just not in a way that's efficient to the numba-users), but
currently we risk a lot of code duplication and bugs. I'm not involved
in scipy so I'm not fully aware of their stance of numba vs some of the
other possibilities out there, but the issue of being unable to
efficiently use @njit is a real issue in my opinion."

[ci skip]
@rgommers
Copy link
Member Author

rgommers commented May 8, 2020

I think it's worth mentioning at least so that people know it's a target. It's even better if you can mention where particular pain points are, but even without that, a general "things should be faster" is worth having in there.

Done. I remembered there was text in last years' NSF proposal on performance, and that we got quite a bit of specific feedback (see some quotes in last commit message).

@rgommers
Copy link
Member Author

Any more thoughts on this? I'd like to merge it in the next couple of days.

@rlucas7
Copy link
Member

rlucas7 commented May 16, 2020 via email

@rgommers
Copy link
Member Author

In terms of roadmap, do we care about the size of the package?

We do, we need to pay attention to size of binaries and not let it grow too much. This is what this sentence under Cython is about: It’s not clear how much functionality can be Cythonized without making the .so files too large. This needs measuring. Although admittedly that's a bit cryptic; better to say explicitly that size of binaries/wheels is important, for example because of AWS Lambda limits.

@rgommers
Copy link
Member Author

Okay we do still have an open issue for incomplete stripping of binaries that would help (not for Lambda though, cause that needs a rebuild I believe, can't use PyPI wheels): https://github.com/matthew-brett/multibuild/issues/162

@rlucas7
Copy link
Member

rlucas7 commented May 17, 2020

In terms of roadmap, do we care about the size of the package?

We do, we need to pay attention to size of binaries and not let it grow too much. This is what this sentence under Cython is about: It’s not clear how much functionality can be Cythonized without making the .so files too large. This needs measuring. Although admittedly that's a bit cryptic; better to say explicitly that size of binaries/wheels is important, for example because of AWS Lambda limits.

ok, thanks for clarifying. Good to know, I'll keep that in mind for future.

@ev-br ev-br merged commit eddc838 into scipy:master May 17, 2020
@ev-br
Copy link
Member

ev-br commented May 17, 2020

Let's keep the ball rolling. Thanks Ralf, all

@tylerjereddy tylerjereddy added this to the 1.5.0 milestone May 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Issues related to the SciPy documentation. Also check https://github.com/scipy/scipy.org
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants