Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: use semantic versioning #10156

Closed
khinsen opened this issue Dec 4, 2017 · 88 comments
Closed

Request: use semantic versioning #10156

khinsen opened this issue Dec 4, 2017 · 88 comments

Comments

@khinsen
Copy link

khinsen commented Dec 4, 2017

Semantic versioning is a widely used convention in software development, distribution, and deployment. In spite of a long-lasting discussion about its appropriateness (Google knows where to find it), it is today the default. Projects that consciously decide not to use semantic versioning tend to choose release numbering schemes that make this immediately clear, such as using dates instead of versions.

NumPy is one of rare examples of widely used software that uses a version numbering scheme that looks like semantic versioning but isn't, because breaking changes are regularly introduced with a change only in the minor version number. This practice creates false expectations among software developers, software users, and managers of software distributions.

This is all the more important because NumPy is infrastructure software in the same way as operating systems or compilers. Most people who use NumPy (as developers or software users) get and update NumPy indirectly through software distributions like Anaconda or Debian. Often it is a systems administrator who makes the update decision. Neither the people initiating updates nor the people potentially affected by breaking changes follow the NumPy mailing list, and most of them do not even read the release notes.

I therefore propose that NumPy adopt the semantic versioning conventions for future releases. If there are good reasons for not adopting this convention, NumPy should adopt a release labelling scheme that cannot be mistaken for semantic versioning.

@eric-wieser
Copy link
Member

Could numpy be considered to be using semantic versioning, but with a leading 1?

@rgommers
Copy link
Member

rgommers commented Dec 4, 2017

Note that almost every core scientific Python project does what NumPy does: remove deprecated code after a couple of release unless that's very disruptive, and only bump the major version number for, well, major things.

Not sure if you're proposing a change to the deprecation policy, or if you think we should be at version 14.0.0 instead of 1.14.09 now.

@khinsen
Copy link
Author

khinsen commented Dec 4, 2017

The latter: NumPy should be roughly at version 14 by now. But I propose to adopt this convention only for future releases.

BTW: NumPy's predecessor, Numeric, did use semantic versioning and got to version 24 over roughly a decade. I don't know why this was changed in the transition to NumPy.

@njsmith
Copy link
Member

njsmith commented Dec 4, 2017

My impression is that the vast majority of Python projects do not use semantic versioning. For example, Python itself does not use semantic versioning. (I'm also not aware of any mainstream operating systems or compilers that use semver -- do you have some in mind?) I agree that semver proponents have done a great job of marketing it, leading many developers into thinking that it's a good idea, but AFAICT it's essentially unworkable in the real world for any project larger than left-pad, and I strongly dispute the idea that the semver folks now "own" the traditional MAJOR.MINOR.MICRO format and everyone else has to switch to something else.

Can you give an example of what you mean by a "release labelling scheme that cannot be mistaken for semantic versioning"? Using names instead of numbers? You cite date-based versioning, but the most common scheme for this that I've seen is the one used by e.g. Twisted and PyOpenSSL, which are currently at 17.9.0 and 17.5.0, respectively. Those look like totally plausible semver versions to me...

And can you elaborate on what benefit this would have to users? In this hypothetical future, every release would have some breaking changes that are irrelevant to the majority of users, just like now. What useful information would we be conveying by bumping the major number every few months? "This probably breaks someone, but probably doesn't break you"? Should we also bump the major version on bugfix releases, given the historical inevitability that a large proportion of them will break at least 1 person's code? Can you give any examples of "software developers, software users, and managers of software distributions" who have actually been confused?

@njsmith
Copy link
Member

njsmith commented Dec 4, 2017

Note that the mailing list is a more appropriate venue for this discussion, and probably we would have to have a discussion there before actually making any change, but the comments here should be useful in getting a sense of what kind of issues you'd want to address in that discussion.

@khinsen
Copy link
Author

khinsen commented Dec 4, 2017

@njsmith It seems that the only factual point we disagree on is whether or not semantic versioning is the default assumption today. This requires a clearer definition of the community in which it is (or not) the default. The levels of software management I care about is distribution managament and systems administration, which is where people decide which version is most appropriate in their context.

The informal inquiry that led to me the conclusion that semantic versioning is the default consisted of talking to administrators of scientific computing installations. I also envisaged
a more empirical approach (listing the packages on a recent Debian installation and picking a few of them randomly to investigate their versioning approach), but this turned out to be very difficult, because few projects clearly state if they use semantic versioning or not.

A comment from one systems administrator particularly struck me as relevant: he said that for the purposes of deciding which version to install, any convention other than semantic versioning is useless. Systems administrators can neither explore each package in detail (they lack the time and the competence) nor consult all their users (too many of them). They have to adopt a uniform policy, and this tends to be based on the assumption of semantic versioning. For example, an administrator of a computing cluster told me that he checks with a few "power users" he knows personally before applying an update with a change in the major version number.

As for examples of people who have actually been confused, specifically concerning scientific Python users, I have plenty of them: colleagues at work, people I meet at conferences, people who ask for advice by e-mail, students in my classes. This typically starts with "I know you are a Python expert, can you help me with a problem?" That problem turns out to be a script that works on one computer but not on another. Most of these people don't consider dependency issues at all, but a few did actually compare the version numbers of the two installations, finding only "small differences".

@khinsen
Copy link
Author

khinsen commented Dec 4, 2017

As @eric-wieser and @rgommers noted, my request is almost synonymous to requesting that the initial "1." be dropped from NumPy versions. In other words, NumPy de facto already uses semantic versioning, even though it is not the result of a policy decision and therefore probably not done rigorously. However, it does suggest that NumPy could adopt semantic versioning with almost no change to the current development workflow.

@njsmith
Copy link
Member

njsmith commented Dec 4, 2017

A comment from one systems administrator particularly struck me as relevant: he said that for the purposes of deciding which version to install, any convention other than semantic versioning is useless.

Unfortunately, semantic versioning is also useless for this :-(. I don't mean to split hairs or exaggerate; I totally get that it's a real problem. But just because a problem is real doesn't mean that it has a solution. You fundamentally cannot boil down the question "should I upgrade this software?" to a simple mechanical check. It's a fantasy. Projects that use semver regularly make major releases that all their users ought to immediately upgrade to, and regularly make breaking changes in minor releases.

Systems administrators can neither explore each package in detail (they lack the time and the competence) nor consult all their users (too many of them). They have to adopt a uniform policy, and this tends to be based on the assumption of semantic versioning. For example, an administrator of a computing cluster told me that he checks with a few "power users" he knows personally before applying an update with a change in the major version number.

I like this part though :-). I doubt we'll agree about the philosophy of semver, but it's much easier to have a discussion about the concrete effects of different versioning schemes, and which outcome we find most desirable.

I don't think the concept of semver has much to do with this policy -- does the system admin you talked to actually check every project to see if they're using semver? Most projects don't, as you said, it's hard to even tell which ones do. And the policy is the same one that sysadmins have been using since long before semver even existed. I think a better characterization of this policy would be: "follow the project's recommendation about how careful to be with an upgrade", along with the ancient tradition that major releases are "big" and minor releases are "little".

The NumPy project's recommendation is that system administrators should upgrade to new feature releases, so what I take from this anecdote is that our current numbering scheme is accurately communicating what we want it to, and that switching to semver would not...

@khinsen
Copy link
Author

khinsen commented Dec 4, 2017

@njsmith OK, let's turn away from philosophy and towards practicalities: What is the role of software version numbers in the communication between software developers, system maintainers, and software users?

Again it seems that we have a major difference of opinion here. For you, it's the developers who give instructions to system maintainers and users, and use the version numbers to convey their instructions. For me, every player should decide according to his/her criteria, and the version number should act as means of factual communication at the coarsest level.

Given that NumPy has no security implications, I don't see how and why the NumPy project should give universal recommendations. People and institutions have different needs. That's why we have both ArchLinux and CentOS, with very different updating policies.

@ghost
Copy link

ghost commented Dec 4, 2017

@khinsen The oldnumeric package still works perfectly, and people can install it with:

pip install oldnumeric

Perhaps this could be your proposed "stable numpy," where the interface to numpy is restricted to Python/Cython and nothing is ever changed. Of course, writing code with oldnumeric is very arcane, but you can't have it both ways.

@khinsen
Copy link
Author

khinsen commented Dec 5, 2017

@xoviat True, but that's a different issue. My point here is not software preservation, but communication between the different players in software management.

Question: As a systems administrator (even just on your personal machine), would you expect a package to drop a complete API layer from version 1.8 to version 1.9?

For those who replied "yes", second question: can you name any software other than numpy that ever did this?

BTW, I can assure you that many people were bitten by this, because I got a lot of mails asking me why MMTK stopped working from one day to the next. All these people had done routine updates of their software installations, without expecting any serious consequences.

But dropping oldnumeric was not the worst event in recent NumPy history. That honor goes to changing the copy/view semantics of some operations such as diagonal. Code that returns different results depending on the NumPy version (minor version number change!) is a real nightmare.

@khinsen
Copy link
Author

khinsen commented Dec 5, 2017

BTW, since hardly anyone knows the story: pip install oldnumeric works since two days ago, because @xoviat prepared this add-on package and put it on PyPI. Thanks a lot!

@eric-wieser
Copy link
Member

eric-wieser commented Dec 5, 2017

would you expect a package to drop a complete API layer from version 1.8 to version 1.9?

Which layer are you referring to?

@rgommers
Copy link
Member

rgommers commented Dec 5, 2017

can you name any software other than numpy that ever did this?

SciPy dropped weave and maxentropy packages, pandas breaks major features regularly. I'm sure there are many more prominent examples. EDIT: Python itself for example, see https://docs.python.org/3/whatsnew/3.6.html#removed

BTW, I can assure you that many people were bitten by this, because I got a lot of mails asking me why MMTK stopped working from one day to the next. All these people had done routine updates of their software installations, without expecting any serious consequences.

That change was about 10 years in the making, and there is no way that a different versioning scheme would have made a difference here.

Dropping deprecated features is a tradeoff between breaking a small fraction of (older) code, and keeping the codebase easy to maintain. Overall, if we're erring, then we're likely doing that on the being conservative side. As someone who also has had to deal with many years old large corporate code bases that use numpy I feel your pain, but you're arguing for something that is absolutely not a solution (and in general there is no full solution; educating users about things like pinning versions and checking for deprecation warnings is the best we can do).

@rgommers
Copy link
Member

rgommers commented Dec 5, 2017

Which layer are you referring to?

numeric/numarray support I assume

@khinsen
Copy link
Author

khinsen commented Dec 5, 2017

@rgommers Sorry, I should have said "another example outside the SciPy ecosystem".

Also, I am not complaining about dropping the support for oldnumeric. I am complaining about doing this without a change in the major version number.

What difference would that have made? It would have made people hesistate to update without reading the release notes. Everyone using (but not developing) Python code would have taken this as a sign to be careful.

Don't forget that the SciPy ecosystem has an enormous number of low-profile users who are not actively following developments. Python and NumPy are infrastructure items of the same nature as ls and gcc for them. And often it's less than that: they use some software that happens to be written in Python and just happens to depend on NumPy, and when it breaks they are completely lost.

@rgommers
Copy link
Member

rgommers commented Dec 5, 2017

@rgommers Sorry, I should have said "another example outside the SciPy ecosystem".

Just edited my reply with a link to the Python release notes, that's outside the SciPy ecosystem.

What difference would that have made? It would have made people hesistate to update without reading the release notes. Everyone using (but not developing) Python code would have taken this as a sign to be careful.

This will simply not be the case. If instead of 1.12, 1.13, 1.14, etc we have 12.0, 13.0, 14.0 then users get used to that and will use the same upgrade strategy as before. The vast majority will not all of sudden become much more conservative.

Don't forget that the SciPy ecosystem has an enormous number of low-profile users who are not actively following developments. Python and NumPy are infrastructure items of the same nature as ls and gcc for them. And often it's less than that: they use some software that happens to be written in Python and just happens to depend on NumPy, and when it breaks they are completely lost.

All true, and all not magically fixable by a version number. If they ran pip install --upgrade numpy, they have to know what they're doing (and this is anyway not even showing a version number). If it's their packaging system, then they're seeing the problem of the software that breaks not having a decent test suite (or that wasn't run).

@rgommers
Copy link
Member

rgommers commented Dec 5, 2017

Other downsides of changing the versioning scheme now:

  • we would be making a change in versioning without a change in maintenance policy's, will be confusing rather than helpful
  • we're now basically following Python's lead and doing the same as the rest of the whole ecosystem. that is a good thing
  • maybe most importantly: we would be losing the ability to signal actually major changes. the kind we would be going to 2.x for, like a release that would break the ABI.

@mhvk mhvk added the 57 - Close? Issues which may be closable unless discussion continued label Dec 5, 2017
@khinsen
Copy link
Author

khinsen commented Dec 6, 2017

My baseline reference is not Python, but a typical software installation. As I said, for many (perhaps most) users, NumPy is infrastructure like gnu-coreutils or gcc. They do not interpret version numbers specifically in the context of the SciPy ecosystem.

I did a quick check on a Debian 9 system with about 300 installed packages. 85% of them have a version number starting with an integer followed by a dot. The most common integer prefixes are 1 (30%), 2 (26%), 0 (14%) and 3 (13%). If NumPy adopted a version numbering scheme conforming to common expectations (i.e. semantic versioning or a close approximation), it definitely would stand out and be treated with caution.

Note also that the only updates in Debian-installed software that ever broke things for me were in the SciPy ecosystem, with the sole exception of an Emacs update that brought changes in org-mode which broke a home-made org-mode extension. The overall low version number prefixes thus do seem to indicate that most widely used software is much more stable than NumPy and friends.

Uniformity across the SciPy ecosystem is indeed important, but I would prefer that the whole ecosystem adopt a versioning scheme conforming to the outside world's expectations. I am merely starting with NumPy because I see it as the most basic part. It's even more infrastructure than anything else.

Finally, I consider a change in a function's semantics a much more important change than a change in the ABI. The former can cause debugging nightmares for hundreds of users, and make programs produce undetected wrong results for years. The latter leads to error messages that clearly indicate the need to fix something.

According to those standards, NumPy is not even following Python's lead, because the only changes in semantics I am aware of in the Python language happened from 2 to 3.

@rgommers
Copy link
Member

rgommers commented Dec 6, 2017

Finally, I consider a change in a function's semantics a much more important change than a change in the ABI. The former can cause debugging nightmares for hundreds of users, and make programs produce undetected wrong results for years. The latter leads to error messages that clearly indicate the need to fix something.

This we try really hard not to do. Clear breakage when some feature is removed can happen, silently changing numerical results should not. That's one thing we learned from the diagonal view change - that was a mistake in hindsight.

@rgommers
Copy link
Member

rgommers commented Dec 6, 2017

it definitely would stand out and be treated with caution.

I still disagree. Even on Debian, which is definitely not "a typical software installation" for our user base (that'd be something like Anaconda on Windows). You also seem to ignore my argument above that a user doesn't even get to see a version number normally (neither with pip install --upgrade or with a package manager).

@rgommers
Copy link
Member

rgommers commented Dec 6, 2017

Also, your experience that everything else never breaks is likely because you're using things like OS utilities and GUI programs, not other large dependency chains. E.g. the whole JavaScript/NodeJS ecosystem is probably more fragile than the Python one.

@njsmith
Copy link
Member

njsmith commented Dec 6, 2017

BTW, I can assure you that many people were bitten by this, because I got a lot of mails asking me why MMTK stopped working from one day to the next

This is a good example of the subtleties here. As far as I know, MMTK and your other projects are the only ones still extant that were affected by the removal of the numeric/numarray compatibility code. How many users would you estimate you have? 100? 1000? NumPy has millions, so maybe 0.1% of our users were affected by this removal? This is definitely not zero, and the fact that it's small doesn't mean that it doesn't matter – I wish we could support 100% of users forever in all ways. And I understand that it's particularly painful for you, receiving 100% of the complaints from your users.

But if we bump our major version number for this, it means to 99.9% of our users, we've just cried wolf. It's a false positive. OTOH for that 0.1% of users, it was really important. Yet it's not uncommon that we break more than 0.1% of users in micro releases, despite our best efforts. So what do we do?

It's simply not possible to communicate these nuances through the blunt instrument of a version number. Everyone wants a quick way to tell whether an upgrade will break their code, for good reasons. Semver is popular because it promises to do that. It's popular for the same reason that it's popular to think that fad diets can cure cancer. I wish semver lived up to its promises too. But it doesn't, and if we want to be good engineers we need to deal with the complexities of that reality.

I don't see how and why the NumPy project should give universal recommendations. People and institutions have different needs.

We give universal recommendations because we only have 1 version number, so by definition whatever we do with it is a universal recommendation. That's not something we have any control over.

That honor goes to changing the copy/view semantics of some operations such as diagonal.

IIRC we have literally not received a single complaint about this from someone saying that it broke their code. (Maybe one person?) I'm not saying that means no-one was affected, obviously the people who complain about a change are in general only a small fraction of those affected, but if you use complaints as a rough proxy for real-world impact then I don't think this makes the top 50.

And BTW I'm pretty sure if you go searching through deep history you can find far more egregious changes than that :-).

Note also that the only updates in Debian-installed software that ever broke things for me were in the SciPy ecosystem, with the sole exception of an Emacs update that brought changes in org-mode which broke a home-made org-mode extension.

Respectfully, I think this says more about how you use NumPy vs Debian than it does about NumPy versus Debian. I love Debian, I've used it for almost 20 years now, and I can't count how many times it's broken things. Just in the last week, some bizarre issue with the new gnome broke my login scripts and some other upgrade broke my trackpoint. (Both are fixed now, but still.) I'll also note that Debian's emacs was set up to download and run code over unencrypted/insecure channels for years, because of backwards compatibility concerns about enabling security checks. I don't think there's such a thing as a gcc release that doesn't break a few people, if only because people do things like use -Werror and then minor changes in the warning behavior (which can rely on subtle interactions between optimization passes etc.) become breaking changes.

The overall low version number prefixes thus do seem to indicate that most widely used software is much more stable than NumPy and friends.

The overall low version number prefixes are because most widely used software does not use semver.

Finally, I consider a change in a function's semantics a much more important change than a change in the ABI. The former can cause debugging nightmares for hundreds of users, and make programs produce undetected wrong results for years. The latter leads to error messages that clearly indicate the need to fix something.

Yes, that's why we're extremely wary of such changes.

There is some disconnect in perspectives here: you seem to think that we change things willy-nilly all the time, don't care about backwards compatibility, etc. I can respect that; I understand it reflects your experience. But our experience is that we put extreme care into such changes, and I would say that when I talk to users, it's ~5% who have your perspective, and ~95% who feel that numpy is either doing a good job at stability, or that it's doing too good a job and should be more willing to break things. Perhaps you can take comfort in knowing that even if we disappoint you, we are also disappointing that last group :-).

@charris
Copy link
Member

charris commented Dec 7, 2017

with the sole exception of an Emacs update

Well, to go off topic, that does serve as an example of the other side of stability. Emacs was static for years due to Stallman's resistance to change, and that resulted in the xEmacs fork. My own path went Emacs -> xEmacs, to heck with it, -> Vim ;) Premature fossilization is also why I stopped using Debian back in the day. For some things, change simply isn't needed or even wanted, and I expect there are people running ancient versions of BSD on old hardware hidden away in a closet. But I don't expect there are many such places.

Apropos the current problem, I don't think a change in the versioning scheme would really make any difference. A more productive path might be to address the modernization problem. @khinsen Do you see your way to accepting updating of your main projects? If so, I think we should explore ways in which we can help you do it.

@ghost
Copy link

ghost commented Dec 7, 2017 via email

@khinsen
Copy link
Author

khinsen commented Dec 7, 2017

@charris I fully agree with you that "never change anything" is not a productive attitude in computing.

My point is that the SciPy ecosystem has become so immensely popular that no single approach to managing change can suit everyone. It depends on how quickly methods and their implementations evolve in a given field, on the technical competences of practitioners, on other software they depend on, on the resources they can invest into code, etc.

The current NumPy core team cares more about progress (into a direction that matters for some fields but is largely irrelevant to others) than about stability. That is fine - in the Open Source world, the people who do the work decide what they want to work on. However, my impression is that they do not realize that lots of people whose work depend on NumPy have different needs, feel abandoned by the development team, and are starting to move away from SciPy towards more traditional and stable technology such as C and Fortran (and, in one case I know, even to Matlab).

I have no idea what percentage of NumPy users are sufficiently unhappy with the current state of affairs, and I don't think anyone else has. Once a software package becomes infrastructure, you cannot easily estimate who depends on it. Many who do are not even aware of it, and much code that depends on NumPy (directly or indirectly) is not public and/or not easily discoverable.

If we want to keep everyone happy in the SciPy community, we need to find a way to deal with diverse needs. The very first step, in my opinion, is to shift the control over the rate of change in a specific installation from the developers to someone who is closer to the end user. That could be the end users themselves, or systems administrators, or packagers, or whoever else - again I don't think there is a universal answer to this question. What this requires from the developers is information at the right level, and that is why I started this thread. Of course version numbers cannot save the world, but I see them as a first step to establishing a distributed responsability for change management.

Finally, some of you seem to believe that I am fighting a personal battle about my own code. It may surprise you that my personal attitude is not the one I am defending here. My own sweetspot for rate of change is somewhere in between of what is common in my field and what sees to be prevalent in the NumPy team. Most of my work today uses Python 3 and NumPy > 1.10. MMTK is 20 years old and I do many things differently today. Quite often I take pieces of code from MMTK I need for a specific project and adapt them to "modern SciPy", but that's something I can do with confidence only because I wrote the original code.

I have been maintaining a stable MMTK as a service to the community, not for my own use, which explains why I have been doing maintenance in a minimalistic way, avoiding large-scale changes in the codebase. Both funding for software and domain-competent developers are very hard to find, so MMTK has always remained a one-maintainer-plus-occasional-contributors project. I am not even sure that porting all of MMTK to "modern SciPy" would do anyone any good, because much of the code that depends on MMTK is completely unmaintained. But then, that's true for most of the Python code I see around me, even code completely unrelated to MMTK. It's the reality of a domain of research where experiments rather than computation and coding are in the focus of attention.

@khinsen
Copy link
Author

khinsen commented Dec 7, 2017

@xoviat The number of test in oldnumeric is ridculously small. I wouldn't conclude much from the fact that they pass with NumPy 1.13.

The C extension modules that you have been looking at is literally 20 years old and was written for Python 1.4. Back then, it was among the most sophisticated examples of Python-C combos and in fact shaped the early development of Numeric (pre-NumPy) and even CPython itself: CObjects (pre-Capsules) were introduced based on the needs of ScientificPython and MMTK.

I am the first to say that today's APIs and support tools are much better, and I expect they will still improve in the future. But some people simply want to use software for doing research, no matter how old-fashioned it is, and I think they have a right to exist as well.

@khinsen
Copy link
Author

khinsen commented Dec 7, 2017

@rgommers I am not ignoring your argument that a user doesn't even get to see a version number. It's simply not true for the environments I see people use all around me. The people who decide about updates (which are not always end users) do see it. They don't just do "pip install --upgrade" once a week. They would even consider this a careless attitude.

If people around mainly use use Anaconda under Windows, that just confirms that we work in very different environments. In the age of diversity, I hope we can agree that each community may adopt the tools and conventions that work well for it.

And yes, NodeJS is worse, I agree. Fortunately, I can easily ignore it.

@khinsen
Copy link
Author

khinsen commented Dec 7, 2017

Just got an e-mail from a colleague who follows this thread but wouldn't dare to chime in. With an excellent analogy:

"I love it when I get the chance to buy a new microscope and do better science with it. But I would hate to see someone replacing my microscope overnight without consulting with me."

It's all about having control over one's tools.

@shoyer
Copy link
Member

shoyer commented Dec 11, 2017

RE: fancy indexing. Indeed, this could use a dedicated function. This is what was done in TensorFlow, for example, with tf.gather, tf.gather_nd, tf.scatter_nd, tf.boolean_mask, etc. The result is a little more verbose than overloading [], but certainly more transparent.

Another feature that can help are type annotations, a feature that was partially motivated by the difficulty of the Python 2 to 3 transition.

I'm not saying this would be easy. In my mind, the community consequences are a bigger deal. This would indeed take a lot of energy to implement and then push downstream into projects like SciPy.

@ilayn
Copy link
Contributor

ilayn commented Dec 12, 2017

@khinsen I've been following the discussion all week and I think I have a practical test problem to test your take on it. This might be a good item to see how your perspective would handle such conflicts instead of the slightly-abstract discussion so far.

Currently, thanks to Apple Accelerate framework the minimum required LAPACK version is 3.1.ish which from more than a decade ago. And currently LAPACK is at 3.8.0. In the meantime they have discarded quite a number of routines (deprecated and/or removed) and fixed a lot of bugs and most importantly introduced new routines that are needed to fill the gap between commercial software and Python scientific software. The end result is summarized here. I have been constantly annoying mainly @rgommers and others for the last 6 months for this 😃 and I can assure you if they were the kind of people that you, maybe unwillingly, portrayed here this should have happened by now and broke the code of many people. Instead they have been patiently explaining why it is not that easy to drop the support for Accelerate.

Now there is an undisputed need for newer versions. That is not the discussion and we can safely skip that part. There is a significant portion of users of NumPy and SciPy that would benefit from this. But we can't just simply drop it because of arguments that you have already presented. How would you resolve this?

I'm not asking this in a snarky fashion but since all the devs seemingly think alike (and I have to say I agree with them) maybe your look can give a fresh idea. Should we keep Accelerate and create a new NumPy/SciPy package everytime such thing happens? If we drop the support in order to innovate what is the best way you think to go here?

@eric-wieser
Copy link
Member

eric-wieser commented Dec 12, 2017

Currently, thanks to Apple Accelerate framework the minimum required LAPACK version is 3.1.ish

@mhvk, this might be a problem for #9976 in 1,14, which I think needs 3.2.2 (edit: let's move discussion there)

@eric-wieser
Copy link
Member

@xoviat: Let's have this discussion on that issue

@khinsen
Copy link
Author

khinsen commented Dec 13, 2017

@ilayn Thanks for nudging this discussion towards the concrete and constructive! There are in fact many similarities between that situation and the ones that motivated me to start this thread.

The main common point: there are different users/communities that have different needs. Some want Accelerate, others want the new LAPACK features. Both have good reasons for their specific priorities. There may even be people who want both Accelerate and the new LAPACK features, though this isn't clear to me.

In the Fortran/C world, there is no such problem because the software stacks are shallower. There's Fortran, LAPACK, and the application code, without additional intermediates. What happens is that each application code chooses a particular version of LAPACK depending on its priorities. Computing centres typically keep several LAPACK versions in parallel, each in its own directory, the choice being made by modifying the application code's Makefile.

The lesson that we can and should take over into the SciPy ecosystem is that choosing software versions is not the task of software developers, but of the people who assemble application-specific software bundles. In our world, that's the people who work on Anaconda, Debian, and other software distributions, but also systems managers at various levels and end users with the right competence and motivation.

So my proposal for the SciPy/LAPACK dilemma is to keep today's SciPy using Accelerate, but put it into minimal maintenance mode (possibly taken over by different people). People who want Accelerate can then choose "SciPy 2017" and be happy. They won't get the new LAPACK features, but presumably that's fine with most of them. Development continues in a new namespace (scipy2, scipy2018 or whatever else), which switches to modern LAPACK. If technically possible, allow parallel installation of these two (and future) variants (which I think should be possible for SciPy). Otherwise, people needing both will have to use multiple environments (conda, venv, or system-wide environments via Nix or Guix). Note that even in this second scenario, I strongly recommend changing the namespace with each incompatible change, to make sure that readers of Python code at any level understand for which SciPy version the code was written.

The overall idea is that developers propose new stuff (and concentrate on its development), but don't advertise it as "better" in a general sense, nor as a universal replacement. Choosing the right combination of software versions for a particular task is not their job, it's somebody else's.

The general idea that development and assembly are done independently and by different people also suggests that today's mega-packages should be broken up into smaller units that can progress at different rates. There is no reason today for NumPy containing a small LAPACK interface and tools like f2py. For SciPy, it may make sense to have a common namespace indicating coherence and a common developement policy, but the sub-packages could well be distributed independently. The mega-package approach goes back to Python's "batteries included" motto which was great 20 years ago. Today's user base is too diverse for that, and software packaging has generally been recognized as a distinct activity. Including the batteries should now be Anaconda's job.

The main obstacle to adopting such an approach is traditional Linux distributions such as Debian or Fedora with their "one Python installation per machine" approach. I think they could switch to multiple system-wide virtual environments with reasonable effort, but I haven't thought much about this. For me, the future of software packaging is environment-based systems such as conda or Guix.

@ilayn
Copy link
Contributor

ilayn commented Dec 13, 2017

I don't see how all the prepositions you have put forth so far, are compatible with any of these steps

  • You have just recreated the madness of the following picture
    image
    Just counted and I have 27 copies now on my Windows machine. Now multiply that by 10 since (releases are more often here) and by 2 (since NumPy and SciPy release cycles are independent). In year 2025 I'll easily have 15 copies of each library and 10 LAPACKs and 5 f2pys as dependencies. Let alone the maintenance burden over only a bunch of dozen people in both packages, this simply won't work. (C++ is not relevant, insert any standard lib of anything). Ask any commercial code developer for Win and tell them this is such a good idea. I'm not responsible for about what follows in that exchange.
  • Then you increased the granularity of the packages and now all doing things on their own with different package versions; f2py broke something in one version so SciPy stops building in the next but still depends on the earlier version of NumPy. So some holistic entity should bring them together for free.
  • Then also you made Anaconda (or some other entity) a company of a major dependence just like Accelerate was. Or simply there will be an abundance of "somebody else"s.
  • Then you mobilized most of the userbase into a workflow that they really don't want (myself included) involving virtual envs.
  • Then you even modified Linux operating systems in passing (which is ... I mean just read some mailing lists of theirs, it's fun).

Maybe you digressed a bit.

@ghost
Copy link

ghost commented Dec 13, 2017

(This has become a free-for-all discussion, so I'll go ahead and jump in).

The problem with keeping support for accelerate is not that it lacks newer LAPACK APIs. If that were the problem, we could ship newer LAPACK shims and be done. The problem is that there are basic functions that return incorrect results in certain scenarios. There is no way to work around that other than to write our own BLAS functions. And if we're doing that, we might as well require OpenBLAS or MKL.

@ilayn
Copy link
Contributor

ilayn commented Dec 13, 2017

@xoviat These have been all discussed in scipy/scipy#6051. It's as usual never that simple. But the point is not to discuss Accelerate drop but use it as a use case for the actual dev cycle for new versions.

@ghost
Copy link

ghost commented Dec 13, 2017

@ilayn Yes, I'm sure already know about the points that I'm making. But the comment was for @khinsen; I think he's under the impression that we can actually keep Accelerate support.

@bashtage
Copy link
Contributor

One could argue that a feature (or limitation) of the Python ecosystem is that you get one version of a library without the horrible hack of name mangling. This happens in core Python. This is why there are libraries named lib and lib2 which have the same purpose but API differences. Even ore Python works this way. It isn't possibly to mix standard libraries across versions even if both are technically usable on the modern Python without someone ripping it and putting it on PyPi. There are plenty of StackOverflow questions on this, all with the same conclusion.

@khinsen
Copy link
Author

khinsen commented Dec 14, 2017

@ilayn If for some reason you want to have all possible combinations of all versions of everything on your machine, yes, that's a mess. But why would you want that? If you limit yourself to the combinations you actually need for your application scenarios, I bet it's going to be less. As an example, I keep exactly two Python environments on my machine: one with Python 2 + NumPy 1.8.2 for running my 20-year-old code, and one representing the state of the art of about two years ago for everything else (two years ago because I set it up two years ago, and never saw a reason to upgrade after that).

As for granularity, I was perhaps not quite clear in my proposition. What I advocate is more granularity in packaging, not in development. I would expect development of, say, f2py and SciPy to continue in close coordination. f2py-2018 and SciPy-2018 should work together. That doesn't mean they have to be packed as a single entity. The goal is to provide more freedom for software distribution managers to do their work.

I definitely don't want to make Anaconda or any other distribution a dependency. It's more like the "abundance of somebody else's", although I don't expect the number of distributions to grow to "abundance", given that assembling them is a lot of work.

I have no idea what workflow "the user base" wants. I see lots of different user bases with different requirements. Personally I'd go for multiple environments, but if there is a significant user base that wants a single environment per machine, some distribution will take care of that. But virtual environments were invented for a reason, they solve a real problem. System-level distributions like Nix or Guix take them to another level. I don't expect them to go away.

BTW, I am actually following the mailing list of one Linux distribution (Guix). Not much fun, but a lot of down-to-earth grunt work. I am happy there are people doing this.

@khinsen
Copy link
Author

khinsen commented Dec 14, 2017

@xoviat I didn't suggest to "keep Accelerate support". I merely suggest to keep a SciPy variant (pretty much the current one) around not as an outdated release for the museum, but as a variant of interest for a particular user group: those for whom using Accelerate is more important than solving the problems that Accelerate creates for others. The "Accelerate first" people will have to live with the consequences of their choice. Some problems will never be fixed for them. That's probably fine with them ("known bugs are better than unknown bugs"), so why force them into something different?

It's really all about labelling and communication. I want to get away from the idealized image of software following a linear path of progress, with newer versions being "better" as indicated by "higher" version numbers. I want to replace this image with one that I consider more realistic: there is no obvious order relation between software releases. Those produced by a long-lived coherent developer community have a temporal order, but that doesn't imply anything about quality or suitability for any given application.

If the idealized image were right, we wouldn't see forks, and we wouldn't have virtual environments. Nor projects such as VersionClimber.

What I am proposing is that software developers should embrace this reality rather than denying it. They should develop (and, most of all, package and label) their products for a world of diversity.

@ghost
Copy link

ghost commented Dec 14, 2017

@khinsen If you're okay with incorrect results from linear algebra functions, then we can keep accelerate support (note to others: I know how to do this). However, the main problem is that you might be the only person who wants this. And even if you are not, what happens when someone down the road blames SciPy for a problem with accelerate? What happens when someone wants to have their cake and eat it too? I can just see that happening.

@khinsen
Copy link
Author

khinsen commented Dec 15, 2017

@xoviat No, I am not OK with incorrect results from linear algebra functions. But I am sure that there are plenty of SciPy users who don't need the affected functions at all. In the thread you referred to, someone suggested removing/deactivating the affected functions when Accelerate is detected, which I think is a good solution (note: I cannot judge the effort required to implement this).

In a way this is part of the mega-package problem. With a more granular distribution, it would be easier to pick the stuff that works, both at the development and the distribution assembly level. One could even imagine a distribution assembler composing a domain- and platform-specific SciPy distribution in which different subpackages use different LAPACK versions, e.g. for use in HPC contexts.

@ghost
Copy link

ghost commented Dec 16, 2017

But I am sure that there are plenty of SciPy users who don't need the affected functions at all.

There's minimal evidence for this statement and I would in fact bet on the opposite. The functions are widely used but only fail in certain scenarios; in other other words your results are probably correct but may not be. Yes, this probably applies to the SciPy that you currently have installed if using OSX. Yes, this needs to be fixed.

As far as maintaining a separate branch, I don't think that anyone would be opposed to giving you write access to a particular branch for you to maintain. But this is open source software and people work on what they want to; I am skeptical that many people would be interested in maintaining that branch.

@ghost
Copy link

ghost commented Dec 16, 2017

Actually, I think the anaconda SciPy is compiled with MKL, so you wouldn't be affected in that case. But then why would you care about accelerate support?

@khinsen
Copy link
Author

khinsen commented Dec 17, 2017

@xoviat It seems there's a big misunderstanding here. I have no personal stakes at all in this specific issue. I don't use any linear algebra routines from SciPy.

You pointed to a thread on a SciPy issue and asked how I would handle that kind of situation. The thread clearly shows reluctance to simply drop Accelerate support, from which I deduced that there is a significant user group that would be affected by such a change. If that user group doesn't exist, then where is the problem? Why hasn't SciPy already dropped Accelerate support?

@khinsen
Copy link
Author

khinsen commented Dec 17, 2017

@xoviat Maintaining a separate branch is easy for anyone. There is no need for it to be hosted in the same GitHub repository. In other words, branches are not the issue. The issue is namespacing, in order to make the parallel existence of separate SciPy versions transparent to users (and distribution assemblers).

Today, when you see code saying "import scipy", you have no idea for which range of SciPy versions it is supposed to work (i.e. has been tested to some degree). In the best case, there is a README saying "SciPy >= 0.8" or something like that. This habit is based on the assumption that "higher" versions are always "better" and never degrade (break, slow down, ...) anything. And that assumption is quite simply wrong.

If, on the other hand, the code says "import scipy2017 as scipy", then it is clear to every reader that using it with earlier or later versions might lead to bad surprises. And if old SciPy versions disappear (effectively, for lack of maintenance), then such a code will fail with an error message, rather than continuing to work unreliably.

This is the one point I am trying to make in this thread. The coexistence of different versions is a reality. The idea that higher is better is a dream. Let's be realistic and organize ourselves for the real world, by acknowledging a multiple-version universe and adjusting everybody's communication to prevent misunderstandings.

@seberg
Copy link
Member

seberg commented Dec 17, 2017

Well, dunno… in my opinion when it comes to warnings, a specific version import is not a warning, it is prohibitive of using a different version, since the users having problems as you describe will not dare change your code. A warning would be if you print a warning on install/runtime that it is untested for all but specific numpy versions?

I suppose creating that type of extra packages is possible. I also expect it will just create a different type of hell. Much might survive, but type checking for example will not and cannot when you mix two versions, so basically you won't know if it can or cannot work until you try (and no one will test this!).
And unless you are suggesting allowing to mix two versions, I think your scipy2017 solution will just make things worse. Seems more like we would need something like dynamic/runtime virtual env choosing (like pin_import_version("1.6<numpy<1.10", level="raise") before any import on the python level).

The specific import makes sense if you have major prohibitive changes (a bit like py2/py3), and we already saw that we have different opinions on where or on what time scale that "major" line seems to be.

@mattip
Copy link
Member

mattip commented Aug 8, 2018

The backward compatiblity NEP #11596 has been submitted, can we close this?

@rgommers
Copy link
Member

rgommers commented Aug 8, 2018

The backward compatiblity NEP #11596 has been submitted, can we close this?

Yes we can close this. Independent of that NEP (which explicitly mentions semver as a rejected alternative), the consensus of the core devs here is that we don't want to change to semver. Hence closing as wontfix.

Thanks for the discussion everyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests