Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan for Python 3.6 EOL (23 Dec 2021) #3227

Closed
joshuacwnewton opened this issue Feb 18, 2021 · 18 comments · Fixed by #3361
Closed

Plan for Python 3.6 EOL (23 Dec 2021) #3227

joshuacwnewton opened this issue Feb 18, 2021 · 18 comments · Fixed by #3361
Labels
priority:HIGH upstream Issue caused by software dependencies
Milestone

Comments

@joshuacwnewton
Copy link
Member

joshuacwnewton commented Feb 18, 2021

Python 3.6 will no longer receive security fixes after 23 Dec 2021. SCT relies exclusively on Python 3.6 because we use tensorflow==1.5.0. We use this version for users whose CPUs don't support AVX (#2562/#2618) and also our CentOS 7 users (#2270/#2276).

So, to move to Python 3.7, we would need to do one of two things:

1. Drop Tensorflow immediately (#3046)

We could drop the Tensorflow-based models (sct_deepseg_<x>), breaking backwards compatibility, then encourage users to update their scripts to use sct_deepseg moving forward. (Requires finishing sct_deepseg in a shorter timespan.)

EDIT: sct_deepseg is not yet able to be a direct replacement for sct_deepseg_sc, sct_deepseg_gm, and sct_deepseg_lesion. So, this option is off the table.

2. Upgrade Tensorflow for now (#3361), then drop Tensorflow once sct_deepseg is complete (#3046)

Python 3.7 support would require at least tensorflow==1.13.1 (but most likely the most recent 1.x version). Upgrading to this version would be a less drastic step that would buy us more time to prepare for #3046.

If we go this route, though, then we would need a way to address the portion of our userbase whose CPUs don't support AVX (#2562/#2618)

The CentOS 7 issues (#2270/#2276) seem to not be a problem, however. See: #3227 (comment).

@joshuacwnewton
Copy link
Member Author

Aside about the impact of this decision on users

Because I'm relatively new to SCT, I still don't have a great sense what our userbase is like. Beyond individual interactions on our forum, I can't really comment on overall demographics and usage habits of our userbase. So, it's hard to estimate the impact any of these decisions will have.

What I'd really love is if we had an open dialogue with the authors of each paper/study that cites SCT. (For example, what if we had someone who sent off an email each time SCT is cited to thank them for using SCT, and to ask them a few questions?) This might help us get a better sense of how our users are applying SCT, which would really help to inform decisions like this.

(By the by, reaching out to users also reminds me of the Testimonials pages from #3346.)

@jcohenadad
Copy link
Member

Drop sct_deepseg_ entirely (breaking backwards compatibility). Then, encourage users to update their scripts to use sct_deepseg moving forward (i.e. #3046)

I would be in favour of that. It is brutal, though, as sct_deepseg_sc is used a lot. But we also don't want to end up in a situation like in this paper, where people are not aware of the most recent tool (ie: sct_deepseg_gm) and instead keep relying on the old ones which are less efficient. So keeping the CLI and sending a message that this function is now deprecated and people should use sct_deepseg.

For example, what if we had someone who sent off an email each time SCT is cited to thank them for using SCT, and to ask them a few questions?

This is a brilliant idea. I do not have the time to do it, but please feel free to take the initiative @joshuacwnewton 🙏

@joshuacwnewton
Copy link
Member Author

Drop sct_deepseg_ entirely (breaking backwards compatibility). Then, encourage users to update their scripts to use sct_deepseg moving forward (i.e. #3046)

I would be in favour of that. It is brutal, though, as sct_deepseg_sc is used a lot. But we also don't want to end up in a situation like in this paper, where people are not aware of the most recent tool (ie: sct_deepseg_gm) and instead keep relying on the old ones which are less efficient.

Good point. 👍

I'm also hopeful that the switch to sct_deepseg + ivadomed will help us to avoid that problem in the future, too. The task-based interface seems like it will help us better communicate when improved methods are released (e.g. by changing whichever tasks are installed by default, while still leaving previous tasks available if needed).

So keeping the CLI and sending a message that this function is now deprecated and people should use sct_deepseg.

I like your idea about a deprecation warning. If we end up making this decision in, say, May-June, we'll ideally give users 6+ months of warning before we finally remove the functions entirely. So, anyone who begins using e.g. 5.3.x/5.4.x/etc. will know in advance that 6.0.0 will make some bigger changes. An advance warning could also be included in the release notes for 5.3.x/5.4.x, the mailing list, etc.

The question I have then, is: does 6 months provide enough warning for our users?

@jcohenadad
Copy link
Member

The question I have then, is: does 6 months provide enough warning for our users?

yes I think so, but more importantly we need to have the models ready by June then 😊

@joshuacwnewton
Copy link
Member Author

yes I think so, but more importantly we need to have the models ready by June then blush

Ah, my apologies, I may have mistakenly thought that sct_deepseg is more ready than it actually is. 😅

Who would be the person to talk to to find out what work is left? Asking because I might need to update the issue description in #3046, in case that it's understating the work remaining for sct_deepseg.

@jcohenadad
Copy link
Member

jcohenadad commented Apr 17, 2021

Who would be the person to talk to to find out what work is left? Asking because I might need to update the issue description in #3046, in case that it's understating the work remaining for sct_deepseg.

It would be me and @charleygros. Nobody else really has the responsibility for this. The bigger work is really to tie in the new model training with the git-annex datasets that @kousu is deploying. We also need to make some decisions on how we want to expose the models, what default models to install, how many flavors of these models (eg: SoftSeg vs. HardSeg), etc. These are extremely important discussions/decisions that we need to have sooner rather than later.

@joshuacwnewton
Copy link
Member Author

These are extremely important discussions/decisions that we need to have sooner rather than later.

Sounds good to me. 👍

Just as a backup plan, I can also quickly test the "upgrading tensorflow" approach to see if we can still use Python 3.7 with SCT's current state (without deprecating anything). Even if we don't end up going that route, it might be useful preparation for the inevitable change later this year.

@kousu
Copy link
Contributor

kousu commented Apr 27, 2021

I didn't read this until today but Tensorflow is pinned because (#2618) it was incompatible with users with older, pre-2008, processors, and more specifically because we're relying on binaries.

That means there's another solution: these users can compile tensorflow themselves, or more reasonably we can compile it and provide it to them; this is the solution proposed here: tensorflow/tensorflow#24548 (comment). It will be annoying for them, but less annoying than holding back everyone else.

Even if we solve this in the meantime with pytorch, this issue is going to come up again. Have the users who had problems with tensorflow tried out sct_deepseg? Does it work for them? Even if it does, there will be some future upgrade to pytorch that is incompatible with processors >10 years old and we'll be stuck again.

Depending on binaries, and not source code, means our vendors' decisions limit who we can support; and PyTorch's target audience is people who can afford data centres, not clinicians.

Also, can we aim for python3.9? The small differences between 3.7/3.8/3.9 can usually be handled by installing polyfills (e.g. https://pypi.org/project/importlib-resources/) and we can get Github Actions to test all three versions. I would like to see

https://github.com/neuropoly/spinalcordtoolbox/blob/c2d2206e21485ab7585af73f4d5b6e895f5fb6e7/setup.py#L35-L36

become

 'Programming Language :: Python :: 3', 
 'Programming Language :: Python :: 3.7', 
 'Programming Language :: Python :: 3.8', 
 'Programming Language :: Python :: 3.9', 

and add this to setup.py:

python_requires='>=3.7',

@charleygros
Copy link
Member

charleygros commented Apr 28, 2021

We also need to make some decisions on how we want to expose the models, what default models to install, how many flavors of these models (eg: SoftSeg vs. HardSeg), etc. These are extremely important discussions/decisions that we need to have sooner rather than later.

👍 !!
@jcohenadad: Should we add this to the next SCT meeting agenda? (If so, I can do this) or have a separate meeting dedicated to these questions?

EDIT: I've just seen @joshuacwnewton message on Slack, all good!

I've added some discussion points to the meeting agenda.

@joshuacwnewton
Copy link
Member Author

joshuacwnewton commented Apr 28, 2021

That means there's another solution: these users can compile tensorflow themselves, or more reasonably we can compile it and provide it to them; this is the solution proposed here: tensorflow/tensorflow#24548 (comment). It will be annoying for them, but less annoying than holding back everyone else.

Would this still fall under "upgrade Tensorflow" part of #3227 (comment)? (In other words, a solution specifically for the subproblem of "we would need a way to address the portion of our userbase whose CPUs don't support AVX")

Have the users who had problems with tensorflow tried out sct_deepseg? Does it work for them?

These are good questions, and ones I don't have the answer to. This is part of the reason why I wanted to bring this up in next week's meeting so that we can work towards getting sct_deepseg ready. (Soliciting feedback is also mentioned in the "Context" part of the description for #3046.)

Also, can we aim for python3.9? The small differences between 3.7/3.8/3.9 can usually be handled by installing polyfills (e.g. https://pypi.org/project/importlib-resources/) and we can get Github Actions to test all three versions. I would like to see

I would like this too, and I'm glad you brought it up. I had only mentioned 3.7 because it was the bare minimum needed to avoid the most urgent issue at hand. (In other words, it would buy us another few years time to plan around 3.8/3.9.)

But, I agree that if we can, it would be better to target 3.8/3.9 upfront, rather than kicking the can down the road another few years.

@jcohenadad
Copy link
Member

Have the users who had problems with tensorflow tried out sct_deepseg? Does it work for them?

no, sct_deepseg currently does not have an equivalent model for SC segmentation-- that's what we need to discuss (and do)-- it is not a trivial task

@joshuacwnewton
Copy link
Member Author

joshuacwnewton commented Apr 28, 2021

2. Upgrade Tensorflow (#3361)

I have some good news about the viability of this option: Tensorflow 1.15 + Python 3.7 only required a single bugfix -- our test suite is passing on every platform for Python 3.7. (I'm especially glad to see that CentOS 7 is passing. I was expecting CentOS 7 to fail, given #2270/#2276.)

If we can plan for the non-AVX users, then this would be a quicker solution for the Python 3.6 EOL, giving us more breathing room to prepare sct_deepseg.

I'm going to explore 3.8/3.9 next. 😄

@joshuacwnewton
Copy link
Member Author

joshuacwnewton commented Apr 28, 2021

It looks like Python 3.8/3.9 is going to be a little more complicated, since pip is having trouble finding matching wheels for tensorflow/onnxruntime. See: Ubuntu, Python 3.8 and Ubuntu, Python 3.9.

This is a bit of a tangential problem. So, I'm going to make a separate issue for 3.8/3.9 so that discussion here can focus on 3.7. (cc: @kousu)

E: #3367

@kousu
Copy link
Contributor

kousu commented Apr 28, 2021

Nice!

Compilers are still a scary thing to me but I did some quick googling about how to go about making compatible builds to not leave users out in the cold.

Once we figure this out maybe we can release them as +noavx the way we're using pytorch's +cpu build and provide them with something like this:

# requirements-noavx.txt
-f https://github.com/neuropoly/tensorflow-noavx/releases/
tensorflow==2.5+noavx
. # <-- install sct

Can we survey the users that had problems and get them to provide us their cat /proc/cpuinfos? Some of them might have avx but not avx2 or avx512 (actually, it looks like I'm one of those people). I don't know if -mno-avx disables all three kinds. And there's a bunch of other CPU flags we might need to worry about for Clinicians running on older hardware.

--

This is for that separate issue once you make it:

The 3.8/3.9 problems are from tensorflow being pinned. I looked through the history and the first available for 3.8 is https://pypi.org/project/tensorflow/2.2.0/#files, and the first 3.9 is a not technically released yet, it's just an rc: https://pypi.org/project/tensorflow/2.5.0rc0/#files. But I think probably by the time we're done they'll have it ready, so we'd be looking at jumping to tensorflow>=2.5.


@joshuacwnewton
Copy link
Member Author

joshuacwnewton commented Apr 28, 2021

Small note: Tensorflow 2.X is a very different beast than Tensorflow 1.X. You can practically consider it a different library, and migrations are nontrivial.

So, I imagine it's more likely we finish the Tensorflow 1.X -> PyTorch transition (i.e. abandoning Tensorflow entirely) before we ever get a chance to touch Tensorflow 1.X -> Tensorflow 2.X.

EDIT: I was off-base here, since SCT has already used Tensorflow 2.X at one point: #2592

@kousu
Copy link
Contributor

kousu commented Apr 28, 2021

Okay, well then forget that then. We'd be looking at recompiling the older tensorflow with all those bugs in it for the no-avx users, and then in a year when 3.7 is EOL we'd be looking at recompiling it for everyone just to keep it alive.

@joshuacwnewton
Copy link
Member Author

joshuacwnewton commented Apr 28, 2021

We'd be looking at recompiling the older tensorflow with all those bugs in it for the no-avx users, and then in a year when 3.7 is EOL we'd be looking at recompiling it for everyone just to keep it alive.

Just to clarify, my understanding was that our ultimate goal was to deprecate our Tensorflow-based methods, not keep them alive. (I've updated the issue description to clarify this, sorry if that was unclear.)

My proposal for upgrading Tensorflow to 1.15 was just as a stopgap, a temporary measure to avoid the Python 3.6 EOL. My understanding was that we would use the extra year to focus on preparing sct_deepseg to be an equivalent replacement for the Tensorflow-based methods, so that SCT would become PyTorch-only. (See #3046.)

@kousu
Copy link
Contributor

kousu commented Apr 28, 2021

Don't worry, it was very clear that exploring tensorflow was just a stop-gap.

@joshuacwnewton joshuacwnewton added user requested Raised by user on the SCT forum/email/GitHub. Be sure to notify them when fixed in a release. and removed user requested Raised by user on the SCT forum/email/GitHub. Be sure to notify them when fixed in a release. labels Jun 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:HIGH upstream Issue caused by software dependencies
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants