New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build openfold with newer ppytorch + cuda #403
Comments
Indeed, preparing the environment on newer platform is far from easy. I just did it this way on my Debian testing rolling setup:
When iterating towards these lines, I encountered few pitfalls:
And the preceding CUDA setup has pitfalls as well:
I find CUDA setup via Debian repos easier than via Nvidia (in fact at this moment the critical bug 4336331 in Nvidia driver is only fixed in Debian). I add these sources:
and install:
I got these package versions:
and outside mamba environment, I have:
It is possible that OpenFold will need some little tweeks here and there with this setup. But I hope this helps a little bit... |
Meanwhile, PR #407 just landed in the codebase (thanks @jnwei and everybody involved!) and it is supposed to tackle these issues. While it certainly moves the code forward (and may likely contain some needed "tweeks here and there" I mentioned above), it still does not allow me to install an environment just following the install instructions in README. When I do:
it tries to install some older packages than PR407 description suggests:
and then fails with:
In the current Also, the C++14/17 patch (which I did above via |
@abeebyekeen I see that you also devoted considerable effort to setting up environment.yml. It would be nice to hear how your current setup works for you. (I am particularly interested in effects of the 'cuda' conda package - maybe it allows even more minimalist cuda setup in the operating system? Just kernel driver?) Or how it compares with my setup (2nd post in this thread) if you have any incentive to try. |
Hi @vaclavhanzl. Yes, I spent a good part of last weekend trying to setup a tool that requires openfold as a dependency. I was initially unable to build openfold due to a number of problems including Here are what I've got and the selections that eventually worked for me in solving all the problems:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0 $ gcc --version
gcc (GCC) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc. For the environment I needed, here is how I set it up:
To get openfold to build, I was able to use the git clone https://github.com/aqlaboratory/openfold.git
cd openfold
sed -i 's/std=c++14/std=c++17/g' setup.py
python -m pip install .
And that works perfectly. |
Thanks a lot @abeebyekeen for sharing this with us, in a very clear way! I'd very much like to see OpenFold working out of the box for most new users, especially now when there are new great and publicized features. I still do not know how a PR going that way should look, addressing different setups people might have is hard. I was thinking about having ranges of versions in environment.yml but that is not an easy way either. |
Hi all, Thanks for all the interest and for sharing notes! The environments I wanted to support at this time were:
I just checked that the pl_upgrades branch on two systems I have access to with pre-installed CUDA 12, and found that they were working for me. Let me know if folks have issues with this environments. My understanding was that having an environment which is CUDA 11.x + Pytorch 2.x is complicated, as the default pytorch 2 packages are built on CUDA 12 (leading to the CUDA mismatch error @vaclavhanzl saw). It looks like @abeebyekeen was able to find a workaround with a lot of elbow grease, thanks for sharing your fix! I plan on cleaning up the documentation for this project and when I do, I'll add a page regarding the supported environments. |
Thanks @jnwei ! I am happy to report that the (And please @jnwei excuse my rather misguided comments on PR #407 - I totally overlooked that you merged to |
quick question about future plan: |
Hi there, just wanted to follow up on this and ask if there are any plans/timelines to merge |
Hi thanks for the interest. We're actively working on finalizing the changes in |
Thank you @jnwei , appreciate the update! |
A quick note on the pytorch 2 / CUDA 12 upgrade: We've run into some technical issues with the pytorch 2 upgrade. Briefly, we observe large instabilities in our training losses in the pytorch2 version relative to our pytorch 1 version. For inference, we're also observing a slight difference between model outputs in pytorch 1 and pytorch 2. The difference in final output coordinates is about RMSD~0.05A for the proteins I've looked at While these differences might seem small, it may point to a larger issue that is also occurring in training; we're currently looking into it. Until we find the root cause of the discrepancy, or a way around the training instability, we're not ready to update the main branch to pytorch 2. Meanwhile, we will upgrade the main branch to use pytorch lightning 2, which has a few features that the team has found useful. I'll also push some changes to pl_upgrades that integrate some of the changes from the main branch, and cleans up the conda environment / docker for a CUDA 12 / pytorch 2. We are actively working on debugging the instability, and we'll keep you posted as soon as we are ready to upgrade. Thank you all for your interest and your patience. |
Right now openfold asks for old pytorch + cuda (11.2), thus latest linux is not able to build openfold.
Would like to upgrade the supported pytorch + cuda and other python packages accordingly, so people can use newer platform(OS, etc)
The text was updated successfully, but these errors were encountered: