Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Knowledge] ROCm (AMD GPU) Support on Linux Guide #868

Open
LuisArtDavila opened this issue Sep 19, 2023 · 18 comments
Open

[Knowledge] ROCm (AMD GPU) Support on Linux Guide #868

LuisArtDavila opened this issue Sep 19, 2023 · 18 comments
Labels
knowledge knowledge

Comments

@LuisArtDavila
Copy link

Description

Hello,

I managed to get my GPU to display within the web interface by executing the following commands after setting up my Conda environment. Your mileage will vary, as this was only tested on a RDNA 2/Navi 2 AMD GPU and more specifically, it was tested with my 6700 XT. I would love to know if this works for anyone else, so please let me know so that I may open a pull request.

This was tested on Arch Linux but might work on other distributions. If not, you can always try with distrobox.

Before running any commands, make sure that you are cd'd into the cloned repository and have the environment activated, i.e. via conda activate vcclient-dev:

$ pip install fairseq pyworld
$ export HSA_OVERRIDE_GFX_VERSION=10.3.0
$ pip install torch==2.0.1+rocm5.4.2 torchvision==0.15.2+rocm5.4.2 --index-url https://download.pytorch.org/whl/rocm5.4.2
$ cd server

If you are running a 7000 series GPUs, the last pip install command will look like this instead:

pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm5.6

and if you are on the older Navi (5000 series) cards, it will be this:

pip install torch==1.13.1+rocm5.2 torchvision==0.14.1+rocm5.2 --index-url https://download.pytorch.org/whl/rocm5.2

Make sure to only run one of the pip install commands - the one that is for your particular GPU. Running it will uninstall the previous version of torch and some other modules and replace them with the ROCm ones.

Then finally, run as normal:

python3 MMVCServerSIO.py -p 18888 --https true \
    --content_vec_500 pretrain/checkpoint_best_legacy_500.pt  \
    --content_vec_500_onnx pretrain/content_vec_500.onnx \
    --content_vec_500_onnx_on true \
    --hubert_base pretrain/hubert_base.pt \
    --hubert_base_jp pretrain/rinna_hubert_base_jp.pt \
    --hubert_soft pretrain/hubert/hubert-soft-0d54a1f4.pt \
    --nsf_hifigan pretrain/nsf_hifigan/model \
    --crepe_onnx_full pretrain/crepe_onnx_full.onnx \
    --crepe_onnx_tiny pretrain/crepe_onnx_tiny.onnx \
    --rmvpe pretrain/rmvpe.pt \
    --model_dir model_dir \
    --samples samples.json

And if you want to pipe your audio through a virtual audio device with PipeWire, you can create a "virtual audio cable" with the following command:

pw-loopback \
  --capture-props='media.class=Audio/Sink node.name=al_speaker node.description="Audiolink Speaker' \
  --playback-props='media.class=Audio/Source node.name=al_mic node.description="Audiolink Mic"' \
  &

This will create "Audiolink Speaker" and "Audolink Mic" devices. You will pipe the voice-changer audio through the speaker and set the microphone as the input device in your application, e.g. Discord.

Credits

stable-diffusion-webui for pip install ROCm commands
audiolink

@LuisArtDavila
Copy link
Author

I forgot to mention, but make sure that you run:

$ export HSA_OVERRIDE_GFX_VERSION=10.3.0

Before starting the server every time or it will crash and be unresponsive.

@w-okada
Copy link
Owner

w-okada commented Sep 20, 2023

Great work!
There were others who also wanted to use ROCm, so I believe this information is very beneficial. I would appreciate it if you could make a pull request.

@TheTrustedComputer
Copy link

TheTrustedComputer commented Sep 22, 2023

I can confirm this setup works with my 5500 XT on Linux; however, this card uses an older GFX version (10.1.x), which this build of PyTorch apparently doesn't like.

"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"

@LuisArtDavila
Copy link
Author

I can confirm this setup works with my 5500 XT on Linux; however, this card uses an older GFX version (10.1.x), which this build of PyTorch apparently doesn't like.

"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"

Hmm, I'm a little confused about what you mean by this. It works if you run all the commands (e.g. the export and pip install for your 5000 series card, meaning you install torch 1.13.1) but normally it wouldn't?

@TheTrustedComputer
Copy link

TheTrustedComputer commented Sep 24, 2023

I can confirm this setup works with my 5500 XT on Linux; however, this card uses an older GFX version (10.1.x), which this build of PyTorch apparently doesn't like.
"hipErrorNoBinaryForGpu: Unable to find code object for all current devices!"

Hmm, I'm a little confused about what you mean by this. It works if you run all the commands (e.g. the export and pip install for your 5000 series card, meaning you install torch 1.13.1) but normally it wouldn't?

What I mean is that by modifying the environment variable HSA_OVERRIDE_GFX_VERSION to 10.1.0 or 10.1.2, I get this error message when starting the voice changer. Changing it to 10.2.0 removes the error, but I don't see any of my cards listed. 10.3.0 works, and the output from radeontop shows my card is being utilized (I have a dual GPU setup btw). I hope this clears up any confusion you have.

Furthermore, when activating the voice changer from the terminal and starting it for the first time from the web browser, not just simply stopping and restarting it, I get this warning:

MIOpen(HIP): Warning [SQLiteBase] Missing system database file: gfx1030_11.kdb Performance may degrade. Please follow instructions to install: https://github.com/ROCmSoftwarePlatform/MIOpen#installing-miopen-kernels-package

However, my package manager (pacman) says it appears to be installed system-wide. I don't know how to remove the warning, as the instructions target Ubuntu and Ubuntu-based distributions. Arch does have this package, and installing it doesn't remove the warning.

extra/miopen-hip 5.6.1-1 [installed]
    AMD's Machine Intelligence Library (HIP backend)

@GatienDoesStuff
Copy link

GatienDoesStuff commented Sep 24, 2023

I've been in touch with TheTrustedComputer, and I might have been misleading due to my lack of prior research. I did some more digging though, and here's the deal :

The issue with AMD's compute stack (Which might just be a packaging issue) is that they tend to build binaries that only target some cards, but can only use a given GPU if both pytorch & the local ROCm libraries were built for it.

As an example, whatever Arch ships with for ROCBLAS (one of the ROCm libraries) doesn't have that many targets, meaning that for most cards you have to override to the closest target the library's been built for. I suppose this is the case for more packages.

On my setup, with a gfx1035 card, I can't run pytorch as either my local installation of the ROCm libraries or pytorch didn't build for it. They did build for gfx1030 though, and since my card is close enough to it, HSA_OVERRIDE_GFX_VERSION=10.3.0 just works

This is why the override is required, most installations don't ship "fat" binaries that support all targets, unlike CUDA which has a different mechanism allowing it to support all of it's targets easier.

Finding the right override can be a pain though, I'm not sure how to document it well

Edit: It's weird that the "Unable to find code object" errors only show up in some cases, and only segfaults in others, but when AMD_LOG_LEVEL=1 is set, the logging does tell you what the issue is, and also give you the targets the software you are running have been built for

EDIT: It seems like the torch ROCm package is self-sufficient, and that the host libraries don't have much to do with it

@Xanderman27
Copy link

Is this process similar to windows? I would love to get ROCm working on my 7800xt. Would a translation layer be necessary?

@GatienDoesStuff
Copy link

Is this process similar to windows? I would love to get ROCm working on my 7800xt. Would a translation layer be necessary?

You need to wait for a pytorch ROCm backend to land for windows.

Most of the ROCm libraries are already ported, but there's still some left (I think there's MIOpen?) before pytorch can work there

@ProFFs
Copy link

ProFFs commented Sep 26, 2023

Where me find Mmvcserversio.py?

@LuisArtDavila
Copy link
Author

Is this process similar to windows? I would love to get ROCm working on my 7800xt. Would a translation layer be necessary?

You might be able to use WSL to get this to work but I have not tried it myself. Let me know how it goes if you give it a shot - I might be able to help if you run into any issues.

Where me find Mmvcserversio.py?

It is located within the "server" folder.

@ProFFs
Copy link

ProFFs commented Sep 28, 2023

Is this process similar to windows? I would love to get ROCm working on my 7800xt. Would a translation layer be necessary?

You might be able to use WSL to get this to work but I have not tried it myself. Let me know how it goes if you give it a shot - I might be able to help if you run into any issues.

Where me find Mmvcserversio.py?

It is located within the "server" folder.

This folder is located in Windows version?

@w-okada w-okada changed the title ROCm (AMD GPU) Support on Linux Guide [Knowledge] ROCm (AMD GPU) Support on Linux Guide Oct 6, 2023
@w-okada w-okada added the knowledge knowledge label Oct 6, 2023
@ALEX5402
Copy link

can you tell me what version should i use for vega 8 gpu

@Xanderman27
Copy link

@EmiliaTheGoddess
Copy link

EmiliaTheGoddess commented Oct 25, 2023

For Polaris users(RX 580, RX 590, etc.) on Arch Linux, HIP binaries provided by official repositories don't work. Here's a small guide for users of older cards:

  • Uninstall any ROCm and HIP binary you have.
  • Install opencl-amd and opencl-amd-dev from AUR (CAUTION: This takes a lot of disk space, make sure to have 20-30 GiB of free space before attempting to install.)
  • Install python-pytorch-rocm from official
  • Go to the voice changer server directory and create a virtual environment with python -m venv venv --system-site-packages. Make sure to delete the existing venv if you have one.
  • Activate the virtual environment with source venv/bin/activate
  • Verify pytorch is working by opening a python shell and typing these 2 lines:
>>> import torch
>>> torch.cuda.is_available()

If you see True there, you're good to go.

  • Install the requirements with pip install -r requirements.txt
  • Install pyworld, it's not in the requirements file.
  • As of writing this, Fairseq has a bug with python 3.11. There's a fix made by @One-sixth at Dataclass error while importing Fairseq in Python 3.11 facebookresearch/fairseq#5012 (comment). Install this with pip install git+https://github.com/One-sixth/fairseq.git. This will probably be fixed in future so try the next step before this. If it fails, install the patched version of fairseq.
  • Try launching the MMVCServerSIO.py with python MMVCServerSIO.py

Note: onnxruntime may give out errors like Failed to load library libonnxruntime_providers_cuda.so. Don't worry about that, it still works.
Please correct me if I have any mistakes.

@YourSandwich
Copy link

Thank you, this worked, although I had to install some python modules manually since the current requirements.txt file will downlaod nvidia stuff.

Also on ArchLinux i compiled Python3.10 since 3.11.5 is incompatible with onix.
I am using an RX7900XT

@YourSandwich
Copy link

I can't record the mic successful because of this issue: NotSupportedError: AudioContext.createMediaStreamSource: Connecting AudioNodes from AudioContexts with different sample-rate is currently not supported.

@YourSandwich
Copy link

I have an input sample-rate of 48k and the models are 40k, do i miss some modules?

@YourSandwich
Copy link

It was an Firefox issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
knowledge knowledge
Projects
None yet
Development

No branches or pull requests

9 participants