Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quicksync HEVC bugs and problems #1558

Open
SceneCityDev opened this issue Mar 18, 2024 · 8 comments
Open

Quicksync HEVC bugs and problems #1558

SceneCityDev opened this issue Mar 18, 2024 · 8 comments
Assignees
Labels
long-term fix Requires a long-term fix

Comments

@SceneCityDev
Copy link

Hi,

I am trying to transcode on the origin a HEVC stream coming in via srt, and re-publish it, encoding it as HEVC and AVC.

In theory this works fine.

However, first of all there is this bug in decoder_hevc_qsv.cpp:

const AVCodec *_codec = ::avcodec_find_decoder_by_name("h265_qsv");
The codec is called "hevc_qsv", not "h265_qsv".

After fixing this, HEVC decoding and everything works.

However, there is one thing that is really strange: HEVC decoding is ten times slower than encoding. To decode the single incoming HEVC stream, four threads are each eating 40% CPU:

1091919 root        20   0 2018M  751M 48848 R  44.8  2.4  0:29.02 DechevcQsv
1091910 root        20   0 2018M  751M 48848 R  44.1  2.4  0:23.88 DechevcQsv
1091902 root        20   0 2018M  751M 48848 R  42.8  2.4  0:22.94 DechevcQsv
1091897 root        20   0 2018M  751M 48848 S  32.8  2.4  0:17.79 DechevcQsv

However, encoding only takes a single thread with 14% CPU, as you would expect with qsv:

1091914 root 20 0 2018M 751M 48848 S 14.7 2.4 0:08.19 Ench264Qsv

I looked at the code, but can not find any reason why this happens.

If I disable qsv for decode, but only enable to for encode, it looks like this:

1092039 root        20   0 1745M  487M 47808 R 100.0  1.5  0:28.97 Dechevc
1092044 root        20   0 1745M  487M 47808 S  20.7  1.5  0:06.12 Dechevc
1092053 root        20   0 1745M  487M 47808 S  20.0  1.5  0:05.84 Dechevc
1092048 root        20   0 1745M  487M 47808 S   6.7  1.5  0:01.87 Ench264Qsv
1092057 root        20   0 1745M  487M 47808 S   5.3  1.5  0:01.60 EnchevcQsv

So, as you can see, now HEVC decode is in software, and only uses 3 threads, but is just as slow as DechevcQsv with four threads.

It looks like no matter what, even when specifying the the correct codec name hevs_qsv, qsv decode is not used.

Please note: I am using the Archlinux ovenmediaengine 0.16.5-1 package, which is using their own ffmpeg build. But the build flags look identical to the ones you are using.

Sadly for the machines I have which have RHEL/Rocky running, I don't have an igpu. Have you patched ffmpeg anywhere so that a "h265_qsv" decoder exists and works properly, or is this untested?

And a related question: I have read that the qsv codecs inside va-api might be faster and better than when using qsv directly, as it may use Intel-provided non-free codec blobs (libmfx and libvpl) - but not sure if that is true. Have you considered supporting va-api? From what I understand all you would need to do is enable this in libva, similar to
-hwaccel vaapi -hwaccel_device /dev/dri/renderD128 in ffmpeg.

Thank you very much for your great work!

@SceneCityDev
Copy link
Author

I have just noticed that decoder_avc_qsv.cpp is setting

::av_opt_set(_context->priv_data, "gpu_copy", "on", 0);

but decoder_hevc_qsv.cpp isn't.

I'll now patch this and try to build from source on Arch Linux instead of using the package, and check if that changes anything....

@SceneCityDev
Copy link
Author

I have now built OME from source including its ffmpeg. I added ::av_opt_set(_context->priv_data, "gpu_copy", "on", 0); and of course fixed the codec name. It works, but sadly is not faster (or not much).

I also built your ffmpeg version including va-api. Now I would like to test if that is faster. However, for this, I would need to pass libavcodec the equivalent of the ffmpeg parameter -hwaccel auto or -hwaccel vaapi -hwaccel_device /dev/dri/renderD128. This must be done as with va-api the encoder has a dedicated name "hevc_vaapi", but the decoder does not - initialization must be done, and then the codec is just called "hevc" because it is an internal optimization of ffmpeg, not a plugin codec.

I think that in general it might be a good idea for OME to always set -hwaccel auto, no matter what. Can't hurt I guess.

Can you give me any pointers on how/where in the OME code I could pass such an initialization parameter?

@getroot
Copy link
Sponsor Member

getroot commented Mar 19, 2024

Recently we improved performance by enabling zero copy on xilinx and nvidia hardware accelerators. But we haven't done that for qsv yet. Because we thought no one was using it.
We are planning to update qsv to enable zero copy, but we cannot yet estimate when that will be completed as there are too many higher priority tasks.
And regarding HEVC decoder, the issue below may also be related to you.
#1548

@SceneCityDev
Copy link
Author

Maybe I should not have put all of these subjects into one issue, but you definitely need to correct the codec name from "h265_qsv" to "hevc_qsv" for it to work at all.

What a lot of people don't know: A $150 MiniPC with Intel N5095 has the same Quicksync encoder as expensive Intel Core CPUs. Such a Mini PC is able to transcode 8 FHD streams in parallel. For a non-commercial project we are running an array of 8 of those. This is fast and MUCH more power-efficient than any other transcoding solution there is. If you are interested, I might do a write-down on this some time in the future. And as you know qsv H.264 encoding has a MUCH higher quality than OpenH264.

This has worked fine for H.264 sources for a while now, but I want to support HEVC, too. This is where the problem started...

When it comes to the HECV qsv decoder, something must be wrong. It's not normal that decoding takes 10x the CPU load than encoding. This might also be a driver bug, or a bug in libmfx, or somewhere between. It might be related to some pixfmt issue, for example.

I'll try to patch stuff to try use the va-api path for HEVC qsv decoding and see if that makes a difference. Not an easy task for someone who doesn't know much about the internals of libavcodec, but I'll try...

As said, at this point this is a non-commercial project. However, I would be interested in acquiring an "enterprise license" in the sense that I would like to get my hands on the missing source files to support x264 to be more flexible. Should I go through the general ovenmedia contact address or do you have a more direct point of contact?

@SceneCityDev
Copy link
Author

And I am aware that qsv does have direct transcoding path, where both decoding and encoding is done in one step in the GPU and is blazingly fast using an internal pixmt. But of course that is completely incompatible with the flexible pipeline of OME, so that's a no-no sadly.

@SceneCityDev
Copy link
Author

I am not experienced enough with github, else I would create a pull request.

So, again, this is the important bug that should be fixed:

In decoder_hevc_qsv.cpp:

const AVCodec *_codec = ::avcodec_find_decoder_by_name("h265_qsv");

The codec is called "hevc_qsv", not "h265_qsv". So this line must be changed to:

const AVCodec *_codec = ::avcodec_find_decoder_by_name("hevc_qsv");

After fixing this, HEVC decoding with Quicksync works.

@getroot
Copy link
Sponsor Member

getroot commented Apr 4, 2024

I would appreciate it if you could post a PR about this.

@getroot getroot added the long-term fix Requires a long-term fix label Apr 4, 2024
irlkitcom added a commit to irlkitcom/OvenMediaEngine that referenced this issue Apr 5, 2024
@SceneCityDev
Copy link
Author

irlkitcom: Thank you for doing this, I still have to learn how to do PRs...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
long-term fix Requires a long-term fix
Projects
None yet
Development

No branches or pull requests

3 participants