Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[regression][ffmpeg-vaapi][vp8d] decode output corrupt #1028

Closed
uartie opened this issue Aug 11, 2020 · 27 comments
Closed

[regression][ffmpeg-vaapi][vp8d] decode output corrupt #1028

uartie opened this issue Aug 11, 2020 · 27 comments

Comments

@uartie
Copy link
Contributor

uartie commented Aug 11, 2020

We are observing this issue only on gen 9.5 (KBL/CFL/WHL/CML ...). However, some of our gen9.5 platforms don't exhibit this problem. So maybe it's related to SKU or kernel that causes bad driver behavior?

$ ffmpeg -hwaccel vaapi -init_hw_device vaapi=hw:/dev/dri/renderD128 \
  -hwaccel_flags allow_profile_mismatch -filter_hw_device hw -v verbose \
  -i ./vp8_rev025_CoeffSkip_10_CapitolRecordsNight0.webm \
  -pix_fmt yuv420p -f rawvideo -vsync passthrough -autoscale 0 \
  -vframes 2 -y threads-default.yuv

$ md5sum threads-default.yuv
318bbd5defb50471c1fe5665f2a70c5a  threads-default.yuv

The second frame in threads-default.yuv output is corrupt:

threads-default-frame-2

However, if we add -threads 1 to the ffmpeg pipeline, then we get the expected output:

$ ffmpeg -hwaccel vaapi -init_hw_device vaapi=hw:/dev/dri/renderD128 \
  -hwaccel_flags allow_profile_mismatch -filter_hw_device hw -v verbose \
  -threads 1 -i ./vp8_rev025_CoeffSkip_10_CapitolRecordsNight0.webm \
  -pix_fmt yuv420p -f rawvideo -vsync passthrough -autoscale 0 \
  -vframes 2 -y threads-1.yuv

$ md5sum threads-1.yuv
0da047495680366fc428d120368f35e4  threads-1.yuv

threads-1-frame-2

We have many other cases that produce this kind of corrupt behavior, too.

@uartie
Copy link
Contributor Author

uartie commented Aug 11, 2020

This is a regression that started with c383f7e:

commit c383f7ef2d65f523b8b8c70a71c94a296541d5be
Author: XiaogangLi <xiaogang.li@intel.com>
Date:   Fri Jul 17 22:46:42 2020 +0800

    [Media Common] Change SW swizzling to HW copy in lock surface
    
    Use HW copy to improve performance for decode only/encode only and VPP cases.
    
    Change-Id: Ibf7e68c2224598f742e6c869204cf0f95dd3e852

cc: @Xiaogangli-intel

@uartie uartie changed the title [ffmpeg-vaapi][vp8d] decode output corrupt [regression][ffmpeg-vaapi][vp8d] decode output corrupt Aug 11, 2020
@Xiaogangli-intel
Copy link
Contributor

Xiaogangli-intel commented Aug 12, 2020

@uartie This change only impacts the platform which uses SW swizzling(should be >Gen11). So I don't think this issue caused by this change because gen 9.5 (KBL/CFL/WHL/CML ...) are not using SW swizzling.

@uartie
Copy link
Contributor Author

uartie commented Aug 12, 2020

@uartie This change only impacts the platform which uses SW swizzling(should be >Gen11). So I don't think this issue caused by this change because gen 9.5 (KBL/CFL/WHL/CML ...) are not using SW swizzling.

I checked on several different KBL, CFL and WHL packages SKUs. Some SKUs did not reproduce... but the others reproduce reliably. I also double-checked before and after patch... and regression always after patch.

@uartie
Copy link
Contributor Author

uartie commented Aug 12, 2020

For example, a i7-7700 (KBL) I tested has 100% reproduced regression. But it is fine before the patch.

@uartie
Copy link
Contributor Author

uartie commented Aug 12, 2020

Interestingly, on my local KBL Fedora OS development machine, it does not regress before or after this commit. But, if I compile and test in an Ubuntu Bionic Docker container on the same machine, I see the regression after this commit, but not before. Hope this gives some clues.

@uartie
Copy link
Contributor Author

uartie commented Aug 12, 2020

Interestingly, on my local KBL Fedora OS development machine, it does not regress before or after this commit. But, if I compile and test in an Ubuntu Bionic Docker container on the same machine, I see the regression after this commit, but not before. Hope this gives some clues.

Another observation... with Debug build, the issue goes away in container, too.

@uartie
Copy link
Contributor Author

uartie commented Aug 12, 2020

@uartie This change only impacts the platform which uses SW swizzling(should be >Gen11). So I don't think this issue caused by this change because gen 9.5 (KBL/CFL/WHL/CML ...) are not using SW swizzling.

But this change does modify some of the code paths taken by gen9.5, right? Perhaps the change exposed a previously dormant multi-thread race condition? Or maybe introduced one by accident?

@uartie
Copy link
Contributor Author

uartie commented Aug 31, 2020

@Xiaogang-Li @feiwan1 any update?

@Xiaogangli-intel
Copy link
Contributor

@uartie, could you try this patch #1048?

@uartie
Copy link
Contributor Author

uartie commented Sep 16, 2020

@uartie, could you try this patch #1048?

@Xiaogang-Li yes #1048 appears to resolve this issue. Thanks!

@dvrogozh
Copy link
Contributor

@uartie : is this observation specific to vp8d only or you saw the same on vp9/av1? and did you ever see that on avc/hevc?

@uartie
Copy link
Contributor Author

uartie commented Sep 17, 2020

@uartie : is this observation specific to vp8d only or you saw the same on vp9/av1? and did you ever see that on avc/hevc?

Only vp8d using ffmpeg-vaapi + iHD as far as I've observed over ~2 months since this regression started.

@uartie
Copy link
Contributor Author

uartie commented Sep 17, 2020

@uartie : is this observation specific to vp8d only or you saw the same on vp9/av1? and did you ever see that on avc/hevc?

Only vp8d using ffmpeg-vaapi + iHD as far as I've observed over ~2 months since this regression started.

Also, note that adding -threads 1 to ffmpeg-vaapi does not exhibit this issue.

wangyan42164 added a commit to wangyan42164/media-driver that referenced this issue Sep 18, 2020
DdiMediaUtil_CreateBuffer()->DdiMediaUtil_AllocateBuffer() will
use it.

Fixes intel#1028.

Signed-off-by: Yan Wang <yan.wang@linux.intel.com>
@wangyan42164
Copy link
Contributor

@uartie I have not your VP8 clip. So I cannot reproduce this issue on my local platform. But I find Xiaogang's patch may impact one VP8 special buffer (VAProbabilityBufferType) creation. bUseSysGfxMem is not initialized but will be used in the creation. As you said previously, debug build has no this issue. For debug build, it should be zero/false. But for release build, the value should be undefined/unknown.
If this issue still exist after apply #1053, I will do more debugging in the next week.
Thanks.

@wangyan42164
Copy link
Contributor

I check it again. MOS_AllocAndZeroMemory should fill zero. Please ignore it. Sorry for this.

@wangyan42164
Copy link
Contributor

I will continue to investigate it in the next week. Please provide the clip.

@uartie
Copy link
Contributor Author

uartie commented Sep 18, 2020

@wangyan42164 you can easily reproduce this issue with the webm vp8 decode test vectors here: https://www.webmproject.org/code/

Here is a basic script that decodes and compares frame md5 differences on those cases. Just clone the test vectors repository (https://chromium.googlesource.com/webm/vp8-test-vectors), rename this script to test-vp8-vectors.sh, and run the script relative to the directory where you cloned the repo.

test-vp8-vectors.txt

On my machine, with -threads 1 added to the script all md5 results match. But without -threads 1, I get:

Files vp80-00-comprehensive-001.ivf.actual and vp80-00-comprehensive-001.ivf.expect differ
Files vp80-00-comprehensive-002.ivf.actual and vp80-00-comprehensive-002.ivf.expect differ
Files vp80-00-comprehensive-003.ivf.actual and vp80-00-comprehensive-003.ivf.expect are identical
Files vp80-00-comprehensive-004.ivf.actual and vp80-00-comprehensive-004.ivf.expect differ
Files vp80-00-comprehensive-005.ivf.actual and vp80-00-comprehensive-005.ivf.expect differ
Files vp80-00-comprehensive-006.ivf.actual and vp80-00-comprehensive-006.ivf.expect differ
Files vp80-00-comprehensive-007.ivf.actual and vp80-00-comprehensive-007.ivf.expect are identical
Files vp80-00-comprehensive-008.ivf.actual and vp80-00-comprehensive-008.ivf.expect differ
Files vp80-00-comprehensive-009.ivf.actual and vp80-00-comprehensive-009.ivf.expect differ
Files vp80-00-comprehensive-010.ivf.actual and vp80-00-comprehensive-010.ivf.expect differ
Files vp80-00-comprehensive-011.ivf.actual and vp80-00-comprehensive-011.ivf.expect differ
Files vp80-00-comprehensive-012.ivf.actual and vp80-00-comprehensive-012.ivf.expect differ
Files vp80-00-comprehensive-013.ivf.actual and vp80-00-comprehensive-013.ivf.expect differ
Files vp80-00-comprehensive-014.ivf.actual and vp80-00-comprehensive-014.ivf.expect differ
Files vp80-00-comprehensive-015.ivf.actual and vp80-00-comprehensive-015.ivf.expect differ
Files vp80-00-comprehensive-016.ivf.actual and vp80-00-comprehensive-016.ivf.expect differ
Files vp80-00-comprehensive-017.ivf.actual and vp80-00-comprehensive-017.ivf.expect are identical
Files vp80-00-comprehensive-018.ivf.actual and vp80-00-comprehensive-018.ivf.expect differ
Files vp80-01-intra-1400.ivf.actual and vp80-01-intra-1400.ivf.expect are identical
Files vp80-01-intra-1411.ivf.actual and vp80-01-intra-1411.ivf.expect are identical
Files vp80-01-intra-1416.ivf.actual and vp80-01-intra-1416.ivf.expect are identical
Files vp80-01-intra-1417.ivf.actual and vp80-01-intra-1417.ivf.expect are identical
Files vp80-02-inter-1402.ivf.actual and vp80-02-inter-1402.ivf.expect differ
Files vp80-02-inter-1412.ivf.actual and vp80-02-inter-1412.ivf.expect differ
Files vp80-02-inter-1418.ivf.actual and vp80-02-inter-1418.ivf.expect differ
Files vp80-02-inter-1424.ivf.actual and vp80-02-inter-1424.ivf.expect differ
Files vp80-03-segmentation-01.ivf.actual and vp80-03-segmentation-01.ivf.expect are identical
Files vp80-03-segmentation-02.ivf.actual and vp80-03-segmentation-02.ivf.expect are identical
Files vp80-03-segmentation-03.ivf.actual and vp80-03-segmentation-03.ivf.expect are identical
Files vp80-03-segmentation-04.ivf.actual and vp80-03-segmentation-04.ivf.expect are identical
Files vp80-03-segmentation-1401.ivf.actual and vp80-03-segmentation-1401.ivf.expect are identical
Files vp80-03-segmentation-1403.ivf.actual and vp80-03-segmentation-1403.ivf.expect differ
Files vp80-03-segmentation-1407.ivf.actual and vp80-03-segmentation-1407.ivf.expect differ
Files vp80-03-segmentation-1408.ivf.actual and vp80-03-segmentation-1408.ivf.expect differ
Files vp80-03-segmentation-1409.ivf.actual and vp80-03-segmentation-1409.ivf.expect differ
Files vp80-03-segmentation-1410.ivf.actual and vp80-03-segmentation-1410.ivf.expect differ
Files vp80-03-segmentation-1413.ivf.actual and vp80-03-segmentation-1413.ivf.expect differ
Files vp80-03-segmentation-1414.ivf.actual and vp80-03-segmentation-1414.ivf.expect are identical
Files vp80-03-segmentation-1415.ivf.actual and vp80-03-segmentation-1415.ivf.expect are identical
Files vp80-03-segmentation-1425.ivf.actual and vp80-03-segmentation-1425.ivf.expect differ
Files vp80-03-segmentation-1426.ivf.actual and vp80-03-segmentation-1426.ivf.expect differ
Files vp80-03-segmentation-1427.ivf.actual and vp80-03-segmentation-1427.ivf.expect differ
Files vp80-03-segmentation-1432.ivf.actual and vp80-03-segmentation-1432.ivf.expect differ
Files vp80-03-segmentation-1435.ivf.actual and vp80-03-segmentation-1435.ivf.expect differ
Files vp80-03-segmentation-1436.ivf.actual and vp80-03-segmentation-1436.ivf.expect are identical
Files vp80-03-segmentation-1437.ivf.actual and vp80-03-segmentation-1437.ivf.expect differ
Files vp80-03-segmentation-1441.ivf.actual and vp80-03-segmentation-1441.ivf.expect differ
Files vp80-03-segmentation-1442.ivf.actual and vp80-03-segmentation-1442.ivf.expect differ
Files vp80-04-partitions-1404.ivf.actual and vp80-04-partitions-1404.ivf.expect differ
Files vp80-04-partitions-1405.ivf.actual and vp80-04-partitions-1405.ivf.expect differ
Files vp80-04-partitions-1406.ivf.actual and vp80-04-partitions-1406.ivf.expect differ
Files vp80-05-sharpness-1428.ivf.actual and vp80-05-sharpness-1428.ivf.expect differ
Files vp80-05-sharpness-1429.ivf.actual and vp80-05-sharpness-1429.ivf.expect differ
Files vp80-05-sharpness-1430.ivf.actual and vp80-05-sharpness-1430.ivf.expect differ
Files vp80-05-sharpness-1431.ivf.actual and vp80-05-sharpness-1431.ivf.expect differ
Files vp80-05-sharpness-1433.ivf.actual and vp80-05-sharpness-1433.ivf.expect differ
Files vp80-05-sharpness-1434.ivf.actual and vp80-05-sharpness-1434.ivf.expect differ
Files vp80-05-sharpness-1438.ivf.actual and vp80-05-sharpness-1438.ivf.expect differ
Files vp80-05-sharpness-1439.ivf.actual and vp80-05-sharpness-1439.ivf.expect differ
Files vp80-05-sharpness-1440.ivf.actual and vp80-05-sharpness-1440.ivf.expect differ
Files vp80-05-sharpness-1443.ivf.actual and vp80-05-sharpness-1443.ivf.expect differ

@uartie
Copy link
Contributor Author

uartie commented Sep 18, 2020

@wangyan42164 you can easily reproduce this issue with the webm vp8 decode test vectors here: https://www.webmproject.org/code/

Here is a basic script that decodes and compares frame md5 differences on those cases. Just clone the test vectors repository (https://chromium.googlesource.com/webm/vp8-test-vectors), rename this script to test-vp8-vectors.sh, and run the script relative to the directory where you cloned the repo.

test-vp8-vectors.txt

On my machine, with -threads 1 added to the script all md5 results match. But without -threads 1, I get:

Oddly, though, I see the same behavior with this test set even before commit c383f7e or after applying #1048... so perhaps this has been a latent issue all along.

@dvrogozh
Copy link
Contributor

Right. Most likely c383f7e just better exposed existing issue. But considering that both c383f7e and #1048 are patching common code non-related to vp8d we can't actually say that only vp8d is affected.

@dvrogozh
Copy link
Contributor

That's interesting. Actually I see that ffmpeg-vaapi creates 2 VAAPI contexts on vp8 decoding with -threads 2 (I don't see such behavior on AVC...):

  1. Create first context
  2. Decode 1st frame
  3. Destroy first context
  4. Create second context
  5. Decode 2nd+ frames
  6. Destroy second context

This does not seem to be right. @xhaihao : can you, please, comment how that can be? Do we face some ffmpeg bug?

@XinfengZhang
Copy link
Contributor

@dvrogozh it is an older ffmpeg-vaapi vp8 decoder issue, we already have a lot of discussion, ffmpeg should fix it inside ffmpeg.
there are some discussion for the fix https://patchwork.ffmpeg.org/project/ffmpeg/patch/1533798542-2494-1-git-send-email-mypopydev@gmail.com/ but the patch was not accepted. so, we have a WA inside driver to handle this case

@dvrogozh
Copy link
Contributor

dvrogozh commented Sep 21, 2020

That's nasty:(. Considering that ffmpeg-vp8 creates 2 configs and driver tries to handle this inside, this means that driver should kind of restore first context inside a second one. Did we miss some key settings from first context which lead to artifacts?

Can you point me to the WA code inside the driver?

That's 6aed7c3.

@uartie
Copy link
Contributor Author

uartie commented Sep 21, 2020

I tried the ffmpeg patch and it fixes this issue on my side. I see the ffmpeg patch was superseded by https://patchwork.ffmpeg.org/project/ffmpeg/patch/1560235949-29164-1-git-send-email-shaofei.wang@intel.com/ and its review has stalled.

@xhaihao @sfeiwong @mypopydev @feiwan1 please help to get this ffmpeg patch reviewed so it can be merged.

@uartie
Copy link
Contributor Author

uartie commented Sep 23, 2020

ffmpeg patch has been merged (https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/eb6bb8f32fdc9be89cce65869dce9dd950e91be2)

I will close after I test it again (assuming it works 😉).

@uartie
Copy link
Contributor Author

uartie commented Sep 23, 2020

Also related #190

@uartie
Copy link
Contributor Author

uartie commented Sep 23, 2020

@uartie uartie closed this as completed Sep 23, 2020
@dalcombright
Copy link

I am running into this issue now while using an i7-11700k.

[hevc_vaapi @ 0x55cc823a8480] Failed to end picture encode issue: 24 (internal encoding error).
[hevc_vaapi @ 0x55cc823a8480] Encode failed: -5.

or I get this in most of my conversions:

[h264 @ 0x559bb0a4a740] Failed to end picture decode issue: 23 (internal decoding error).
[h264 @ 0x559bb0a4a740] hardware accelerator failed to decode picture

ffmpeg -hide_banner -loglevel info -init_hw_device vaapi=vaapi0:/dev/dri/renderD128 -hwaccel vaapi -hwaccel_output_format vaapi -hwaccel_device vaapi0 -i /library/tv/[FILE].mkv -strict -2 -max_muxing_queue_size 2054 -filter_hw_device vaapi0 -vf format=nv12|vaapi,hwupload -map 0:v:0 -map 0:a:0 -map 0:s:0 -map 0:s:1 -c:v:0 hevc_vaapi -c:a:0 copy -c:s:0 copy -c:s:1 copy -y /tmp/unmanic/unmanic_file_conversion-1640504463.512407/[FILE]-WORKING-2.mkv

I wasn't able to find what release this was rolled into, but since its over 1 year since the fix maybe its something else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants