Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"semiplanar" YUV images support #518

Open
chaplin89 opened this issue Feb 27, 2023 · 14 comments
Open

"semiplanar" YUV images support #518

chaplin89 opened this issue Feb 27, 2023 · 14 comments
Assignees

Comments

@chaplin89
Copy link

chaplin89 commented Feb 27, 2023

Is your feature request related to a problem? Please describe.
Disclaimer: I barely know what I'm saying.
I played a lot with the configuration of YUView and seems like I'm not able to find a configuration that correctly render "semiplanar" YUV images.
I have a set of YUV frames that have a 420 chroma subsample. In these frames, Y is a plane on its own but U and V are a single plane.

The frames are 1984x1080 (that's 1080p with a stride), and the memory layout is:

  • Y, which is a 1984x1080 bytes matrix, followed by
  • U/V together in a single 1984x540 bytes matrix. On each row, first 992 bytes are U and the rest is V.

Is there a way to render these frames in the current implementation of YUView or is it something that can be added?

Thanks!

Describe the solution you'd like
Render a frame described above.

Describe alternatives you've considered
n.a.

@ChristianFeldmann
Copy link
Member

Hi!
That sounds like a YUV format that I have not encountered yet. But there is a wild number of specialized YUV formats out there. Do you have any specification on this YUV format? Or if not, can you share where this came from?
It would be great if you could provide a file in this format for me to test for implementation. Or a way how I can create one.

@ChristianFeldmann ChristianFeldmann self-assigned this Feb 28, 2023
@chaplin89
Copy link
Author

chaplin89 commented Feb 28, 2023

Hi Christian,
I didn't find any spec about this, I just found some reference here but I'm not even sure it's the same thing.

This format is used internally by Chromium. Sample.
The format that chromium assigns is this:
PIXEL_FORMAT_I420, 12bpp YUV planar 1x1 Y, 2x2 UV samples, a.k.a. YU12.

Not sure this is correct or makes sense though.
Following Python script is capable of showing the image correctly, sorry if not polished and I bet there are tons of better way to do what I'm doing with numpy, but it's the first time I'm using it:

from PIL import Image
import numpy

def getyuv():
    y = []
    uv = []

    with open('single_frame.yuv', 'rb') as f:
        for i in range(0,1080):
            row = list(f.read(1984))
            y.append(row)
        for i in range(0,540):
            row = list(f.read(1984))
            uv.append(row)
    return y,uv

def convert_input(y,uv):
    output = numpy.full((1080, 1984,3), (0,0,0), dtype=numpy.uint8)
    for row in range(0,len(uv)):
        for column in range(0,int(len(uv[row])/2)-1):
            output[row*2][column*2] = (y[row*2][column*2], uv[row][column], uv[row][column+992])
            output[row*2][column*2+1] = (y[row*2][column*2+1], uv[row][column], uv[row][column+992])
            output[row*2+1][column*2] = (y[row*2+1][column*2], uv[row][column], uv[row][column+992])
            output[row*2+1][column*2+1] = (y[row*2+1][column*2+1], uv[row][column], uv[row][column+992])
    return output

y, uv = getyuv()
out = convert_input(y,uv)

# Trimming the last 64 px on each rows (garbage)
out_2 = numpy.full((1080, 1920,3), (0,0,0), dtype=numpy.uint8)
for row in range(0,len(out)):
    out_2[row] = out[row][:-64]

img = Image.fromarray(out_2, mode='YCbCr')
img.show()

@chaplin89
Copy link
Author

Oddly enough, the memory dump of this image contains 541 rows in the YU matrix, this means can't be open directly in yuview in any case. The frame I shared in the previous comment does not contains this extra bytes.
Maybe a feature to discard XX bytes from the beginning or the end of each frame (and maybe also from the beginning of a file) can be surely useful when dealing with these raw information.

@chaplin89
Copy link
Author

UPDATE: tried with many different pix_fmt on ffmpeg and none of them are able to decode the image as well. Seems like this is surely not a common format, maybe it's just Chromium that is using this internally and is not made to be shared/stored on disk. After all I was just trying to fix an issue in chromium, so it can be.

In any case, I built my own tooling for this. Probably we can close the issue as I don't think implementing this will bring any value to the project.

Here's a better script in the unlikely case someone else should run into the same issue:

from PIL import Image
import numpy
import os

class Convert:
    def __init__(self, column, row, stride, filename) -> None:
        self.column = column
        self.row = row
        self.stride = stride
        self.filename = filename
        self.fpos = 0
        if os.path.exists(self.get_destination()):
            os.unlink(self.get_destination())


    def get_single_frame(self):
        y = []
        uv = []

        with open(self.filename, 'rb') as f:
            if f.seek(self.fpos) == -1:
                return None,None
            y = list(f.read(self.stride*self.row))
            if len(y)==0:
                return None, None
            y = numpy.array(y)
            y = y.reshape(self.row, self.stride)

            half_row = int(self.row/2)
            uv = list(f.read(self.stride*half_row))
            if len(uv)==0:
                return None, None
            uv = numpy.array(uv)
            uv = uv.reshape(half_row, self.stride)
            # Remove last line (contain garbage)
            self.fpos = f.tell() + self.stride

        return y,uv

    def trim(self, frame):
        output = numpy.full((self.row,self.column,3), (0,0,0), dtype=numpy.uint8)
        for row in range(0,self.row):
            trimmed_row = frame[row][:(self.column-self.stride)]
            output[row] = trimmed_row
        return output

    def show(self, frame):
        image = Image.fromarray(frame, mode='YCbCr')
        image.show()

    def merge_yuv(self,y,uv):
        output = numpy.full((self.row, self.stride, 3), (0,0,0), dtype=numpy.uint8)
        half_stride = int(self.stride/2)
        uv_rows = len(uv)
        for row in range(0,uv_rows):
            uv_columns = len(uv[row])
            half_uv_columns = int(uv_columns/2)-1
            for column in range(0,half_uv_columns):
                output[row*2][column*2] = (y[row*2][column*2], uv[row][column], uv[row][column+half_stride])
                output[row*2][column*2+1] = (y[row*2][column*2+1], uv[row][column], uv[row][column+half_stride])
                output[row*2+1][column*2] = (y[row*2+1][column*2], uv[row][column], uv[row][column+half_stride])
                output[row*2+1][column*2+1] = (y[row*2+1][column*2+1], uv[row][column], uv[row][column+half_stride])
        return output

    def get_destination(self):
        return self.filename[:self.filename.rfind('.')] + ".converted.yuv"
    
    def save(self, y,uv):
        with open(self.get_destination(), 'ab') as f:
            f.write(bytes(y.reshape(y.shape[0]*y.shape[1]).tolist()))
            u = uv[:,0:int(uv.shape[1]/2)]
            v = uv[:,int(uv.shape[1]/2):]
            f.write(bytes(u.reshape(u.shape[0]*u.shape[1]).tolist()))
            f.write(bytes(v.reshape(v.shape[0]*v.shape[1]).tolist()))


target1 = 'your_file.yuv'

# change me according to target1 file spec
a = Convert(848,480,960, target1)

i=0
y, uv = a.get_single_frame()
while y is not None:
    a.save(y,uv)
    # Show an image every 20 frame for debug purposes
    if (i+1)%20 == 0:
        frame = a.merge_yuv(y,uv)
        frame = a.trim(frame)
        a.show(frame)
    y,uv = a.get_single_frame()
    i=i+1

This will take your_file.yuv in the format described above and generate `your_file.converted.yuv' that can be encoded into h264 with a command like this:

ffmpeg -f rawvideo -pixel_format yuv420p -video_size 960x480 -framerate 24 -i ./your_file.converted.yuv ./your_file.mp4

The only things that needs to be changed according to video spec are the parameter of the Convert ctor:

  • Cols, Rows -> The size of the visible portion of the frame
  • Stride -> The real number of column in the frame (can be bigger or equal to Cols)

Cheers!

@ChristianFeldmann
Copy link
Member

Ah wait a second! I think we do have support for these semi planar files. At least for some of them. In the link you provided I saw the name NV12 and that rang a bell. We do support that. So you can open the YUV file and go to YUV Format ... custom. In the dialog you have to select the UV(A) interleaved checkbox. Can you try that? It may be what you are looking for.
image

Alternatively you can put nv12 into the name of the file and YUView should apply the format based on that.

@chaplin89
Copy link
Author

I think at this point I may have tried each combination of custom and non custom decoding option but it's never displaying the image correctly.
I think the option you're suggesting must have a sequence of alternating Cb and Cr on each row of the UV matrix.
The format I'm talking about has the 1st half of the row for Cb values and the 2nd half of the row for Cb values.

@ChristianFeldmann
Copy link
Member

Ah sorry then that is not exactly the format that you are looking for. Sorry. That would have been to easy anyway.
All of the info you provided is already super helpful. But can you please somehow share a file in that format with me? It also only has to be a few frames. That will already do.

@chaplin89
Copy link
Author

Yup, I already shared it here: #518 (comment)

Pasting the link again here: https://github.com/IENT/YUView/files/10848045/single_frame.zip

@ChristianFeldmann
Copy link
Member

ChristianFeldmann commented Mar 2, 2023

Ah sorry my bad I was blind. Got it!

@chaplin89
Copy link
Author

No worries, YW!

@ChristianFeldmann
Copy link
Member

Ok so I looked though all the data and files and its still a bit strange:

  • Are you sure that this format is used in chrome as PIXEL_FORMAT_I420? Because from the chromium code it looks like this is a "normal" planar format with separate planes for Y Cb Cr.
  • The file looks like its as you described with interlaced Cb Cr lines. I have never seen a format like that. But there are some extra bytes that I can not account for. So if the resolution is 1984x1080 then there is one line of 1984 bytes that is not accounted for.

I have still not found any documentation of a format like this mentioned anywhere. I mean there is all sorts of strange YUV formats out there.

@chaplin89
Copy link
Author

Are you sure that this format is used in chrome as PIXEL_FORMAT_I420? Because from the chromium code it looks like this is a "normal" planar format with separate planes for Y Cb Cr.

Yes, I'm sure, and yes, chromium code "apparently" support normal planar YUV files. However, if you debug the media part, you'll find out that this is just illusory. A YUV file in chrome is represented as a contiguous memory area in which:

  • The first plane is starting from offset 0
  • The second plane starts from offset stride*width
  • The third plane starts from offset stride*width + half stride

This means that in order to render the image correctly you still have to take into account that the UV plane is interleaved in this way (half row U, half row V).

But there are some extra bytes that I can not account for.

Yup, this is where I'm talking about that #518 (comment)

I thought I removed this extra line but perhaps I'm wrong. And yeah, I agree it's strange. ffmpeg it's not even supporting it, this is why I was saying probably it's not adding much value to the project.

@ChristianFeldmann
Copy link
Member

ChristianFeldmann commented Mar 3, 2023

Can you refer to the code where this happens in the chromium media part please? I checked out the code and to me it looks like the I420 format has 3 separate planes. E.g. here is the code from video_framce.cc:

    case PIXEL_FORMAT_I420: {
      int uv_width = (coded_size.width() + 1) / 2;
      int uv_height = (coded_size.height() + 1) / 2;
      int uv_stride = uv_width;
      int uv_size = uv_stride * uv_height;
      planes = std::vector<ColorPlaneLayout>{
          ColorPlaneLayout(coded_size.width(), 0, coded_size.GetArea()),
          ColorPlaneLayout(uv_stride, coded_size.GetArea(), uv_size),
          ColorPlaneLayout(uv_stride, coded_size.GetArea() + uv_size, uv_size),
      };
      break;
    }

The offset for the V plane here is uv_size which indicates that the 3 planes are completely separate.
The PIXEL_FORMAT_NV12 format seems to have 2 frames where UV are packed. But here, the UV values are packed per value (UVUVUV) and not per line.

I am riding on this so much because if we find out the name of this format, then we can also use it. I don't want to invent a new name for this as there must be one if it is used in chromium.

I think I found one reference to a format like this in the Microsoft docs: https://learn.microsoft.com/en-us/windows/win32/medfound/recommended-8-bit-yuv-formats-for-video-rendering#imc2 . They call it IMC2.

@FaiScofield
Copy link

From the view of data arrangement, we can divide the YUV formats into 3 basic types: Planar(3/4(alpha) plans), Semi-Plan(2 plans), Interleaved(or so called packed, only 1 plan). Then divide the Semi-Plan to 2 subclass: uv_interleaved(UVUV...UVUV), or uv_followed(UU...UUVV...VV). I believe this will work in distinguishing from YUV formats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants