Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Playback quality selection #1

Open
heff opened this issue Oct 14, 2021 · 14 comments · May be fixed by #11
Open

Playback quality selection #1

heff opened this issue Oct 14, 2021 · 14 comments · May be fixed by #11

Comments

@heff
Copy link
Member

heff commented Oct 14, 2021

Allow a user to select from a set of video quality levels/resolutions/renditions/bitrates/variants/representations.

Was hoping to start this with a PR, but some research and discussion will be helpful first.


Related conversation: whatwg/html#562 from @dmlap

The proposed extension to VideoTrack seems promising.

partial interface VideoTrack {
  sequence<string> getAvailableRenditions();
  // promise resolves when change has taken effect
  Promise<void> setPreferredRendition(string rendition);
};

Something to solve for is "auto".

Ping @gkatsev @littlespex

@gkatsev
Copy link
Member

gkatsev commented Oct 15, 2021

Also wanted to mention https://github.com/videojs/videojs-contrib-quality-levels which we wrote with making it be updated to a spec in mind.

@littlespex
Copy link

Amending the VideoTrack API seems promising as it allows for each track to have multiple renditions. The two new functions would allow for a basic selection menu.

With the wide variety of bitrate ladders out in the wild, having getAvailableRenditions return a list of strings may be limiting. It may be necessary to return a Rendition object similar to videojs-contrib-quality-levels so that dimensions, bitrates and codecs can be used to generate the list of menu items.

For more advanced use cases, it may also be necessary to dispatch events so that changes made to the renditions list can be reflected in the UI:

  • A change event to catch changes made by things like the streaming library's ABR algorithm. For example, some menus have Auto checked, and a separate indicator showing which rendition is currently being rendered.
  • add/remove events. Multi-period DASH manifests are allowed to have different numbers of Representations per Period.

One possible alternative, though certainly not as simple, would be something similar to the existing text/audio/video track APIs:

partial interface VideoTrack {
  readonly attribute VideoRenditionList renditions;
};

interface VideoRenditionList : EventTarget {
  readonly attribute unsigned long length;
  getter VideoRendition (unsigned long index);
  VideoRendition? getRenditionById(DOMString id);
  readonly attribute long selectedIndex;

  attribute EventHandler onchange;
  attribute EventHandler onaddrendition;
  attribute EventHandler onremoverendition;
};

interface VideoRendition {
  readonly attribute DOMString id;
  readonly attribute unsigned long width;
  readonly attribute unsigned long height;
  readonly attribute unsigned long bitrate;
  readonly attribute unsigned long codec;
  attribute boolean selected;
};

@heff
Copy link
Member Author

heff commented Aug 23, 2022

We're stepping into this with media-chrome, and it looks like @luwes has already done work on a version.

How should mixed audio/video renditions (.ts HLS) be handled in an API like this? Should the assumption be if there's no audio renditions then there's only mixed media renditions? Or should the rendition list not be media type specific, with a rendition type field that can be video/audio/mixed. I think I remember a proposal from @wilaw somewhere with those options.

@luwes
Copy link

luwes commented Aug 24, 2022

Yes, good food for thought. @cjpillsbury brought this also up when we discussed my draft implementation.

Maybe it'd be easier to not have to patch the Video/AudioTrack apis for browsers other than Safari.

would be more like

partial interface HTMLMediaElement {
  readonly attribute RenditionList renditions;
}

interface RenditionList : EventTarget {
  readonly attribute unsigned long length;
  getter Rendition (unsigned long index);
  Rendition? getRenditionById(DOMString id);
  readonly attribute long selectedIndex;

  attribute EventHandler onchange;
  attribute EventHandler onaddrendition;
  attribute EventHandler onremoverendition;
};

interface Rendition {
  readonly attribute DOMString trackId;
  readonly attribute video | audio | mixed type; 

  readonly attribute DOMString id;
  readonly attribute unsigned long width;
  readonly attribute unsigned long height;
  readonly attribute unsigned long bitrate;
  readonly attribute unsigned long codec;
  attribute boolean selected;
};

@heff
Copy link
Member Author

heff commented Aug 24, 2022

The multiple video tracks use case is one to consider here. On one hand, it means you'll still end up identifying which video track a rendition belongs to. On the other hand, I question how much we can rely on the native VideoTracks list to actually represent the multiple video tracks in an adaptive manifest. If it doesn't, then that makes it more complicated to extend the native VideoTracks in the (maybe rare) use case of multiple video tracks with multiple renditions each. Anybody have experience with that or want to test it?

@cjpillsbury
Copy link
Collaborator

cjpillsbury commented Aug 25, 2022

There is poor (none that I know of) support of "alternate video" in native playback for browser/browser-like envs (and players generally). However, there is decent support for "alternate audio".

@gkatsev
Copy link
Member

gkatsev commented Aug 25, 2022

I think there are two things to consider here:

  1. what's the easiest and best API to have for something like media-chrome
  2. what's the best API that's we can propose to the w3c/whatwg to get it into the standards.

For an API we can use today, not having to extend Audio/Video Tracks is definitely nice, but I think such an API is less likely to get accepted into the relevant specs.
In addition, I don't think it really matters if a rendition is muxed content. From a user's perspective, it doesn't matter if the audio is available in the same segment as the video or if it was downloaded from a separate segment.

I think that adding a RenditionList to Audio and Video Tracks, similar to what @littlespex prposed above, is better than a combined RenditionList. In the majority case, since alternative video tracks aren't very common, you'd end up with a single Video Track, which has the specified renditions on it. Additionally, you'd have one or more AudioTracks, potentially with their own renditions. In the case of muxed content, you'd have a track show up under both AudioTrack and VideoTrack.
This how Safari currently implements things, where for media, including mp4, you get video.videoTracks[0] pointing at the video portion and video.audioTracks[0] point at the audio portion.
You can then separately turn off audio and video with video.audioTracks[0].enabled = false and video.videoTracks[0].selected = false.
The way AudioTracks and VideoTracks are defined is that you could theoretically enable multiple audio tracks at the same time, but not multiple video tracks. This is why videojs-contrib-quality-levels uses enabled on the renditions list, so that you could have multiple enabled, rather than only selecting one.

The tricky part of a renditions API is likely supporting everything that DASH allows. DASH is tricky here because you can have different renditions per period and potentially different audio tracks per video track. Maybe a non-goal would be to not support all permutations that DASH allows. HLS is simpler because it doesn't allow you multiple renditions per audio track.

@cjpillsbury
Copy link
Collaborator

cjpillsbury commented Sep 1, 2022

@gkatsev

HLS is simpler because it doesn't allow you multiple renditions per audio track.

I'd be careful here. Folks definitely use EXT-X-MEDIA:TYPE=AUDIO to provide multiple encodings/"renditions" of "the same" audio content, and Safari will represent them as a single AudioTrack. For example, Apple's official test stream https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_adv_example_hevc/master.m3u8 includes:

#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="a1",NAME="English",LANGUAGE="en-US",AUTOSELECT=YES,DEFAULT=YES,CHANNELS="2",URI="a1/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="a2",NAME="English",LANGUAGE="en-US",AUTOSELECT=YES,DEFAULT=YES,CHANNELS="6",URI="a2/prog_index.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="a3",NAME="English",LANGUAGE="en-US",AUTOSELECT=YES,DEFAULT=YES,CHANNELS="6",URI="a3/prog_index.m3u8"

(note the shared NAME and LANGUAGE but the differences in e.g. CHANNEL (and also the encoded content itself)

and when playing in Safari, you'll get:
Screen Shot 2022-09-01 at 8 20 50 AM

(aka a single AudioTrack).

Additionally, no "Languages" control menu is added to the controls, since there is only one "track".

Compare to this example https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/hls.m3u8 which includes:

#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-eng-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="en",NAME="stream_5",DEFAULT=YES,AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-deu-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="de",NAME="stream_4",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-ita-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="it",NAME="stream_8",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-fra-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="fr",NAME="stream_7",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-spa-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="es",NAME="stream_9",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="playlist_a-eng-0384k-aac-6c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="en",NAME="stream_6",CHANNELS="6"

Note that they all share the same GROUP-ID but vary with e.g. LANGUAGE and NAME (though there are two en playlists, which still have different NAMEs). Here's what you get when playing in Safari:
Screen Shot 2022-09-01 at 8 29 29 AM

And here's what shows up in the automatically added "Language" control menu:
Screen Shot 2022-09-01 at 8 49 48 AM

Finally, here's what happens when I create a local version of the multivariant playlist where the two english EXT-X-MEDIA playlists share the same NAME. playlist tags:

#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-eng-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="en",NAME="English",DEFAULT=YES,AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-deu-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="de",NAME="German",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-ita-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="it",NAME="Italian",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-fra-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="fr",NAME="French",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-spa-0128k-aac-2c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="es",NAME="Spanish",AUTOSELECT=YES,CHANNELS="2"
#EXT-X-MEDIA:TYPE=AUDIO,URI="https://storage.googleapis.com/shaka-demo-assets/angel-one-hls/playlist_a-eng-0384k-aac-6c.mp4.m3u8",GROUP-ID="default-audio-group",LANGUAGE="en",NAME="English",CHANNELS="6"

Safari's audioTracks:
Screen Shot 2022-09-01 at 8 53 11 AM

(note there's only one en AudioTrack now)

"Languages" control menu:
Screen Shot 2022-09-01 at 8 54 45 AM

All this is to say that Safari will treat multiple audio playlists as different tracks or as the same track depending on details in their attributes

@gkatsev
Copy link
Member

gkatsev commented Sep 1, 2022

I'd be careful here. Folks definitely use EXT-X-MEDIA:TYPE=AUDIO to provide multiple encodings/"renditions" of "the same" audio content, and Safari will represent them as a single AudioTrack. For example, Apple's official test stream https://devstreaming-cdn.apple.com/videos/streaming/examples/bipbop_adv_example_hevc/master.m3u8 includes:

However, this will still only match a specific audio track to a specific set of video renditions. The audio renditions won't be switching independently of the video renditions here, which is specifically what I was calling out, maybe it wasn't clear enough.

I tested locally and as far as I can tell, Safari is ignoring the second English track (I just named all the tracks English).

@cjpillsbury
Copy link
Collaborator

"ignoring" may be wrong here. iirc AVFoundation/AVPlayer (which Safari HLS playback is built on top of) will do some filtering based on support (6 channels being relevant here) but will also use ABR switching, similar to video playlists, for multiple audio playlists with "similar relevant features". It just isn't exposed in the browser.

@gkatsev
Copy link
Member

gkatsev commented Sep 1, 2022

Yeah, maybe it selects one from the available options and sticks with it. Either way, it seems simplified compared to what you can do in DASH.

@gkatsev
Copy link
Member

gkatsev commented Sep 1, 2022

If it does ABR the audio renditions, I couldn't get it to happen. But maybe my test wasn't great.

@cjpillsbury
Copy link
Collaborator

We don't have a good test stream. We'd want all the same container format & codec & channels with matching names & languages but notably different bitrates (including a stupidly large bitrate). We'd also likely want only one EXT-X-STREAM-INF to avoid the dance of video ABR switching vs. (potential) audio ABR switching.

@cjpillsbury
Copy link
Collaborator

Or maybe someone with more knowledge of how this works under the hood will chime in 🤞

gkatsev added a commit to gkatsev/media-ui-extensions that referenced this issue Apr 14, 2023
@gkatsev gkatsev linked a pull request Apr 14, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants