Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add format selection criteria longside, shortside #30742

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dirkf
Copy link
Contributor

@dirkf dirkf commented Mar 14, 2022

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

Explanation of your pull request in arbitrary form goes here. Please make sure the description explains the purpose and effect of your pull request and is worded well enough to be understood. Provide as much context and examples as possible.

Copy link
Contributor Author

@dirkf dirkf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mk-pmb, I've created the PR for you. Do check these review items for which GH won't let me Request Changes from myself. Or you might rather create your own PR.

With my suggested changes the existing test/test_YoutubeDL.py test succeeds. You should either add a new test case for long/short/side, or add something to the test_youtube_format_selection() test case.

youtube_dl/YoutubeDL.py Show resolved Hide resolved
@@ -1484,6 +1491,17 @@ def sanitize_numeric_fields(info):
sanitize_string_field(info_dict, 'id')
sanitize_numeric_fields(info_dict)

def add_calculated_video_proprties(fmt):
dimensions = [fmt.get(side) for side in ('width', 'height',)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or in 1 line:

            dimensions = filter(None, [fmt.get(side) for side in ('width', 'height',)])

def add_calculated_video_proprties(fmt):
dimensions = [fmt.get(side) for side in ('width', 'height',)]
dimensions = [n for n in dimensions if n is not None]
if len(dimensions):
Copy link
Contributor Author

@dirkf dirkf Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need both width and height to be set, otherwise long and short are indeterminate, as shown by the test error on l.1499 when trying to evaluate min(1000) ('int' object is not iterable). So:

            if len(dimensions) == 2:

The doc for min():

 min(iterable[, key])
 min(arg1, arg2, *args[, key])
...

Because of the optional key, parameter, the reasonable result min(x) -> x isn't possible. If you don't know the length of the list whose min (resp. max) is being found, you have to say min(iter(dimensions)) rather than *dimensions.

Copy link
Contributor

@mk-pmb mk-pmb Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late. It seems I missed a notification.
Thanks for teaching me! I hope I'll remember this trap.
The iter() approach seemed best, but min() and max() throw on an empty, list, and I'd prefer to be tolerant for bad input, as in just giving 0 as the length for non-dimensional video. Thus, in my next version I'll propose min(0, 0, *dimensions) unconditionally.

Edit: Yeah I see how this fails for min(). :D

'shortside': min(*dimensions),
'longside': max(*dimensions),
})
for fmt in info_dict['formats']:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all info_dicts have a formats key (see extractor/common.py). In case the dimensions have been set in the top level, you could process the info_dict as a format:

        for fmt in info_dict.get('formats') or [info_dict]:

@@ -334,6 +334,11 @@ class YoutubeDL(object):
'playlist_index',
))

_NUMERIC_FIELDS |= set(('longside', 'shortside',))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're doing this, I'd just have these on a separate line in the original list:

        'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
        'longside', 'shortside',
        'timestamp', 'upload_year', 'upload_month', 'upload_day',

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested in my refactor idea, IMO this list should not exist in the first place; rather, its values should be collected from the sections in code that implement them. I will not oppose anyone who decides to enact your solution, but I won't do anything that may look like approval.

@mk-pmb
Copy link
Contributor

mk-pmb commented Mar 14, 2022

Thank you for the kind gesture. I purposefully did not file a PR because my patch is not worthy of being merged into the main codebase, and never was intended to.

My patch was an attempt to provide a stopgap at a time when the idea of my feature request was met with strong resistance. Someone who seemed to be a team member argued that this feature should not exist in yt-dl, so my patch was aimed at helping users merge it locally into their systems.

Now if maintainers decide that the feature is worth adding, we should try and find someone who knows how to implement it cleanly and elegantly.

I recommend to add the feature, but to not use my patch for it.

@dirkf
Copy link
Contributor Author

dirkf commented Mar 14, 2022

The status of a commenter should be shown on the right-hand side of the message header.

Your patch is in line with the existing code structure. The fields in the NUMERIC_FIELDS list are separately documented. I don't think it would help to generate the list dynamically. That's why the code is as it is (as far as I know and without a lot of blame-seeking).

If this feature is to be implemented, the approach used in your patch seems fine. However it does need test(s) as I mentioned, and also updating README.md.

A proposal for refactoring the format selection code ought to be a separate PR, also possibly acquiring some other recent and proposed changes from yt-dlp.

@mk-pmb mk-pmb deleted the fmtsel-longside-shortside branch January 10, 2024 23:41
@mk-pmb mk-pmb restored the fmtsel-longside-shortside branch January 10, 2024 23:41
@mk-pmb
Copy link
Contributor

mk-pmb commented Jan 11, 2024

A proposal for refactoring the format selection code ought to be a separate PR,

Nowadays I accept that argument and have updated my code to your suggestions.

You should either add a new test case for long/short/side, or add something to the test_youtube_format_selection() test case.

Might take a while, so anyone else reading this, feel free to help out!

Edit: I just discovered that I already had some tests in #30744, but accidentially closed that thread. I rebased that branch. Should I just merge them here, or make a new PR for them so we can first check if the tests would work?

@dirkf
Copy link
Contributor Author

dirkf commented Jan 23, 2024

Please update this PR if you can, or start a replacement if not. Thanks.

@mk-pmb mk-pmb force-pushed the fmtsel-longside-shortside branch 3 times, most recently from e685543 to 69872de Compare January 23, 2024 03:58
@mk-pmb
Copy link
Contributor

mk-pmb commented Jan 23, 2024

Done. I'm optimistic. Someone should look at the :TODO: marks in the tests though.

@mk-pmb mk-pmb force-pushed the fmtsel-longside-shortside branch 2 times, most recently from fe34885 to bfc5707 Compare May 16, 2024 00:46
@mk-pmb
Copy link
Contributor

mk-pmb commented May 16, 2024

I reviewed this again. The first :TODO: is just an API question: Can we expect process_ie_result to not mangle its arguments?

    ydl.process_ie_result(info_dict.copy())
    # ^-- :TODO: Do we need this .copy()?

The other :TODO: is a bug outside the scope of this PR and will be solved by #32786.
I amended this PR to include that fix and omit both :TODO: lines so you can easily merge if you don't care about potential performance savings on the dict copy.

@mk-pmb mk-pmb force-pushed the fmtsel-longside-shortside branch from bfc5707 to 8acf85a Compare May 16, 2024 00:48
@mk-pmb mk-pmb force-pushed the fmtsel-longside-shortside branch from 8acf85a to d4f8a34 Compare May 16, 2024 00:54
@mk-pmb mk-pmb force-pushed the fmtsel-longside-shortside branch from d4f8a34 to 0c1db86 Compare May 16, 2024 01:00
@mk-pmb
Copy link
Contributor

mk-pmb commented May 16, 2024

I don't understand the CI errors.

AssertionError: Regexp didn't match: u'(?:(?:#.*?|\\s*)\\n)*from __future__ import
  (?:[a-z_]+,\\s*)*unicode_literals' not found, unicode_literals import missing in
  /…/test/formatselection/criteria_longside_shortside.py

but from __future__ import unicode_literals is obviosuly there, in exactly that file, line 5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants