Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

microdvd parser should read fps only from first line with {1}{1} prefix #71

Closed
milahu opened this issue Apr 28, 2023 · 11 comments
Closed
Assignees

Comments

@milahu
Copy link

milahu commented Apr 28, 2023

frame rate should be parsed from the first line
only when the first line has the format

\{1\}\{1\}(\d+(?:\.\d*)?)

examples

{1}{1}25.000
{1}{1}30
{1}{1}60.
{1}{1}123.456
{1}{1}5
{1}{1}1

failing tests

import pysubs2

text = """\
{10}{20}666
"""
subs = pysubs2.SSAFile.from_string(text, format_="microdvd")
print("fps:", subs.fps)
# this works, frame rate is now 666

text = """\
{10}{20}asdf
"""
subs = pysubs2.SSAFile.from_string(text, format_="microdvd")
# ValueError: could not convert string to float: 'asdf'
# pysubs2.exceptions.UnknownFPSError: Framerate was not specified and cannot be read from the MicroDVD file.

downstream issue smacke/ffsubsync#176

@tkarabela tkarabela self-assigned this Apr 29, 2023
@tkarabela
Copy link
Owner

Hi @milahu, can you tell me what you wish the library did instead of raising UnknownFPSError in this case?

If you want to assume a "default fps", you can do this already:

try:
    subs = pysubs2.SSAFile.from_string(text, format_="microdvd")
except pysubs2.exceptions.UnknownFPSError:
    subs = pysubs2.SSAFile.from_string(text, format_="microdvd", fps=25)

However, unless both the input and output format is MicroDVD, this will mess up timing for any subtitles that are not the assumed default (eg. 25 fps) and do not have the declaration in the first subtitle.

@milahu
Copy link
Author

milahu commented Apr 29, 2023

frame rate should be parsed from the first line only when the first line has the format

\{1\}\{1\}(\d+(?:\.\d*)?)

examples

{1}{1}25.000
{1}{1}30
{1}{1}60.
{1}{1}123.456
{1}{1}5
{1}{1}1

@tkarabela
Copy link
Owner

The library internally works with milliseconds, not frames - specifying framerate is not optional when reading MicroDVD.

  • One can specify the framerate directly as the fps parameter
    • (in this case, the library does not attempt to parse anything from the subtitle text)
  • If the fps parameter is not given, the library tries to read the framerate value from text of the first subtitle
  • If reading framerate from the first subtitle fails (eg. {10}{20}asdf), then the library bails out with UnknownFPSError

I'm not sure what more the library could be doing to figure out what framerate should be used. The only other option I see is to supply a default framerate in case the autodetection fails, which is already possible via the code I suggested earlier:

try:
    subs = pysubs2.SSAFile.from_string(text, format_="microdvd")
except pysubs2.exceptions.UnknownFPSError:
    subs = pysubs2.SSAFile.from_string(text, format_="microdvd", fps=25)

Maybe I'm misunderstanding the issue, feel free to tell me more about the use-case where this is a problem :)

@milahu
Copy link
Author

milahu commented Apr 30, 2023

the parser should be more strict

it should parse the FPS only from first lines like

{1}{1}25.000
{1}{1}30
{1}{1}60.
{1}{1}123.456
{1}{1}5
{1}{1}1

but NOT from first lines like

{10}{20}666

sorry, my issue title was misleading

@milahu milahu changed the title Framerate was not specified and cannot be read from the MicroDVD file. microdvd parser should read fps only from first line with {1}{1} prefix Apr 30, 2023
@tkarabela
Copy link
Owner

Ah, maybe I'm beginning to understand, let's go back to your initial examples.

Example 1
Would you like to see this throw UnknownFPSError as well?

import pysubs2

text = """\
{10}{20}666
"""
subs = pysubs2.SSAFile.from_string(text, format_="microdvd")
# raises pysubs2.exceptions.UnknownFPSError: Framerate was not specified and cannot be read from the MicroDVD file.

Example 2
What should happen in this case? I believe this is already correct, it raises UnknownFPSError (ValueError is caught internally and not propagated to the caller)

text = """\
{10}{20}asdf
"""
subs = pysubs2.SSAFile.from_string(text, format_="microdvd")
# raises pysubs2.exceptions.UnknownFPSError: Framerate was not specified and cannot be read from the MicroDVD file.

@tkarabela
Copy link
Owner

BTW, this is the relevant code:

@classmethod
def from_file(cls, subs, fp, format_, fps=None, **kwargs):
"""See :meth:`pysubs2.formats.FormatBase.from_file()`"""
for line in fp:
match = MICRODVD_LINE.match(line)
if not match:
continue
fstart, fend, text = match.groups()
fstart, fend = map(int, (fstart, fend))
if fps is None:
# We don't know the framerate, but it is customary to include
# it as text of the first subtitle. In that case, we skip
# this auxiliary subtitle and proceed with reading.
try:
fps = float(text)
subs.fps = fps
continue
except ValueError:
raise UnknownFPSError("Framerate was not specified and "
"cannot be read from "
"the MicroDVD file.")
start, end = map(partial(frames_to_ms, fps=fps), (fstart, fend))

@milahu
Copy link
Author

milahu commented Apr 30, 2023

Example 1
Would you like to see this throw UnknownFPSError as well?

yes

@moi15moi
Copy link
Contributor

VLC also only parse the fps when it has this format: "{1}{1}THE_FPS"

Reference: https://code.videolan.org/videolan/vlc/-/blob/dccda0e133ff0a2e85de727cf19ddbc634f06b67/modules/demux/subtitle.c#L1014

@tkarabela
Copy link
Owner

Thanks, I see! I will implement this.

@milahu
Copy link
Author

milahu commented Dec 17, 2023

trivial...

--- a/pysubs2/microdvd.py
+++ b/pysubs2/microdvd.py
@@ -34,6 +34,12 @@ class MicroDVDFormat(FormatBase):
                 # We don't know the framerate, but it is customary to include
                 # it as text of the first subtitle. In that case, we skip
                 # this auxiliary subtitle and proceed with reading.
+                if fstart != 1 or fend != 1:
+                    # the first subtitle must have this format for fps=25:
+                    # {1}{1}25
+                    raise UnknownFPSError("Framerate was not specified and "
+                                          "cannot be read from "
+                                          "the MicroDVD file.")
                 try:
                     fps = float(text)
                     subs.fps = fps

tkarabela added a commit that referenced this issue May 4, 2024
- Saving MicroDVD with `write_fps_declaration=True` (default) creates subtitle
  {1}{1}<fps>, not {0}{0}<fps>.
- Reading MicroDVD with `strict_fps_inference=True` (default) will raise
  UnknownFPSError if fps is not given and the first subtitle does not have
  the format {1}{1}<fps>.

Fixes issue #71.
@tkarabela
Copy link
Owner

Fixed in version 1.7.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants