Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

producer_avformat performs useless work when source is intra-only and consumer framerate is lower than the source's framerate #986

Open
Tjoppen opened this issue May 15, 2024 · 3 comments

Comments

@Tjoppen
Copy link
Contributor

Tjoppen commented May 15, 2024

This may or may not be considered a bug, but with ffmpeg one can set -thread_type frame -threads 16 to speed up intra-only decoding, especially for slow codecs like JPEG2000. producer_libavformat.c happily passes thread_type along, but this is purely detrimental. libavformat decodes multiple frames at once as instructed, but producer_libavformat only ever uses the first one. Playing more nicely with libavformat's threading would enable better handling of heavy to decode formats like JPEG2000. I see that recently there has been done some work to make melt would work better with intra-only, and supporting frame threading would be one way to make it work even better.

@Tjoppen
Copy link
Contributor Author

Tjoppen commented May 31, 2024

Update: this only happens when the output framerate is lower than the input framerate. For example when putting a 50 fps source into a 25 fps project. This gets worse the lower the output framerate is. I added a bit of code tracking how many packets are sent to the decoder, and for a 50 Hz intra-only input with an output rate of 10 Hz, i=0 out=199 and intra-only, a grand total of 2990 (!) packets are sent to the decoder, almost 15x overhead.

Initially I had accidentally run @sirf's fork when reporting this, but a very similar issue exists here upstream. producer_avformat actually does the right thing when the output framerate is greater than or equal to the input framerate.

Example command lines:

melt V75UPPSNACK_LORDAG_P01_195335.mov in=0 out=199 -consumer avformat:/tmp/foo.avi -> 209 packets sent to decoder. Nine extra packets decoded = no biggie

melt V75UPPSNACK_LORDAG_P01_195335.mov in=0 out=199 -consumer avformat:/tmp/foo.avi frame_rate_num=10 frame_rate_den=1 -> 2990 packets sent to decoder. Definitely an issue.

Versions: current melt master (49fcfd3) and current ffmpeg master (249c66bb225b0671434b3ce9cc3f7935a229f428).

@Tjoppen Tjoppen changed the title thread_type=frame causes increased CPU usage with no increase in throughput producer_avformat performs useless work when source is intra-only and consumer framerate is lower than the source's framerate May 31, 2024
@Tjoppen
Copy link
Contributor Author

Tjoppen commented May 31, 2024

Oh and I have an idea for fixing this: don't bother sending packets to the decoder whose result we know won't be used. This can be done by inspecting the packet's pts. That obviously only works for intra-only, but would be a huge gain in many cases

@Tjoppen
Copy link
Contributor Author

Tjoppen commented May 31, 2024

I wrote an extremely ugly hack that just throws away 4/5 packets and gave it a go on a lossless 4k JPEG2000 sample. 200 frames -> 205 packets sent and the output looks as expected. So the idea of throwing away intra-only packets between the desired output frames has some legs. It just needs proper timestamp logic. It'll still read the packets, which isn't great on high-bitrate files like some of the JPEG2000 samples we have, but it's better than performing useless decode work.

edit: Oh and it manages to make full use of the CPU while doing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant