Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only finalize an encoding when all chunks have finished #938

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

KyleMaas
Copy link
Contributor

Description

Make sure that all chunks are done before kicking off HLS creation. Fixes #929

Steps

Pre-deploy

Post-deploy

@KyleMaas
Copy link
Contributor Author

Hmm...actually, I've only tested this on successes. It probably doesn't handle cleanup correctly for failures. Converting this to a draft until I can get that figured out.

@KyleMaas KyleMaas marked this pull request as draft December 15, 2023 18:40
@KyleMaas KyleMaas marked this pull request as ready for review December 21, 2023 15:30
@KyleMaas
Copy link
Contributor Author

Okay, I think this should be good now.

@KyleMaas
Copy link
Contributor Author

So, after running this in production for a while, this is still a problem. This does seem to fix the case where failures happen, but it only runs after the chunk has already been marked as a success, which means if the concatenate and post encode processes take a long time, other chunks may also go into "success" state, see that this one is in "success", and proceed to also concatenate and post encode. So this still needs some work. This check really should probably happen on the transition from "running" to "success", not after it has already been marked "success" on the chunk.

@mgogoulos
Copy link
Contributor

@KyleMaas thanks for submiting this PR along with the other, I'm struggling to find some time to handle all of them...

as far as this one is related, I'm wondering what are the cases you encounter the problem. Like what types of volumes of video are necessary for it to happen. I haven't seen it myself, but it could be the case that I've missed it, specially if there's not much logging. Perhaps add more log messages to help debug the case?

@KyleMaas
Copy link
Contributor Author

@mgogoulos The main issues come in with long, very large videos. For example, I've got a server processing one video right now that's about 3 hours long. Fairly high resolution. Don't remember the file size, but it's in the several gigabytes range. That server's been processing that one video for about 22 hours. And currently it has one Bento operation running and another 30 concurrent instances of cp -rT copying to the same output directory. Usually there are more Bento instances running concurrently as well, but I think this one's just about done processing so now it's mostly just a bunch of copy operations stepping on each other.

@KyleMaas
Copy link
Contributor Author

@mgogoulos Had some more time to track this down further, and I think this does actually fix part of the problem. But there is also a separate problem I just filed as #962 which is at least as bad if not worse. So I think this one's still valid and is actually probably correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chunks show encoding a Fail status, but not any apparent problems
2 participants