Apparent locking issues when running across multiple GPUs #283

gtebbutt · 2023-12-06T11:47:58Z

I've noticed an interesting issue when running on multi-GPU machines: although selecting gpu(N) as the decoding context initially works as expected, the overall throughput when running multiple processes drops off very rapidly until there's only one process showing activity on a single GPU, sometimes with occasional very short bursts of processing from others.

This happens even when the processes are totally independent (started separately from different screen sessions, operating on entirely different files, using separate GPUs, for example), which leads me to think there's probably a hardware- or system-level locking mechanism being used globally rather than per-process since it occurs even between separate python instances.

Working theory is that it could be falling through to a global lock of some kind due to setting decoder_info_.vidLock = nullptr;, but so far that hasn't brought us closer to a fix. Would be very helpful to hear if anyone else has (or hasn't!) run into similar issues?

Possibly related to #187 and/or #159?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apparent locking issues when running across multiple GPUs #283

Apparent locking issues when running across multiple GPUs #283

gtebbutt commented Dec 6, 2023 •

edited

Apparent locking issues when running across multiple GPUs #283

Apparent locking issues when running across multiple GPUs #283

Comments

gtebbutt commented Dec 6, 2023 • edited

gtebbutt commented Dec 6, 2023 •

edited