javascript-video-processing-analysis

JavaScript real-time video processing using Canvas (2D, WebGL).

Video Processing: Extracting pixels from video each frame

Because the web is not primarily designed for video processing, not many APIs exist that can support us here. But why would we even want to do this? Use cases include:

Video effects (e.g. filters, real-time greenscreen, face detection / blur via AI)
WebRTC: Synchronizing metadata with video frames requires frame information to be encoded into and extracted from the video itself as there is no othe technical way to connect frames and metadata (see this and that)

Knowing about video frame changes

Sadly, there is no way for us to know when the browser renders the next video frame. There is a timeupdate event on the video element which at first sight might seem like a solution. But (presumably for performance reasons?), that event triggers unpredictably and therefore does not ensure that every frame will be captured (see video.js#4322, this Bugzilla bug).

Instead, we have to rely on the browser render cycle. We can use a requestAnimationFrame() loop to constantly process the current video frame.

In the test case implementations, we just go ahead and process every frame, no matter whether it actually changed. Further optimizations might help reduce the load on the main thread.

Reading raw pixels from video frame

There is no direct way to read raw pixels from a video, or a video frame. Instead:

We create an in-memory canvas element (2D or WebGL)
Render step: We render the current video frame into the canvas element by defining the video element as a source (rather implicit, but it works as intended)
Extraction step: We read all raw pixels from the canvas element

Side note: There is an API that allows us to get the current video frame as an image named createImageBitmap. Sadly, we cannot process this image any futher but would have to go through the canvas again. Adding to that, createImageBitmap() seems to be really really slow.

The test case implementations do not yet consider multi-threaded solutions like OffscreenCanvas, primarily due to the still very bad browser support.

The magic (or horror) that is WebGL

Time for honesty: I have no clue about WebGL. It took me a few days to copy-paste and trial-error (mostly error) together those both test cases using WebGL. Resources I used include:

Video element

When using video elements, there is a bunch of browser behaviour that we need to keep in mind. Interesting here is:

In order for the canvas element to use the video (the frame, in particular) as a source, the video needs to be rendered somewhere into the DOM, it cannot just exist in-memory.
To ensure that the video auto-plays (and continues to do si), it must reside in the visible screen area. Therefore, the video element must not be hidden via the hidden attribute or a display: none rule. Using position: fixed and an accessible hide solution based on visibility: hidden works, though.
The browser might decide to not render the video if the tab is not active & not visible to save energy. The requestAnimationFrame() API, however, contionues to get called as per usual.

Performance Analysis Setup & Implementation

Video file

As the test video file, we use the "Big Buck Bunny" short animated movie, in particular the 2D Full HD (1920x1080) 30fps version.

Source:

Website for downloads: http://bbb3d.renderfarming.net/download.html

Specific URL for video: http://distribution.bbb3d.renderfarming.net/video/mp4/bbb_sunflower_1080p_30fps_normal.mp4

The video file got prepared by

cutting out a 30 second long clip from the full video that contains different types of animations and various cuts, using lossless-cut
converting the .mp4 file into a .webm file so that it works with the Chromium instance that puppeteer installs (see puppeteer#291), using the Adobe Media Encoder with the WebM Plugin

Test setup

Of course, we want to get meaningful and consistent results when performance analysis each use case implementation. The following has been done to ensure this:

All use case implementations are completely separated from each other.
Sure, it's a lot of duplicate code and tons of storage used by node_modules folders, but it keeps things clean. In particular:
- Each use case implementation exists within a separete folder, and no code gets shared between use cases. This way, we can ensure that the implementation is kept to the absolute minimum, e.g. only the relevant video processor, no unnecessary logic or "pages".
- Each use case defines and installs its own dependencies, and thus has its own node_modules folder. This way, we can easily run our performance analysis tests using different dependencies per use case.
Performance analysis happens on a production build of the application.
That's the version our users will see, so that's the version we should test. In particular:
- Production builds might perform better (or at least different) than development builds due things like tree shaking, dead code elimination and minification.
Performance analysis happens with the exact same clean browser.
Let's keep variations to a minimum. In particular:
- We use the exact same version of Chrome for all tests, ensuring consistent results.
- We use a clean version of Chrome so that things like user profiles, settings or extensions / plugins don't affect the results.

While all this certainly helps getting solid test results, there will always be things out of our control, such as:

Browser stuff (e.g. garbage collection, any internal delays)
Software stuff (Windows, software running in the background)
Hardware stuff (CPU, GPU, RAM, storage)

All the performance profiling results documented below ran on the following system:

Area	Details
CPU	Intel Core i7 8700K 6x 3.70Ghz
RAM	32GB DDR4-3200 DIMM CL16
GPU	NVIDIA GeForce GTX 1070 8GB
Storage	System: 512GB NVMe M.2 SSD, Project: 2TB 7.200rpm HDD
Operating System	Windows 10 Pro, Version 1909, Build 18363.778

Test implementation

Within each test case implementation, the start-analysis.bin.ts script is responsible for executing the performance analysis and writing the results onto the disk.

In particular, it follows these steps:

Step	Description
1	Start the server that serves the frontend application build locally
2	Start the browser, and navigate to the URL serving the frontend application
3	Start the browser performance profiler recording
-	Wait for the test to finish
4	Stop the browser performance profiler recording
5	Write results to disk
6	Close browser
7	Close server

Internally, we use Puppeteer to control a browser, and use the native NodeJS server API to serve the fronted to that browser.

Heads up!
For some reason, Chrome produces extremely high profiling values when running in headless mode. Thus, all tests are being executed with headless mode disabled.

How to run a test

To run a performance analysis on a use case, follow these steps:

Install dependencies by running npm run install
Create a production build by running npm run build
Run the performance analysis by running npm run start:analysis

The script will create the following two files within the results folder:

profiler-logs.json contains the React profiler results
The root project is a React app that offers a visualization of this file in the form of charts. Simply run npm start and select a profiler-logs.json file.
tracing-profile.json contains the browser performance tracing timeline
This file can be loaded into the "Performance" tab of the Chrome Dev Tools, or can be uploaded to and viewed online using the DevTools Timeline Viewer

Performance Analysis

Summary

Test parameters

We are running the performance analysis with the following parameters:

We play a 30 second 1080p video and extract raw pixels on every browser render cycle

Test results (summary)

The following table shows a short test summary. See further chapters for more details.

Test case	Duration	Render duration	Extract duration	Comparison (duration)
2D Canvas	~6.64ms	~0.49ms	~6.15ms	100% (baseline)
WebGL Canvas (Variant 1)	~4.06ms	~0.50ms	~3.57ms	61.14%
WebGL Canvas (Variant 2)	~4.05ms	~0.49ms	~3.56ms	61.00%

Interpretation of results

Using a WebGL canvas generally shortens the overall duration by a factor of 1.66.
There is no visible performance difference between both WebGL implementations.
The tracing profile suggest what WebGL-based solutions perform more consistent than a 2D canvas (fewer duration spikes).
Overall, render duration stays consistent across all test cases; only pixel extraction seems to be faster when using WebGL.

Recommendations

While the performnace improvement is alright, the improvement in actual numbers - here possibly around 2.5ms - is a good reason to switch to a WebGL-based solution, especially when keeping the usual frame budget (16.66ms) in mind.

Test case: 2D Canvas

In this test case, we use a simple 2D canvas to render the video frame into and extract raw pixels from.

Implementation pointer: Video Processor

Timeline

The following chart shows that durations are generally follow an average, although quite a few spikes in both directions exist at times.

Durations

The average duration is around 6.6ms to 6.7ms, with a few durations being slightly faster and some durations being significantly slower.

Render durations

Rendering a video frame into a 2D canvas is generally very fast, taking about 0.5ms. A few times, rendering happens faster, and at times very slowly. Compared to the whole duration, the rendering step only accounts for a small amount of the overall time.

Extract durations

Reading raw pixels from an image rendered into a 2D canvas is generally very slow, taking between 6.1ms and 6.2ms. A few times, the pixel extraction happens a bit faster, other times it takes up considerable more time. Overall, this step is the main reason for the overall slow process.

Tracing

This tracing profile looks fairly clean, the GPU access time can be clearly seen here.

Test case: WebGL Canvas (Variant 1)

In this test case, we use a WebGL canvas to render the video frame into and extract raw pixels from.

Implementation pointer: Video Processor

Timeline

The following chart shows that durations are generally follow an average, although very few spikes in both directions exist at times.

Durations

The average duration is around 4.0ms to 4.1ms, with a few durations being slightly faster and some durations being significantly slower.

Render durations

Rendering a video frame into a 2D canvas is generally very fast, taking about 0.5ms. A few times, rendering happens faster, and at times very slowly. Compared to the whole duration, the rendering step only accounts for a small amount of the overall time.

Extract durations

Reading raw pixels from an image rendered into a 2D canvas is generally slow, taking between 3.5ms and 3.6ms. A few times, the pixel extraction happens a tiny bit faster, other times it takes up considerable more time. Overall, this step is the main reason for the overall slow process.

Tracing

This tracing profile looks fairly clean, the GPU access time can be clearly seen here.

Test case: WebGL Canvas (Variant 2)

In this test case, we use a WebGL canvas to render the video frame into and extract raw pixels from.

Implementation pointer: Video Processor

Timeline

The following chart shows that durations are generally follow an average, although very few spikes in both directions exist at times.

Durations

The average duration is around 4.0ms to 4.1ms, with a few durations being slightly faster and some durations being significantly slower.

Render durations

Rendering a video frame into a 2D canvas is generally very fast, taking about 0.5ms. A few times, rendering happens faster, and at times very slowly. Compared to the whole duration, the rendering step only accounts for a small amount of the overall time.

Extract durations

Reading raw pixels from an image rendered into a 2D canvas is generally slow, taking between 3.5ms and 3.6ms. A few times, the pixel extraction happens a tiny bit faster, other times it takes up considerable more time. Overall, this step is the main reason for the overall slow process.

Tracing

This tracing profile looks fairly clean, the GPU access time can be clearly seen here.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
docs/results-2020-05-05		docs/results-2020-05-05
packages		packages
public		public
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

License

dominique-mueller/javascript-video-processing-analysis

Folders and files

Latest commit

History

Repository files navigation

javascript-video-processing-analysis

Video Processing: Extracting pixels from video each frame

Knowing about video frame changes

Reading raw pixels from video frame

The magic (or horror) that is WebGL

Video element

Performance Analysis Setup & Implementation

Video file

Test setup

Test implementation

How to run a test

Performance Analysis

Summary

Test parameters

Test results (summary)

Interpretation of results

Recommendations

Test case: 2D Canvas

Timeline

Durations

Render durations

Extract durations

Tracing

Test case: WebGL Canvas (Variant 1)

Timeline

Durations

Render durations

Extract durations

Tracing

Test case: WebGL Canvas (Variant 2)

Timeline

Durations

Render durations

Extract durations

Tracing

About

Topics

Resources

License

Stars

Watchers

Forks

Languages