Non-blocking assets loaders #11746

fernandojsg · 2017-07-11T12:23:01Z

As discussed in #11301 one of the main problems that we have in WebVR, although is annoying in non-VR experiences too, is blocking the main thread while loading assets.

With the recent implementation on link traversal in the browser non-blocking loading is a must to ensure a satisfying user experience. If you jump from one page to another and the target page start to load assets blocking the mainthread, it will block the render function so no frames will be submitted to the headset and after a small period of grace the browser will kick us out from VR and it will require the user to take out the headset, click enter VR again (user gesture required to do so) and go back to the experience.

Currently we can see two implementations of non-blocking loading of OBJ files:

(1) Using webworkers to parse the obj file and then return the data back to the main thread WWOBJLoader:
Here the parsing is done concurrently and you could have several workers at the same time. The main drawback is that once you've loaded the data you need to send the payload back to the mainthread to reconstruct the THREE objects instances and that part could block the main thread:
https://github.com/kaisalmen/WWOBJLoader/blob/master/src/loaders/WWOBJLoader2.js#L312-L423
(2) Mainthread promise with deferred parsing using setTimeOut:Oculus ReactVR: This loader keeps reading lines using small time slots to prevent blocking the main thread by calling setTimeout: https://github.com/facebook/react-vr/blob/master/ReactVR/js/Loaders/WavefrontOBJ/OBJParser.js#L281-L298
With this approach the loading will be slower as we're just parsing some lines on each time slot, but the advantage is that once the parsing is completed, we'll have the THREE objects ready to use without any additional overhead.

Both has their pros and cons and I'm honestly not an expert of webworkers to evalute that implementation but It's an interesting discussion that ideally would lead to a generic module that could be used to port the loaders to a non-blocking version.

Any suggestions?

/cc @mikearmstrong001 @kaisalmen @delapuente @spite

spite · 2017-07-11T13:37:07Z

You can have a Promise+worker+incremental based loader (a bit like a mix of both points)

Pass the source URL to the worker script, Fetch the resources, return a struct of transferable objects with the required buffers, structs, even ImageBitmaps; it should be straightforward enough to not need a lot of three.js processing overhead.

The upload data to GPU will be blocking regardless, but you can build a queue to distribute the commands across different frames, via display.rAF. The commands can be executed one at a time per frame, or calculate the average time of the operation and run as many as are "safe" to run in the current frame budge (something similar to requestIdleCallback would be nice, but it's not widely supported, and it's problematic in WebVR sessions). Also can be improved by using bufferSubData, texSubImage2D, etc.

Support for workers and transferable objects is pretty solid right now, specially in WebVR capable browsers.

kaisalmen · 2017-07-11T21:47:02Z

Hi all, I have a prototype available that may be of interest to you in this context. See the following branch:
https://github.com/kaisalmen/WWOBJLoader/tree/Commons
Here the mesh provisioning part of has been completely separated from WWOBJLoader2:
https://github.com/kaisalmen/WWOBJLoader/blob/Commons/src/loaders/WWLoaderCommons.js

WWLoaderCommons makes it easy to implemented other mesh providers (file format loaders). Basically, it defines how a web worker implementation has to provide mesh data back to the main thread and processes/integrates it into the scene. See the random triangle junk provider 😉 which serves as a tech demonstrator:
https://github.com/kaisalmen/WWOBJLoader/tree/Commons/test/meshspray
https://kaisalmen.de/proto/test/meshspray/main.src.html

Even in the current implementation WWOBJLoader2 relies on transferable objects (ArrayBuffers/ByteBuffers) to provide the raw BufferedGeometry data for the Mesh from worker to main thread. Time-wise the creation of the Mesh from the provided ByteBuffers is negligible. Whenever a bigger mesh is integrated into the scene however the rendering is stalled (data copies, scene graph adjustments ... !?). This is always the case independent of the source (correct me if I am wrong).
The "stream" mode of WWOBJLoader2 smooths these stalls, but if a single mesh piece from your OBJ model weighs 0.5 million vertices, then rendering will pause for a longer period of time.

I have opened a new issue to detail what I exactly I have done on the bespoke branch and why:
kaisalmen/WWOBJLoader#11
The issue is still a stub and details will follow soon.

donmccurdy · 2017-07-12T05:20:55Z

To offer some numbers, here's a performance profile of https://threejs.org/examples/webgl_loader_gltf2.html, loading a 13MB model with 2048x2048 textures.

In this case the primary thing blocking the main thread is uploading textures to GPU, and as far as I know that can't be done from a WW.. either the loader should add textures gradually, or three.js should handle it internally.

For the curious, the final chunk blocking the main thread is addition of an environment cubemap.

mikearmstrong001 · 2017-07-12T07:36:41Z

The main aim for react-vr is not necessarily to have the most optimal loader in terms of wall clock time but to not cause sudden and unexpect frame outs as loading new content happens. Anything we can do to minimize this is beneficial to all but especially VR.

Textures are definitely an issue and an obvious first step would be optionally load them incrementally - a set of lines at a time for a big texture. As the upload is hidden for the client programs it is going to be difficult for them to manage but I'd be all for this being exposed more openly to the webgl renderer to take the pressure off three.js

For the gltf parsing I commonly see blocking of a 500ms on my tests, this is significant and I'd much prefer an incremental approach to all the loaders (which should also be clonable)

The premise of React VR is to encourage easy dynamic content driven by a web style so as to encourage more developers, and this will push more emphasis on improving dynamic handling. Most of the time we don't know which assets will be required at the beginning of our user created applications.

@kaisalmen Thanks for the link

jbaicoianu · 2017-07-12T22:29:19Z

In Elation Engine / JanusWeb, we actually do all our model parsing using a pool of worker threads, which works out pretty well. Once the workers have finished loading each model, we serialize it using object.toJSON(), send it to the main thread with postMessage(), and then load it using ObjectLoader.parse(). This removes most of the blocking portions of the loader code - there's still some time spent in ObjectLoader.parse() which could probably be optimized out, but overall interactivity and load speed is drastically improved. Since we're using a pool of workers, we can also parse multiple models in parallel, which is a huge win in complex scenes.

On the texture side of things, yeah, I think some changes are needed to three.js's texture uploading functionality. A chunked uploader using texSubImage2D would be ideal, then we could do partial updates of large textures over multiple frames, as mentioned above.

I would be more than happy to collaborate on this change, as it would benefit many projects which use Three.js as a base

takahirox · 2017-07-12T22:52:52Z

I think to use texSubImage2D is a good idea.
But also why WebGL doesn't upload texture asynchronously.
OpenGL and other libs have the same limitation?

And another thing I'm thinking is GLSL compilation.
Will it drop the frame? Or fast enough and we don't need to care?

jbaicoianu · 2017-07-12T22:56:41Z

Yes, this is a problem in native OpenGL as well - compiling shaders and uploading image data are synchronous / blocking operations. This is why most game engines recommend or even force you to preload all content before you start the level - it's generally considered too much of a performance hit to load new resources even off of a hard drive, and here we're trying to do asynchronously over the internet...we actually have a more difficult problem than most game devs, and we'll have to resort to using more advanced techniques if we want to be able to stream new content in on the fly.

Mugen87 · 2017-07-13T11:24:56Z

Uploading textures will be less problematic if we use the new ImageBitmap API in the future. See https://youtu.be/wkDd-x0EkFU?t=82 .

BTW: Thanks to @spite, we already have an experimental ImageBitmapLoader in the project.

jbaicoianu · 2017-07-13T18:45:33Z

@Mugen87 actually I'm already doing all my texture loads with ImageBitmap in Elation Engine / JanusWeb - it definitely helps and is worth integrating into the Three.js core, but there are two main expenses involved with using textures in WebGL - image decode time, and image upload time - ImageBitmap only helps with the first.

This does cut the time blocking the CPU by about 50% in my tests, but uploading large textures to the GPU, especially 2048x2048 and up, can easily take a second or more.

delapuente · 2017-07-14T09:29:24Z

It would be convenient to try what @jbaicoianu is suggesting. Anyway, if opting for the main-thread alternative, this seems a perfect match for requestIdleCallback instead of setTimeout.

fernandojsg · 2017-07-14T10:37:11Z

I agree with you all, I believe the approach to load and parse everything on the worker, create the needed objects back on main thread (if it's very expensive it could be done in several steps) and then include a incremental loading on the renderer.
For a MVP we could define a maxTexturesUploadPerFrame (by default infinite), and the render will take care of loading from the pool according to that number.
In the following iterations we could add a logic, as @spite commented, to measure the average and automatically upload them based on a safe range time before blocking. This could be done initially for each textures as an unit, but then it could be improved to incrementally upload chunks for bigger textures.

requestIdleCallback would be nice, but it's not widely supported, and it's problematic in WebVR sessions

@spite I'm curious about your sentence, what do you mean with problematic?

spite · 2017-07-14T10:50:17Z

I have a THREE.UpdatableTexture to incrementally update textures using texSubImage2D, but needs a bit of tweaking of three.js. The idea is to prepare a PR to add support.

Regarding requestIdleCallback (rIC):

first, it's supported on Chrome and Firefox, and although it can be polyfilled easily, the polyfilled version might defeat the purpose slightly.
second: the same way vrDisplay.requestAnimationFrame (rAF) needs to be called instead of window.rAF when presenting, the same applies for rIC, as discussed in this crbug. That means that the loader needs to be aware of the current active display at all times, or it will stop firing depending on what's presenting. It's not terribly complicated, it just adds more complexity to the wiring of the loaders (which should ideally just do their job, independently of the presentation state). Another option is to have the part in threejs that runs incremental jobs in the main thread to share the current display; i think it's much easier to do now with the latest changes to VR in threejs.

Another consideration: in order to be able to upload one large texture in several steps using texSubImage2D (256x256 or 512x512), we need a WebGL2 context to have offset and clipping features. Otherwise the images have to be pre-clipped via canvas, basically tiled client-side before uploading.

fernandojsg · 2017-07-14T11:49:26Z

@spite Good point, I didn't thought about rIC not being called when presenting, at first I thought that we should need a display.rIC but I believe that the .rIC should be attached to window and being called when window or display are both idle.
I believe I didn't hear anything related to this in the webvr specs discussions @kearwood maybe has more information, but definitely is an issue we should address.

Looking forward to see your UpdatableTexture PR! :) Even if it's just a WIP we could move some of the discussion there.

mrdoob · 2017-07-21T19:02:52Z

Maybe loaders could become something like this...

THREE.MyLoader = ( function () {

	// parse file and output js object
	function parser( text ) {
		return { 'vertices': new Float32Array() }
	}

	// convert js object to THREE objects.
	function builder( data ) {
		var geometry = new THREE.BufferGeometry();
		geometry.addAttribute( new THREE.BufferAttribute( data.vertices, 3 );
		return geometry;
	}

	function MyLoader( manager ) {}
	MyLoader.prototype = {
		constructor: MyLoader,
		load: function ( url, onLoad, onProgress, onError  ) {},
		parse: function ( text ) {
			return builder( parser( text ) );
		},
		parseAsync: function ( text, onParse ) {
			var code = parser.toString() + '\nonmessage = function ( e ) { postMessage( parser( e.data ) ); }';
			var blob = new Blob( [ code ], { type: 'text/plain' } );
			var worker = new Worker( window.URL.createObjectURL( blob ) );
			worker.addEventListener( 'message', function ( e ) {
				onParse( builder( e.data ) );
			} );
			worker.postMessage( text );
		}
	}
} )();

spite · 2017-07-25T15:00:59Z

First proposal release of THREE.UpdatableTexture

Ideally it should be part of any THREE.Texture, but i would explore this approach first.

spite · 2017-07-25T15:08:14Z

@mrdoob i see the merit on having the exact same code piped to the worker, it just feels soooo wrong 😄. I wonder what the impact of serialising, blobbing and re-evaluating the script would be; nothing too terrible, but i don't think the browser is optimised for this quirks 🙃

Also, ideally the fetch of the resource itself would happen in the worker. And I think the parser() method in the browser would need an importScripts of three.js itself.

But a single point for defining sync/async loaders would be kick-ass!

kaisalmen · 2017-07-26T06:59:07Z

@mrdoob the builder function could be completely generic and common to all loaders (WIP: https://github.com/kaisalmen/WWOBJLoader/blob/Commons/src/loaders/support/WWMeshProvider.js#LL215-LL367; Update: not yet isolated in a function). If the input data is constrained to pure js objects without reference to any THREE objects (that's what you have in mind, right?) we could build serializable worker code without need for imports in the worker (what WWOBJLoader does). This is easy for Geometry, but Materials/Shaders (if defined in file) could then only be created in the builder and only be described as JSON before by the parser.
A worker should signal every new Mesh and its completion, I think. It could be alter like this:

// parse file and output js object
function parser( text, onMeshLoaded, onComplete ) {
    ....
}

parse: function ( text ) {
    var node = new THREE.Object3d();
    var onMeshLoaded = function ( data ) {
        node.add( builder( data ) );
    };
    
    // onComplete as second callbackonly provided in async case
    parser( text, onMeshLoaded ) );
    return node;
},

A worker builder util is helpful + some generic communication protocol which is not contradicting your idea of using the parser as is, but it needs some wrapping, I think. Current state on WWOBJLoader evolution: https://github.com/kaisalmen/WWOBJLoader/blob/Commons/src/loaders/support/WWMeshProvider.js#LL40-LL133, whereas front-end calls are report_progress, meshData and complete.

Update2:

Do you think the Parser should be stateless? It's ok for the builder, but it could make sense to be able to set some parameters to adjust the behavior of the parser. This also implies configuration parameters should be transferable to the worker independent of parsing
It would be cool to have something like a run function that eats a generic configuration object. A generic director could then feed any loader with instructions (this is now working in the bespoke Commons branch of WWOBJLoader, btw)
ongoing dev: WWOBJLoader2 now extends OBJLoader and it overrides parse. So, it we have both parsing caps, but in different classes. It comes close to the proposal, but it is not in line, yet. Some parser code needs to be unified and eventually both classes need to be fused

That's it for now. Feedback welcome 😄

jbaicoianu · 2017-07-26T17:13:12Z

@mrdoob I like the idea of composing the worker out of the loader's code on the fly. My current approach just loads the entire combined application js and just uses different entry point from the main thread, definitely not as efficient as having workers composed with just the code they need.

I like the approach of using a trimmed-down transmission format for passing between workers, because it's easy to mark those TypedArrays as transferrable when passing back to the main thread. In my current approach I'm using the .toJSON() method in the worker, but then I go through and replace the JS arrays for vertices, UVs, etc. with the appropriate TypedArray type, and mark them as transferrable when calling postMessage. This makes the parsing/memory usage a bit lighter in the main thread, at the cost of a bit more processing/memory usage in the worker - it's a fine trade-off to make, but it could be made more efficient by either introducing a new transmission format as you propose, or by modifying .toJSON() to optionally give us TypedArrays instead of JS arrays.

The two downsides I see to this simplified approach are:

Needs a rewrite of the existing model loaders. Many loaders use built-in THREE namespaced classes and functions to perform their task, so we'd probably have to pull in some minimal set of three.js code - and with workers this gets tricky
The transmission format should properly capture an object hierarchy, as well as different object types (Mesh, SkinnedMesh, Light, Camera, Object3D, Line, etc.)

jbaicoianu · 2017-07-26T17:20:17Z

@spite Regarding "Also, ideally the fetch of the resource itself would happen in the worker." - this was my thinking when I first implemented the worker-based asset loader for Elation Engine - I had a pool of 4 or 8 workers, and I would pass them jobs as they became available, and then the workers would fetch the files, parse them, and return them to the main thread. However, in practice what this meant was that the downloads would block parsing, and you'd lose the benefits you'd get from pipelining, etc. if you requested them all at once.

Once we realized this, we added another layer to manage all our asset downloads, and then the asset downloader fires events to let us know when assets become available. We then pass these off to the worker pool, using transferrables on the binary file data to get it into the worker efficiently. With this change, the downloads all happen faster even though they're on the main thread, and the parsers get to run full-bore on processing, rather than twiddling their thumbs waiting for data. Overall this turned out to be one of the best optimizations we made in terms of asset load speed.

jbaicoianu · 2017-07-26T17:57:43Z

On the topic of texture loading, I've built a proof of concept of a new FramebufferTexture class, which comes with a companion FramebufferTextureLoader. This texture type extends WebGLRenderTarget, and its loader can be configured to load textures in chunked tiles of a given size, and compose them into the framebuffer using requestIdleCallback().

https://baicoianu.com/~bai/three.js/examples/webgl_texture_framebuffer.html

In this example, just select an image size and a tilesize and it'll start the loading process. First we initialize the texture to pure red. We start the download of the images (they're about 10mb, so give it a bit), and when they complete we change the background to blue. At this point we start parsing the image with createImageBitmap() to parse the file, and when that's done we set up a number of idle callbacks which contain further calls to createImageBitmap() which efficiently split the image into tiles. These tiles are rendered into the framebuffer over a number of frames, and have a significantly lower impact on frame times than doing it all at once.

NOTE - FireFox currently doesn't seem to implement all versions of createImageBitmap, and is currently throwing an error for me when it tries to split into tiles. As a result, this demo currently only works in Chrome. Does anyone have a reference for a createImageBitmap support roadmap in FireFox?

There's some clean-up I need to do, this prototype is a bit messy, but I'm very happy with the results and once I can figure out a way around the cross-browser problems (canvas fallback, etc), I'm considering using this as the default for all textures in JanusWeb. The fade-in effect is kind of neat too, and we could even get fancy and blit a downsized version first, then progressively load the higher-detail tiles.

Are there any performance or feature-related reasons anyone can think of why it might be a bad idea to have a framebuffer for every texture in the scene, as opposed to a standard texture reference? I couldn't find anything about max. framebuffers per scene, as far as I can tell once a framebuffer has been set up, if you're not rendering to it then it's the same as any other texture reference, but I have this feeling like I'm missing something obvious as to why this would be a really bad idea :)

spite · 2017-07-26T18:06:15Z

@jbaicoianu re: firefox's createImageBitmap, the reason is they don't support the dictionary parameter, so it doesn't support image orientation or color space conversion. it makes most applications of the API pretty useless. I filed two bugs related to the issue: https://bugzilla.mozilla.org/show_bug.cgi?id=1367251 and https://bugzilla.mozilla.org/show_bug.cgi?id=1335594

jbaicoianu · 2017-07-26T18:12:35Z

@spite that's what I thought too, I'd seen this bug about not supporting the options dictionary - but in this case I'm not even using that, I'm just trying to use the x, y, w, h options. The specific error I'm getting is:

Argument 4 of Window.createImageBitmap '1024' is not a valid value for enumeration ImageBitmapFormat.

Which is confusing, because I don't see any version of createImageBitmap in the spec which takes an ImageBitmapFormat as an argument.

wrr · 2017-07-26T19:16:42Z

Are there any performance or feature-related reasons anyone can think of why it might be a bad idea to have a framebuffer for every texture in the scene, as opposed to a standard texture reference? I couldn't find anything about max. framebuffers per scene, as far as I can tell once a framebuffer has been set up, if you're not rendering to it then it's the same as any other texture reference, but I have this feeling like I'm missing something obvious as to why this would be a really bad idea :)

@jbaicoianu THREE.WebGLRenderTarget keeps a framebuffer, a texture and a render buffer. When you have the texture assembled, you can delete the framebuffer and the render buffer and only keep the texture. Something like this should do this (not tested):

texture = target.texture;
target.texture = null; // so the webgl texture is not deleted by dispose()
target.dispose();

jbaicoianu · 2017-07-26T19:19:45Z

@wrr that's good to know, thanks. I definitely have to do a pass on memory efficiency on this too - it inevitably crashes at some point if you change the parameters enough, so I know there's some clean-up I'm not doing yet. Any other hints like this would be much appreciated.

kaisalmen · 2017-07-29T11:49:14Z

@mrdoob and @jbaicoianu I forgot to mention that I like the idea, too. 😄
I have uncluttered the code (reworked init, worker instructions object, replaced rubbish multi-callback handling, common resource description, etc.) of OBJLoader and WWOBJLoader and all examples (code). Both loaders are now ready to be combined. They will be according your blueprint hopefully some time next week depending on my spare time:
Directed WWOBJLoader2 test:
https://kaisalmen.de/proto/test/wwparallels/main.src.html
Directed user of generic WorkerSupport:
https://kaisalmen.de/proto/test/meshspray/main.src.html
The big zipped OBJ file test:
https://kaisalmen.de/proto/test/wwobjloader2stage/main.src.html

I will update the above examples with newer code when available and let you know.
Update 2017-07-30: OBJLoader2 and WWOBJLoader2 now use identical Parsers. They pass data to common builder function directly or from worker.
Update 2017-07-31: WWOBJLoader2 is gone. OBJLoader2 provides parse and parseAsync, load and run (feed by LoaderDirector or manually)

Update 2017-08-09:
Moved update to new post.

kaisalmen · 2017-08-10T06:34:59Z

OBJLoader2 is signature and behaviour compatible again with OBJLoader (I broke this during evolution), OBJLoader2 provides parseAsync and load with useAsync flag in addition. I think, it is ready to be called V2.0.0-Beta now. Here you find the current dev status:
https://github.com/kaisalmen/WWOBJLoader/tree/V2.0.0-Beta/src/loaders

I have extracted LoaderSupport classes (independent of OBJ) that serve as utilities and required support tools. They could be re-used for potential other worker based loaders. All code below, I put under namespace THREE.LoaderSupport to highlight its dependence from OBJLoader2:

Builder: For general mesh building
WorkerDirector: Creates loaders via reflection, processes PrepData in queue with configured amount of workers. Used to fully automate loaders (MeshSpray and Parallels demo)
WorkerSupport: Utility class to create workers from existing code and establish a simple communication protocol
PrepData + ResourceDescriptor: Description used for automation or simply for unified description among examples
Commons: Possible base class for loaders (bundles common parameters)
Callbacks: (onProgress, onMeshAlter, onLoad) used for automation and direction and LoadedMeshUserOverride is used to provide info back from onMeshAlter (normals addition in objloader2 test below)
Validator: null/undefined variable checks

@mrdoob @jbaicoianu OBJLoader2 now wraps a parser as suggested (it is configured with parameters globally set or received by PrepData for run). The Builder receives every single raw mesh and the parser returns the base node, but apart from that it matches the blueprint.
There is still some helper code in OBJLoader2 for serialization of the Parser that is likely not needed.
The Builder needs clean-up as the contract/parameter object for buildMeshes function is still heavily influenced by OBJ loading and is therefore still considered under construction.

The code needs some polishing, but then it is ready for feedback, discussion, criticism, etc... 😄

Examples and Tests

OBJ Loader using run and load:
https://kaisalmen.de/proto/test/objloader2/main.src.html
OBJ Loader using run async and parseAsync:
https://kaisalmen.de/proto/test/wwobjloader2/main.src.html
Directed use of run async OBJLoader2:
https://kaisalmen.de/proto/test/wwparallels/main.src.html
Directed use of generic WorkerSupport:
https://kaisalmen.de/proto/test/meshspray/main.src.html
The big zipped OBJ file test:
https://kaisalmen.de/proto/test/wwobjloader2stage/main.src.html

mrdoob · 2017-08-10T21:17:50Z

Looking good! Are you aware of these changes in OBJLoader? #11871 565c6fd

takahirox · 2018-08-21T21:40:54Z

Ah, OK. I've missed that spec.

Mugen87 · 2018-08-21T21:42:16Z

The importance of the options dictionary is also discussed here:

https://bugzilla.mozilla.org/show_bug.cgi?id=1335594

spite · 2018-08-21T21:43:29Z

The bugs on bugzilla (https://bugzilla.mozilla.org/show_bug.cgi?id=1367251, https://bugzilla.mozilla.org/show_bug.cgi?id=1335594) have been there untouched for ... two years now? I didn't think it would take them this bloody long to fix it.

So the problem is that "technically" the feature is supported on FF, but in practice is useless. In order to use it, we could have a path for Chrome that uses it, and another for the other browsers that doesn't. Problem is, since Firefox does have the feature, we'd have to do UA sniffing, which sucks.

The practical solution is performing feature detection: build a 2x2 image using cIB with the flip flag, and then read back and make sure the values are correct.

takahirox · 2018-08-21T21:47:43Z

About FireFox bugs, I'm gonna also internally contact them. Let's see if we need workaround after we hear their plan.

fernandojsg · 2018-08-21T22:44:18Z

The bugs on bugzilla (https://bugzilla.mozilla.org/show_bug.cgi?id=1367251, https://bugzilla.mozilla.org/show_bug.cgi?id=1335594) have been there untouched for ... two years now? I didn't think it would take them this bloody long to fix it.

Yep sorry for that I really didn't follow up with it for a while -_-

So the problem is that "technically" the feature is supported on FF, but in practice is useless. In order to use it, we could have a path for Chrome that uses it, and another for the other browsers that doesn't. Problem is, since Firefox does have the feature, we'd have to do UA sniffing, which sucks.

The practical solution is performing feature detection: build a 2x2 image using cIB with the flip flag, and then read back and make sure the values are correct.

Yep I agree that both solutions really suck and we should try to avoid them so before digging into any of these lets see if we could unblock it on our side

takahirox · 2018-08-23T15:32:27Z

I made ImageBitmap uploading performance test. Uploading texture in every 5 secs.

You can compare Regular Image vs ImageBitmap.

https://rawgit.com/takahirox/three.js/ImageBitmapTest/examples/webgl_texture_upload.html (Regular Image)
https://rawgit.com/takahirox/three.js/ImageBitmapTest/examples/webgl_texture_upload.html?imagebitmap (ImageBitmap)

On my windows I see

Browser	8192x4096 JPG 4.4MB	2048x2048 PNG 4.5MB
Chrome Image	500ms	140ms
Chrome ImageBitmap	165ms	35ms
FireFox Image	500ms	40ms
FireFox ImageBitmap	500ms	60ms

(texture.generateMipmaps is true)

My thoughts

3x better performance with ImageBitmap on Chrome. Very nice improvement.
FireFox has ImageBitmap performance issue now? Can you try on your computer, too? Trying on mobile is also welcome.
Even with ImageBitmap, uploading texture seems to still block for large texture. Maybe we need incrementally partial uploading technique or something for non-blocking.

Mugen87 · 2018-08-23T16:00:03Z

Even with ImageBitmap, uploading texture seems to still block for large texture. Maybe we need partial uploading technique or something for non-blocking.

I guess one solution for this problem might be the usage of a texture compression format and the avoidance of JPG or PNG (and thus ImageBitmap). It would be interesting to see some performance data in this context.

takahirox · 2018-08-23T16:03:48Z

Yes, agreed. But I guess we probably still see blocking for large texture especially on low-power device like mobile. Anyways, evaluation the performance first.

spite · 2018-08-23T16:08:30Z

Or use scheduled/requestIdleCallback texSubImage2D

takahirox · 2018-08-23T16:10:54Z

rIC = requestIdleCallback?

spite · 2018-08-23T16:13:01Z

yes, i've made a ninja edit

takahirox · 2018-08-23T16:24:33Z

OK. Yes agreed.

takahirox · 2018-08-23T16:41:11Z

BTW, I'm not familiar with compressed texture yet. Let me confirm my understanding. We can't use Compressed Texture with ImageBitmap because compressedTexImage2D doesn't accept ImageBitmap, correct?

https://developer.mozilla.org/en-US/docs/Web/API/WebGLRenderingContext/compressedTexImage2D

jbaicoianu · 2018-08-23T20:32:07Z

I went back to revisit my old TiledTextureLoader experiments - seems like they're now causing my video driver to crash and restart :(

(edit: actually, it looks like even loading the largest texture (16k x 16k - https://baicoianu.com/~bai/three.js/examples/textures/dotamap1_25.jpg) directly in chrome is what's causing the crash. This used to work just fine, so seems to be some regression in chrome's image handling)

I'd done some experiments using requestIdleCallback, ImageBitmap, and ES6 generators to split a large texture into multiple chunks for uploading to the GPU. I used a framebuffer rather than a regular Texture, because even if you're using texSubimage2D to populate the image data, you still need to preallocate the memory, which requires uploading a bunch of empty data to the GPU, whereas a framebuffer can be created and initialized with a single GL call.

The repository for those changes is still available here https://github.com/jbaicoianu/THREE.TiledTexture/

Some notes from what I remember of the experiments:

requestIdleCallback definitely helped reduce the jank while loading textures, at the expense of greatly increasing the total load time
With a bit of extra work, this could be mitigated by first uploading a downscaled version of the texture, then filling in the full-res data at a more leisurely pace
ES6 generators helped to make the code easier to understand, and easier to write without wasting memory, but probably aren't really necessary for this

spite · 2018-08-23T21:01:40Z

My results were similar: there was a trade off between upload speed and jankiness. (BTW I created this https://github.com/spite/THREE.UpdatableTexture).

I think that for the second option to work in WebGL 1, you would actually need two textures, or at least modifiers to the UV coordinates. In WebGL 2 i think it's easier to copy sources that are different size from the target texture.

jbaicoianu · 2018-08-24T05:29:05Z

Yeah, with texSubImage2D I think that sort of resize wouldn't be possible, but when using a framebuffer, I'm using an OrthographicCamera to render a plane with the texture fragment, so it's just a matter of changing the scale of the plane for that draw call.

takahirox · 2018-08-27T14:54:37Z

About the performance issue of ImageBItmap on FireFox, I opened a bug on bugzilla

https://bugzilla.mozilla.org/show_bug.cgi?id=1486454

twilson7755 · 2019-01-14T22:13:40Z

I have been looking to try and better understand when the data associated with a texture is actually loaded into the GPU and came across this thread. In my particular use case, I am NOT concerned about the loading and decoding of local jpeg/gif files into textures, I am only concerned about trying to preload texture data onto the GPU. After reading this thread I must confess I am not entirely sure if it is addressing both issues or only the former? Given that I only care about the latter, do I need to look for a different solution or is there something in here that will help force the texture data to be loaded into the GPU?

gkjohnson · 2021-01-02T02:57:16Z

I've been looking into this, as well, and I think using THREE.ObjectLoader as mentioned in #11746 (comment) to create a transferrable representation after parse is the most general and extensible approach. It lets users create and manage their own workers and do any extra transformation needed in the workers, as well. Using this pattern with a couple changes to core and loading the cerberus.obj model I was able to get ~200ms of total frame stall time down to ~5ms so in my opinion this is almost already there. And the total time from load to model ready is just about the same, as well, once the worker is instantiated. I chose OBJLoader because it's particularly slow but this should be beneficial for most loaders that have to do a lot of parsing. Here's what the code looks like roughly:

Before

// index.js
const loader = new OBJLoader();
loader.load( 'models/obj/cerberus/cerberus.obj', result => {

        // loaded!

} );

After

// index.js
const worker = new Worker( './objLoaderWorker.js' );
worker.onMessage( e => {

        // loaded!

} );
worker.postMessage( 'models/obj/cerberus/cerberus.obj' );

// objLoaderWorker.js
globalThis.onmessage = e => {

        const loader = new OBJLoader();
        loader.load( e.data, result => {

		const json = result.toJSON();
		postMessage( json );

	} );

};

Of course there may be some limitations with ObjectLoader I'm unaware of but at a basic case this seems to work. Now regarding some of the changes I had to make to get the performance improvements -- both Object3D.toJSON and ObjectLoader.parse can both be a bit slow specifically due to the conversion of TypedArrays to and from basic Arrays for the JSON representation. The two changes I made were to set the BufferAttribute typed arrays directly onto the toJSON result and using those TypedArrays directly in ObjectLoader.parse.

With the above changes I saw improvements from ~180ms to less than 1ms for toJSON and ~180ms to ~2ms for ObjectLoader.parse. This is also without transfering those typed array buffers with Worker.postMessage which could possibly further improve the 5ms stall time I noted above. All in all this is what I would like to change for this to be viable:

Provide an option for the toJSON function somehow that indicates that the BufferAttributes (and data textures, etc) just set the TypedArray directly onto the result rather than converting it to an Array.
Adjust getTypedArray function to return the same input array if it's already of the correct type.
Bonus would be returning a list of transferrable types from the toJSON function next to the fully serialized result.

I can make a sample PR with the changes for discussion if this sounds interesting. I'm also happy to make an example page to show how users can do this if the above changes can move forward. @mrdoob @donmccurdy

EDIT: I see there may also be some issues loading textures in webworkers depending on the technique used so this is only a piece of the puzzle

mrdoob · 2021-01-08T10:53:48Z

I can make a sample PR with the changes for discussion if this sounds interesting.

Yes it does! Lets start with these two:

Provide an option for the toJSON function somehow that indicates that the BufferAttributes (and data textures, etc) just set the TypedArray directly onto the result rather than converting it to an Array.

Adjust getTypedArray function to return the same input array if it's already of the correct type.

jbaicoianu · 2021-01-09T00:24:23Z

I can echo @gkjohnson's findings, I did a write-up about those changes, I think this got shared on some other tickets but should have been shared here too since it's relevant to this thread as well https://github.com/jbaicoianu/elation-engine/wiki/Optimizations#models

We saw similar speed-ups using a monkey-patched version of BufferGeometry.toJSON() which preserves the attribute arrays as TypedArrays instead of converting them to JS arrays. I think @takahirox was also doing some experiments with a separate transferrable-optimized serialization representation, but I'm not sure if any of our experiments ever landed back in three.js core.

gkjohnson · 2021-01-09T03:09:28Z

I've added a quick PR here to show what kinds of changes would be needed: #21035

Mugen87 · 2021-03-16T12:02:06Z

Merging into #18234.

fernandojsg mentioned this issue Jul 11, 2017

VR render path and optimizations #11301

Closed

10 tasks

Mugen87 added the Suggestion label Jul 11, 2017

kaisalmen mentioned this issue Aug 2, 2017

OBJLoader n-gon support #11871

Merged

OmarShehata mentioned this issue Mar 5, 2019

Use createImageBitmap when available CesiumGS/cesium#7579

Merged

takahirox mentioned this issue Mar 6, 2019

[WIP] Suggestion: Add a way to enable ImageBitmapLoader to GLTFLoader #15919

Closed

Mugen87 mentioned this issue Nov 12, 2020

ThreeJs ThreeMF loader is blocking the UI while upzipping the 3MF file. #20206

Closed

12 tasks

gkjohnson mentioned this issue Jan 9, 2021

Object3D.toJSON : Add demo implementation to retain typed buffers in result #21035

Draft

Mugen87 closed this as completed Mar 16, 2021

gkjohnson mentioned this issue Mar 16, 2021

TaskManager: Proposed worker management class #18234

Closed

jo-chemla mentioned this issue Apr 9, 2024

Add non-blocking large textures uploading to GPU - maybe via tiled uploading copyTextureToTexture #28101

Closed

Non-blocking assets loaders #11746

Non-blocking assets loaders #11746

Comments

fernandojsg commented Jul 11, 2017

spite commented Jul 11, 2017 • edited

kaisalmen commented Jul 11, 2017

donmccurdy commented Jul 12, 2017 • edited

mikearmstrong001 commented Jul 12, 2017

jbaicoianu commented Jul 12, 2017

takahirox commented Jul 12, 2017 • edited

jbaicoianu commented Jul 12, 2017 • edited

Mugen87 commented Jul 13, 2017 • edited

jbaicoianu commented Jul 13, 2017

delapuente commented Jul 14, 2017

fernandojsg commented Jul 14, 2017 • edited

spite commented Jul 14, 2017

fernandojsg commented Jul 14, 2017 • edited

mrdoob commented Jul 21, 2017

spite commented Jul 25, 2017

spite commented Jul 25, 2017

kaisalmen commented Jul 26, 2017 • edited

jbaicoianu commented Jul 26, 2017

jbaicoianu commented Jul 26, 2017 • edited

jbaicoianu commented Jul 26, 2017 • edited

spite commented Jul 26, 2017

jbaicoianu commented Jul 26, 2017 • edited

wrr commented Jul 26, 2017

jbaicoianu commented Jul 26, 2017

kaisalmen commented Jul 29, 2017 • edited

kaisalmen commented Aug 10, 2017 • edited

Examples and Tests

mrdoob commented Aug 10, 2017

takahirox commented Aug 21, 2018

Mugen87 commented Aug 21, 2018

spite commented Aug 21, 2018

takahirox commented Aug 21, 2018 • edited

fernandojsg commented Aug 21, 2018

takahirox commented Aug 23, 2018 • edited

Mugen87 commented Aug 23, 2018 • edited

takahirox commented Aug 23, 2018 • edited

spite commented Aug 23, 2018 • edited

takahirox commented Aug 23, 2018

spite commented Aug 23, 2018

takahirox commented Aug 23, 2018

takahirox commented Aug 23, 2018 • edited

jbaicoianu commented Aug 23, 2018 • edited

spite commented Aug 23, 2018

jbaicoianu commented Aug 24, 2018

takahirox commented Aug 27, 2018

twilson7755 commented Jan 14, 2019

gkjohnson commented Jan 2, 2021 • edited

mrdoob commented Jan 8, 2021

jbaicoianu commented Jan 9, 2021

gkjohnson commented Jan 9, 2021

Mugen87 commented Mar 16, 2021

spite commented Jul 11, 2017 •

edited

donmccurdy commented Jul 12, 2017 •

edited

takahirox commented Jul 12, 2017 •

edited

jbaicoianu commented Jul 12, 2017 •

edited

Mugen87 commented Jul 13, 2017 •

edited

fernandojsg commented Jul 14, 2017 •

edited

fernandojsg commented Jul 14, 2017 •

edited

kaisalmen commented Jul 26, 2017 •

edited

jbaicoianu commented Jul 26, 2017 •

edited

jbaicoianu commented Jul 26, 2017 •

edited

jbaicoianu commented Jul 26, 2017 •

edited

kaisalmen commented Jul 29, 2017 •

edited

kaisalmen commented Aug 10, 2017 •

edited

takahirox commented Aug 21, 2018 •

edited

takahirox commented Aug 23, 2018 •

edited

Mugen87 commented Aug 23, 2018 •

edited

takahirox commented Aug 23, 2018 •

edited

spite commented Aug 23, 2018 •

edited

takahirox commented Aug 23, 2018 •

edited

jbaicoianu commented Aug 23, 2018 •

edited

gkjohnson commented Jan 2, 2021 •

edited