Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: DOM APIs in web workers? #1217

Open
BenjaminAster opened this issue Jul 30, 2023 · 42 comments
Open

Proposal: DOM APIs in web workers? #1217

BenjaminAster opened this issue Jul 30, 2023 · 42 comments

Comments

@BenjaminAster
Copy link

I think there are valid use cases for DOM APIs like DOMParser, XMLSerializer, document.implementation.createDocument() etc. to be available in web workers. I don't mean having direct access to the current document (that wouldn't make sense, of course), but being able to parse, create, modify and serialize "offscreen" documents. Use cases for this include:

  • Parsing & serializing XML files off the main thread: For example, I'm currently working on a web-based rich text editor and Microsoft Word alternative, and I'm planning to add DOCX (Microsoft Word document file) support to it in the future. A DOCX file basically consists of a bunch of XML files zipped into a compressed archive. I can then (un)compress the zip file with the help of (De)CompressionStream and parse the XML files with DOMParser or create them with XMLSerializer. Currently, this has to be done on the main thread which will lead to the page being unresponsive while reading/writing DOCX files.
    Some projects like @jakearchibald's SVGOMG, an SVG optimizer & minifier based on SVGO, are currently even using XML parsing libraries like Sax instead of the browser's DOMParser – amongst other reasons, to make them work in web workers.
  • Generating HTML files off the main thread: Applications that generate HTML files – be it website builders, math document editors, Markdown to HTML transpilers, etc. – could profit immensely from being able to convert their internal representations to HTML off the main thread.

Since only a few months, all three major browser engines support worker modules and OffscreenCanvas, so I think websites are starting to do more and more expensive stuff off the main thread, with people like @surma having advocated for that for years.

From a technical perspective, my proposal is that e.g. a global self.document property is exposed in workers, which is a stripped down version of Document containing only the following properties and functions:

  • self.document.implementation
  • self.document.createAttribute()
  • self.document.createAttributeNS()
  • self.document.createCDATASection()
  • self.document.createComment()
  • self.document.createDocumentFragment() (?)
  • self.document.createElement()
  • self.document.createElementNS()
  • self.document.createEvent()
  • self.document.createExpression()
  • self.document.createProcessingInstruction()
  • self.document.createRange() (?)
  • self.document.createTextNode()

Additionally, the following interfaces should be exposed in workers:

  • Document & XMLDocument
  • DocumentType
  • DOMImplementation
  • DocumentFragment
  • DOMParser
  • XMLSerializer
  • XSLTProcessor
  • Sanitizer
  • Node
  • ParentNode
  • Attr
  • CharacterData
  • Text
  • CDATASection
  • Element
  • Comment
  • HTMLElement and all HTML element interfaces
  • SVGElement and all SVG element interfaces
  • MathMLElement
  • NodeList
  • HTMLCollection
  • AbstractRange, StaticRange & Range
  • MutationObserver & MutationRecord (?)
  • NamedNodeMap
  • ProcessingInstruction
  • XPathResult, XPathExpression & XPathEvaluator

One could then use new DOMParser().parseFromString() or self.document.implementation.{createDocument(), createHTMLDocument()} to create a new document, modify it with all the usual and beloved DOM methods, and stringify it with new XMLSerializer().serializeToString() or myOffscreenDocument.documentElement.outerHTML.

Things like Element.prototype.getClientRects() or Element.prototype.computedStyleMap() don't make sens with offscreen documents of course, but that is already the case with documents created on the main thread with DOMParser or document.implementation.createHTMLElement.

@WebReflection
Copy link

WebReflection commented Jul 30, 2023

While I'd be +1 on this, this part is misleading:

I don't mean having direct access to the current document (that wouldn't make sense, of course)

that's already possible with coincident/window and it does make sense ... we use that to drive WASM targeting programming languages from a worker, without ever blocking via Atomics, giving them the ability to interact 1:1 with the DOM API (or anything else only available on main) so it's a solved problem to us, but surely having it native would be awesome, yet we're good, and we have demanded, working, and usable use cases, even my own DOM libraries work in there out of the box, so please let's not spread FUD around what's desirable or possible, as that's not necessary, thanks.

edit P.S. you'd probably be good with that module too, just use those API as they are from a worker and give it a shot, you might be surprised by everything just working out of the box. If not, please file an issue to the project, thanks again.

@jakearchibald
Copy link
Collaborator

What are the advantages of this proposal, vs being able to create an iframe that runs in a different thread?

@BenjaminAster
Copy link
Author

BenjaminAster commented Jul 31, 2023

@jakearchibald That seems a bit... clunky? Coming from a worker, you'd have to pass a message to the main thread, which sends it to the sandboxed iframe, which sends the result back to the main thread, which sends it back to the worker. Am I missing something here? At the end of the day, using an iframe for that is a hack, and not what iframes were designed to do. DOMParser & friends are not something that are architecturally coupled to the main thread, so they simply should just be available in workers as well.


@WebReflection Hmmm... A thing that makes web workers so awesome is that they are completely isolated from the main thread – on modern systems, they even run in separate CPU cores – and therefore are not constrained by having to finish any synchronous work before the browser renders the next frame. Stuff like DOM operations with the current document are fundamentally synchronous operations and have to operate on the main thread which manages it. Of course, you could give workers access to the current document, but the way this would work internally in the browser is that the worker would somehow notify the main thread to make a DOM operation, the main thread then does this synchronously, and sends a "done" message back to the worker. And this is exactly what libraries like your coincident, via.js or comlink are already doing, just by implementing it themselves with Proxies, Atomics, postMessage, etc. And don't get me wrong: I think it absolutely is an awesome developer experience to be able to modify the current DOM directly from a worker, but building this natively into web browsers simply improves DX because you don't have to use a library for that anymore (or implement all the Proxy/Atomics/postMessage horror yourself), but I don't think you would get any performance benefits from it, as the DOM operations would still have to be executed on the main thread at the end, just that the browser would do it for you and you (or your library) don't have to worry about it anymore.

The proposal I'm talking about wouldn't involve the current document – and therefore the main thread – at all, and would work truly independent from anything outside the worker itself, which is not at all possible today (except if you use an iframe as Jake mentioned, or if you build your own HTML/XML parser, custom "virtual" DOM implementation, and HTML/XML serializer – which will never be as performant as the browser's native methods). This would give actual performance benefits as you have your own separate thread and can do a long, synchronous operation like parsing a giant HTML/XML file that may last dozens of milliseconds, while the document and the main thread simultaneously do their own independent thing.

you'd probably be good with that module too, just use those API as they are from a worker and give it a shot, you might be surprised by everything just working out of the box.

Going back to my use case of parsing a large amount of XML files extracted from a zipped DOCX file, existing libraries like your concident or other ones mentioned above do provide awesome developer experiences, but they do not solve my use case, as even though you can then create a DOMParser in a worker, everything is still just a proxy to the main thread (correct my if I'm wrong here) and the actual XML parsing would be executed on the main thread – which is exactly what I'm trying to avoid.

@WebReflection
Copy link

@BenjaminAster you are right, proxied stuff will operate from the main when it comes to main-only utilities, but if iframe already uses a separated thread (or ... does it?) you can use coincident or other projects from that iframe and delegate the iframe to communicate eventually stuff to its parent? if the iframe doesn't create its own thread though I agree having DOMParser in workers is desirable and surely less hacky.

@surma
Copy link

surma commented Jul 31, 2023

What are the advantages of this proposal, vs being able to create an iframe that runs in a different thread?

FWIW, I think being able to create document fragments in a Worker that can be manipulated, without having to pay the cost of layouting or rendering, but can be sent to a renderer thread seems valuable to me and sufficiently different from an iframe.

(I suppose a case could be made to introduce something like <iframe no-render> that can skip layout and rendering and effectively becomes a Worker with a DOM ™️ . Not sure if that has second-order implications tho).

@jakearchibald
Copy link
Collaborator

@BenjaminAster

That seems a bit... clunky? Coming from a worker…

Yeah, that's fair. If your starting point is a worker, the iframe solution isn't great. But, maybe being able to create one of these iframes from a worker is a solution.

At the end of the day, using an iframe for that is a hack, and not what iframes were designed to do.

I don't find this very compelling. You could equally, and truthfully say that DOM APIs weren't designed to be in workers. Whatever solution is employed here will involve changing the intentional design of something.

DOMParser & friends are not something that are architecturally coupled to the main thread

Yes they are. They're absolutely coupled to documents. That's why they aren't available in workers.

Maybe their design could be changed so they don't need to be coupled to documents, but isn't where we're at right now.

@jakearchibald
Copy link
Collaborator

It feels like folks think there's a single line in browsers like:

if (isWorkerEnvironment) return;
exposeDOMAPIs();

But that isn't the case. It isn't that DOM APIs are simply not-exposed workers, it's that DOM APIs are not designed to work in non-document environments. Allowing DOM APIs to exist in workers will be a massive undertaking in terms of spec and implementation.

I'm not saying it's impossible, but it's not just flipping a flag.

DOM APIs are massively interlinked with style and rendering. It might be easier to create a new set of interfaces that don't have that issue, and can be cloned/transferred, and upgraded to HTMLElement & co within a document context.

@WebReflection
Copy link

WebReflection commented Jul 31, 2023

DOM APIs are massively interlinked with style and rendering

but (new DOMParser).parseFromString(...) works already, right? I am not sure, if stuff is never live, how this API could be problematic once exposed via Worker 🤔


I went ahead and did a test ... the iframe hack is awkward (it needs a sandbox that apparently allows a different thread and at the same time is discouraged and it warns but it's needed for worker to execute).

index.html

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <script src="../../mini-coi.js"></script>
    <script>
      addEventListener('message', ({data}) => {
        document.body.append(data);
      });
    </script>
  </head>
  <body>
    <iframe src="iframe.html"
      sandbox="allow-scripts allow-same-origin"
      frameborder="0" width="0" height="0"
      style="position:absolute;top:-1px;left:-1px"
    ></iframe>
  </body>
</html>

iframe.html

<!DOCTYPE html>
<script type="module">
import coincident from '../../window.js';
coincident(new Worker('./worker.js', {type: 'module'}));
</script>

worker.js

import coincident from '../../window.js';

const {window} = coincident(self);
const parser = new window.DOMParser;

const document = parser.parseFromString(
  '<!doctype html>',
  'text/html'
);

document.body.textContent = 'Hello World';

// send a message to the parent
window.parent.postMessage(document.documentElement.outerHTML);
// <html><head></head><body>Hello World</body></html>

Live test here

I believe this would cover @BenjaminAster non-blocking use case via a whole DOM API that should not execute among the main thread but I couldn't find any encouraging discussion around this assumption, yet it seems to be de-facto standard.

@WebReflection
Copy link

btw ... I've just realized that if the iframe is already on a different thread, coincident is kinda useless ... I just used it to be sure I could at least have it running from an iframe but if it uses the iframe thread and that's sync, there's no advantage in doing that at all ... so iframe doesn't look like an answer if we can't guarantee it runs on a separate, non-blocking, thread.

@jakearchibald
Copy link
Collaborator

@WebReflection

if stuff is never live, how this API could be problematic once exposed via Worker 🤔

What do you mean by 'live'? Remember that some elements have actions when they're constructed, not just when they're connected. Eg creating an image.

it needs a sandbox that apparently allows a different thread

I don't believe browsers run iframes in a different thread, even if they have the sandbox attribute.

iframe doesn't look like an answer if we can't guarantee it runs on a separate, non-blocking, thread

Right, that's why I was proposing a feature that did that.

@WebReflection
Copy link

Remember that some elements have actions when they're constructed, not just when they're connected. Eg creating an image.

of course I did not think about that, fair enough then.

I don't believe browsers run iframes in a different thread, even if they have the sandbox attribute.

from live tests via SO iframes run in a different thread if:

  • the src points to a different domain
  • the sandbox attribute is used ... at least that's what devs observed and tested live

Right, that's why I was proposing a feature that did that.

it'd be awesome, and if not too problematic and it can speed up things more, @surma hint around no-render would be a strawberry on the cake.

@WebReflection
Copy link

Remember that some elements have actions when they're constructed ... Eg creating an image.

wait a minute though ... I don't see any network activity in here ... that's what I meant by live ... if we parse to retrieve a document I don't think the parser constructs out of the box those elements until these are live/adopted ... what am I missing?

(new DOMParser).parseFromString('<img src="shenanigans.png">', 'text/html')

@jakearchibald
Copy link
Collaborator

Maybe images were a bad example then - my point is that someone is going to have to go through all the elements and check that their constructor behaviours are worker compatible.

@WebReflection
Copy link

I'd be curious to know which element might have issues though, as I think most of them need to be adopted and pass through the adopt algorithm before having any meaning for the current environment ... I've tested <base>, custom elements, others, I can't find anything working at all unless adopted by the "live document". MDN also doesn't specify anything around this behavior and standards mention that scripts will be flagged as not-executable https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#dom-domparser-parsefromstring-dev

that's still something to consider while adopting those nodes ... moreover:

The document's encoding will be left as its default, of UTF-8. In particular, any XML declarations or meta elements found while parsing string will have no effect.

In the parsing model it's also not clear why this would be unsafe if the document is created via the API ... looking forward for some enlightenment around this.

@BenjaminAster
Copy link
Author

BenjaminAster commented Jul 31, 2023

Yes they are. They're absolutely coupled to documents. That's why they aren't available in workers.

Of course it will be some work to implement, but all of the things like computed CSS styles, layout, scripts, resource loading, ... aren't a thing in documents created by DOMParser or DOMImplementation::createHTMLDocument. That's what I meant by "not architecturally coupled to the main thread". I remember that when implementing OffscreenCanvas, that was a lot of work because suddenly stuff like font rendering and CSS parsing (via context2d.{font, fillStyle, etc.}) needed to work in workers. The only thing related to this that comes to my mind now is Document::styleSheets, which gives access to parsed CSS stylesheets and I think currently works also with "fake" documents. For this to work in workers, yes, there would have to be a basic CSS parser available in workers, but if that's too difficult to implement, I guess for the "minimum viable product" of worker DOM APIs, browsers could just leave this empty and not parse the CSS at all? I think the use cases for parsing CSS in a worker are minimal anyways.

Edit: Ok, it turns out Document::styleSheets does not work, but HTMLStyleElement::sheet does work, i.e.

new DOMParser().parseFromString("<!DOCTYPE html><style> body { color: red } </style>", "text/html").querySelector("style").sheet.cssRules

returns the correctly parsed CSS with one rule containing one declaration.

I don't believe browsers run iframes in a different thread, even if they have the sandbox attribute.

I know @WebReflection already mentioned that now, but at least in Chrome where I tested it, it seems that iframes with a sandbox attribute do run in their separate thread. You can try it out with e.g. this setup:

index.html:

<!DOCTYPE html>
<html lang="en">
<head>
	<script type="module">
		const frame = () => {
			millis.textContent = performance.now()
			requestAnimationFrame(frame)
		}
		requestAnimationFrame(frame)
	</script>
</head>
<body>
	<div id="millis"></div>
	<iframe src="iframe.html" sandbox="allow-scripts"></iframe>
</body>
</html>

iframe.html:

<!DOCTYPE html>
<html lang="en">
<head>
	<script type="module">
		const frame = () => {
			millis.textContent = performance.now()
			requestAnimationFrame(frame)
		}
		requestAnimationFrame(frame)
		block.onclick = () => {
			while(true);
		}
	</script>
</head>
<body>
	<div id="millis"></div>
	<button id="block">block</button>
</body>
</html>

If you click the "block" button in the iframe, the iframe is totally blocked but the parent frame continues to run.

Live demo now published at benjaminaster.com/playground/async-iframe

@jakearchibald
Copy link
Collaborator

at least in Chrome where I tested it, it seems that iframes with a sandbox attribute do run in their separate thread

Interesting. That wasn't the case a couple of months ago when I last tested it. It that the case on mobile too?

@WebReflection
Copy link

WebReflection commented Jul 31, 2023

Is that the case on mobile too?

in the SO thread somebody mentioned on Android heuristics can be different (no guarantees, depends on ... things ...) but on Desktop it seems to be consistent.

The thread mentions also that multiple iframes, even with sandbox attribute, will share the same thread so if you add 2 iframes in the above example a click in one will (should) block the other iframe too (still not the main thread).

@jakearchibald
Copy link
Collaborator

If you click the "block" button in the iframe, the iframe is totally blocked but the parent frame continues to run.

Live demo now published at benjaminaster.com/playground/async-iframe

It blocks the whole tab for me. Desktop Chrome 115.0.5790.114 on mac.

@BenjaminAster
Copy link
Author

It blocks the whole tab for me. Desktop Chrome 115.0.5790.114 on mac.

Ha, I had tested it in Chromium 113 on my Raspberry Pi (separate threads), and now in Chrome 115 on Windows and Android, where it indeed blocks the main thread... Interesting. So it either changed in a very recent Chrome version, or my Raspberry Pi somehow handles that differently. Anyways, yep, you're right, iframes do generally block the whole tab, so they're not an option!

@WebReflection
Copy link

... so they're not an option!

and imho they shouldn't be in general, now that I think about it, because an iframe with a guaranteed thread (like a worker) would compete with workers at that point, making workers kinda redundant as inferior to iframes ability (no DOM parsing ability), beside the security concerns when foreign scripts might try to access their content.

@developit
Copy link

Might be worth splitting the discussion here up into two topics:

  1. Ergonomic differences between iframe-as-thread and worker-with-dom
  2. Spec + technical feasibility (of exposing a DOM to Workers, and of allowing <iframe sandbox> to strictly imply OMT)

For #1:

It seems like any ergonomic warts in the process of constructing an iframe are either solvable in userland (essentially add an optimized mechanism for using <iframe sandbox> in a JS-loading-JS scenarios rather than just HTML-loading-HTML).

The ergonomics of the DOM-in-Worker are clearer to me, the issues there seem to be more on the spec and implementation side.

For #2:
I think there are some fringe cases for DOM-in-Worker that make this particularly tricky. Some potential cases off the top of my head:

  • what happens to inline scripts when parsing?
  • what happens to iframe or other nested documents (<svg foreignObject>, <embed> et al) in parsed documents within the Worker (do they get forced into the same thread?)
  • how would things like the media attribute work given that a document and its nodes have no direct relationship with display?

I can think of possible answers to these things, but they would all seem to require substantial revisions to DOM specs. Seems like it would be easier to spec out a "lite" DOM interface that avoids all of these issues by omitting presentation-related APIs.

@BenjaminAster
Copy link
Author

I think there are some fringe cases for DOM-in-Worker that make this particularly tricky.

All of the problems you mentioned have already been solved when browsers implemented DOMParser and DOMImplementation::createHTMLDocument(). If DOM APIs in workers would be spec'd, we could simply use the behavior that currently exists with them, only now in workers.

what happens to inline scripts when parsing?

Nothing. JS doesn't get executed, as is currently the case on the main thread with DOMParser and createHTMLDocument()

what happens to iframe or other nested documents (<svg foreignObject>, <embed> et al) in parsed documents within the Worker (do they get forced into the same thread?)

External content (iframe, embed, img, ...) doesn't get loaded at all. <foreignObject> gets parsed normally & on the same thread.

how would things like the media attribute work given that a document and its nodes have no direct relationship with display?

It doesn't. The media attribute in e.g.

<link rel="styleheet" href="dark.css" media="(prefers-color-scheme: dark)" />

would do absolutely nothing, and it doesn't matter since the CSS file doesn't get loaded anyway. Again, all of this is already the case today with "fake" documents created on the main thread via DOMParser or DOMImplementation::createHTMLDocument().

@WebReflection
Copy link

WebReflection commented Jul 31, 2023

@developit I agree with @BenjaminAster there: nothing you mentioned is an issue with current living standard because DOMParser and parseFromString do nothing until created nodes from that document get adopted.

In Workers, there's no way to adopt these in any meaningful way ("live content") because nothing is ever live ... no src, no source, no CSS, nothing ... the parseFromString rightly does parsing only, the rest is performed only when stuff gets adopted on the main, live, thread (which can't be the case within workers as we can't postMessage DOM nodes, as per structured clone algorithm specs).

@rniwa
Copy link
Collaborator

rniwa commented Aug 1, 2023

The way browser engines such as Blink, Gecko, & WebKit are written right now, the vast majority of DOM code assumes that it's running in the main thread. Making it possible to run that code in a worker is a massive undertaking. Is it theoretically possible? Yes, but it's by no means simple or easy. It could easy be a multi-year/multi-engineer effort.

@WebReflection
Copy link

Is it theoretically possible? Yes, but it's by no means simple or easy.

I don't think anyone in here believes it's a flag switch, like Jake suggested, but it would be interesting to understand why the main is so special in "just parsing" regards (which of course needs many other classes exposed to work properly).

It could easy be a multi-year/multi-engineer effort.

LinkeDOM (or other projects that already run in workers) could be a great polyfill in the meantime but if there's no vendors interest in moving forward with this proposal there won't be interest in making these projects closer to standards than they are now.

@jakearchibald
Copy link
Collaborator

I don't think anyone in here believes it's a flag switch, like Jake suggested

They really do. See the thread that started this one w3c/ServiceWorker#846 - the feeling there is very much that service workers chose to block DOM APIs from that context. Even down to the latest comment w3c/ServiceWorker#846 (comment).

@WebReflection
Copy link

They really do.

sad thread ... and I should've specified in here 😅

@rniwa on a second thought about this:

It could easy be a multi-year/multi-engineer effort.

I think that if we had a way to ensure a separate, non-blocking, thread for an iframe we could cut some corner and have what we want, in terms of functionality, even if that's not exactly where we want it (workers) ... as apparently in some circumstance iframes already get that thread, would @surma suggestion around having a no-render (or any other name) be a fast way forward, hopefully relatively easier than bringing the DOM to Workers?

@jakearchibald
Copy link
Collaborator

Fwiw, in my linear() app, I wanted to be able to analyse SVG paths off the main thread. To do this, I needed to bring another implementation of SVG paths into a worker. I couldn't use the built-in APIs because there's no easy standard way to run them in a different thread. A different-thread iframe (rendered or not) would have solved this.

That might be a different use-case though.

In terms of DOM-in-workers, any thoughts on mine and @developit's suggestion to have a different, minimal API for this? As in, it doesn't create HTMLImageElements, where you have things like naturalWidth and decode(), but a simpler tree model that can be later upgraded to real elements, and that upgrade can only happen in a document.

@keithamus
Copy link

keithamus commented Aug 1, 2023

It might be good to contextualise what people want. The ability to de-serialize an HTML string into some kind of object model - and back again - is a hugely different problem than reifying HTML into a DOM; as others have alluded to.

If the ask is "I don't want to bring my own HTML parser when the browser has a perfectly good one outside of Workers" then that closes the scope to a large degree compared to "I want to have the full suite of DOM APIs and shuttle tree fragments between threads".

What gives me pause about this discussion is; while I don't think people are naive enough to believe the DOM is intentionally blocked from workers, I do think that even people in this thread are failing to correctly grasp (or articulate) exactly what they want and the ramifications of that. I think the reason DOMParser() exists, and not HTMLParser() is because it answers a question and gives developers a fully reified DOM sits at the very end of a set of steps of taking HTML and turning it into UI. Everything in between is full of so much nuance that it's hard to find one place to settle on.

An HTML parser would alleviate you from some code within workers, and maybe give you a nice performance boost, but I think if people asked for it, they'd end up disappointed with what you get for it (not the DOM). Having a tree of objects that don't ascribe any semantic meaning to each node gives you very little, and once all that data gets sent to the main thread it still needs to be reified into the DOM, and all the things that your application wants like event listeners. The OP gives some good use cases for having general purpose serialisation but those cases aren't UI, they're data transformation. The rest of the thread talks of UI.

On the other hand having an object model that represents HTML requires full reification, which includes all the aforementioned steps and all the decisions about that must come from somewhere - so you're either introducing a fake environment which means whatever DOM you pass back to the main thread needs to effectively go through the same reification all over again (which means reification gets done twice and possibly diverges in each, making a worker DOM not WYSIWYG) or you need to introduce shenanigans tying a worker to a main thread's DOM so you can marshal data back and forth in order to make decisions, at which point you're back to blocking and may as well have done it in the main thread.

In addition, to talk of some of the use cases of the OP; I don't think the use cases are quite as compelling on the second glance. Let's take for example markdown to HTML. The final artefact is indeed DOM but it's much simpler to write a markdown to HTML converter (that is, converting one string to another string), then hand that to a browser to convert into DOM, than it is to write a markdown to DOM converter. While it would be useful to have an HTML parser to sanitize input, that is the last step in a chain of operations that has to happen before DOM, and pretty much where the contract ends. Up until sanitization the fastest and easiest way to generate HTML from markdown is string to string. DOM APIs would give us nothing in converting markdown.

@WebReflection
Copy link

WebReflection commented Aug 1, 2023

The rest of the thread talks of UI.

I never mentioned UI as desired feature and others mentioned no-render too as UI is not interesting or requested (also a non-sense from a Worker?) ... the OP, to which I agree with, is about having the parser exposed ... true that this requires a broader discussion around what we then want from the resulting document to happen when listeners are added or other special things (see Jake mention of naturalWidth) but it looks like we all agree (Surma desire of posting fragments a part) that a parser that produces a lightweight tree but it still validates inputs would be already a huge step forward in regards to this feature request.

@jakearchibald
Copy link
Collaborator

and all the things that your application wants like event listeners

Yeah, this is where things get messy. Let's say you did this in a worker:

const div = workerDOM.createElement('div');
div.addEventListener('click', () => console.log('click'));

self.postMessage(div);

…would that event listener 'work'? Would preventDefault in that listener work?

You end up with the same question for every bit of state an element can have that sits outside of the serialisable tree. Pixels on a canvas, styles in a sheet etc etc.

@WebReflection
Copy link

WebReflection commented Aug 1, 2023

that fails at the structured clone level:

  • no callbacks
  • no DOM nodes

IMHO, if DOMParser could validate and produce dummy nodes which are all just Node and ParentNode interfaces + Document with just querySelector or other features such as Xpath to parse & validate + lightweight search crawling ability, it'd be pretty awesome. EventListener or EventTarget interface feels like unnecessary to me or these could be added via utilities when and/or if needed, but maybe I am reducing too much the scope of the proposal, yet I don't see postMessaging DOM nodes making much sense, even if it'd be great for DX, but too much magic involved and tons of surprises.

Posting outerHTML would already go a long way to me, when or if that's needed.

@jakearchibald
Copy link
Collaborator

that fails at the structured clone level:

My assumption from the OP that folks wanted some way to send this stuff from the worker to the document.

@BenjaminAster
Copy link
Author

BenjaminAster commented Aug 1, 2023

Basically, all I really need is three things:

  • convert an HTML/XML string to some magical tree of node objects (aka. Document) or create a new one
  • mess around with that tree by adding, modifying and removing nodes
  • convert the tree back to a valid HTML/XML string

I don't really need anything like event listeners or even being able to postMessage the tree to the main thread. I'm ok with the idea of a "lite" "DOM alternative"; I think that would reduce down to something like the following API shape (names TBD of course):

  • LiteNode:
    • .childNodes
    • .firstChild (?)
    • .lastChild (?)
    • .nextSibling
    • .nodeName
    • .parentElement
    • .parentNode
    • .previousSibling
    • .textContent
    • .cloneNode()
    • .compareDocumentPosition() (?)
    • .contains()
    • .getRootNode() (?)
    • .isEqualNode() (?)
    • .normalize()
  • LiteElement (extends LiteNode):
    • .classList (?) (for convenience)
    • .dataset (?) (for convenience)
    • .innerHTML (?)
    • .innerText (?) (always treats element like white-space: normal)
    • .localName
    • .namespaceURI
    • .nextElementSibling
    • .outerHTML (?)
    • .prefix
    • .previousElementSibling
    • .tagName
    • .after() (?)
    • .append() (?)
    • .before() (?)
    • .getAttribute()
    • .getAttributeNS()
    • .getAttributeNames()
    • .hasAttribute()
    • .hasAttributeNS()
    • .insertAdjacentElement() (?)
    • .insertAdjacentHTML() (?)
    • .insertAdjacentText() (?)
    • .prepend() (?)
    • .remove()
    • .removeAttribute()
    • .removeAttributeNS()
    • .replaceChildren()
    • .replaceWith()
    • .toggleAttribute()
  • LiteElement & LiteDocument (both extend LiteNode):
    • .childElementCount
    • .children
    • .firstElementChild (?)
    • .lastElementChild (?)
    • .getElementsByClassName()
    • .getElementsByTagName()
    • .getElementsByTagNameNS()
  • LiteDocument (extends LiteNode):
    • .body (?)
    • .documentElement
    • .head (?)
    • .getElementById()
  • LiteDocumentFragment (extends LiteNode)
  • equivalent of DOMParser
  • equivalent of XMLSerializer
  • equivalent of window.document.implementation.createDocument()
  • equivalent of window.document.implementation.createDocumentType()
  • equivalent of window.document.implementation.createHTMLDocument()
  • equivalent of window.document.createAttribute()
  • equivalent of window.document.createAttributeNS()
  • equivalent of window.document.createCDATASection()
  • equivalent of window.document.createComment()
  • equivalent of window.document.createElement()
  • equivalent of window.document.createElementNS() (?)
  • equivalent of window.document.createProcessingInstruction()
  • equivalent of window.document.createTextNode()

Some things to consider:

  • Shadow roots? (probably out of scope)
  • Sanitizer API?
  • CSS selector parser?
    • .closest() (?)
    • .matches() (?)
    • .querySelector(All)() (?) (would be very helpful!)
  • .before()/.after()/.prepend()/.append() vs .insertAdjacentElement()/.insertAdjacentText()? (only one of them is needed)
  • Non-element nodes (text nodes, comments, CDATA sections and XML processing instructions): Should they just use the LiteNode interface directly or get their own respective interfaces like in "main" DOM?
  • XPath?
  • XSLT?
  • MutationObserver?

@developit
Copy link

@WebReflection it seems like you're more arguing for DOMParser as a pure standalone implementation using the DOM's structure with the parsing and attribute semantics of HTML, but not including any of the base element prototypes. That seems a lot more feasible, and also seems reasonably in line with where folks have found value in things like LinkeDOM/WorkerDOM/etc.

Devs would be able to build sync mechanisms atop this just as they can with userland DOM implementations, they just wouldn't have to implement the DOM tree, parsing and events from scratch. I do think it might be the case that many of the most compelling use-cases for a dynamic DOM in Worker require property-level MutationObserver (I know my current project does).

@WebReflection
Copy link

@developit yup, we're aligned, and so seems to be @BenjaminAster 👍

@bahrus
Copy link

bahrus commented Aug 26, 2023

I raised a related issue here, but that web site seems to be down (not sure if that's permanent), so I would like to make the suggestion here, if I may:

Cloudflare, which models itself after service workers, but on the server side, introduced something quite innovative: The HTML rewriter.

I think this would be a great first step in achieving more ambitious goals mentioned above. It would provide the ability to inject dynamic data (say from IndexedDB) into the HTML stream, as the content streams into the browser. From my experiments, having this api would allow developers to build a DOM Parser. Perhaps such a DOM parser, built in userland, could then become a candidate for inclusion, once it proves useful and mature.

Also, creating link preview functionality would be doable with this, and avoid an extra hop passing through a cloudflare worker.

There have been implementations with web assembly, which would seem to suggest that we would have a running start getting this implemented in the browser, and the ability to polyfill would be quite feasible.

@annevk
Copy link
Member

annevk commented Aug 28, 2023

This is starting to sound like a duplicate of issue #270.

@bahrus
Copy link

bahrus commented Aug 28, 2023

There are some similarities between the HTML Rewriter and the DOMTreeConstruction class. But I think the HTML Rewriter API provides an extra ability to filter nodes based on a subset of css matching, which seems quite useful. Not sure if I should open a separate issue to propose the HTML Rewriter API? (I don't want to be accused of spamming by opening duplicates).

@annevk
Copy link
Member

annevk commented Aug 28, 2023

A lot more is needed with regards to step 1-7 of https://whatwg.org/faq#adding-new-features. If you think an issue helps with that I won't oppose it, but it strikes me as a rather specific suggestion which seems too early in the general conversation of "what problem are we trying to solve?".

@bakkot
Copy link

bakkot commented Aug 31, 2023

What are the advantages of this proposal, vs being able to create an iframe that runs in a different thread?

Ability to create an iframe is gated on CSP; frame-src 'none' will prevent that from working. worker-src similarly gates workers, of course, but it's a lot more justifiable (and also objectively less dangerous) to relax your worker-src to do off-thread computation than to relax your frame-src.

@patricknelson
Copy link

patricknelson commented Feb 28, 2024

Enjoyed reading this thread. This is a great question:

what problem are we trying to solve

For me, it simply boils down to having a standard way to parse HTML without "the DOM ™️". I think it gets tricky when communicating this because, traditionally, what we're parsing from (static HTML) is of course tightly coupled with the representation that we're parsing into (the DOM) and of course all the baggage that comes with.

That's why I like where @BenjaminAster's hint at representing the DOM without it actually being @surma's "DOM ™️" 😄. That's where @bahrus's suggestion of looking at Cloudflare's contribution with HTMLRewriter comes into play. From what I can tell, it also represents its own DOM, so to speak, without it of course being a full on representation (with event listeners and etc, as noted above).

I think of it as an intermediate still-mostly-serialized state. It's not HTML anymore but it's also not "the DOM ™️" per se, either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests