You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@derek-zhou It's not that is too fragile, but I think the HTML parsing state machine is too damn complicated to fix when the parser never followed the specs 😅
I plan to finish the built-in parser one day. But in the meanwhile, I suggest you to give it a try to the html5ever parser https://github.com/philss/floki#using-html5ever-as-the-html-parser, now that comes with precompiled NIFs (you don't need Rust to use it anymore).
I am not afraid of a little of rust tool chain. However, I need to do some ad-hoc XML parsing in the same application and I am afraid if the html5ever parser could be too strict on things.
Description
According to HTML5 spec, closing
</p>
tag is optional. ie:is equivalent to:
However, Floki with the builtin parser does not handle this correctly.
To Reproduce
It looks like Floki fills in the missing
</p>
at the end of the document.Expected behavior
<p>
tag shall not contain another<p>
The text was updated successfully, but these errors were encountered: