Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hickory does not correctly parse <noscript> tags in <head> #52

Open
eigenhombre opened this issue Sep 22, 2017 · 1 comment
Open

Hickory does not correctly parse <noscript> tags in <head> #52

eigenhombre opened this issue Sep 22, 2017 · 1 comment
Labels
category: parsing Correctness and Edge Cases. Hail the DOM type: bug Something isn't working as intended

Comments

@eigenhombre
Copy link

Hickory doesn't parse noscript tags correctly when they are in the head, but does when it is in body. noscript should be supported in both.

(require '[hickory.core :as hick]
         '[hickory.render :as hr])

(-> "
<html>
 <body>
  <noscript>Ceçi n'est pas de JavaScript</noscript>
 </body>
</html>"
    hick/parse
    hick/as-hickory
    hr/hickory-to-html)

;;=>
'"<html><head></head><body>\n  <noscript>Ceçi n'est pas de JavaScript</noscript>\n \n</body></html>"


(-> "
<html>
 <head>
  <noscript>Ceçi n'est pas de JavaScript</noscript>
 </head>
</html>"
    hick/parse
    hick/as-hickory
    hr/hickory-to-html)

;;=>
'"<html><head>\n  <noscript></noscript></head><body>Ceçi n'est pas de JavaScript\n \n</body></html>"
@davidsantiago
Copy link
Collaborator

Hm, OK. We actually outsource our HTML parsing completely to JSoup, so I don't have much control over it I'm afraid (It's been a super solid library so far). Can you try checking if the latest version of JSoup still has the issue by manually depending on it? I notice that JSoup appears to have some open issues around the noscript tag: https://github.com/jhy/jsoup/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aopen%20noscript

@port19x port19x added type: bug Something isn't working as intended priority 2: medium category: parsing Correctness and Edge Cases. Hail the DOM and removed priority 2: medium labels Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: parsing Correctness and Edge Cases. Hail the DOM type: bug Something isn't working as intended
Projects
None yet
Development

No branches or pull requests

3 participants