Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitize HTML strings has flaws but implies that this is an easy problem to solve #281

Open
simonw opened this issue Nov 6, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@simonw
Copy link
Contributor

simonw commented Nov 6, 2023

As discussed on Hacker News: https://news.ycombinator.com/item?id=38162435

This article: https://phuoc.ng/collection/html-dom/sanitize-html-strings/ has some problems.

First, it uses value.startsWith('javascript:') to remove javascript: attributes, which misses the case <a href=" javascript:alert('hi')"> - the leading space.

But even if you apply .trim() there are still all sorts of ways this could go wrong - JavaScript: and many others too, probably. A better approach would be to allow-list http:// and https:// URLs.

But... the core problem with that article is that it implies to people that they should feel confident writing their own HTML sanitization mechanisms.

I'm a big supporter of VanillaJS for almost everything else... but HTML sanitization is a wildly difficult problem. Saying things like "It's that easy!" gives a very false sense of how hard it is.

Personally I think that particular article should be removed entirely. If not removed, then it should start (not end) with a very clear warning not to attempt to roll your own solution for this, and promote DOMPurify instead.

The current conclusion reads:

Ensuring the security of your web application is crucial, and sanitizing HTML strings is an essential step in achieving this. The DOMParser API is one tool you can use to get the job done right. By sanitizing your HTML strings, you can prevent malicious code from being executed and protect your users from harm.

It's worth noting that while removing script tags and inline event handlers can help prevent malicious code execution, it's not a bulletproof solution. There are still ways attackers can inject harmful code into your app, such as through CSS styles or exploiting vulnerabilities in your server-side code. That's why it's important to use multiple layers of defense when securing your web application.

This conclusion implies that the DOMParser solution provided in this article "can prevent malicious code from being executed" - that's a strong promise, and one that I don't think you can deliver on in an article like this one.

The second paragraph warns that "it's not a bulletproof solution" and then mixes the issues of CSS styles and vulnerabilities in server-side code - very different problems. This is not a strong enough warning!

(I love everything else about this project aside from this one article.)

@phuocng
Copy link
Owner

phuocng commented Dec 9, 2023

Thanks for the feedback, @simonw 👍
The article's content is available in this repository, so I would greatly appreciate it if you could submit a pull request to update the wording. Thank you!

@phuocng phuocng added the enhancement New feature or request label Dec 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants