Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post title renders escaped/encoded characters parsed by wp_kses filter #20490

Open
andrewserong opened this issue Feb 27, 2020 · 13 comments
Open
Labels
[Package] Editor /packages/editor [Type] Bug An existing feature does not function as intended

Comments

@andrewserong
Copy link
Contributor

Describe the bug
In a WordPress multisite install, most users do not have the unfiltered_html capability, so saved post titles will be filtered by wp_kses as the title_save_pre filtered is applied along with other filters in kses_init_filters().

This means that saving an unencoded post title like "This & that" will become "This & that" in the database. In Gutenberg, immediately upon saving, the post title currently gets rendered in its unescaped form after being parsed by wp_kses on the backend.

To reproduce
Steps to reproduce the behavior:

  1. Install a WordPress multisite by following the instructions here: https://wordpress.org/support/article/create-a-network/ (I had some difficulty getting this working with wp-env so tested using VVV)
  2. Once you've gone to /wp-admin/network.php, set up the multisite and update your wp-config.php, and you have a multisite instance running, go to create a new post on one of your sites.
  3. Enter a title like "This & that", click publish and the title will now be rendered as "This & that"

Expected behavior
The post title to be decoded for display and editing.

Screenshots

post-title-this-and-that-small

Desktop (please complete the following information):

  • OS: macOS Mojave 10.14.6
  • Browser: Chrome
  • Version: 80.0.3987.122

Additional context

I'm happy to keep investigating this one and (try to) come up with a fix. Initially I was thinking of just wrapping the value={ title } in the PostTitle component with decodeEntities like it was previously, however this results in strange behaviour if you start writing something like & where the title gets decoded as you type. Ideally we want the decoding of the title to happen only when fetching the data from the server. I'm not that familiar with this area, so any pointers (or if anyone else want to have a go) will be much appreciated!

@aduth
Copy link
Member

aduth commented Feb 27, 2020

I'm happy to keep investigating this one and (try to) come up with a fix. Initially I was thinking of just wrapping the value={ title } in the PostTitle component with decodeEntities like it was previously, however this results in strange behaviour if you start writing something like & where the title gets decoded as you type. Ideally we want the decoding of the title to happen only when fetching the data from the server. I'm not that familiar with this area, so any pointers (or if anyone else want to have a go) will be much appreciated!

It's tricky, and not immediately clear to me what the best approach would be. As you note, there are challenges with simply decoding entities of the title. And there may be cases that a user with the correct permissions makes a conscious choice to include entities in the title. Should we support that?

On glance, possible directions may include one of:

  • Decoding the title, perhaps at predefined intervals (only on post load) or based on user permissions
  • Considering if there is any difference/benefit in using post.title.rendered from the API instead of post.title.raw

Other items to investigate:

  • How was this handled in the classic editor? Is there prior art we can lean on for reference?
  • What is the full extent of character escaping that we can anticipate?

For what it's worth #19955 was simply a revert of #18616. It should be expected that the behavior as it exists today is the same as has always been for all stable WordPress releases. Those changes to escaping affected versions of the plugin between Gutenberg 7.0 and 7.4.

@aduth aduth added [Package] Editor /packages/editor [Type] Bug An existing feature does not function as intended labels Feb 27, 2020
@andrewserong
Copy link
Contributor Author

It should be expected that the behavior as it exists today is the same as has always been for all stable WordPress releases. Those changes to escaping affected versions of the plugin between Gutenberg 7.0 and 7.4.

That's great to know, this makes coming up with a fix feel a bit less urgent :)

And there may be cases that a user with the correct permissions makes a conscious choice to include entities in the title. Should we support that?

That's a great question! A possibly awkward thing to do would be to check the capability of the user that last edited the post before decoding, but that feels like a level of complexity that could be particularly fraught.

As for the number of characters affected, I'll investigate further next week and compare against the classic editor, but so far from taking a look through wp-includes/kses.php it looks like the issue is almost exclusively with the ampersand character being encoded to & and decoding that would (potentially) resolve any other characters.

Next week I'll have a play with decoding the title on post load. Dealing with user permissions feels a tricky one because the permissions of a user editing a post might be different from the permissions of someone who created the post. I'll give this some more thought, too.

Thanks for the extra insight @aduth!

@aduth
Copy link
Member

aduth commented Feb 28, 2020

As for the number of characters affected, I'll investigate further next week and compare against the classic editor, but so far from taking a look through wp-includes/kses.php it looks like the issue is almost exclusively with the ampersand character being encoded to & and decoding that would (potentially) resolve any other characters.

Next week I'll have a play with decoding the title on post load.

Part of the reason I was wondering about which characters are affected is in how that might impact how we choose to implement any decoding. For example, if it only impacts the ampersand, then maybe we don't need to use the full decodeEntities implementation, but instead a simpler replace( /&/g, '&' );.

@aduth
Copy link
Member

aduth commented Mar 13, 2020

  • How was this handled in the classic editor? Is there prior art we can lean on for reference?

Short answer: It wasn't 😅

Upstream (from 2009): https://core.trac.wordpress.org/ticket/11311

@aduth
Copy link
Member

aduth commented Mar 13, 2020

I've begun to explore a solution at #20887

@eduardozulian
Copy link

Related: #14178 and #19533

@xavivars
Copy link

xavivars commented May 20, 2021

This is creating quite a lot of confusion for less-technical people. They don't know what escaped html entities are, and seeing that the title changes with weird text suddenly is quite confusing for them.

@chrisvanpatten
Copy link
Member

Noting that this is also an issue for all block attributes, which are passed through kses and have ampersands replaced with & inside wp_kses_normalize_entities. This means that any non-editor user who inputs an ampersand into a block attribute via, say, a TextControl inside the block's inspector, is going to have that ampersand converted (and visible) as &:

image

(Image is of a Yoast SEO block, for illustrative purposes. I edited an article as an "author" role, which does not have the unfiltered_html capability.)

@Mamaduka
Copy link
Member

@andrewserong, I can no longer reproduce the original issue. So let's close this issue.

@chrisvanpatten
Copy link
Member

@Mamaduka I can still reproduce the issue.

@Mamaduka
Copy link
Member

@chrisvanpatten, for the Post Title or in components like TextControl?

@sabernhardt
Copy link
Contributor

Trac has a similar report today about the Site Editor (which probably should have its own issue).

@Mamaduka
Copy link
Member

@sabernhardt, that issue is fixed in the Gutenberg trunk, and I just marked PR for backporting into a minor release. Thanks for bringing this up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Package] Editor /packages/editor [Type] Bug An existing feature does not function as intended
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants