Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check every is_robots_noindex usage to ensure no bugs #21341

Open
leonidasmi opened this issue Apr 26, 2024 · 0 comments
Open

Check every is_robots_noindex usage to ensure no bugs #21341

leonidasmi opened this issue Apr 26, 2024 · 0 comments

Comments

@leonidasmi
Copy link
Contributor

To understand the investigation that needs to happen, this is the current situation:

  • In the builder of the homepage indexable, we check the value of the WP-native blog_public setting to set the is_robot_noindex value of the homepage indexable
  • But that blog_public setting is an int, so no matter if the Discourage search engines from indexing this site is checked or not, the the is_robot_noindex of the homepage indexable will always be FALSE!
  • That is_robot_noindex value is the source of truth for showing index or noindex in the robots meta tag
  • We luckily don't show index for sites that have the Discourage search engines from indexing this site setting checked, solely because at the last moment we also filter on the is_robot_noindex value in the indexable presentation. And that filter is used if the blog_public setting is false-y, to set the robots to noindex. This works here because we cast the setting to string before checking
    • The really weird thing about this is that the first check in the builder of the homepage indexable used to work seamlessly (!), until we explicitly changed it way back.
  • Now, even though the above case is still working as expected, we still use the is_robot_noindex around our codebase. And considering that
    • the source of truth is invalid, if the Discourage search engines from indexing this site is checked
    • it's a pretty important part of our product

it is recommended a deep dive on this, to ensure we're covered in all cases where is_robot_noindex is used.

If nothing else, we probably need to change the homepage indexable handling, so as to start storing the right is_robot_noindex value (and also consider an upgrade routine to remedy past mishaps).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant