Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty target column in database #266

Open
Zodiac1978 opened this issue Jul 16, 2023 · 9 comments
Open

Empty target column in database #266

Zodiac1978 opened this issue Jul 16, 2023 · 9 comments

Comments

@Zodiac1978
Copy link
Member

Describe the bug
A user in the support forums reported that his website has empty target columns in the database:
https://wordpress.org/support/topic/beste-inhalte-2/ (German)

What can cause this issue? Any idea?

Do I need to ask more about the environment, then please add a comment?

@Zodiac1978
Copy link
Member Author

The website has no redirect from http to https. Can this lead to this anomaly?

@stklcode
Copy link
Contributor

stklcode commented Jul 16, 2023

I don't see any reason why this should not The target might be empty, if there is no path in the URI, e.g. https://www.example.com was requested and not https://www.example.com/

While almost all browsers do, not all webservers enforce a / for empty paths. So the request might not be issued by humans, but we can't tell this for sure.

The target website does not seem to use JS tracking, so we use $_SERVER['REQUEST_URI']:

if ( isset( $_SERVER['REQUEST_URI'] ) ) {
$target = filter_var( wp_unslash( $_SERVER['REQUEST_URI'] ), FILTER_SANITIZE_URL );
}

And we just have a fallback for failed sanitization, not for empty content:

if ( is_null( $target ) || false === $target ) {
$target = '/';
}

(which I don't thinks is really elegant, because the JS-content might contain any kind of bogus data which fails sanitization, so I'd personally prefer null here and just don't count those requests for top lists - but that's not the point here)

But a few lines below we should opt-out on empty or invalid targets:

if ( empty( $target ) || ! wp_validate_redirect( $target, false ) ) {
return self::_jump_out( $is_snippet );
}

@Zodiac1978
Copy link
Member Author

Zodiac1978 commented Jul 16, 2023

As there is a "/" also in the list, I think it is a request from http which is empyting the referrer because of the default Referrer-Policy set to strict-origin-when-cross-origin which is not sending the referrer in this case:

Don't send the Referer header to less secure destinations (HTTPS→HTTP).

@stklcode
Copy link
Contributor

We are not talking about an empty referrer, but an empty target... 🤔

@Zodiac1978
Copy link
Member Author

Oh, I mixed this up ... sorry, you are right.

@Zodiac1978
Copy link
Member Author

The target might be empty, if there is no path in the URI, e.g. https://www.example.com/ was requested and not https://www.example.com/

But the mentioned website redirects from without to the one with trailing slash ...

@stklcode
Copy link
Contributor

Only path that comes to my mind would be a valid URI like https://example.com/ with Home URL / and $wp_rewrite->use_trailing_slashes === false, i.e. this call

$data['target'] = user_trailingslashit( str_replace( home_url( '/', 'relative' ), '/', $target ) );

with user_trailingslashit/)

	if ( $wp_rewrite->use_trailing_slashes ) {
		$url = trailingslashit( $url );
	} else {
		$url = untrailingslashit( $url );
	}

calling untrailingslashit().

No idea how to achieve this 🤷

@stklcode
Copy link
Contributor

stklcode commented Jul 16, 2023

But the mentioned website redirects from without to the one with trailing slash ...

No. Likely your browser does, but a call without slash with a slightly more primitive client gives a 200 response immediately:

$ curl -i https://**********.de
HTTP/2 200 
x-ua-compatible: IE=edge
link: <https://**********.de/wp-json/>; rel="https://api.w.org/", <https://**********.de/wp-json/wp/v2/pages/150>; rel="alternate"; type="application/json", <https://**********.de/>; rel=shortlink
cache-control: max-age=0
expires: Sun, 16 Jul 2023 17:04:45 GMT
vary: Accept-Encoding
content-type: text/html; charset=UTF-8
date: Sun, 16 Jul 2023 17:04:45 GMT
server: Apache

<!DOCTYPE html>
[...]

So here is no client-side redirection, but in general you are right, the path should always be root, i.e. /. Highly likely something else messed up the path in the process. Probably temporarily due to some reconfiguration.

I would recommend observing the behavior and if it does not grow, just ignore or remove the 31 entries.


And there seems to be a 301 redirection from HTTPS to HTTP now, maybe added after your advice:

$ curl -i http://**********.de 
HTTP/1.1 301 Moved Permanently
Date: Sun, 16 Jul 2023 17:08:04 GMT
Server: Apache
X-UA-Compatible: IE=edge
Expires: Sun, 16 Jul 2023 18:08:04 GMT
Cache-Control: max-age=3600
X-Redirect-By: WordPress
Upgrade: h2,h2c
Connection: Upgrade
Location: https://**********.de/
Content-Length: 0
Content-Type: text/html; charset=UTF-8

@Zodiac1978
Copy link
Member Author

Zodiac1978 commented Jul 16, 2023

And there seems to be a 301 redirection from HTTPS to HTTP now, maybe added after your advice:

This does WordPress if the Site/Home is set to https, for pages WP doesn't do this by default:
http://example.de/impressum/ still does not redirect to https.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants