You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be nice if linkchecker users could specify allowed redirects that linkchecker wouldn’t return as warning.
2. Example of desired behavior
For example, Reddit add \?rdt=\d{5} to the end of each link, when I make any linkchecker request to the Reddit site. I get a warning like this:
linkchecker --recursion-level 0 https://www.reddit.com/r/webdev/comments/48z7jz/do_you_take_into_account_those_who_disable/d0nxftd/
URL `https://www.reddit.com/r/webdev/comments/48z7jz/do_you_take_into_account_those_who_disable/d0nxftd/'
Real URL https://www.reddit.com/r/webdev/comments/48z7jz/do_you_take_into_account_those_who_disable/d0nxftd/?rdt=37228
Check time 1.003 seconds
Warning [http-redirected] Redirected to
`https://www.reddit.com/r/webdev/comments/48z7jz/do_you_take_into_account_those_who_disable/d0nxftd/?rdt=37228'
status: 302 Found.
Result Valid: 200 OK
Reddit URL can be represented as a regular expression:
^(https:\/\/www\.reddit\.com.+)$
Reddit REAL URL regular expression:
^$1\?rdt=\d{5}$
where $1 — matches from Reddit URL.
We can add to linkcheckerrc the option like allowed-redirects:
If the redirected REAL URL matches the expression ^$1\?rdt=\d{5}$, Linkchecker will not return a warning as in this case:
URL `https://www.reddit.com/r/webdev/comments/48z7jz/do_you_take_into_account_those_who_disable/d0nxftd/'
Real URL https://www.reddit.com/r/webdev/comments/48z7jz/do_you_take_into_account_those_who_disable/d0nxftd/?rdt=12345
But if the redirected REAL URL doesn’t match the regular expression ^$1\?rdt=\d{5}$, as in the next case, Linkchecker will return a warning:
URL `https://www.reddit.com/r/webdev/comments/48z7jz/do_you_take_into_account_those_who_disable/d0nxftd/'
Real URL https://www.spam.site/redirect-to-spam-site
URL `https://ru.stackoverflow.com/a/544861/199934'
Real URL https://ru.stackoverflow.com/questions/537362/%d0%98%d0%ba%d0%be%d0%bd%d0%ba%d0%b8-%d1%84%d0%b0%d0%b9%d0%bb%d0%be%d0%b2-%d0%b2-%d1%81%d0%b0%d0%b9%d0%b4%d0%b1%d0%b0%d1%80%d0%b5-sublime-text-3/544861#544861
Warning:
URL `https://ru.stackoverflow.com/a/544861/199934'
Real URL https://ru.stackoverflow.com/this-answer-is-deleted
URL `https://toml.io/en/latest'
Real URL https://toml.io/en/v1.0.0
Warning:
URL `https://toml.io/en/latest'
Real URL https://tomlnewsite.com
4. Reasons why the feature is needed
In some cases, linkchecker users may find that it’s better not use redirected links. For my cases:
Reddit case — linkchecker users should spend time adding to Reddit links endings like ?rdt=37228. I just want to add links to Reddit to my site and don’t worry that every time I have to add something to them.
VK case — VK redirects the linkchecker request to the mobile version of the site — https://m.vk.com. I think that it’s better to leave a link to the desktop version of the site by default. VK automatically redirect users of mobile devices to the mobile version of the site from desktop version, but not automatically redirect users of desktop devices to the desktop version of the site from mobile version.
Stack Overflow case — I prefer to use links like https://ru.stackoverflow.com/a/544861/199934 because in this format they remain unchanged if the title of a Stack Overflow question changes. Stack Overflow editors may change question titles quite often, and redirected links may become non-valid.
TOML case — in my case, I wanted to link specifically to the latest version of the TOML specification — https://toml.io/en/latest. When the next version of TOML (2.0.0, 1.2.0 or 1.0.1) is released, the reference to version 1.0.0 will no longer be a reference to the latest version.
5. Not best ideas
5.1. Just use “ignorewarnings=http-redirected”
I think this is a very bad idea. If external site no longer works and its links lead to a spam site or to content that is different from the content that was there when the linkchecker user inserted the link, http-redirected carries important information to linkchecker users. In my case, through http-redirected I fixed about a hundred of outdated links on my site.
5.2. Just add your URL to “ignore=”
I don’t think that ignoring is the best idea. If I add the URL to the ignore list, I will no longer receive any errors and warnings about that URL. I won’t be aware of any problems with this URL.
Thanks.
The text was updated successfully, but these errors were encountered:
1. Summary
It would be nice if linkchecker users could specify allowed redirects that linkchecker wouldn’t return as warning.
2. Example of desired behavior
For example, Reddit add
\?rdt=\d{5}
to the end of each link, when I make any linkchecker request to the Reddit site. I get a warning like this:Reddit
URL
can be represented as a regular expression:Reddit
REAL URL
regular expression:where
$1
— matches from RedditURL
.We can add to
linkcheckerrc
the option likeallowed-redirects
:If the redirected
REAL URL
matches the expression^$1\?rdt=\d{5}$
, Linkchecker will not return a warning as in this case:But if the redirected
REAL URL
doesn’t match the regular expression^$1\?rdt=\d{5}$
, as in the next case, Linkchecker will return a warning:3. More examples of desired behavior
3.1. VK
No warning:
Warning:
3.2. Stack Overflow
No warning:
Warning:
3.3. TOML
No warning:
Warning:
4. Reasons why the feature is needed
In some cases, linkchecker users may find that it’s better not use redirected links. For my cases:
?rdt=37228
. I just want to add links to Reddit to my site and don’t worry that every time I have to add something to them.https://m.vk.com
. I think that it’s better to leave a link to the desktop version of the site by default. VK automatically redirect users of mobile devices to the mobile version of the site from desktop version, but not automatically redirect users of desktop devices to the desktop version of the site from mobile version.https://ru.stackoverflow.com/a/544861/199934
because in this format they remain unchanged if the title of a Stack Overflow question changes. Stack Overflow editors may change question titles quite often, and redirected links may become non-valid.https://toml.io/en/latest
. When the next version of TOML (2.0.0
,1.2.0
or1.0.1
) is released, the reference to version1.0.0
will no longer be a reference to the latest version.5. Not best ideas
5.1. Just use “ignorewarnings=http-redirected”
I think this is a very bad idea. If external site no longer works and its links lead to a spam site or to content that is different from the content that was there when the linkchecker user inserted the link,
http-redirected
carries important information to linkchecker users. In my case, throughhttp-redirected
I fixed about a hundred of outdated links on my site.5.2. Just add your URL to “ignore=”
I don’t think that ignoring is the best idea. If I add the URL to the ignore list, I will no longer receive any errors and warnings about that URL. I won’t be aware of any problems with this URL.
Thanks.
The text was updated successfully, but these errors were encountered: