Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request(plugin): requests from another location and using another tools #790

Open
Kristinita opened this issue Jan 21, 2024 · 0 comments

Comments

@Kristinita
Copy link

Kristinita commented Jan 21, 2024

1. Summary

If a linkchecker user gets an error when requesting an external site, it would be nice to be able to make a new request from another geolocation and using another tool.

2. Example of desired behavior

Example with using https://httpstatus.io/:

  1. linkchecker user add to linkcheckerrc:

    # [INFO] Adding the HttpStatus plugin
    [HttpStatus]
  2. linkchecker user make a request:

    1. if this is a request for an internal link, nothing additional needs to be done.

    2. elif the request for an external link return status code 200 or a warning, also, nothing additional needs to be done.

    3. else the request for an external link return error, linkchecker make a request to the same URL via HttpStatus API:

      1. if HttpStatus request also return error, linkchecker return error.
      2. else HttpStatus request return status code 200, linkchecker also return status code 200.

3. Reasons why the feature is needed

  1. linkchecker requests have limitations. For example, linkchecker doesn’t bypass CloudFlare-protected URLs. linkchecker depends on the requests library, that doesn’t support HTTP/2 requests (perhaps it’s time to migrate from requests to HTTPX for linkchecker). Requests from third-party services may not have linkchecker limitations.

  2. Why it would be nice to have requests from other locations:

    1. Local requests:

      1. For example, my country has strong internet censorship. Users are forced to use VPN, proxy or other means of bypassing site blocking. In turn, some external sites block requests made using VPN/proxies.
      2. External sites may block requests originating from a specific geolocation. For example, probably the most Ukrainian websites block requests from my country.
    2. CI requests — users can make linkchecker requests to external links from continuous integration services (for example, I make requests from Travis and AppVeyor). External sites can block requests made from IP of Travis or AppVeyor, but not block requests from IP of HttpStatus.

4. Examples of successful use of HttpStatus

4.1. KiraLinksList.txt

# LinkChecker URL list
https://www.alib.ru/
https://gramota.ru/biblioteka/spravochniki/slovar-trudnostey
https://www.imdb.com/name/nm6626184/
https://musicseasons.org/author/vladimir-kotihov/
https://www.nasonline.org/publications/biographical-memoirs/memoir-pdfs/nagel-ernest.pdf
https://www.olx.ua/uk/
https://sustainablefinancelab.nl/en/persoon/arjo-klamer/
https://wombat.org.ua/AByteOfPython/AByteofPythonRussian-2.02.pdf
https://www.wzb.eu/en/persons/hans-dieter-klingemann

4.2. Behavior

4.2.1. Desired — HttpStatus

Google spreadsheet link:

HttpStatus requests for KiraLinksList.txt — status 200

4.2.2. Non-desired — linkchecker

Requests from Russian IP-address with antizapret-tcp.ovpn configuration file for OpenVPN.

linkchecker --check-extern --output html --no-status KiraLinksList.txt

Linkchecker requests for KiraLinksList.txt — errors 1

Linkchecker requests for KiraLinksList.txt — errors 2

5. Another solutions

I’m not asking for support specifically for the HttpStatus service. If you know other methods so that linkchecker requests from any geolocation would return status code 200 for links from KiraLinksList.txt, that would also be nice.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant