Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve reporting of redirects and excluded links #1312

Open
nathany opened this issue Dec 3, 2023 · 7 comments
Open

Improve reporting of redirects and excluded links #1312

nathany opened this issue Dec 3, 2023 · 7 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@nathany
Copy link

nathany commented Dec 3, 2023

Hi Matthias. I'm running lychee on a book with 574 total links reported, which is making it a little difficult to know which links I need to correct. The book is written with AsciiDoctor, so I first generated an HTML file. I don't mind that *.adoc isn't supported (#291) directly by lychee.

Redirects

Running lychee nightly (or rather, compile from source), the main issue I'm seeing is that there are a number of redirects, but it's difficult to tell which redirect belongs to which link.

~/src/github.com/lycheeverse/lychee/target/debug/lychee output/get-programming-with-go-manuscript.html -vv

redirect-nightly

The redirect to https://www.red-gate.com... is from the https://www.simple-talk.com link at the bottom of the screenshot shown, but they aren't tied together. I assume this is related to concurrency and the redirect being a debug log (#1117) rather than designed into the user experience.

Ideally I'd like a report of which links required a redirect along with the final URL they redirected to. That way I could easily search for the original links and update them. (the intermediate redirects being a debug log is fine, as that doesn't concern me unless troubleshooting)

Excluded

A related concern is the "6 Excluded" when looking at a list this large. If I scroll through the verbose version I can track them down, but there are hundreds of ✔ [200] lines that I want to ignore, even with a single -v. Even piping to grep doesn't help.

Most of the excluded in my example are email addresses, which is fine (2 repeated a few times). But one excluded link is:

? [EXCLUDED] https://gobyexample.com/ | Excluded

From the HTML <a href="https://gobyexample.com" class="bare">gobyexample.com</a>. I'm not entirely sure why. So that's the sort of thing I'd like listed out.

Suggestion

I think that the good links should only be shown with -vv and the corrections should be clearly shown with a single -v or even without.

Also may want to double check if there's an example.com default exclusion that's a little too aggressive.

@mre
Copy link
Member

mre commented Dec 3, 2023

All good suggestions. Changing the behavior of -v and -vv is a breaking change, but it should be fine as we haven't released 1.0 yet.
The redirects steps should be more easily traceable. Let's see if we can add the redirect source and destination to the output.
For excluded links, I think you could try different output formats (see --format) to choose a preferred style. We could add more styles.

example.com is excluded by default as it's usually used as a placeholder. The gobyexample.com exclusion is a bug. For now, you can explicitly include it (--include), which should take preference, or you include all example URLs.

@nathany
Copy link
Author

nathany commented Dec 3, 2023

Thanks.

FYI, I've tried a few variations of --include ^https://gobyexample.com and it still is excluded. I copied the link directly from the HTML to ensure there are no typos.

@mre
Copy link
Member

mre commented Dec 4, 2023

FYI, I fixed that issue in #1316. With the next release, it should just work as expected.

@mre
Copy link
Member

mre commented Jan 5, 2024

The release with the fix is out. So if you update to lychee 0.14.0, the false-positive example domains should be gone.

@mre
Copy link
Member

mre commented Jan 5, 2024

That leaves us with the following action items:

  • Implement a feature to link original URLs with their final redirect destinations in the output.
  • Possible change: Only show 200 OK links in -vv output. (TBD)

@nathany
Copy link
Author

nathany commented Jan 5, 2024

By "Only show 200 OK links in -vv output.", do you mean to not include 200 OK links in the -v output?

That would be great for tracking down the redirects without all the noise of good links.

@mre
Copy link
Member

mre commented Jan 5, 2024

Exactly.

@mre mre added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants