New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a mode to handle "pretty URLs", i.e. URIs without .html
extension
#1422
base: master
Are you sure you want to change the base?
Add a mode to handle "pretty URLs", i.e. URIs without .html
extension
#1422
Conversation
c2b561e
to
7fe842d
Compare
Hmm. Most of the CI failures (all except one) appear outside the code touched by this PR. Help? |
db77418
to
b61ba55
Compare
b61ba55
to
64acc73
Compare
In general, I like the idea. However, I'd love to make it more generic, i.e. cover more use-cases. --check-extensions <EXTS>
Test the specified file extensions for URIs when checking files locally.
Multiple extensions can be separated by commas. Extensions will be checked in order of appearance.
Example: --check-extensions html,htm,php,asp,aspx,jsp,cgi Any thoughts? |
1c40f0a
to
3b3caa1
Compare
@mre done! |
Sounds like a useful addition. I agree that we can add this more generic approach to lychee |
lychee-lib/src/client.rs
Outdated
// if the path has no file extension, try to append some | ||
let mut path_buf = path.clone(); | ||
let mut found = false; | ||
for ext in &self.check_extensions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the found
flag and manually looping and breaking is not too nice. Instead try to use .iter
and find_map
.
Maybe similar to the following (didn't test it):
let check_extension_exists = |ext: &str| { // could also be a function
let mut path_buf = path.clone();
path_buf.set_extension(ext);
if path_buf.exists() {
Some(path_buf)
} else {
None
}
};
self.check_extensions.iter()
.find_map(check_extension_exists)
.map(|path_buf| {
if self.include_fragments {
self.check_fragment(&path_buf, uri).await
} else {
Status::Ok(StatusCode::OK)
}
})
.next()
.unwrap_or_else(|| ErrorKind::InvalidFilePath(uri.clone()).into());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomas-zahner correct me if I am wrong, but don't we want to stop at the first found extension? The suggested map()
would not stop there, and in the best case do unnecessary work, in the worst case do unwanted work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is my proposed code to address your concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually iterators are lazy and run only when it's really necessary. This is described in the rust book.
For example we can convert an infinite range of numbers into an iterator, map the numbers and take just a few of the numbers. Then the map only executes as many times as we take a result. In my code snippet above the next()
consumes just one element so that the map only runs once.
let iter = (0..).into_iter().map(|x| x * 2).take(5);
println!("{:?}", iter.collect::<Vec<_>>());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually iterators are lazy and run only when it's really necessary. This is described in the rust book.
For example we can convert an infinite range of numbers into an iterator, map the numbers and take just a few of the numbers. Then the map only executes as many times as we take a result. In my code snippet above the
next()
consumes just one element so that the map only runs once.let iter = (0..).into_iter().map(|x| x * 2).take(5); println!("{:?}", iter.collect::<Vec<_>>());
Okay, I didn't know that :-)
So now I'm torn: the code you showed that uses a map()
looks less obvious than I'd like it to be, certainly more complicated than equivalent constructs in languages that I'm more familiar with than Rust. It may very well be more idiomatic Rust code than the for
loop code that is in the current version of this PR. However, the latter version is much easier to parse and grok for me. So I am torn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, maybe my proposed iter solution is not perfect but normally iter is to be preferred over for loops. Maybe the iter solution could be simplified further. @mre what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I also prefer iterators, but in this case in particular, I'd lean towards the for-loop because of the early return. Now that the found
flag got removed, it's gotten much more readable already. So I'm fine with the current version.
a17cb4a
to
b3742fb
Compare
In many circumstances (GitHub Pages, Apache configured with MultiViews, etc), web servers process URIs by appending the `.html` file extension when no file is found at the path specified by the URI but a `.html` file corresponding to that path _is_ found. To allow Lychee to use the fast, offline method of checking such files locally via the `file://` scheme, let's handle this scenario gracefully by adding the `--check-extensions=html` option. Note: This new option can take a list of file extensions to use; The first one for which a corresponding file is found is then used. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
b3742fb
to
976ba4b
Compare
In many circumstances (GitHub Pages, Apache configured with MultiViews, etc), web servers process URIs by appending the
.html
file extension when no file is found at the path specified by the URI but a.html
file corresponding to that path is found.To allow Lychee to use the fast, offline method of checking such files locally via the
file://
scheme, let's handle this scenario gracefully by adding the--auto-append-html-fileext
option.This is especially nice to have when migrating sites that have been run on, say, a web app, to a static site generated by Hugo, as I am trying to do with Git's home page.