Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: move from notify v4 to notify-debouncer-full v0.3 #2503

Merged
merged 1 commit into from
May 22, 2024

Conversation

iamorphen
Copy link
Contributor

@iamorphen iamorphen commented May 20, 2024

Introduction

Closes #2077 , unblocks #2498 .

This updates our notify dependency from v4 to notify-debouncer-full v0.3. The newer notify libraries behave significantly differently from the version zola currently uses, so we need to introduce more logic to generally preserve the current end-user experience behind the scenes. The most significant difference is that serve now needs to react to a Vec<DebouncedEvent> instead of a single DebouncedEvent.

Code Changes

  1. Use notify-debouncer-full, which adds debouncing functionality peripheral to the newer notify v6. This functionality was built directly into v4.
  2. Map fine-grained notify file system event kinds into our own local abstraction for ease of use.
  3. Introduce get_relevant_event_kind to map the vector of notify file system events to our abstraction and filter out events we aren't interested in. This moves previously inlined filtering logic into its own function. The new function lives in a new library, components/fs_event_utils.
  4. Sort the incoming events by timestamp, descending. Add the events into a hash map where the key is the event path. This way we keep only the latest event for each unique path. This helps in cases where editors like Vim raise multiple events (which don't seem to be debounced) on a single path.
  5. Munge the collection in (4) into a new map that bins changes by change type.
  6. Iterate over the change map from (5) to process the group of events. This means we can make multiple changes to the site in rapid succession if the user changes a lot of files within the debounce period.
  7. reload_sass and reload_templates now pass an absolute path to the sass and templates directories, respectively, to rebuild_done_handling to participate in livereload. To support this, the Site struct now stores the sass and template directories as fields.

Testing

Unit Tests

This MR introduces a new unit test to test the new internal support function listed above. Run with cargo test --package fs_event_utils.

Integration Testing

I created the following script to rapidly modify test_site.

Expand for script.
# Config
echo " " >> test_site/config.toml
echo " " >> test_site/config.toml

# Content
echo " " >> test_site/content/root-page-1.md
echo " " >> test_site/content/posts/fixed-slug.md

# Templates
echo " " >> test_site/templates/index.html
echo " " >> test_site/templates/page.html

# Static files
echo " " >> test_site/static/site.css
echo " " >> test_site/static/scripts/hello.js

# SASS
echo " " >> test_site/sass/blog.scss
echo " " >> test_site/sass/scss.scss

# Themes
echo " " >> test_site/themes/sample/theme.toml
echo " " >> test_site/themes/sample/sass/sample.scss

Here is the result of running the test on zola built on next.

Expand for results on next.
Change detected @ 2024-05-19 20:15:53
-> Config changed. The browser needs to be refreshed to make the changes visible.
Checking all internal links with anchors.
> Successfully checked 0 internal link(s) with anchors.
-> Creating 35 pages (1 orphan) and 12 sections
Done in 87ms.

Change detected @ 2024-05-19 20:15:54
-> Content changed /Users/orphen/code/zola/test_site/content/root-page-1.md
Checking all internal links with anchors.
> Successfully checked 0 internal link(s) with anchors.
-> Creating 35 pages (1 orphan) and 12 sections
Done in 46ms.

Change detected @ 2024-05-19 20:15:54
-> Content changed /Users/orphen/code/zola/test_site/content/posts/fixed-slug.md
Checking all internal links with anchors.
> Successfully checked 0 internal link(s) with anchors.
-> Creating 35 pages (1 orphan) and 12 sections
Done in 45ms.

Change detected @ 2024-05-19 20:15:54
-> Template changed /Users/orphen/code/zola/test_site/templates/index.html
Reloading only template
Done in 20ms.

Change detected @ 2024-05-19 20:15:54
-> Template changed /Users/orphen/code/zola/test_site/templates/page.html
Reloading only template
Done in 20ms.

Change detected @ 2024-05-19 20:15:54
-> Static file changed /Users/orphen/code/zola/test_site/static/site.css
Done in 0ms.

Change detected @ 2024-05-19 20:15:54
-> Static file changed /Users/orphen/code/zola/test_site/static/scripts/hello.js
Done in 0ms.

Change detected @ 2024-05-19 20:15:54
-> Sass file changed /Users/orphen/code/zola/test_site/sass/blog.scss
Done in 0ms.

Change detected @ 2024-05-19 20:15:54
-> Sass file changed /Users/orphen/code/zola/test_site/sass/scss.scss
Done in 0ms.

Change detected @ 2024-05-19 20:15:54
-> Themes changed.
Checking all internal links with anchors.
> Successfully checked 0 internal link(s) with anchors.
-> Creating 35 pages (1 orphan) and 12 sections
Done in 44ms.

Change detected @ 2024-05-19 20:15:54
-> Themes changed.
Checking all internal links with anchors.
> Successfully checked 0 internal link(s) with anchors.
-> Creating 35 pages (1 orphan) and 12 sections
Done in 45ms.

The config changes are rolled into a single update, and the content, template, static, sass, and theme changes are all processed individually (2 changes each).

Cumulative time to update is ~310 ms.

Here is the result of running the test on this feature branch.

Expand for test result on this branch.
Change detected @ 2024-05-20 09:10:48
-> Template file(s) changed /Users/orphen/code/zola/test_site/templates/index.html, /Users/orphen/code/zola/test_site/templates/page.html
Reloading only template
Done in 49ms.

Change detected @ 2024-05-20 09:10:48
-> Static file changed /Users/orphen/code/zola/test_site/static/scripts/hello.js
-> Static file changed /Users/orphen/code/zola/test_site/static/site.css
Done in 0ms.

Change detected @ 2024-05-20 09:10:48
-> Themes changed.
Checking all internal links with anchors.
> Successfully checked 0 internal link(s) with anchors.
-> Creating 35 pages (1 orphan) and 12 sections
Done in 54ms.

Change detected @ 2024-05-20 09:10:48
-> Sass file(s) changed /Users/orphen/code/zola/test_site/sass/blog.scss, /Users/orphen/code/zola/test_site/sass/scss.scss
Done in 1ms.

Change detected @ 2024-05-20 09:10:48
-> Config changed. The browser needs to be refreshed to make the changes visible.
Checking all internal links with anchors.
> Successfully checked 0 internal link(s) with anchors.
-> Creating 35 pages (1 orphan) and 12 sections
Done in 43ms.

Change detected @ 2024-05-20 09:10:48
-> Content changed /Users/orphen/code/zola/test_site/content/posts/fixed-slug.md
Checking all internal links with anchors.
> Successfully checked 0 internal link(s) with anchors.
-> Creating 35 pages (1 orphan) and 12 sections
-> Content changed /Users/orphen/code/zola/test_site/content/root-page-1.md
Checking all internal links with anchors.
> Successfully checked 0 internal link(s) with anchors.
-> Creating 35 pages (1 orphan) and 12 sections
Done in 87ms.

(The order of events is different, subject mostly to OS event scheduling and whatever the backing notify library is at the mercy of.) Template changes are batched into a single update, sass changes are batched into a single update, themes changes are batched into a single update, the config changes are still rolled into 1, and the static files and content are processed individually (looped over).

Cumulative time to update is ~235 ms. This single test is by no means a reliable profiling, but the results are not surprising, since we perform fewer operations overall relative to current behavior on next.

Manual Tests

I modified my own personal site using zola built from this branch; in particular, I edited files with Vim. This is what revealed the need for debounce_events.

Alternatives

It is possible we perform multiple site rebuilds in one debounce period, for example by modifying the config and modifying the theme. I investigated adding an optimization that would skip all remaining event processing in a debounce period if some change requires a site rebuild. While doable, this handling greatly complicates the logic in my opinion.

First, we can no longer simply match on ChangeKind because we would need to prioritize handling of particular events. This is mostly done by a branch-order-matters if that first checks for config changes, then template changes, then etc. In the case that we don't do a site rebuild in a debounce period, we have to make sure we correctly apply all required updates to account for each change.

If we do a site rebuild, we also need a clean way to end event processing early. We can do this with labeled match cases and break to those labels, but we'd also need to account for printing the time taken to apply site changes. Time-tracking and breaking at various points during the event handling makes understanding control flow more difficult.

To keep the user-facing experience similar with what's on master, we would also need to let the end-user know which files we observed changing, and the reporting of such becomes non-trivial (ex: if config changed, print usual config-changed message and then list all other files; but if templates changed, print usual templates-changed message and then list all other files; but if you don't rebuild the site and only do partial updates , then...).

I don't know whether end users are often applying many site changes in a single debounce window. I personally do not. I don't think pursuing this optimization is worth it from an ROI-to-end-user and maintainability perspective.

@iamorphen iamorphen marked this pull request as ready for review May 20, 2024 01:38
console::info(&msg);
rebuild_done_handling(
&broadcaster,
compile_sass(&site.base_path, &site.output_path),
&partial_path.to_string_lossy(),
&site.sass_path.to_string_lossy(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Keats , a question on this change and a similar one in reload_templates. Previously we passed partial paths here, but now I'm passing an absolute path to the root of the sass directory. I couldn't find (or resolve, for some reason) protocol docs for livereload v7. Is this change okay? Functionally it seems to work.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea. If it works it's probably fine, i'll try it locally.

src/cmd/serve.rs Outdated
assert!(
event.event.paths.len() == 1,
"Didn't expect notify event with multiple paths."
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either get_relevant_event_kind should make sure of that or we should return an error rather than a panic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. It now prints an error and skips that event.

src/cmd/serve.rs Outdated

// Map of change_kind -> (partial_path, full_path, event_kind).
let changes: HashMap<ChangeKind, Vec<(PathBuf, &PathBuf, &SimpleFSEventKind)>> =
map_events_to_change_map(&filtered_events, &root_dir, &config_path);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extract the whole logic of finding and sorting/filtering events to a separate function so it's easy to test and doesn't take space in the actual loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your time. Before I start, to confirm, are you asking about moving only the new code in this merge request or also existing code like is_ignored_file, is_temp_file, etc. participating in the filtering?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might make sense to move everything to a new file? But no strong feeling about it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. In that case, I will focus on just my changes for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

ChangeKind::Templates => {
let partial_paths: Vec<&PathBuf> =
change_group.iter().map(|(p, _, _)| p).collect();
let full_paths: Vec<&PathBuf> =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we do get partial and full paths?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your question correctly, we get full paths from the incoming events from notify, and we get partial paths from detect_change_kind. In the implementation on next, check the partial paths to determine whether shortcodes were modified. We pass the full paths to reload_templates. I'm not sure if the latter is necessary, so just following what I see on next.

src/cmd/serve.rs Outdated

/// Return a map storing all events of a like `ChangeKind` behind one key.
/// Each element maps `change_kind -> (partial_path, full_path, event_kind)`.
fn map_events_to_change_map<'a>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move all that code to another file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've introduced components/fs_event_utils. I took a guess at where to put code. If there's a better place, please advise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put it in the root component, next to cli.rs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

src/cmd/serve.rs Outdated
/// with the latest `Instant` value.
fn debounce_events(
events: &Vec<(PathBuf, SimpleFSEventKind, Instant)>,
) -> Vec<(PathBuf, SimpleFSEventKind)> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the events are ordered (or that can will order them), we could just use a hashmap with the path as the key and insert them in order

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks. I took the liberty to use a hash map earlier on, and that allowed me to remove two of the functions I introduced. Upside is that the diff is smaller, downside is that we lost some test guarantees because some logic is inlined into serve.

@iamorphen iamorphen force-pushed the feat/orphen/integrate-debouncer-full branch from 4e7a725 to f26fd10 Compare May 21, 2024 12:00
Copy link
Contributor Author

@iamorphen iamorphen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the description to match the implementation of the latest update. I re-ran my manual test and confirmed the same behavior.

ChangeKind::Templates => {
let partial_paths: Vec<&PathBuf> =
change_group.iter().map(|(p, _, _)| p).collect();
let full_paths: Vec<&PathBuf> =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your question correctly, we get full paths from the incoming events from notify, and we get partial paths from detect_change_kind. In the implementation on next, check the partial paths to determine whether shortcodes were modified. We pass the full paths to reload_templates. I'm not sure if the latter is necessary, so just following what I see on next.

src/cmd/serve.rs Outdated
assert!(
event.event.paths.len() == 1,
"Didn't expect notify event with multiple paths."
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. It now prints an error and skips that event.

src/cmd/serve.rs Outdated

// Map of change_kind -> (partial_path, full_path, event_kind).
let changes: HashMap<ChangeKind, Vec<(PathBuf, &PathBuf, &SimpleFSEventKind)>> =
map_events_to_change_map(&filtered_events, &root_dir, &config_path);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

src/cmd/serve.rs Outdated
/// with the latest `Instant` value.
fn debounce_events(
events: &Vec<(PathBuf, SimpleFSEventKind, Instant)>,
) -> Vec<(PathBuf, SimpleFSEventKind)> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks. I took the liberty to use a hash map earlier on, and that allowed me to remove two of the functions I introduced. Upside is that the diff is smaller, downside is that we lost some test guarantees because some logic is inlined into serve.

src/cmd/serve.rs Outdated

/// Return a map storing all events of a like `ChangeKind` behind one key.
/// Each element maps `change_kind -> (partial_path, full_path, event_kind)`.
fn map_events_to_change_map<'a>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've introduced components/fs_event_utils. I took a guess at where to put code. If there's a better place, please advise.

@iamorphen iamorphen force-pushed the feat/orphen/integrate-debouncer-full branch from f26fd10 to cf386fe Compare May 22, 2024 06:52
@Keats
Copy link
Collaborator

Keats commented May 22, 2024

LGTM. I'll make some changes to the code org later but it works fine locally. Thanks a lot!

@Keats Keats merged commit e7bbd3f into getzola:next May 22, 2024
5 checks passed
@iamorphen
Copy link
Contributor Author

Thanks Keats!

@iamorphen iamorphen deleted the feat/orphen/integrate-debouncer-full branch May 22, 2024 22:00
@Keats
Copy link
Collaborator

Keats commented May 23, 2024

Here's the commit i've added: b693e62
Just moving more things to that new file to try to keep the serve.rs reasonable

@iamorphen
Copy link
Contributor Author

Nice, thank you for the context and refactor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants