Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add back/forward cache NotRestoredReasons #9360

Merged
merged 42 commits into from Feb 21, 2024

Conversation

rubberyuzu
Copy link
Contributor

@rubberyuzu rubberyuzu commented May 30, 2023

This PR adds Back/forward cache NotRestoredReasons interface, adds it to document state, and sets the appropriate value. Explainer is here.

(Sorry for lots of unnecessary commits - I used VSCode for editing the source file, and it automatically added and removed parts of the file due to the large size and messed up rebase. Later I switched to vim and didn't have the problem.)

Please take a look @rakina @domenic @fergald @smaug----


/acknowledgements.html ( diff )
/browsing-the-web.html ( diff )
/document-lifecycle.html ( diff )
/dom.html ( diff )
/index.html ( diff )
/infrastructure.html ( diff )
/nav-history-apis.html ( diff )

@domenic
Copy link
Member

domenic commented May 30, 2023

Hi @rubberyuzu,

You seem to have opened 4 separate draft pull requests. Usually, we have people wait to open a pull request until the feature is ready, and once it is ready, they only open one.

Would you be willing to close some of these? Giving all the 672 watchers of these repositories updates on every PR can be a bit confusing.

@rubberyuzu
Copy link
Contributor Author

Hi Domenic, thanks for pointing this out! I created the draft PRs just to generate HTML diff files so I didn't need most of them open, so I closed three of them and left this one open.
(If there's any good way to produce HTML diff when creating PRs in private repo, please let me know.)

@domenic
Copy link
Member

domenic commented May 30, 2023

You can create diffs using either the Git command line tools, or the GitHub web UI. For an example of the latter, see main...rubberyuzu:html:build-nrr-document (click on the "Files changed" tab).

@rakina
Copy link
Member

rakina commented May 30, 2023

Sorry @domenic, I was the one that suggested her to open a draft PR, to get the auto-generated HTML diff preview (not the source diff) to make it easier for reviewers of @rubberyuzu's PR in her own repos. Didn't realize that it's visible in the PR lists still and also updates all watchers :(

Is it ok to keep just 1 draft PR here for this purpose (if there's a way to make it not noisy to watchers)? If not probably we'll just go with uploading the generated HTML in @rubberyuzu's PRs manually instead. Sorry for the noise!

@domenic
Copy link
Member

domenic commented May 30, 2023

No worries! One draft PR seems fine.

@rubberyuzu rubberyuzu changed the title Build NotRestoredReasons for page Add back/forward cache NotRestoredReasons Jun 1, 2023
@rubberyuzu rubberyuzu marked this pull request as ready for review June 1, 2023 08:13
Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overall looks very solid. I have a lot of comments about minor improvements; the bigger ones are around ObservableArray manipulation, and fixing the builder algorithms to actually do something with the created object.

I would encourage you to spend a bit more time on source formatting if possible, matching the conventions of the rest of the file. Perhaps a good place to start would be turning off whatever wrapping your editor is doing, and instead using https://domenic.github.io/rewrapper/ or https://github.com/domfarolino/specfmt . But we can straighten that out eventually.

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@rubberyuzu
Copy link
Contributor Author

Thanks for the review Domenic! I think I addressed all of your comments. PTAL again :)

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just editorial things left!

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@rubberyuzu
Copy link
Contributor Author

Thanks! I think I addressed all the comments. PTAL again

Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit (6925f59) with some final editorial and source formatting fixes. Take a look, but at this point, no editorial issues remain.

However, I added some comments on the reason strings themselves. And, I spotted one potentially big issue. Which is, it is weird to store a NotRestoredReasons object in document state.

The point of document state is that it is something used to re-create new Documents when old ones get evicted from bfcache. But when you evict a Document, you need to also delete all the JavaScript/IDL objects associated with that Document. Keeping a NotRestoredReasons around breaks this.

In other words, as you've currently specified it, the NotRestoredReasons object is from the old Document (and thus old Window). This means you'll get funny results like the following:

const nrr = navigationTimingEntry.notRestoredReasons;

nrr instanceof NotRestoredReasons; // false, because it's an instance of the *old* Window's NotRestoredReasons

const w = nrr.__proto__.__proto__.constructor.constructor('return this')();
// w is now *the old Window*
// w.document is *the old Document*

This kind of sucks. You need to store the equivalent information in the document state, but then actually create the NotRestoredReasons object from it later. (I guess in some navigation timing spec.)

I have some ideas on how to do this. But first I need to understand your intentions for what realms/globals/Windows the created NotRestoredReasons objects should belong to.

In particular, should the following be true?

navigationTimingEntryFromTop.notRestoredReasons.children[0] === navigationTimingEntryFromChild.notRestoredReasons

// and in this case we'd create each notRestoredReasons in its corresponding Window:

navigationTimingEntryFromTop.notRestoredReasons   instanceof top.NotRestoredReasons &&
navigationTimingEntryFromChild.notRestoredReasons instanceof frames[0].NotRestoredReasons

This obviously cannot be true for the un-masked cross-origin iframe case. (Because navigationTimingEntryFromChild is in a different process vs. navigationTimingEntryFromTop) So maybe it should never be true? Or maybe it's better to reuse the objects when you can, only creating a new object graph when you go to another process.

The alternative is that you use different objects containing the same information. Then

navigationTimingEntryFromTop.notRestoredReasons.children[0] !== navigationTimingEntryFromChild.notRestoredReasons

// and in this case each tree of NotRestoredReasons would be separately created in
// the window of that tree's root:

navigationTimingEntryFromTop.notRestoredReasons             instanceof top.NotRestoredReasons &&
navigationTimingEntryFromTop.children[0].notRestoredReasons instanceof top.NotRestoredReasons &&
navigationTimingEntryFromChild.notRestoredReasons           instanceof frames[0].NotRestoredReasons

Let me know which of these is preferred (and ideally web platform tested), and I can give suggestions on how to spec that.

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@rubberyuzu
Copy link
Contributor Author

rubberyuzu commented Jun 22, 2023

Thanks for the comment!
Right, it makes sense that notRestoredReasons should not belong to DocumentState given its lifecycle. We can maybe keep the reasons in Document or DocumentState initially, and then move it to SessionHistory (or something else that remains after document destruction) when the document gets destroyed or when document gets restored. I think notRestoredReasons should belong to whatever realm/globals/Window NavigationTimingEntry belongs to.

As for your code example, navigationTimingEntryFromChild.notRestoredReasons will always be null because notRestoredReasons only gets attached to top-level main frame.

Please let me know what you think.

@domenic
Copy link
Member

domenic commented Jun 23, 2023

As for your code example, navigationTimingEntryFromChild.notRestoredReasons will always be null because notRestoredReasons only gets attached to top-level main frame.

Ah, right, that makes this easier! OK then. Here is my suggested strategy.

Define a struct called "not restored reasons". It has a src, id, name, URL, reasons, and children. Besides being a struct instead of an interface, its fields have the following differences:

  • Its URL field should be a URL
  • Its reasons is a list of strings, instead of an observable array.
  • Its children is a list of not restored reasons structs, instead of an observable array of NotRestoredReasons objects.

(You'll need to go back to defining the reasons and children items explicitly.) Some example struct definitions in the spec are https://html.spec.whatwg.org/#coop-enforcement-result and https://html.spec.whatwg.org/#source-snapshot-params .

Now basically convert all the work you've done to operate on this not restored reasons struct. It's fine to store that in document state (since it's realm-independent). You can build it in basically the same way.

Now we need to redefine the NotRestoredReasons interface. Update it to only have one private member: a not restored reasons. (I'm unsure on a good name for this member. Maybe "backing struct", or maybe just "not restored reasons". Up to you.) And then update all its non-ObservableArray getters to return the corresponding item from the backing struct. For the url getter, you'll need to have it serialize the URL.

To handle the observable array items, add a small wrapper algorithm, "create a NotRestoredReasons object", given a not restored reasons struct and a realm. This algorithm creates a new NotRestoredReasons, sets its backing struct, and then copies the reasons and children across into the reasons and children backing lists. For reasons, you can just extend the backing list. For children, you'll need to loop and append, at each step recursively calling "create a NotRestoredReasons object" to convert from the not restored reasons struct to a NotRestoredReasons object.

OK, we're almost done. The final step is to be clear on how we'll specify navigationTimingEntry.notRestoredReasons. You'd define that NavigationTimingEntry objects have a notRestoredReasons private value, which starts null. Then update create a navigation timing entry to fill it in using "create a NotRestoredReasons object" given the document's node navigable's active session history entry's document state's not restored reasons. Then, the getter steps for notRestoredReasons can return the notRestoredReasons private value.

All this ensures that we create the NotRestoredReasons object once per document, while the persistent not restored reasons structs can stick around in session history as necessary.

I hope this makes sense! I'm sorry we spotted this so late. Hopefully it's not too much of a pain to change.

@rubberyuzu
Copy link
Contributor Author

Thanks Domenic for the suggestion! I have a question:

then update all its non-ObservableArray getters to return the corresponding item from the backing struct

The new NotRestoredReasons interface only has a not restored reasons struct as its member right? Even so does it still define getters for not restored reasons' items (such as url and src)?
Please let me know if there is any existing example like this.

Thanks in advance!

@domenic
Copy link
Member

domenic commented Jun 26, 2023

Yes, that's right. You would define them roughly like the following:

The src getter steps are to return this's backing struct's src.

@rubberyuzu
Copy link
Contributor Author

Thanks! Another small question: Can the dom-intro (note to developers) section stay there under the struct, or should it be removed?

@domenic
Copy link
Member

domenic commented Jun 26, 2023

That part should stay with the definition of the NotRestoredReasons interface.

@rubberyuzu
Copy link
Contributor Author

I see. Does NotRestoredReasons interface have getters for src, id and name etc. even though it does not have them as a member? Also, backingStruct is a private member so we should not define getters for it?

@domenic
Copy link
Member

domenic commented Jun 27, 2023

Does NotRestoredReasons interface have getters for src, id and name etc. even though it does not have them as a member?

Yes; because those are part of the IDL, you need to define getters for them.

Also, backingStruct is a private member so we should not define getters for it?

Yes. It does not appear in the IDL, so it has no getter.

@rubberyuzu
Copy link
Contributor Author

rubberyuzu commented Jan 19, 2024

@smaug----
NotRestoredReasons is only available for the outermost main frame, and for iframes it's always null.

@smaug----
Copy link
Collaborator

Ah, ok. That isn't super clear from w3c/navigation-timing#195

Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit on this side of the new addition

source Outdated Show resolved Hide resolved
@rubberyuzu
Copy link
Contributor Author

@smaug----
I updated the PerformanceNavigationTiming spec, as well as this spec.
NRR is only set when document's node navigable is a top-level traversable, and otherwise it's null.

Please let me know if there's anything else I should add in this spec for your approval, thanks!

source Show resolved Hide resolved
source Show resolved Hide resolved
@smaug----
Copy link
Collaborator

I think this is starting to look quite good. Could you still explain or fix the backing struct vs interface usage?

@rubberyuzu
Copy link
Contributor Author

@smaug----
Updated and added a comment. Can you please take a look again? Thanks!

Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a final pass and found a bit more backing struct / Web IDL object confusion. But I think it's all easy to fix; the architecture is sound.

source Show resolved Hide resolved
source Show resolved Hide resolved
source Show resolved Hide resolved
source Show resolved Hide resolved
source Show resolved Hide resolved
source Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@rubberyuzu
Copy link
Contributor Author

Thanks for pointing that out Domenic! Addressed your comments.

source Outdated Show resolved Hide resolved
<dd>This <code>Document</code> has children that are in a cross-origin <code>iframe</code>,
and they prevented <a href="#note-bfcache">back/forward cache</a>; or this <code>Document</code>
could not be <a href="#note-bfcache">back/forward cached</a> for user agent-specific reasons.
This reason only appears for <span>node navigable</span>'s <span>top-level traversable</span>'s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This reason only appears for <span>node navigable</span>'s <span>top-level traversable</span>'s
This reason only appears for a <span>node navigable</span>'s <span>top-level traversable</span>'s

But, is this true still? If an iframe does something disallowed, I think it will get "masked".

We could also consider using two separate reasons, "masked" and "unknown"/"user-agent-specific"/something like that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We explicitly don't want the page to know whether it was cross-origin iframe or something in the UA which blocked bfcaching. The point is that it wasn't your own page, so you shouldn't know about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, is this true still? If an iframe does something disallowed, I think it will get "masked".

Thanks for pointing it out, this is not true any more. I removed this sentence.

Copy link
Collaborator

@smaug---- smaug---- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks reasonable now. The algorithm to build not restored reasons is a bit complicated since there we iterate through the descendants twice, but it should still work.

@domenic domenic removed the needs tests Moving the issue forward requires someone to write tests label Feb 21, 2024
@domenic domenic merged commit b979bc3 into whatwg:main Feb 21, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements topic: navigation
Development

Successfully merging this pull request may close these issues.

None yet

10 participants