Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assets + stylesheet assets #1475

Draft
wants to merge 162 commits into
base: master
Choose a base branch
from

Conversation

eoghanmurray
Copy link
Contributor

@eoghanmurray eoghanmurray commented May 14, 2024

Medium to large enhancement to rrweb, building upon #1239

Please review #1437 first as this also builds upon that PR (which can be merged before #1239)


Asset Events

Assets are a new type of event that embody a serialized version of a http resource captured during snapshotting. Some examples are images, media files and stylesheets. Resources can be fetched externally (from cache) in the case of a href, or internally for blob: urls and same-origin stylesheets. Asset events are emitted subsequent to either a FullSnapshot or an IncrementalSnapshot (mutation), and although they may have a later timestamp, during replay they are rebuilt as part of the snapshot that they are associated with. In the case where e.g. a stylesheet is referenced at the time of a FullSnapshot, but hasn't been downloaded yet, there can be a subsequent mutation event with a later timestamp which, along with the asset event, can recreate the experience of a network-delayed load of the stylesheet.

Assets to mitigate stylesheet processing cost

In the case of stylesheets, rrweb does some record-time processing in order to serialize the css rules which had a negative effect on the initial page loading times and how quickly the FullSnapshot was taken (see https://pagespeed.web.dev/). These are now taken out of the main thread and processed asynchronously to be emitted (up to processStylesheetsWithin ms) later. There is no corresponding delay on the replay side so long as the stylesheet has been successfully emitted.

Asset Capture Configuration

The captureAssets configuration option allows you to customize the asset capture process. It is an object with the following properties:

  • objectURLs (default: true): This property specifies whether to capture same-origin blob: assets using object URLs. Object URLs are created using the URL.createObjectURL() method. Setting objectURLs to true enables the capture of object URLs.

  • origins (default: false): This property determines which origins to capture assets from. It can have the following values:

    • false or []: Disables capturing any assets apart from object URLs, stylesheets (unless set to false) and images (if that setting is turned on).
    • true: Captures assets from all origins.
    • [origin1, origin2, ...]: Captures assets only from the specified origins. For example, origins: ['https://s3.example.com/'] captures all assets from the origin https://s3.example.com/.
  • images (default: false or true if inlineImages is true in rrweb.record config): When set, this option turns on asset capturing for all images irrespective of their origin. Unless this configuration option is explicitly set to false, images may still be captured if their src url matches the origins setting above.

  • stylesheets (default: 'without-fetch'): When set to true, this turns on capturing of all stylesheets and style elements via the asset system irrespective of origin. The default of 'without-fetch' is designed to match with the previous inlineStylesheet behaviour, whereas the true value allows capturing of stylesheets which are otherwise inaccessible due to CORS restrictions to be captured via a fetch call, which will normally use the browser cache. Unless this is explicitly set to false, a stylesheet will be captured if it matches via the origins config above.

  • stylesheetsRuleThreshold (default: 0): only invoke the asset system for stylesheets with more than this number of rules. Defaults to zero (rather than say 100) as it only looks at the 'outer' rules (e.g. could have a single media rule which nests 1000s of sub rules). This default may be increased based on feedback.

  • processStylesheetsWithin (default: 2000): This property defines the maximum time in milliseconds that the browser should delay before processing stylesheets. Inline <style> elements will be processed within half this value. Lower this value if you wish to improve the odds that short 'bounce' visits will emit the asset before visitor unloads page. Set to zero or a negative number to process stylesheets synchronously, which can cause poor scores on e.g. https://pagespeed.web.dev/ ("Third-party code blocked the main thread").

Copy link

changeset-bot bot commented May 14, 2024

🦋 Changeset detected

Latest commit: e6e5de9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 8 packages
Name Type
rrweb-snapshot Major
rrweb Major
rrdom Major
@rrweb/types Major
rrdom-nodejs Major
rrweb-player Major
@rrweb/web-extension Major
rrvideo Major

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

eoghanmurray and others added 26 commits May 23, 2024 11:41
… checking the replay took me ages to debug as I thought it was something I introducedg
… mutations with `absoluteToStylesheet` - I had accidentally removed that with initial work
…yStylesheet to happen in a single place during initial snapshot.
…n't being assigned to correct text node, and provide remedy whereby css text is properly re-split upon rebuild
…hat the css hasn't been altered server-side/in-transit
…ion, even if there is only one text element on the <style>.

Recent conversation with Justin has firmed up my thinking on why we look at .sheet in the first place (it's not because we want to get rid of vendor prefixes, which may actually be undesirable where they are useful for accurate replay in the replayer - but rather it's to ensure we don't miss any programmatic style mutations that have happened - have added comments to reflect this)
rrweb:test:     test/utils.ts:181:43 - error TS2345: Argument of type 'elementNode & { rootId?: number | undefined; isShadowHost?: boolean | undefined; isShadow?: boolean | undefined; } & { id: number; }' is not assignable to parameter of type '{ attributes: { src?: string | undefined; }; }'.
rrweb:test:       Types of property 'attributes' are incompatible.
rrweb:test:         Type 'attributes' has no properties in common with type '{ src?: string | undefined; }'.
rrweb:test:
rrweb:test:     181               stripBlobURLsFromAttributes(add.node);
eoghanmurray and others added 27 commits May 23, 2024 11:43
…the `origins` mechanism to attempt to inline all images. The previous method will still be used in standalone snapshots
…rt - they are basically testing the same thing as the iframe versions still have <img> elements. These tests were also previously marked DEPRECATED due to association with older inlineImages setting
…et - the asset event will include the final content anyway
…a `splits` parameter when it is an asset so that there's a uniform way of processing it
…ting a zero or negative value for `config.processStylesheetsWithin`
…rt of the captureAssets config where possible
… config.stylesheets, as I want that to match inlineImages in that `true` will attempt to fetch all stylesheets
…tion targets and will be important for switch to vite
Thanks for adding this document. I tested that even the translation result of Chatgpt4 is not good enough.
So I changed some words to make the doc more readable for Chinese users.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants