You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently there is a feature to crawl through rendered html for additional links.
This is passed in as a boolean.
An issue #127 was opened asking for the ability to disallow some domains to be rendered.
Change
We could change the crawl behaviour from a boolean to an optional function the consumer can pass in to decide whether a link should be rendered.
Option 1: On all HTML
Crawl function could get called once a render is complete.
It would be responsible for looking for all links on the page and returning an array of new pages to render.
Optionally we could add a getHrefsFromHtml convenience function to save each consumer writing this parser.
Background
Currently there is a feature to crawl through rendered html for additional links.
This is passed in as a boolean.
An issue #127 was opened asking for the ability to disallow some domains to be rendered.
Change
We could change the crawl behaviour from a boolean to an optional function the consumer can pass in to decide whether a link should be rendered.
Option 1: On all HTML
Crawl function could get called once a render is complete.
It would be responsible for looking for all links on the page and returning an array of new pages to render.
Optionally we could add a
getHrefsFromHtml
convenience function to save each consumer writing this parser.Option 2: On each link
Crawl function could get called after we've parsed the rendered HTML for links.
Feedback
Feedback is welcome. Please comment below with your thoughts.
The text was updated successfully, but these errors were encountered: