Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot) #129

Open
optroodt opened this issue Apr 29, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@optroodt
Copy link

Since version 0.13.1, crawling seems to fail at taking screenshots. It does a total of 5 tries, then fails. It doesn't happen always, but most of the time, for different urls, but I haven't found any patterns yet.

Here is some output from the hoarder-workers container:

2024-04-29T05:23:00.415Z info: [Crawler][73] Will crawl "https://ntfy.sh/" for link with id "v1x4322u8dsfcqxu4lphrb11"
2024-04-29T05:23:02.098Z info: [Crawler][73] Successfully navigated to "https://ntfy.sh/". Waiting for the page to load ...
2024-04-29T05:23:06.107Z info: [search][167] Attempting to index bookmark with id tlxoo973txr3tdw5mj5m48n9 ...
2024-04-29T05:23:06.277Z info: [search][167] Completed successfully
2024-04-29T05:23:07.101Z info: [Crawler][73] Finished waiting for the page to load.
2024-04-29T05:23:07.113Z error: [Crawler][73] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Internal error

Let me know if you want me to gather more details.

@MohamedBassem
Copy link
Collaborator

Yeah, I've seen this happen in a deployment for a friend of mine as well. I'll send a patch to make taking the screenshot optional if it fails. Thanks for the report!

@MohamedBassem MohamedBassem added the bug Something isn't working label Apr 29, 2024
@tehsnarf
Copy link

tehsnarf commented May 3, 2024

Had the same issue with bookmarking some GitHub links.. adding '--enable-features=ConversionMeasurement,AttributionReportingCrossAppWeb' to the command portion of Chrome resolved this for me. Seems to be some cross-site asset sharing shenanigans involved.

  chrome:
    image: gcr.io/zenika-hub/alpine-chrome:123
    restart: unless-stopped
    command:
      - --no-sandbox
      - --disable-gpu
      - --remote-debugging-address=0.0.0.0
      - --remote-debugging-port=9222
      - --hide-scrollbars
      - --enable-features=ConversionMeasurement,AttributionReportingCrossAppWeb

Also, I did have to delete the bookmark and remake it, but it seems to have successfully completed afterwords.

@MohamedBassem
Copy link
Collaborator

@tehsnarf oh wow. I have no idea what those "features" are :D How did you end up finding about them? Can you also share some of the github links you mentioned were failing before this change?

@tehsnarf
Copy link

tehsnarf commented May 4, 2024

@MohamedBassem first github link I used: https://github.com/Donkie/Spoolman

I saw them in the chrome log file:

hoarder-workers      | 2024-05-04T00:58:22.823Z info: [Crawler][1] Successfully navigated to "https://github.com/Donkie/Spoolman". Waiting for the page to load ...
hoarder-workers      | 2024-05-04T00:58:27.827Z info: [Crawler][1] Finished waiting for the page to load.
hoarder-workers      | 2024-05-04T00:58:27.857Z error: [Crawler][1] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Internal error
hoarder-workers      | 2024-05-04T00:58:28.873Z info: [Crawler][1] Will crawl "https://github.com/Donkie/Spoolman" for link with id "j22uq0e1qi5vqtiaim81fsoy"
hoarder-chrome       | [0504/005828.892692:WARNING:runtime_features.cc(728)] AttributionReportingCrossAppWeb cannot be enabled in this configuration. Use --enable-features=ConversionMeasurement,AttributionReportingCrossAppWeb in addition.
hoarder-workers      | 2024-05-04T00:58:30.187Z info: [Crawler][1] Successfully navigated to "https://github.com/Donkie/Spoolman". Waiting for the page to load ...
hoarder-workers      | 2024-05-04T00:58:35.188Z info: [Crawler][1] Finished waiting for the page to load.
hoarder-workers      | 2024-05-04T00:58:35.201Z error: [Crawler][1] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Internal error
hoarder-workers      | 2024-05-04T00:58:37.295Z info: [Crawler][1] Will crawl "https://github.com/Donkie/Spoolman" for link with id "j22uq0e1qi5vqtiaim81fsoy"
hoarder-chrome       | [0504/005837.307231:WARNING:runtime_features.cc(728)] AttributionReportingCrossAppWeb cannot be enabled in this configuration. Use --enable-features=ConversionMeasurement,AttributionReportingCrossAppWeb in addition.

added flag, restarted, deleted and remade bookmark, and golden

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants