Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error capturing some urls #88

Open
andreasmcdermott opened this issue Aug 2, 2017 · 0 comments
Open

Error capturing some urls #88

andreasmcdermott opened this issue Aug 2, 2017 · 0 comments

Comments

@andreasmcdermott
Copy link

andreasmcdermott commented Aug 2, 2017

Trying to grab a screenshot from this url:
https://www.nextgenscience.org/resources/equip-professional-learning-facilitator%E2%80%99s-guide-v20
(unescaped: https://www.nextgenscience.org/resources/equip-professional-learning-facilitator’s-guide-v20)

And it results in the following error:

2017-08-02T21:41:37.851Z - debug: Request query parameters: {"url":"https://www.nextgenscience.org/resources/equip-professional-learning-facilitator%E2%80%99s-guide-v20"}
2017-08-02T21:41:37.851Z - debug: Request body parameters: {}
2017-08-02T21:41:37.861Z - debug: Sending file ("https://www.nextgenscience.org/resources/equip-professional-learning-facilitator’s-guide-v20") in response
2017-08-02T21:41:37.864Z - info: Capture site screenshot: "https://www.nextgenscience.org/resources/equip-professional-learning-facilitator’s-guide-v20"
2017-08-02T21:41:37.864Z - debug: Options for script: {"url":"https://www.nextgenscience.org/resources/equip-professional-learning-facilitator’s-guide-v20"}, base64: eyJ1cmwiOiJodHRwczovL3d3dy5uZXh0Z2Vuc2NpZW5jZS5vcmcvcmVzb3VyY2VzL2VxdWlwLXByb2Zlc3Npb25hbC1sZWFybmluZy1mYWNpbGl0YXRvchlzLWd1aWRlLXYyMCJ9, command: ["phantomjs","--ignore-ssl-errors=true","--web-security=false","/usr/local/lib/node_modules/manet/src/scripts/screenshot.js","eyJ1cmwiOiJodHRwczovL3d3dy5uZXh0Z2Vuc2NpZW5jZS5vcmcvcmVzb3VyY2VzL2VxdWlwLXByb2Zlc3Npb25hbC1sZWFybmluZy1mYWNpbGl0YXRvchlzLWd1aWRlLXYyMCJ9","/var/folders/2m/3h5k0q2j40s2gk2tljb373rr_hf0p8/T/385fe470089a2bc87bfd2a1eb43caf15499d31e8.png"]
2017-08-02T21:41:41.977Z - debug: Process output: Script options: {"url":"https://www.nextgenscience.org/resources/equip-professional-learning-facilitator�s-guide-v20"}
Error: SyntaxError: JSON Parse error: Unterminated string
Error: TypeError: undefined is not an object (evaluating 'options.clipRect')
2017-08-02T21:41:41.977Z - debug: Execution time: 4.11 sec
2017-08-02T21:41:41.977Z - debug: Process finished work: eyJ1cmwiOiJodHRwczovL3d3dy5uZXh0Z2Vuc2NpZW5jZS5vcmcvcmVzb3VyY2VzL2VxdWlwLXByb2Zlc3Npb25hbC1sZWFybmluZy1mYWNpbGl0YXRvchlzLWd1aWRlLXYyMCJ9
2017-08-02T21:41:41.978Z - error: Error while sending data file: ENOENT: no such file or directory, stat '/var/folders/2m/3h5k0q2j40s2gk2tljb373rr_hf0p8/T/385fe470089a2bc87bfd2a1eb43caf15499d31e8.png'

Running version: 0.4.19.

edit

This url fails as well: https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban

Which indicates that it might not be the "’" that is the problem.

edit 2

So the guardian url shows an error in page ({"error":{"error":"Can not capture: https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban"}}), but after another few seconds it seems to succeed, and if I run the same url again, it loads the image from storage.

2017-08-02T22:17:49.993Z - debug: Request query parameters: {"url":"https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban"}
2017-08-02T22:17:49.993Z - debug: Request body parameters: {}
2017-08-02T22:17:50.009Z - debug: Sending file ("https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban") in response
2017-08-02T22:17:50.011Z - info: Capture site screenshot: "https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban"
2017-08-02T22:17:50.012Z - debug: Options for script: {"url":"https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban"}, base64: eyJ1cmwiOiJodHRwczovL3d3dy50aGVndWFyZGlhbi5jb20vd29ybGQvMjAxNi9vY3QvMjAvc3BhbmlzaC1jb3VydC1vdmVydHVybnMtY2F0YWxvbmlhLWJ1bGxmaWdodGluZy1iYW4ifQ==, command: ["phantomjs","--ignore-ssl-errors=true","--web-security=false","/usr/local/lib/node_modules/manet/src/scripts/screenshot.js","eyJ1cmwiOiJodHRwczovL3d3dy50aGVndWFyZGlhbi5jb20vd29ybGQvMjAxNi9vY3QvMjAvc3BhbmlzaC1jb3VydC1vdmVydHVybnMtY2F0YWxvbmlhLWJ1bGxmaWdodGluZy1iYW4ifQ==","/var/folders/2m/3h5k0q2j40s2gk2tljb373rr_hf0p8/T/c721ae458be487ced560f9ef619d45245b425560.png"]
2017-08-02T22:17:50.811Z - debug: Process output: Script options: {"url":"https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban"}
Resource was downloaded: data:image/png;base64,iVBORw0KGgoAAAA[...]
Resource was downloaded: data:application/x-font-woff;base64,d09GRgABAAA[...]
Resource was downloaded: https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban
Resource was downloaded: data:application/x-font-woff;base64,d09GRg[...]
Resource was downloaded: data:application/x-font-woff;base64,d09G[...]
Resource was downloaded: data:application/x-font-woff;base64,d09G[...]
Resource was downloaded: data:application/x-font-woff;base64,d09GR[...]
2017-08-02T22:17:50.831Z - error: Process error: 
2017-08-02T22:17:50.831Z - debug: Execution time: 0.8 sec
2017-08-02T22:17:50.831Z - debug: Process finished work: eyJ1cmwiOiJodHRwczovL3d3dy50aGVndWFyZGlhbi5jb20vd29ybGQvMjAxNi9vY3QvMjAvc3BhbmlzaC1jb3VydC1vdmVydHVybnMtY2F0YWxvbmlhLWJ1bGxmaWdodGluZy1iYW4ifQ==
2017-08-02T22:17:50.832Z - error:  error=Can not capture: https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban

When running the same url again (I get the screenshot). Logs:

2017-08-02T22:20:33.081Z - debug: Request query parameters: {"url":"https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban"}
2017-08-02T22:20:33.081Z - debug: Request body parameters: {}
2017-08-02T22:20:33.085Z - debug: Sending file ("https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban") in response
2017-08-02T22:20:33.085Z - info: Capture site screenshot: "https://www.theguardian.com/world/2016/oct/20/spanish-court-overturns-catalonia-bullfighting-ban"
2017-08-02T22:20:33.086Z - debug: Take screenshot from file storage: eyJ1cmwiOiJodHRwczovL3d3dy50aGVndWFyZGlhbi5jb20vd29ybGQvMjAxNi9vY3QvMjAvc3BhbmlzaC1jb3VydC1vdmVydHVybnMtY2F0YWxvbmlhLWJ1bGxmaWdodGluZy1iYW4ifQ==

Seems like this is actually two different problems.

@andreasmcdermott andreasmcdermott changed the title Error capturing urls that includes escaped characters Error capturing some urls Aug 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant