Skip to content
This repository has been archived by the owner on Jan 2, 2023. It is now read-only.

Exit with code 1 due to network error: UnknownContentError while converting html to pdf #2187

Closed
no1stunna opened this issue Feb 2, 2015 · 28 comments
Labels

Comments

@no1stunna
Copy link

There is html file:

https://gist.github.com/no1stunna/36637bac02569ad6744e

When i do

    wkhtmltopdf issue.html test.pdf

There is an error:

Loading pages (1/6)
Counting pages (2/6)
Resolving links (4/6)
Loading headers and footers (5/6)
Printing pages (6/6)
Done
Exit with code 1 due to network error: UnknownContentError

Most of the time its working well. But for some generated google urls like this one it throws such an error.

Thanks a lot.

@ctrlaltdylan
Copy link

Bumping having this same issue

@kenorb
Copy link

kenorb commented May 13, 2015

The same here.

$ wkhtmltopdf page.html page.pdf
Loading pages (1/6)
Warning: Failed to load file://connect.facebook.net/en_US/sdk.js (ignore)
Warning: SSL error ignored                                        
Counting pages (2/6)                                               
Resolving links (4/6)                                                       
Loading headers and footers (5/6)                                           
Printing pages (6/6)
Done                                                                        
Exit with code 1 due to network error: UnknownContentError

@kenorb
Copy link

kenorb commented May 13, 2015

See: QNetworkReply Class

QNetworkReply::UnknownContentError: 299: an unknown error related to the remote content was detected

In #1948 we can read:

It looks like you are trying to load a dynamic resource which is not found (see explanation) -- you might want to see what it is.

However it doesn't solve the problem, because there is no way to ignore such error and continue. Currently it prevents generation of the output file.

@team4music
Copy link

Same problem here =(

@brafdlog
Copy link

Same here :(

@kai789
Copy link

kai789 commented Feb 3, 2016

I hit UnknownContentError as well. It seems like when I reach the 16-18 page mark, with header+footer, it causes the error.

I moved my problem into a new issue: #2778

@cm8
Copy link

cm8 commented Mar 6, 2016

I hit this error for a single page without any header or footer and without javascript running.

To debug this problem try running the conversion with --no-images which runs fine. I then saved the file locally to find out which image was offending. To test with a local copy do not forget to set e.g. <base href=".."/> in the <head> section of the html appropriately.

If the document is large, use grep -A2 "<img" to find urls to check. Use wget on these urls and inspect their content. This yields an img resource on archive.org, with an html response body, an image url which was broken at archival time.

  • archive.org replays http status codes for archived resources, in this case it is/was 400: Bad Request
  • QT webkit transforms this into UnknownContentError, presumably trying to interpret the html response body (the error message) as an image, despite code 400

After a review I can "ignore" this problem, since the pdf in question renders fine without the img resource. However I do not want to ignore non-zero return status in a batch invocation of wkhtmltopdf (to find and break on more serious errors).

The specific problem described above can be coupled to the load-media-error class.

  • wkhtmltopdf defaults to --load-media-error-handling ignore
  • the 'ignore' handlers do not trigger non-zero exit codes
  • UnknownContentError triggered from loading <img> sources (or, more generally any page requisite with an image/* mime type) should be handled like any other load-media-error

(using statically compiled wkhtmltopdf 0.12.3-dev (with patched qt))

@bronson
Copy link

bronson commented Mar 14, 2016

Yep, I'm getting this error consistently with 0.12.1 and 0.12.3:

(EDIT: I mean, it consistently happens hundreds of times when generating thousands of PDFs. It doesn't happen on every PDF)

$ wkhtmltopdf http://prop.gisweb.com/print/geosearch/m/tCentralFlorida out.pdf
Loading pages (1/6)
QFont::setPixelSize: Pixel size <= 0 (0)                     ] 48%
Counting pages (2/6)
Resolving links (4/6)
Loading headers and footers (5/6)
Printing pages (6/6)
Done
Exit with code 1 due to network error: UnknownContentError

So I upgraded to 0.13.0-alpha-7b36694 and I'm getting:

$ wkhtmltopdf http://prop.gisweb.com/print/geosearch/m/tCentralFlorida out.pdf
Loading page (1/2)
Printing pages (2/2)
Done
Exit with code 1 due to network error: InternalServerError

Sigh. In both cases the PDF appears to be correctly created. I guess I'll hack my app to ignore error 1 for now... Hoping someone finds a better solution.

@rk
Copy link

rk commented Mar 30, 2016

I've had this error happen when using the --post key value param on a command that normally works (v0.12.3).

Loading pages (1/6)
content-type missing in HTTP POST, defaulting to application/x-www-form-urlencoded. Use QNetworkRequest::setHeader() to fix this problem.
content-type missing in HTTP POST, defaulting to application/x-www-form-urlencoded. Use QNetworkRequest::setHeader() to fix this problem.
Counting pages (2/6)
Resolving links (4/6)
Loading headers and footers (5/6)
Printing pages (6/6)
Done
Exit with code 1 due to network error: UnknownContentError

I was able to confirm the exact same request worked with jQuery $.post(), for both the cover page and the content. I have 1 external linked resource (a logo on the cover page) referenced by HTTP, so there shouldn't be a problem there.

@hbarrington
Copy link

We just started receiving this error as well. We're still trying to debug the root cause in our case but passing the --no-images parameter seems to resolve it for us as well. We're not sure why.

@brafdlog
Copy link

brafdlog commented May 5, 2016

This happened to me when there was a missing image in the page. One the
image was removed from the page it was fixed.
On Fri, 6 May 2016 at 00:05 Hunter Barrington notifications@github.com
wrote:

We just started receiving this error as well. We're still trying to debug
the root cause in our case but passing the --no-images parameter seems to
resolve it for us as well. We're not sure why.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#2187 (comment)

@omrqs
Copy link

omrqs commented Mar 2, 2017

This error occurs when url are defined as //url... in place of https://url....
Keep the protocol.
wkhtmltopdf try find double slash as local path...

Solved for me.

@seanthebean
Copy link

Anyone know of any way to diagnose which resource is causing the problem? (Using the process of elimination can be very tedious/difficult if the PDF includes a lot of resources, or they're being programmatically included.)

@Glepooek
Copy link

was this problem solved?
image

@DionataNunesGarcia
Copy link

I have the same error, when it reaches 18 or more pages and including the footer and the header with html with URL, returns this error, and with menas pages it works normally, but until now I could not solve the problem, I give a URL to the footer and the header always returns the error "Exit with code 1 due to network error: UnknownContentError".
Does anyone already know how to fix it? I already tried everything I researched.

@sbont
Copy link

sbont commented Feb 16, 2018

@DionataNunesGarcia me too, large documents seem to cause this error too.
Have you found a way to solve this issue?

@gokulk16
Copy link

Same error. there should be an option created for handling this ContentNotFoundError.
Also in most cases, those are images.

@alloylab
Copy link

What flavor of linux are y'all running? What version of wkhtmltopdf? Is a patched QT version?
Is the certificate chain valid for https url?

Do you have openssl & ca-certificates packaged installed?

@alloylab
Copy link

alloylab commented Jul 5, 2018

is this still an issue on 0.12.5?

@BenjaminRbt
Copy link

@alloylab I was going to ask. Glad to see that some people want to see this issue out.

@Tomsgu
Copy link
Contributor

Tomsgu commented Jul 19, 2018

SSL issues were fixed in 0.12.5. It should be also easier to debug the problem, when ContentNotFoundError will appear.

@Tomsgu
Copy link
Contributor

Tomsgu commented Jul 25, 2018

Closing as there is no response. Please feel free to open a new issue with a complete example if you have some of the problems from this issue.

@Tomsgu Tomsgu closed this as completed Jul 25, 2018
@rainabba
Copy link

I'm seeing this with 12.3 on WSL/Ubuntu 18.04, Windows 1803. When I pull these pages in browser, everything is 200 or 304 as expected.

@singhravi1
Copy link

Got this error when i tried to create a 70 page pdf. Any possible solution?

@rainabba
Copy link

rainabba commented Dec 1, 2018

For me, the issue is resolved with 12.4

@singhravi1
Copy link

singhravi1 commented Dec 1, 2018

i'm using 0.12.5 version. But still having this issue. Any idea?

edit: ubuntu 16.04

@theredled
Copy link

Same

@rafaeljusto
Copy link

This PR will probably solve this issue:
#4461

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Development

No branches or pull requests