[IMP] web: replace wkhtmltopdf with chromeheadless. #32624

JKE-be · 2019-04-11T10:56:35Z

Description of the issue/feature this PR addresses:

Current behavior before PR:

Desired behavior after PR is merged:

--
I confirm I have signed the CLA and read the PR guidelines at www.odoo.com/submit-pr

yajo · 2019-04-12T07:04:10Z

odoo/addons/base/models/ir_actions_report.py

-def _get_wkhtmltopdf_bin():
-    return find_in_path('wkhtmltopdf')
+def _get_chrome_bin():
+    return find_in_path('google-chrome')


Why not reuse odoo.tests.common.ChromeBrowser?

At least please do not forget to add support for chromium. 🙏

open-net-sarl · 2019-04-16T06:50:41Z

odoo/addons/base/models/ir_actions_report.py

+                    })
+                    chrome.navigate_to("file://" + path, True)
+                    res_id = chrome._websocket_send('Page.printToPDF', params=params)
+                    res = chrome._websocket_wait_id(res_id)


After analyzing the new code, I see that you still don't run the PDF creations asynchronous.
We have done some testing with wkhtml2pdf and lunching the PDF creation asynchronous improves a lot the time to generate a PDF with multiple Odoo records.

If the _websocket_wait_id was done outside the loop, with the list of processes, the PDF generation would be a lot faster when printing multiple records.

legalsylvain · 2019-04-23T10:26:18Z

hi @JKE-be.

Could you explain the reason of this move ? why leaving wkhtmltopdf ? why choosing google-chrome ?

I'm Quite worried to see google technologies landing in all projects.

Ref : https://redalemeden.com/blog/2019/we-need-chrome-no-more

kind regards.

mart-e · 2019-04-25T13:51:22Z

@legalsylvain this branch is a test, to see the limitations, nothing has really been decided yet.
The main issue is that wkhtmltopdf is based on an outdated webkit engine, we can't use es6 in it for instance.
My personal opinion is to stay away from anything google touched as much as possible so, if you have another alternative, please share. The trick is that very few engines allowed javascript evaluation (in pure python, only wkhtmltopdf actually).

yajo · 2019-04-29T09:46:05Z

Alternatives:

It's a missing option for firefox headless, you can add your votes here: https://bugzilla.mozilla.org/show_bug.cgi?id=1407238
Also it seems like pyqt would work fine: https://stackoverflow.com/a/28054707/1468388
Weasyprint looks good but indeed no JS support: Javascript generated content Kozea/WeasyPrint#454

OTOH:

Chrome/Chromium is already used to run tours... It's not like you're adding a dependency here, just using it for more things.
wkhtmltopdf has been PITA for years (although it stopped being when Wkhtmltopdf failed (error code: -11). Message: - when only empty content is to be drawn on the last page wkhtmltopdf/wkhtmltopdf#2711 was fixed for 0.12.5) and is very outdated.
For a dev, having the ability to open the HTML version with the same engine that will print it is an important +1 for debugging.

So I think that this is the less worse solution currently. Possibly using firefox would be cooler, but they don't support it, so... 🤷‍♂️

liZe · 2019-05-14T15:46:37Z

Hello.

(Disclaimer: I'm WeasyPrint's main developer 😉)

There is a somehow related thread (OCA/reporting-engine#254), with comments that can be useful about WeasyPrint's pros and cons. Short summary: there are many really useful features for print in WeasyPrint, it's written in Python, but it's pretty slow and doesn't support JS.

If the goal is to render the pages the same way they are displayed on screen, with no extra work from the web developer if possible, using a browser engine is a much better idea. WeasyPrint is able to generate high quality documents with the possibility to handle pagination, but it will probably need extra CSS.

Chrome/Chromium is already used to run tours... It's not like you're adding a dependency here, just using it for more things.

For a dev, having the ability to open the HTML version with the same engine that will print it is an important +1 for debugging.

These two points are good, but they also are exactly why Chrome/Chromium is becoming a monopoly: we use it more because we already use it, and devs will it more because we assume that they already use it. Is it a good idea to fight for end users' freedom when they love their almighty master? 🤔

So I think that this is the less worse solution currently. Possibly using firefox would be cooler, but they don't support it, so...

I agree, it's the less worse solution on the technical side. PyQt (based on WebKit) is probably a good solution too, and it would avoid depending more on Google products.

mart-e · 2019-05-14T16:21:53Z

@liZe thank you for your comment. I like your software and would prefer to use it instead of chrome. We could work to remove the JS need (which is probably for wrong reasons) but we do have a need to print really big files sometimes (end of year general ledger can be more than 1000 pages).

To update on this task, this is currently blocked as the no external resources in header/footer in headless chrome is quite blocking.

liZe · 2019-05-14T22:34:16Z

To update on this task, this is currently blocked as the no external resources in header/footer in headless chrome is quite blocking.

Oh, that's an interesting article! Here are some random thoughts about these limitations:

About headers and footers, it's a real pain to get complex layouts, and following or not the CSS specs leads to different problems. With Chrome (and other browsers), headers are special cases and have serious limitations that need disturbing workarounds. With WeasyPrint (blindly following to the CSS specs), you can't add HTML in headers and need to rely on backgrounds to have images, or on advanced features such as the string() function to have context-related content. You can find some dirty workarounds in How to repeat on each page of complex headers (eg, tables)? Kozea/WeasyPrint#92 (most of them should work with browsers too).
Page breaks are really hard to understand well, use correctly and of course to implement. They used to be terribly handled or even totally ignored by browsers, but things got better when columns breaks were introduced (and IE / Edge was really got at it, believe it or not). In normal flow, it's easy to get something pretty good, but things are bad when multiple flows are parallel (like in tables, flex, floating elements, etc.) I've tried to play with Gecko's and WebKit's code a long time ago, but browsers are really deeply designed to render one surface, not multiple pages, and it would be really difficult to get something interesting. (Here's a special gift 🎁: a 17-year-old bug in Firefox about page-break-* properties.) It's technically easier to get better results in WeasyPrint, but as often it would require a lot of manpower. Prince has probably the best support of these properties by far.
Special headers and footers depending on page numbers are now possible, using the nth() page selector. I don't know if it's implemented in browsers, but it is in WeasyPrint.
Authentication may be a problem when you want to generate printable documents from websites. Some plugins exist for Python web frameworks (at least Flask and Django) to automatically transform URLs to static routes into filenames when they're rendered with WeasyPrint. It allows the library to avoid authentication (and useless extra requests) to get these resources. The idea can probably be adapted and implemented in JS for headless browsers.

I'd be happy to help, even to use other headless browsers 😉. The best I can do is probably to give dirty/smart CSS tricks if needed.

pedrobaeza · 2020-02-17T18:15:06Z

Is this test going to be continued?

mart-e · 2020-02-18T10:55:12Z

Not at the moment at least

Yenthe666 · 2022-01-20T16:47:26Z

@JKE-be @mart-e see wkhtmltopdf/wkhtmltopdf#5160 :-)

em230418 · 2022-01-21T05:15:48Z

Just noting, that maintainer of wkhtmltopdf suggested to use https://github.com/Kozea/WeasyPrint
Prooflink: https://wkhtmltopdf.org/status.html

liZe · 2022-01-21T12:48:20Z

If you’re interested in using WeasyPrint, you can ping us, we’ll be happy to help.

Since my previous comment, we even have good news. WeasyPrint is now developped by CourtBouillon with official professional support, and backers who financially help us to fix bugs and develop features faster than ever.

Headers and footers gained the possibility to include complex layouts thanks to running elements, it should thus be possible to have the features needed for Odoo regarding this topic. We now also support parallel flows in the latest version.

robodoo added the CI 🤖 Robodoo has seen passing statuses label Apr 11, 2019

C3POdoo added the RD research & development, internal work label Apr 11, 2019

yajo reviewed Apr 12, 2019

View reviewed changes

open-net-sarl reviewed Apr 16, 2019

View reviewed changes

sga-odoo force-pushed the master-replace-wkhtmltopdf-headless-sga branch from 5b62f91 to 99fb080 Compare April 16, 2019 13:25

robodoo added CI 🤖 Robodoo has seen passing statuses and removed CI 🤖 Robodoo has seen passing statuses labels Apr 16, 2019

sga-odoo force-pushed the master-replace-wkhtmltopdf-headless-sga branch from 99fb080 to b9988a0 Compare April 18, 2019 05:21

robodoo added CI 🤖 Robodoo has seen passing statuses and removed CI 🤖 Robodoo has seen passing statuses labels Apr 18, 2019

sga-odoo force-pushed the master-replace-wkhtmltopdf-headless-sga branch from b9988a0 to a07b7bf Compare April 18, 2019 12:56

robodoo removed the CI 🤖 Robodoo has seen passing statuses label Apr 18, 2019

sga-odoo force-pushed the master-replace-wkhtmltopdf-headless-sga branch from a07b7bf to adffbd1 Compare April 19, 2019 10:40

robodoo added the CI 🤖 Robodoo has seen passing statuses label Apr 19, 2019

[IMP] web: replace wkhtmltopdf with chromeheadless.

8a49abe

sga-odoo force-pushed the master-replace-wkhtmltopdf-headless-sga branch from adffbd1 to 8a49abe Compare April 25, 2019 10:10

robodoo removed the CI 🤖 Robodoo has seen passing statuses label Apr 25, 2019

KangOl force-pushed the master branch from 86c80d3 to ab6d0c3 Compare November 6, 2019 12:27

mart-e closed this Feb 18, 2020

robodoo added closed 💔 labels Feb 18, 2020

antonylesuisse deleted the master-replace-wkhtmltopdf-headless-sga branch March 26, 2021 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IMP] web: replace wkhtmltopdf with chromeheadless. #32624

[IMP] web: replace wkhtmltopdf with chromeheadless. #32624

JKE-be commented Apr 11, 2019

yajo Apr 12, 2019

open-net-sarl Apr 16, 2019

legalsylvain commented Apr 23, 2019

mart-e commented Apr 25, 2019

yajo commented Apr 29, 2019

liZe commented May 14, 2019

mart-e commented May 14, 2019

liZe commented May 14, 2019

pedrobaeza commented Feb 17, 2020

mart-e commented Feb 18, 2020

Yenthe666 commented Jan 20, 2022

em230418 commented Jan 21, 2022 •

edited

liZe commented Jan 21, 2022

[IMP] web: replace wkhtmltopdf with chromeheadless. #32624

[IMP] web: replace wkhtmltopdf with chromeheadless. #32624

Conversation

JKE-be commented Apr 11, 2019

yajo Apr 12, 2019

Choose a reason for hiding this comment

open-net-sarl Apr 16, 2019

Choose a reason for hiding this comment

legalsylvain commented Apr 23, 2019

mart-e commented Apr 25, 2019

yajo commented Apr 29, 2019

liZe commented May 14, 2019

mart-e commented May 14, 2019

liZe commented May 14, 2019

pedrobaeza commented Feb 17, 2020

mart-e commented Feb 18, 2020

Yenthe666 commented Jan 20, 2022

em230418 commented Jan 21, 2022 • edited

liZe commented Jan 21, 2022

em230418 commented Jan 21, 2022 •

edited