Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IMP] web: replace wkhtmltopdf with chromeheadless. #32624

Closed

Conversation

JKE-be
Copy link
Contributor

@JKE-be JKE-be commented Apr 11, 2019

Description of the issue/feature this PR addresses:

Current behavior before PR:

Desired behavior after PR is merged:

--
I confirm I have signed the CLA and read the PR guidelines at www.odoo.com/submit-pr

@robodoo robodoo added the CI 🤖 Robodoo has seen passing statuses label Apr 11, 2019
@C3POdoo C3POdoo added the RD research & development, internal work label Apr 11, 2019
def _get_wkhtmltopdf_bin():
return find_in_path('wkhtmltopdf')
def _get_chrome_bin():
return find_in_path('google-chrome')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reuse odoo.tests.common.ChromeBrowser?

At least please do not forget to add support for chromium. 🙏

})
chrome.navigate_to("file://" + path, True)
res_id = chrome._websocket_send('Page.printToPDF', params=params)
res = chrome._websocket_wait_id(res_id)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After analyzing the new code, I see that you still don't run the PDF creations asynchronous.
We have done some testing with wkhtml2pdf and lunching the PDF creation asynchronous improves a lot the time to generate a PDF with multiple Odoo records.

If the _websocket_wait_id was done outside the loop, with the list of processes, the PDF generation would be a lot faster when printing multiple records.

@sga-odoo sga-odoo force-pushed the master-replace-wkhtmltopdf-headless-sga branch from 5b62f91 to 99fb080 Compare April 16, 2019 13:25
@robodoo robodoo added CI 🤖 Robodoo has seen passing statuses and removed CI 🤖 Robodoo has seen passing statuses labels Apr 16, 2019
@sga-odoo sga-odoo force-pushed the master-replace-wkhtmltopdf-headless-sga branch from 99fb080 to b9988a0 Compare April 18, 2019 05:21
@robodoo robodoo added CI 🤖 Robodoo has seen passing statuses and removed CI 🤖 Robodoo has seen passing statuses labels Apr 18, 2019
@sga-odoo sga-odoo force-pushed the master-replace-wkhtmltopdf-headless-sga branch from b9988a0 to a07b7bf Compare April 18, 2019 12:56
@robodoo robodoo removed the CI 🤖 Robodoo has seen passing statuses label Apr 18, 2019
@sga-odoo sga-odoo force-pushed the master-replace-wkhtmltopdf-headless-sga branch from a07b7bf to adffbd1 Compare April 19, 2019 10:40
@robodoo robodoo added the CI 🤖 Robodoo has seen passing statuses label Apr 19, 2019
@legalsylvain
Copy link
Contributor

hi @JKE-be.

Could you explain the reason of this move ? why leaving wkhtmltopdf ? why choosing google-chrome ?

I'm Quite worried to see google technologies landing in all projects.

Ref : https://redalemeden.com/blog/2019/we-need-chrome-no-more

kind regards.

@sga-odoo sga-odoo force-pushed the master-replace-wkhtmltopdf-headless-sga branch from adffbd1 to 8a49abe Compare April 25, 2019 10:10
@robodoo robodoo removed the CI 🤖 Robodoo has seen passing statuses label Apr 25, 2019
@mart-e
Copy link
Contributor

mart-e commented Apr 25, 2019

@legalsylvain this branch is a test, to see the limitations, nothing has really been decided yet.
The main issue is that wkhtmltopdf is based on an outdated webkit engine, we can't use es6 in it for instance.
My personal opinion is to stay away from anything google touched as much as possible so, if you have another alternative, please share. The trick is that very few engines allowed javascript evaluation (in pure python, only wkhtmltopdf actually).

@yajo
Copy link
Contributor

yajo commented Apr 29, 2019

Alternatives:

OTOH:

So I think that this is the less worse solution currently. Possibly using firefox would be cooler, but they don't support it, so... 🤷‍♂️

@liZe
Copy link

liZe commented May 14, 2019

Hello.

(Disclaimer: I'm WeasyPrint's main developer 😉)

There is a somehow related thread (OCA/reporting-engine#254), with comments that can be useful about WeasyPrint's pros and cons. Short summary: there are many really useful features for print in WeasyPrint, it's written in Python, but it's pretty slow and doesn't support JS.

If the goal is to render the pages the same way they are displayed on screen, with no extra work from the web developer if possible, using a browser engine is a much better idea. WeasyPrint is able to generate high quality documents with the possibility to handle pagination, but it will probably need extra CSS.

  • Chrome/Chromium is already used to run tours... It's not like you're adding a dependency here, just using it for more things.
  • For a dev, having the ability to open the HTML version with the same engine that will print it is an important +1 for debugging.

These two points are good, but they also are exactly why Chrome/Chromium is becoming a monopoly: we use it more because we already use it, and devs will it more because we assume that they already use it. Is it a good idea to fight for end users' freedom when they love their almighty master? 🤔

So I think that this is the less worse solution currently. Possibly using firefox would be cooler, but they don't support it, so...

I agree, it's the less worse solution on the technical side. PyQt (based on WebKit) is probably a good solution too, and it would avoid depending more on Google products.

@mart-e
Copy link
Contributor

mart-e commented May 14, 2019

@liZe thank you for your comment. I like your software and would prefer to use it instead of chrome. We could work to remove the JS need (which is probably for wrong reasons) but we do have a need to print really big files sometimes (end of year general ledger can be more than 1000 pages).

To update on this task, this is currently blocked as the no external resources in header/footer in headless chrome is quite blocking.

@liZe
Copy link

liZe commented May 14, 2019

To update on this task, this is currently blocked as the no external resources in header/footer in headless chrome is quite blocking.

Oh, that's an interesting article! Here are some random thoughts about these limitations:

  • About headers and footers, it's a real pain to get complex layouts, and following or not the CSS specs leads to different problems. With Chrome (and other browsers), headers are special cases and have serious limitations that need disturbing workarounds. With WeasyPrint (blindly following to the CSS specs), you can't add HTML in headers and need to rely on backgrounds to have images, or on advanced features such as the string() function to have context-related content. You can find some dirty workarounds in How to repeat on each page of complex headers (eg, tables)? Kozea/WeasyPrint#92 (most of them should work with browsers too).

  • Page breaks are really hard to understand well, use correctly and of course to implement. They used to be terribly handled or even totally ignored by browsers, but things got better when columns breaks were introduced (and IE / Edge was really got at it, believe it or not). In normal flow, it's easy to get something pretty good, but things are bad when multiple flows are parallel (like in tables, flex, floating elements, etc.) I've tried to play with Gecko's and WebKit's code a long time ago, but browsers are really deeply designed to render one surface, not multiple pages, and it would be really difficult to get something interesting. (Here's a special gift 🎁: a 17-year-old bug in Firefox about page-break-* properties.) It's technically easier to get better results in WeasyPrint, but as often it would require a lot of manpower. Prince has probably the best support of these properties by far.

  • Special headers and footers depending on page numbers are now possible, using the nth() page selector. I don't know if it's implemented in browsers, but it is in WeasyPrint.

  • Authentication may be a problem when you want to generate printable documents from websites. Some plugins exist for Python web frameworks (at least Flask and Django) to automatically transform URLs to static routes into filenames when they're rendered with WeasyPrint. It allows the library to avoid authentication (and useless extra requests) to get these resources. The idea can probably be adapted and implemented in JS for headless browsers.

I'd be happy to help, even to use other headless browsers 😉. The best I can do is probably to give dirty/smart CSS tricks if needed.

@pedrobaeza
Copy link
Collaborator

Is this test going to be continued?

@mart-e
Copy link
Contributor

mart-e commented Feb 18, 2020

Not at the moment at least

@mart-e mart-e closed this Feb 18, 2020
@antonylesuisse antonylesuisse deleted the master-replace-wkhtmltopdf-headless-sga branch March 26, 2021 15:41
@Yenthe666
Copy link
Collaborator

@JKE-be @mart-e see wkhtmltopdf/wkhtmltopdf#5160 :-)

@em230418
Copy link
Contributor

em230418 commented Jan 21, 2022

Just noting, that maintainer of wkhtmltopdf suggested to use https://github.com/Kozea/WeasyPrint
Prooflink: https://wkhtmltopdf.org/status.html

@liZe
Copy link

liZe commented Jan 21, 2022

If you’re interested in using WeasyPrint, you can ping us, we’ll be happy to help.

Since my previous comment, we even have good news. WeasyPrint is now developped by CourtBouillon with official professional support, and backers who financially help us to fix bugs and develop features faster than ever.

Headers and footers gained the possibility to include complex layouts thanks to running elements, it should thus be possible to have the features needed for Odoo regarding this topic. We now also support parallel flows in the latest version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RD research & development, internal work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet