[Html2PdfConverter] Html2PdfConverter FAQ

[Html2PdfConverter] Html2PdfConverter FAQ

  
Forge Component
(46)
Published on 20 Mar by Guilherme Pereira
46 votes
Published on 20 Mar by Guilherme Pereira

Hi Guys,


While helping to maintain this component I’ve got several questions regarding it’s configuration and usage.


The Html2PdfConverter component is a wrapper around the wkhtmltopdf command line tool that generates pdf documents or images based on web pages.

The tool itself comes in two versions: one that is MinGW based (bundles all dlls) on that is MSVC based (and requires installation of MSVC 2013 on the server).

This post is written as a FAQ with the most common questions collected over time and separated by category


Installation/Configuration/Troubleshooting


Q) Does this component work on the free Personal Environment?

A) Yes. It works on both personal and enterprise cloud and also on on premises installations.


Q) What version of wkhtmltopdf should I install?

A) I recommend using the MinGW one. Actually if you’re using a cloud environment (Personal or Enterprise) you have no other choice as you cannot install the dependencies on the machine required by the other version. The MSVC based one it’s supposed to be faster but in practice I never saw the difference.

Note: The lastest stable version of HTML2PDF do not include the MinGW version so for those trying to configure you can use the latest "Bleeding Edge" version.


Q) What binaries should I upload?

A) The binaries for windows include two executables and a few dlls. You should upload each exe (wkhtmlpdf.exe and wkhtmlimage.exe) to the correct places as well as the accompanying dlls. t’s not described anywhere but for the windows version the wkhtmltox.dll is not necessary. I found this by trial and error and it’s a good way to save 50MB on your HD and DB.

The linux versions do not have dlls only the binaries (wkhtmlpdf and wkhtmlimage)


Q) I’m having troubles uploading the binaries. What can I do?

A) This issue is usually related with network configuration. I suggest using google chrome as it has a nice progress indicator and sometimes you can see it reach 100% and get back to 0%.  

In those cases the best recommendation I can give is to use a different network or a wired connection (it has worked on most of the cases).

Some community members have worked around by extending the max request length and timeout by using Factory configuration to change the web.config. I never had the need to try this approach but you can find more details here


Q) I cannot export. I’m getting a blank page. What can I do?

A) There can be several reasons for this:

  • First and foremost if you’re using an on premises installation log onto the machine and navigate to the HtmltoPdfConverter folder and execute the component from the command line. The output is usually a good indicator of potential problems.

  • If you cannot log into the machine activate the LogDebugMessages option on the Administration page of the component and check service center logs.

  • Make sure the page you’re trying to print is accessible from the server.

  • On some occasions the machine cannot reach when using the machine domain or localhost due to bad dns resolving/configuration. Try to use the IP or 127.0.0.1 instead or log into to the machine and try to access the URL from the browser.

  • There’s a known incompatibility between the component and a SilkUi based widget to detect the mobile devices because it is blocking the execution of javascript on the component. In such scenarios create an empty theme not SilkUi based and use it as the theme for the webflow of the page you’re trying to export.


Q) My content is broken when exporting to pdf. Why?

A) If you’re using a responsive theme (London, SILK or other) then you need to understand that the default page size is much closer to a tablet resolution than a desktop so the elements will break accordingly. Try to fix your page to have a good L&F on a tablet resolution and it should work when exporting A4 out of the box.

If you're using a Table Records you can add a NoResponsive class to the widget so it won't break in that resolution


Q) I cannot export I’m getting the invalid permissions screen. What can I do?

A) The screen to import needs to be accessible anonymously. This is because the component runs on a different session without the possibility to trigger the login.


Q) Sometimes the export works sometimes doesn’t. What’s wrong?

A) Please validate if you have more than one front end in your service center and if so activate the farm mode on the administration page. Also validate the potential network issues on the several frontends.


Q) I'm getting some performance degradation because of the component.

A) wkhtml is known to be memory and cpu intensive when the pages are large this situation can be made worse if you try to export several pdfs at the same time.

In order to minimize the impact you should try and reduce your pdf size so the generation is as less intensive as possible.

Starting with version 1.1.13 a new feature was implemented so the processes run with low priority. This is activated by default and configurable in the administration page of the component. It's recommended to be used.

If you cannot control if more than one request is done at the same time (usually when you export in a screen on an application with several users this is hard to control) it is recommended to implement a queuing system where you store the pdf generation requests and then process them one by one in order to avoid more than one instance of the executable to run.


Usage


Q) If the screens to export have to be anonymous how can I secure them?

A) The best way is by generating a token and passing it to the page to generate and on the preparation of that page validate the token. See this post for guidelines for a possible implementation.


Q) How can I keep my content from splitting from one page to the other?

A) Use CSS page-breaks to control the behavior:


Q) How can I control the header on my pages?

A) By using css.

  • If you want to prevent your table header to be repeated on other pages, you should apply the following CSS styles on the page that you are printing:

  • If you want to maintain your table header repeated on all pages but still ensure that the content don't overlap you can try the below solution:




Q) Does the component allows me to do X?

A) Check the component command line parameters and try to use them accordingly.


Support


Q) Is this component the official way of exporting PDF with the OutSystems Platform. Can I email OutSystems support with questions about it?

A) The Html2Pdf is a community driven initiative and not supported by OutSystems. The team is composed by OutSystems and Non OutSystems personnel that work in a best effort mode to keep it running and assist the community with the questions  


Q) What alternatives do I have to this component.

A) There are several alternatives on the forge. Just search for pdf


Hi

Regarding:


Q) If the screens to export have to be anonymous how can I secure them?

A) The best way is by generating a token and passing it to the page to generate and on the preparation of that page validate the token.

Can you point me to any instructions on how to do that?

Thanks

Trish

Trish Tye wrote:

Hi

Regarding:


Q) If the screens to export have to be anonymous how can I secure them?

A) The best way is by generating a token and passing it to the page to generate and on the preparation of that page validate the token.

Can you point me to any instructions on how to do that?

Thanks

Trish

Hi Trish,

This should be quite easy to accomplish by:

1) Create a table to store the authentication tokens and an expiration date

2) Before generating the PDF you generate the token and store it on the table with an expiration date time

3) Pass the token to the anonymous page via a URL parameter

4) On the preparation of page you validate the token against the DB and if valid first you invalidate the token and then proceed. If the validation fails you can throw an exception

With this approach you can keep your page anonymous but secure and use it to generate the PDF.

Hope this helps,

Guilherme

Hi,

We have used HTMLTOPDFConverter which internally called wkhtmltopdf exe for converting OutSystems content to PDF. It worked fine with English characters. but when Chinese presents in the page, while converting into PDF, Chinese characters are not render correctly. Page is displayed correctly on the browser. but output PDF file generate from wkhtmltopdf from page, Chinese characters are render as square.

If we build simple html page in notepad ( clean HTML ) with Chinese character, PDF file print perfectly. 

When simple page developed with service studio. While access this page in the browser it works properly. But htmltopdf converter component is not render Chinese characters correctly. We have notice any page developed on service studio added extra html ( css, js file ) which causes the problem. Appreciate any suggestion from your side in resolving the problem.


Hi Jyotiranjan,


Have you tried to install wkhtmltopdf locally and run it from the cmdline? Does it work?

Have you done it from the server? What were the results?


In any case after googling a bit looking for chinese character support for wkhtmltopdf it seems that in order for it to work the necessary fonts need to be installed on the server (where the component actually runs).

I'd recommend taking a look at this post and try to set up the font and if necessary install it on the server.


Let us know if you are able to solve your issue

Cheers,

Guilherme

Hi Guilherme,

Thanks for the reply. we have downloaded latest version from wkhtmltopdf from "
https://wkhtmltopdf.org/downloads.html" ( Bleeding Edge ). Now Chinese characters are getting printed. Now we have another problem. we are not able to pass footer. Previously version footer can be passed with --footer-html. Please let us know what is correct way to pass footer

we are expecting to pass other argument.


--encoding ISO10646 -O Landscape -B 60 --header-font-size 10 --header-right Page[page]/[toPage] --footer-html http://<server>/FooterPage.htm


Regards,

Dash 


Hi Dash,

According to wkhtml documentation --footer-html should be the parameter.

Have you tried locally from the command line? Does it work?

If not I recommend you to reach out directly on their forums to see if they can help because it is not something we can control on the wrapper.

Let us know if you were able to solve the issue

Cheers,

Guilherme

Hi Guilherme,

The wkhtml documentation link you have shared is for version 0.12.4 (with patched qt). We cannot use that since it does not support Chinese character encoding. In order to solve that we installed the alpha version of wkhml. The Chinese char encoding issue is resolved with the alpha version, but now we are having problem setting page footer. The --footer-html parameter is not available in the alpha version.

We are looking for a solution where we can convert an HTML page having Chinese chars and a footer to PDF. Any help on this topic will be appreciated.


Regards,

Ravi


Hi Ravi,


Unfortunately I do not have a solution or workaround for the library behavior. You may check with wktml to see if the can produce a version that works well with both or take their source do the changes and compile it yourself.


But i’ve never done it before and i think it may not be that easy.


You could try to add your footer via an iframe and embed it on the page with an expression.

Can you try that?


Cheers,

Guilherme

Brilliant component, however it is degrading our environment performance severely.  Has anybody else experienced this and is there a workaround?

Hi Esther,


Without more details I'm unsure if this is your case but the wkhtmltopdf tend to use a lot of memory if the pdf to be generated is very large.

What's the size of the final pdfs? And with what frequency it is being generated?

Neither the component nor the underlying library (wkhtmltopdf) have any features to limit memory consumption so in terms of workarounds I can only recommend for you to try and rearchitect your application to generate the pdf assynchronously or perhaps taking advantage of zones (and isolating the component in a separate server) if supported in your configuration.

Let us know if you are able to find a solution.

Cheers,

Guilherme

Thank you for your reply.  The documents are not very large, 5 - 6 pages, average size 200kb - 1mb.  These are generated approximately 10 times a day, so not heavy usage.  

Esther Pedro wrote:

Thank you for your reply.  The documents are not very large, 5 - 6 pages, average size 200kb - 1mb.  These are generated approximately 10 times a day, so not heavy usage.  

And when you say the component is degrading the environment can you be more specific to what's causing the degradation (memory, cpu usage) and how did you relate with the component?



Good day Guilherme

We logged a support ticket with Outsystems last week and this occurred again yesterday.  This was their reply:

I noticed that the FE server of your Production environment was having a high CPU usage since last Friday around 10:30 AM UTC. Accessing the FE server, I could see that the process responsible for the high CPU usage was related to the HTML2PDFConverter component [1].

The HTML2PDFConverter components has known issues with high CPU usage, we already had some support cases reporting slowness that ended up being this component excessively consuming CPU. Consuming that the component is not supported by OutSystems, we recommend you opening a thread in the component discussion page regarding this issue.

We'll be waiting for your feedback regarding the current status of the environment.

Here's the link for the reference in this communication:
[1] - https://www.outsystems.com/forge/209/ 

Hi Esther,

Thank you for the clarification.

Unfortunately with the information provided by our technical support I cannot give you a better advice other than trying to look for other component that would do the same but doesn't have to run on the machine.

What I can tell you is that this component is used extensively in several customer projects both running on the cloud and on prem without issues that I'm aware of.

The component is something as simple as running an executable, generating a tmp pdf and trigger it's download. Without any more detail to what could be causing the load (e.g antivirus running, server errors on event viewer, etc) it is impossible for me to understand what could be causing it.

So for now the only recommendation I can give you is to look for an alternative that doesn't impact your infrastructure.

Cheers

Guilherme


Hi

Thank you for trying to assist.  We will keep you posted if we get any additional information.



Hi All,


After talking with some people that had similar performance issues as well as the indication that in some ocasions where the application pool would recycle while the component was being executed would lead to a scenario with zombie processes I've released a new versions 1.1.13 with a couple of experimental features but only for the .NET version.


The first is the ability to execute the processes with less priority (default is true).

The second is activating the EnableRaisingEvent property in order to force the connection between processes so that if the pool is recycled the child processes are killed (details here)


This version is still experimental let me know if you find any issues.


Thanks,

Guilherme

Guilherme -


Suggest putting the "NoResponsive" trick in this FAQ. :)


J.Ja

Thank you very much.  We will look at your suggestion and provide feedback.

Justin James wrote:

Guilherme -


Suggest putting the "NoResponsive" trick in this FAQ. :)


J.Ja

Thank Justin. Added.