wkhtmltopdf and UTF-8

Posted Sat Oct 20 @ 04:57:26 PM PDT 2012

wkhtmltopdf is a great (maybe the best?) way to convert HTML documents to PDF. It can easily handle most CSS, and it will even execute JavaScript to produce the final HTML output. I use it on two of my websites, and the only downside I have experienced is that is kind of slow (takes a few seconds to make a PDF). Not a deal breaker though.

Recently, I had a user complain that my PDFs were not displaying Chinese characters. The HTML output would display them, but when the PDF was generated, little boxes, question marks, or slashes would appear in their place. Here's an example:

The interesting thing was that my beta server would display the characters in the PDF, but my production server wouldn't. My beta server runs the desktop version of Ubuntu, but the production server runs the server edition. I figured the server edition was missing some package that would display the characters. The hard part was figuring out which package it was!

After Googling around for an hour, I stumbled upon an Ask Ubuntu question that was very loosely related to my problem. Fortunately, one of the packages mentioned in an answer solved the problem for me. The magical package is: ttf-wqy-microhei

On a related note, if you want to use wkhtmltopdf on your Ubuntu server, and you get error messages like:

./wkhtmltopdf: error while loading shared libraries: libfontconfig.so.1: cannot open shared object file: No such file or directory


./wkhtmltopdf: error while loading shared libraries: libXrender.so.1: cannot open shared object file: No such file or directory

Then this should solve the problem:

apt-get install libxrender-dev ttf-wqy-microhei

