HTMLDoc: PDF From HTML Markup (UNIX FreeBSD)

Updated on January 4, 2019
HTMLDoc: PDF From HTML Markup (UNIX FreeBSD) header image

Have you ever wanted to be able to generate PDF files on-the-fly without having to spend hours setting up your server environment? HTMLDoc will turn properly formed Markup (HTML 3.2) to PostScript (PDF 1.6), dynamically.

For this example, we will be utilizing Vultr’s FreeBSD 11.2 (x64) with IPv4, although it all works the same with IPv6 only servers. Keep in mind, we're working with a brand-new FreeBSD install, and as such we will go through the steps of setting up a FreeBSD machine to correctly and safely take on new applications such as HTMLDoc.

Update FreeBSD 11.2 (x64)

First things first, on FreeBSD we need to update the system if you haven’t done so already. Log in as root and run the two following commands, the first command will seek out and retrieve updates, if available, while the second command is only useful to install an update if indeed one was fetched.

freebsd-update fetch
freebsd-update install

Note: When presented with installation or configuration options simply use the default choices. Furthermore, when asked Y/N questions just answer Y on all prompts.

Install and initialize the Ports Collection

First, fetch the updates for the Ports Collection. This step will take several minutes.

portsnap fetch extract

Once this process is done, we will see the following output.

Building new INDEX files... done.

Now, we install the updates we just fetched.

portsnap fetch update

Next, we install portmaster.

cd /usr/ports/ports-mgmt/portmaster
make install clean

Now that we’ve installed portmaster, an application that helps us install applications from the Ports Collection, we can update any outdated ports in our system.

portmaster -a

This is a very long process, but as such, it is indeed the best process in getting your machine up to date, secured and ready to install HTMLDoc and, in-turn, churning out PDFs on-the-fly. This process will definitely take several minutes, up to 30 minutes.

If any errors are encountered during this process just add the –f switch, which will upgrade and rebuild all the ports, in essence:

portsnap -af

The update is done when you see the following output.

===>>> Done displaying pkg-message files

Installing HTMLDoc

Now, we can install HTMLDoc from the Ports Collection. You’ll be asked if you would like to add the GUI front-end to the application. This is entirely optional. All other options should be left to default, and simply go through the motions of installing all the dependencies for HTMLDoc. You’ll notice plenty of dependencies, such as animated PNG support, jpeg-turbo, Babel, NASM, CMake, py27 and a whole lot more, including curl. This is why we update the system before installing HTMLDoc, for there are quite a number of dependencies which may cause installation issues if the system is not up to date. This step will take the longest.

cd /usr/ports/textproc/p5-HTML-HTMLDoc/ && make install clean

Finally, when you see the following lines displayed, we’re done installing HTMLDoc:

===>  Cleaning for p5-HTML-HTMLDoc-0.10_2

Install Nano

Since the next example uses Nano, we will install and link it now, like so.

cd /usr/ports/editors/nano && make install clean
ln -s /usr/local/bin/nano /usr/bin/nano

Generating your first PDF document from HTML markup

Let’s move on over to /tmp/ to play around and test out HTMLDoc.

cd /tmp/

Now, let’s create a simple HTML document which we will use to generate a PDF document, call it markup-source.html.

nano markup-source.html

Add the following HTML markup.

<title>My first PDF from HTML</title>
This is the body of my first PDF document made from HTML.

Save the file by hitting Ctrl + X to exit Nano editor, press Y followed by Enter to save your changes. Now, you can instruct HTMLDoc, via the command line, to parse a PDF document from your markup-source.html file.

htmldoc --webpage -f postscript-output.pdf markup-source.html

You will now have a new file named postscript-output.pdf in the /tmp/ directory, with a title of "My first PDF from HTML" and a body of "This is the body of my first PDF document made from HTML".