pinealservo-com/posts/2013-09-07-MoreOnHakyll.mar...

167 lines
7.3 KiB
Markdown
Raw Permalink Normal View History

---
title: More on Hakyll
---
On Personal Websites
--------------------
I have had a website running, off and on, since the mid-90s. At first,
it was a simple collection of HTML files with a few images. Web
technologies developed rapidly, though, and it became time-consuming
to do anything interesting with my sites.
I've used a number of "simplifying" tools over the years, but most of
them relied on dynamically generating pages from a SQL database of
some sort, and I just don't want to deal with that right now. I want
simple markup again, and I want it to be served directly and
efficiently by any old plain web server.
My Basic Requirements
---------------------
HTML is no longer really a markup language, it's more of a structured
document modeling language. I don't want to compose text documents in
*code*, I want to write them naturally in my native language. A few
stylistic clues in my text formatting should suffice for specifying
the structure of these simple text documents.
So, [Markdown][Md] provides exactly the sort of markup language I
want, but a web site requires more structure. Exactly the structure
provided by HTML, in fact. So, [Pandoc][Pan] converts Markdown to HTML
(among *many* other formats), which provides the core functionality I
need.
But there's more-- a bunch of HTML files don't quite make a cohesive
site. They need some common additional structure, such as headers and
footers. They need a navigation structure to link them together. And
it would be nice to be able to generate content syndication feeds as
well.
What I need is a system that allows me to compose simple text files
written in a text-based markup language together with some HTML
templates into a *static* set of interlinked pure-HTML pages, along
with some other resources, whenever I write a new document.
Enter Hakyll
------------
The [Hakyll][Hak] system provides a flexible language for describing
just that sort of system. It is essentially a static site *compiler*
built from a set of *rules* for transforming source documents to
output documents.
This code snippet shows the rule for compiling top-level pages of my
site from Markdown to HTML:
~~~~ { .haskell }
match "*.markdown" $ do
route $ setExtension "html"
compile $ pandocCompiler
>>= loadAndApplyTemplate "templates/default.html" defaultContext
>>= relativizeUrls
~~~~
The code is a form of Embedded Domain Specific Language (EDSL) that is
both a legal [Haskell][Has] expression as well as a succinct
description of what I want Hakyll to do for this particular case.
The first line describes what files this rule should apply to. It's
expressed in a format similar to UNIX filename glob format. This one
says that it applies to all files at the top level of the site source
directory that have names ending in `".markdown"`.
Digressions on Haskell
----------------------
For those not familiar with Haskell, the `$` character is a
right-associative operator that denotes function application-- the
left-operand is the function, the right-operand is the argument. This
is in addition to the normal way of expressing function application,
which is a left-associative operation denoted by juxtaposing the
operands.
Normal function application has a very high precedence in the Haskell
grammar, so the `$` form is often used in place of parentheses to
allow a secondary function application to provide the argument to a
primary function application.
With that digression out of the way, the second line can be seen as a
nested function application-- the `route` function is passed the
`setExtension "html"` function. As another digression, there are two
interesting things to say about this nested application:
1. The function application `setExtension "html"` evaluates to a value
that is itself a function-- the function that takes a filename,
possibly with some extension already, and produces a new filename
with the extension `"html"` instead. So `setExtensions` is a
*higher-order* function in two ways, because it is used as an
argument to another function and also because it returns a function
as its result.
2. The arguments to Haskell functions are not necessarily evaluated
before the functions themselves. So if the rule on line 1 never
matched any files, the `setExtension "html"` expression would never
need to be evaluated. If the rule found multiple matches, however,
the expression would evaluate *only once* to the function that sets
filename extensions to `"html"`.
Regardless of the language mechanics behind the scene, the effect of
the second line is to ensure that when the rule completes, the
resulting file will have the `"html"` extension rather than the
`"markdown"` extension it started with.
Back to the Example
-------------------
The third line starts the expression that does the bulk of the
work. It calls the `compile` function, and specifies the compiler as
the `pandocCompiler` function with its output piped through a couple
of post-processors, `loadAndApplyTemplates` and `relativizeUrls`.
The `pandocCompiler` is built in to Hakyll, and it links to the Pandoc
markup processor mentioned earlier. In default form, as it's used
here, it translates Markdown documents to HTML documents.
As the name implies, `loadAndApplyTemplates` applies the template we
give it along with a `Context`, which is a data structure that
describes the mappings between template variables and the values they
should expand to. We use the default one, which provides template
variables such as `"title"`, `"url"`, and `"body"` to the template
based on the values from the item that's being processed.
Finally, `relativizeUrls` will find all the links in the document and
change them from absolute URL form, e.g. `"/foo.html"`; to relative
URL form, e.g. `"foo.html"`. This allows us to have absolute URLs for
syndication feeds, but relative URLs for the site itself.
This example covered only one of eight rules I'm currently using, but
hopefully it gives an idea of how simple and flexible the rule-based
system is.
Hakyll's Rule Processing
------------------------
Like the `make` language, Hakyll's rule processor keeps track of the
relationships between source files and build products, and it only
runs rules for which the input files are newer than the existing build
products. If you just add a new bit of content, such as a blog entry,
only a few rules may need to run again. On the other hand, changing a
core template may require rebuilding most of the site!
Full rebuilds are one of the areas in which Hakyll really shines,
though. Since its rules are a language embedded within Haskell, a
Hakyll site builder is a compiled and optimized binary program whose
single purpose is to rebuild your site as quickly as possible.
By default, it stores all its intermediate work on disk, but if you
have the memory to work with, it can also keep all its intermediate
work in memory, which makes it even faster. For my own site, which
only has a few files so far, rebuilding the entire thing is nearly
instant even on an old 32-bit PC, so I haven't bothered with any
optimization.
[Md]: http://daringfireball.net/projects/markdown/ "Markdown's Original Home"
[Pan]: http://johnmacfarlane.net/pandoc/ "Pandoc's Home"
[Hak]: http://jaspervdj.be/hakyll/ "Hakyll's Home"
[Has]: http://www.haskell.org/ "Haskell's Home"