21 February 2022

Using Template Haskell to bundle data into an executable at compile time

Often, a program will rely on static data - lookup tables, files to serve, initial values for internal data structures, and so on. Template Haskell can be used to load files, and execute code, at compile time.

This is a very gentle introduction to the otherwise often intimidating world of Template Haskell. For the sake of simplicity, I am keeping a narrow focus: loading a file, and then manipulating the data from that file, both at compile time.

Pre-requisites

I'm going to assume that you already have GHC and Cabal installed, and understand the very basics of programming in Haskell.

How-to

First, create a fresh cabal project.

Then add a file within the project with the data you want to load at compile time: echo 'hello template haskell!' > hello.txt.

Once you have added bytestring and file-embed to your cabal build-depends, and the language pragma TemplateHaskell to your haskell code file, you can use $(embedFile "hello.txt") to embed your data file at compile time.

Full example:

{-# LANGUAGE TemplateHaskell #-}
module Main where
import Data.FileEmbed (embedFile)
import qualified Data.ByteString.Char8 as B

main :: IO ()
main = B.putStrLn embeddedData

embeddedData :: B.ByteString
embeddedData = $(embedFile "hello.txt")

Cabal file:

cabal-version:      2.4
name:               template-haskell-data-loading
version:            0.1.0.0

executable template-haskell-data-loading
    main-is:          Main.hs

    build-depends:    base ^>=4.15.0.0,
                      bytestring,
                      file-embed
    hs-source-dirs:   app
    default-language: Haskell2010

When we run this program, we should see the text from the file displayed. If we examine the complied executable file (I used xxd for this), we can see that the text is embedded within it.

More Complex Compile-Time Operations

If we want to compose our own compile-time operations, we need to create another module, as the function used to generate the code that will be spliced in when we compile our main file must be compiled first.

We also need must add template-haskell to our build-depends list, and the name of our other module to other-modules (I have called mine CompileTime).

cabal-version:      2.4
name:               template-haskell-data-loading
version:            0.1.0.0

executable template-haskell-data-loading
    main-is:          Main.hs

    other-modules:    CompileTime

    build-depends:    base ^>=4.15.0.0,
                      bytestring,
                      template-haskell,
                      file-embed
    hs-source-dirs:   app
    default-language: Haskell2010

If we want to manipulate the data from the file at compile time, we can do this by creating a new function (embedReversedFile) which reads the file, but reverses the ByteString before turning it into a Haskell expression (Exp) to be spliced into our code.

CompileTime.hs

module CompileTime where

import Language.Haskell.TH.Syntax (Q, runIO, Exp)
import Data.FileEmbed (bsToExp)
import qualified Data.ByteString.Char8 as B

embedReversedFile :: FilePath -> Q Exp
embedReversedFile fp = (runIO $ B.readFile fp) >>= bsToExp . B.reverse

Above, the reverse function (B.reverse :: B.ByteString -> B.ByteString) manipulates the ByteString. We could substitute in its place any other function which takes and returns a B.ByteString, and that function would run at compile time.

And now, we can use this new compile time function in our code.

Main.hs

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TemplateHaskell #-}

module Main where
import CompileTime (embedReversedFile)
import qualified Data.ByteString.Char8 as B

embeddedData :: B.ByteString
embeddedData = $(embedReversedFile "hello.txt")

main :: IO ()
main = B.putStrLn embeddedData

Examining the executable binary now will reveal the reversed string embedded within it.

Tags: Haskell