Fork me on GitHub

Getting Twiggy With It: Node Visitors

I am going to write about node visitors: one of the more obscure but powerful concepts in Twig. To help make sense of it I will be using a simple, real world example.

At OpenSky we recently added a basic CMS to our site that allows us to make edits to text without going through the hassle of editing a template and redeploying the entire codebase. We added a module to our admin that manages these CMS “blocks” as documents in MongoDB. Each document represents a block and includes a unique, descriptive key and the text content. In our templates we render these blocks using a simple Twig function. For example:

{{ cms_block('welcome') }}

When rendering a template, Twig would come across this function and issue a query to MongoDB for the welcome document in the collection of CMS blocks. Easy enough, right?

Not quite. We would like to speckle these blocks all over pages across the site: a paragraph here, a header there, an image over there, meta tags, Facebook tags… We could be looking at adding a dozen or more queries to a page for our little CMS, which is unacceptable.

Twig’s own Flux Capacitor

Imagine being able to look into the future to see what CMS blocks a template was going to use and issue a single query to prefetch them all from the database. That is exactly the sort of thing you can do by taking advantage of the static compilation phase, when Twig converts your templates into optimized PHP classes.

A Quick Primer on Twig Internals

Let’s take a step back and review some of the guts of Twig. I promise I’ll return to Back to the Future references later.

Compilation of a template into a PHP class is a four step process:

  1. Load
  2. Tokenize
  3. Parse
  4. Compile

The first step involves an implementation of Twig_LoaderInterface, of which only one method is pertinent to us: getSource(). This method accepts a template name and returns the raw content of that template. In the case of the default Twig_Loader_Filesystem implementation, this boils down to a simple call to file_get_contents().

The second step, tokenizing the loaded source, is handled by an implementation of Twig_LexerInterface which consists of one method, tokenize(). If you are having a hard time sleeping at night you can read up on lexical analysis on Wikipedia. For the purpose of this article, you only need to understand that the lexer converts what you’ve written in your template using Twig’s grammar into a stream of simple PHP objects called tokens.

In the third step the stream of tokens created by the lexer is parsed into a multi-dimensional tree of nodes. This work is done by the parser in cooperation with a collection of token parsers. This is the extension point you would hook into if you wanted to create a new {% foo %} tag, a topic outside the scope of this article.

In the final step the node tree created by the parser is compiled into runtime PHP code. Each node in the tree implements Twig_NodeInterface, which includes a compile() method that allows it to write arbitrary code to the resulting template class.

Great Scott!

That description of the Twig engine was criminally brief, but it should give you enough knowledge to understand where node visitors come in. After the parser creates the node tree but before the tree is compiled into PHP code, the parser recursively iterates over the tree and filters each node through its registered node visitors. Each visitor has a chance to inspect every single node in the tree, make changes, replace it with another, or even remove it altogether. It’s using this tool that we are able to inspect each template and magically anticipate what CMS blocks each template will need, before that template is rendered.

Coming Soon…

Dive into code with a working example of a Twig node visitor.