Kris Wallsmith

Symfony Guru at opensky.com.
Discussing web development, Symfony and fatherhood.

Posts tagged twig

Jan 6

Twig Node Visitors (Part 2)

This is the second in a series of articles on Twig node visitors. Please read part one first.

Node visitors can be used for any number of things. The Twig_NodeVisitorInterface interface itself is just three methods:

interface Twig_NodeVisitorInterface
{
    /**
     * @return Twig_NodeInterface The modified node
     */
    function enterNode(Twig_NodeInterface $node, Twig_Environment $env);

    /**
     * @return Twig_NodeInterface The modified node
     */
    function leaveNode(Twig_NodeInterface $node, Twig_Environment $env);

    /**
     * @return integer The priority level
     */
    function getPriority();
}

The interface is simple and powerful. It provides a mechanism for manipulating nodes before a template is compiled down to a PHP class. It puts no constraints on what it can be used for. In the Twig core there are node visitors for escaping and optimization, both of which bear no semblance to what we are going to do here.

Our use case at OpenSky has to do with querying blocks of CMS content from the database eagerly, based on what blocks have been included in the template using the following function:

{{ cms_block('header') }}

We can accomplish this eager loading by using a node visitor to statically analyze each template and stashing the CMS blocks it calls for. The first method of the interface, enterNode(), can be used to look at every node in each template to see if it represents a call to this function.

public function enterNode(Twig_NodeInterface $node, Twig_Environment $env)
{
    if ($cmsBlock = $this->getCmsBlockKey($node)) {
        $this->cmsBlocks[] = $cmsBlock;
    }

    return $node;
}

// ...

private function getCmsBlockKey(Twig_NodeInterface $node)
{
    if ($node instanceof Twig_Node_Expression_Function
        && 'cms_block' == $node->getAttribute('name')) {
        return $node->getNode('arguments')->getNode(0)->getAttribute('value');
    }
}

This code looks at each node to see if it represents a call to the cms_block function and pushes the first argument (the name of the CMS block) to an internal array for use later.

After running this and debugging what was being stacked onto that array we found a few issues. First, the visitor was not smart enough to crawl included or imported templates and look for CMS blocks there. Second, the node visitor was not being reset for each template so by the end of cache warmup all CMS blocks called across the entire application were stacked on that internal array.

Solving the first issue meant adding a way to recursively crawl the graph of each template’s children — a child being a call to either {% include %} or {% import %} in our case. We did this by adding another internal stack:

public function enterNode(Twig_NodeInterface $node, Twig_Environment $env)
{
    if ($cmsBlock = $this->getCmsBlockKey($node)) {
        $this->cmsBlocks[] = $cmsBlock;
    } elseif ($templateName = $this->getIncludedTemplateName($node)) {
        $this->includes[] = $templateName;
    }

    return $node;
}

// ...

private function getIncludedTemplateName(Twig_NodeInterface $node)
{
    if ($node instanceof Twig_Node_Include || $node instanceof Twig_Node_Import) {
        return $node->getNode('expr')->getAttribute('value');
    }
}

We solved the second issue by listening for the root node, an instance of Twig_Node_Module, and clearing the internal stacks when we leave that node:

public function leaveNode(Twig_NodeInterface $node, Twig_Environment $env)
{
    if ($node instanceof Twig_Node_Module) {
        // todo: make these stacks available at runtime

        // reset
        $this->cmsBlocks = array();
        $this->includes  = array();
    }

    return $node;
}

We’re in pretty good shape at this point. We have built a node visitor that collects the information necessary for eagerly querying our CMS for the blocks that each template calls for. Now we just need to make this information available at runtime, when the eager query needs to be executed.

We’ll do this in the next post by wrapping the Twig_Node_Module in our own module node that compiles down the a PHP class with the necessary public methods. Stay tuned!


Nov 30

Getting Twiggy With It: Node Visitors

I am going to write about node visitors: one of the more obscure but powerful concepts in Twig. To help make sense of it I will be using a simple, real world example.

At OpenSky we recently added a basic CMS to our site that allows us to make edits to text without going through the hassle of editing a template and redeploying the entire codebase. We added a module to our admin that manages these CMS “blocks” as documents in MongoDB. Each document represents a block and includes a unique, descriptive key and the text content. In our templates we render these blocks using a simple Twig function. For example:

{{ cms_block('welcome') }}

When rendering a template, Twig would come across this function and issue a query to MongoDB for the welcome document in the collection of CMS blocks. Easy enough, right?

Not quite. We would like to speckle these blocks all over pages across the site: a paragraph here, a header there, an image over there, meta tags, Facebook tags… We could be looking at adding a dozen or more queries to a page for our little CMS, which is unacceptable.

Twig’s own Flux Capacitor

Imagine being able to look into the future to see what CMS blocks a template was going to use and issue a single query to prefetch them all from the database. That is exactly the sort of thing you can do by taking advantage of the static compilation phase, when Twig converts your templates into optimized PHP classes.

A Quick Primer on Twig Internals

Let’s take a step back and review some of the guts of Twig. I promise I’ll return to Back to the Future references later.

Compilation of a template into a PHP class is a four step process:

  1. Load
  2. Tokenize
  3. Parse
  4. Compile

The first step involves an implementation of Twig_LoaderInterface, of which only one method is pertinent to us: getSource(). This method accepts a template name and returns the raw content of that template. In the case of the default Twig_Loader_Filesystem implementation, this boils down to a simple call to file_get_contents().

The second step, tokenizing the loaded source, is handled by an implementation of Twig_LexerInterface which consists of one method, tokenize(). If you are having a hard time sleeping at night you can read up on lexical analysis on Wikipedia. For the purpose of this article, you only need to understand that the lexer converts what you’ve written in your template using Twig’s grammar into a stream of simple PHP objects called tokens.

In the third step the stream of tokens created by the lexer is parsed into a multi-dimensional tree of nodes. This work is done by the parser in cooperation with a collection of token parsers. This is the extension point you would hook into if you wanted to create a new {% foo %} tag, a topic outside the scope of this article.

In the final step the node tree created by the parser is compiled into runtime PHP code. Each node in the tree implements Twig_NodeInterface, which includes a compile() method that allows it to write arbitrary code to the resulting template class.

Great Scott!

That description of the Twig engine was criminally brief, but it should give you enough knowledge to understand where node visitors come in. After the parser creates the node tree but before the tree is compiled into PHP code, the parser recursively iterates over the tree and filters each node through its registered node visitors. Each visitor has a chance to inspect every single node in the tree, make changes, replace it with another, or even remove it altogether. It’s using this tool that we are able to inspect each template and magically anticipate what CMS blocks each template will need, before that template is rendered.

Coming Soon…

Dive into code with a working example of a Twig node visitor.