libTemplateEngine is a simple unicode awawre templating engine.
In the simplest form, the engine will use a dictionary to expand named values contained in a template. E.g.
<form>
{{FIRSTNAME}}:<br>
<input type="text" name="firstname">
<br>
{{LASTNAME}}:<br>
<input type="text" name="lastname">
</form>
can be converted into:
<form>
First name:<br>
<input type="text" name="firstname">
<br>
Last name:<br>
<input type="text" name="lastname">
</form>
I started writing C code back in the Kernigan and Richie days. I continued to write C and C++ code, although professionally more focus was given to C# with the advent of the .Net framework. Now, I find myself with a bit of extra spare time, and have decided to brush off my C++ skills.
In order to do that I decided on a small cross platform server project. One of the first tasks was cross platform capability of reporting performance counters. Remember Knuth's premature optimization is the root of all evil. Surprisingly I didn't find any readily available open source solutions, and decided to roll my own.
That library uses a simple DSL (domain specific language) to hide away the platform specific code. It became obvious to me that a the code generation could be simplified tremendously by using a simple template engine.
There are a lot of templating engines available. But none really matched my requirements, either they were huge, and required a lot of effort compared to what I needed done or they had severe issues working across platforms.
So, I decided to roll my own (is there a pattern here?). In retrospect it might have been quicker to go with a different code generation scheme, but as I started out say, I find myself with a bit of spare time on my hands.
I wanted a lightweight, simple, cross platform templating engine, with full unicode support.
Those requirements are quite fluffy, e.g. what does lightweight mean?
For my purposes lightweight means, no external dependencies! I've had to compromise a bit on this goal. The unit tests depends on boost, the build environment depends on cmake and this documentation depends on doxygen. The core library can be build without boost and doxygen.
Another aspect of lightweight is that is builds as a static link library, a .a or .lib file, rather than a .so or .dll file. I doubt that this library will gain such popularity that the hazzles of yet another install time component is worth the effort.
Cross platform in this context means that it can be built and run under both most Linuxes and Windows. It might build on other platforms, BSD and OSX springs to mind, but I have not verified this.
Full unicode support means just that. The actual text being expanded by the engine should support all 17 planes, and the engine should correctly expand all unicode based templates regardless of the glyphs being displayed. If I want hieroglyphics or klingon text, I should be able to get that.
The code is available under the GNU LGPL license. Which basically means you are free to link to the binary code, but any changes to the code should be made publicly available. If you do modify the code, fix a bug or add an interesting feature I'd appreciate a pull request.
Licenses of dependencies
Application | License | Comment |
---|---|---|
CMake | BSD | Required |
Boost | BSL | Optional |
Doxygen | GPL | Optional |
All libraries - including libTemplateEngine - have no ownership of the output of the application. Who ever has the copyright of the input, also holds the copyright of the output.
If you're an old hand at this, you probably only need to know that libTemplateEngine uses cmake and that it doesn't accept in-source builds.
The root CMakeLists.txt is located in the src
folder.
During configuration it will try to locate boost and doxygen. If it finds boost it will enable building of the unit test framework, if it finds doxygen it will enable building of the documentation.
CMake is a great tool, but unless you perform an install, it leaves the build results in some - in my opinion - odd places. I doubt anyone would want to make a system wide install of libTemplateEngine, so I've created a small build script which performs the build and then installs things in a safe local location.
./lib
- will contain the static libraries (.a/.lib).
./libTemplateEngine
- will contain the header files.
./bin
- will contain the unit test executable, if it was built.
./doc
- will contain the documentation, if it was built.
I know libTemplateEngine
is a little unconventional, but this way - with proper specification of include folders - it is possibly to write something like this in your code:
#include <libTemplateEngine/TemplateEngine.hpp>
you have a namespace for your include files! This way it is easy to see which library provides the header file.
The easiest way will be to use the wibuild.cmd
file.
- Open a Visual Studio 2015 command line prompt
- CD to the root of libTemplateEngine
- Type
wibuild
and press enter.
The the script will use Visual Studio solutions and projects to build the binaries. The build will performed four times, once for: debug x86, release x86, debug x64 and finally release x64. When the build has completed the solution and project files can be found in the build
directory.
The easiest way will be to use the libuild
script.
- Open a shell/terminal window
- CD to the root of libTemplateEngine
- Type
libuild
and press enter.
libTemplateEngine operates in a two step process.
- Compile the input stream into a list of template elements
- Pull in a dictionary and render the compiled text.
Any text enclosed between {{
and }}
, start-tag and end-tag respectively, is considered to be a processing instruction. Any text found outside these tags is considered to be plain text, and is copied verbatim to the output stream. Any text inside tags are parsed separately and is considered to be some kind of instruction.
Most of the following will focus on processing instructions since there is very little processing involved in handling the plain text components. Any plain text, except for escape sequences, is copied from the input to the output verbatim.
This concept of tagging - of course - begs the question of escaping, or how to have strings like "{{" passed through to the output stream without any processing. libTemplateEngine handles this in a slightly unusual way. As usual the backslash '' character can be used to escape the {{ character.
The goal was to have the least possible impact on the source text which should be handled by libTemplateEngine. When libTemplateEngine encounters two left braces in a row ({{) it looks at the third character immediately following the two braces, if that third character isn't one of the start characters of an instruction the sequence will be passed unmmodified on to the output stream.
To be specific, only if the stream contains a string matching this regular expression will it need to be escaped: {{[:_a-zA-Z0-9-]
. Piping the source file through this ugly looking sed command will add an escape marker to any location which needs to be escaped: sed s/\({{[:_a-zA-Z0-9-]\)/\\\1/
The format of a template, including instructions, can be seen in this syntax diagram:
Where PlainText is defined as:
Note! I know that from a formalistic viewpoint the above diagram doesn't make sense. What I wanted to show was that any unicode character can be used as plain text, except two left braces in a row, and how to escape those. I couldn't find a simple way of showing this without breaking a few formalistic rules. If you have a better solution, I'd love to hear it.
Only creativity sets a limit to what kinds of processing instructions could be implemented. libTemplateEngine only has three kinds of instructions (remember one of the goals was simplicity).
The first is a simple dictionary lookup expansion.
The second repeats the embedded template a number of times.
The third is a comment which can be used to clarify the tempate instructions.
Syntax:
{{\ ':'* <name>}}
Purpose:
Replace the given name with a value from the associated dictionary.
Method:
An expansion performs a lookup in the context using the embedded name and outputs the contents of the entry found.
The dictionaries in the context stack are searched for a matching name, if the name cannot be found an error is reported.
Scoping of the search can be controlled using prefixed colons (:
). Sometimes the dictionaries will contain identical names, either on purpose or by accident, by default, the value will be taken from the nearest scope where the name can be found. If the name should be taken from some other scope, colons can be used.
Let's look at an example, given the following dictionary layout.
Root | ||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Name | Value | |||||||||||||||||||||||||||||||||||||||||
Label | Root | |||||||||||||||||||||||||||||||||||||||||
Type | Integer | |||||||||||||||||||||||||||||||||||||||||
Class |
|
and the following template
class {{LABEL}
{
{{#repeat Class}}
class {{Label}}
{
{{Type}} {{:Label}} = Lookup({{::Label}})
}
{{/repeat}
}
the following text will be generated:
class Root
{
class MyObjectEn
{
Integer MyObject = Lookup(Root)
}
class MyObjectGe
{
Integer MyObject = Lookup(Root)
}
class MyObjectFr
{
Integer MyObject = Lookup(Root)
}
}
Name | Explanation |
---|---|
{{Type}} |
The name only exists in the root dictinary, and the scoped search will find that entry. |
class {{label}} |
The value is taken from the active sub-dictionary, while rendering the repeat block |
{{:Label}} |
The colon breaks out of the immediate scope, yielding the dictionary associated with the list. |
{{::Label}} |
The search for a name begins two steps up in the context stack, yielding the root dictionary. |
Syntax:
{{#repeat <label>}}<template>{{/repeat}}
Purpose:
A repeat duplicates/copies the embedded template zero or more times, as determined by the context.
Method:
A repeat instruction carries a label, assigned to it in the <label> part of the instruction. When the repeat is processed the context is queried for a value matching the label of the instruction. That value is expected to be a list of dictionaries, if not an error is reported.
The details and intricacies of dictionary lists and the context is documented else where, but basic principle is that a dictionary entry may either be a simple text value or a list of dictionaries.
The embedded template is copied once for each sub-dictionary in the dictionary list. If the dictionary list is empty, no copying is done.
Syntax:
{{-<text>}}
Purpose:
Help document and clarify the template usage.
Method: The text within the instruction is discarded.
A dictionary is a data-structure which associates a given name with a given value.
Name | Value |
---|---|
A | <value for A> |
B | <value for B> |
A value can either be a simple UTF-16 string or a list of dictionaries.
Name | Value | ||||
---|---|---|---|---|---|
Name | Rudolph Abel | ||||
Block |
|
A context can conceptually be thought of as a stack of dictionaries (Note! the stack is only implicitly expressed in the code). When the context is created it creates an internal dictionary containing some default values. The globally available values are defined here.
Prior to calling render on the template a dictionary should be attached to the context using the setDictionary method.
When the rendering is started the context pushes first the it's global dictionary onto the stack, then by the dictionary supplied via setDictionary is pushed onto the stack.
Stack | Comment |
---|---|
Main | Assigned via setDictionary |
Global | System supplied values |
Dictionary lookup is performed by walking the stack from top to bottom. First the topmost dictionary is queried, if the entry isn't found in that dictionary, then the next dictionary on the stack is queried, until either the value is found or the stack has been exhausted.
In the simple case, a template without any repeats, this is all that needs to be done.
Repeat instructions use additional dictionaries. When a repeat instruction is rendered, the context if first queried for a value with the name of the repeat instruction. The value must exist and the value must be a dictionary list, otherwise an error is reported, and the rendering is aborted.
A dictionary list is a dictionary in it's own right, and may contain any number of values scoped to be shared with each sub dictionary in the list.
Once the dictionary list has been found, the list's internal dictionary is pushed onto the stack, then the list of sub dictionaries is iterated. for each iteration, the sub-dictionary is pushed onto the stack, and the embedded template is rendered using that context.
Stack | Comment |
---|---|
Dict-i | The i'th sub-dictionary of the list |
List | The internal dictionary of the list |
Main | Assigned via setDictionary |
Global | System supplied values |
All errors are reported by throwing an exception. In order to keep things simple only one kind of exception is defined. The exception inherits from the STL exception, and the actual error message can be queried via what()
.
The reasoning behind using exceptions rather than returning booleans from various method calls is that - once the application has been debugged - errors should be a rare occurrence.
As stated previously, unicode correctness is one of the design goals of this project. for this reason some deliberation has gone into the decision about the format of the internal string buffers.
Internally libTemplateEngine uses UTF-16 strings and characters. This was choice was made in order to strike a balance between memory footprint and the risk of polluting the internal structures with non-unicode strings.
Note! A dictionary has a method for adding a std::string to the dictionary. It is important that the string is guaranteed to contain a valid UTF-8 string, otherwise the dictionary will be polluted, and the resulting output may not be unicode correct.
libTemplateEngine is doing the easy part, the real work in being unicode correct lies in the input and output mechanisms.
The task of managing the input is delegated to the scanner. The purpose of the scanner is to take the input template, and convert it to an UTF-16 stream of UTF-16 code units.
On windows libTemplateEngine has a small utility, called te_converter, which can convert between UTF-8 and UTF-16. To the best of my knowledge that conversion utility is correct. Although the c++11 standard has methods for converting between UTF-32 and its smaller counterparts I have been unable to link that code under windows. There appears to be a bug reported to MS about this, and although it is more than six months old it hasn't been patched yet. Consequently the library has no means of converting between UTF-32 and UTF-16.
I have included two example applications, which might help provide some insights into how everything fits together
The first is very simple and builds upon the example shown in the introduction.
The second shows how to use the repeat instruction and dictionary lists.