Skip to content

Latest commit

 

History

History
151 lines (111 loc) · 7.6 KB

CompilerOverview.md

File metadata and controls

151 lines (111 loc) · 7.6 KB

Overview of the Rembulan compiler

Rembulan compiles Lua sources directly to Java bytecode. In contrast to PUC-Lua, which first compiles Lua sources to the Lua bytecode (that is then interpreted), the intermediate representation used in the Rembulan compiler is not exposed to the users, and has no serialisable format.

This means that on the binary level, Rembulan is not compatible with PUC-Lua. On the other hand, however, it allows the compiler to perform some optimisations static analysis more easily.

The main class of the compiler is net.sandius.rembulan.compiler.LuaCompiler in the rembulan-compiler module.

Basics

Every Lua chunk is compiled to one or more classes that extend LuaFunction, i.e., every resulting class corresponds to a Lua function in the source. Nested functions are treated analogously to static nested classes in Java.

Function upvalues are represented as fields of the type Variable; the number of upvalues a function has determines the form of its constructor.

Optimisations

Rembulan performs basic static analysis of the Lua programs in order to simplify the generated code, and eliminate unnecessary guards against possible metamethod calls. This is important since the coroutine implementation used in Rembulan requires Lua functions that perform calls to be ready to be paused, and generating such code makes the bytecode longer and makes its control flow graph more complex. This in turn makes the JVM less likely to optimise the JIT-ed bytecode.

What follows is a description of optimisation steps performed by the compiler, done in a loop until a fixed point is reached.

Note: CPU accounting is not influenced by the optimisation steps. An optimised version of a compiled function will still clock in the same number as a plain, unoptimised one.

Type analysis

In the majority of cases, whether a Lua operation op may involve a metamethod call depends on the types of the operands of op. (See §2.4 of the Lua Reference Manual.) For instance, the expression

x + y

may trigger a metamethod call, unless x and y are numbers or strings that can be converted to numbers. It then follows that if we know that both x and y are numbers, we can generate bytecode for numerical addition immediately, and we also know that the type of the expression (x + y) will be number.

Type analysis of course also works in conjunction with control flow analysis. Suppose we have the following snippet:

local x
if condition then
  x = a + b
else
  x = 0
end  

Here, the type of the value of the local variable x after the if statement will depend on the types of a and b (and the value of condition, see below). Let us look at the possible cases:

  • if a and b are integers (a subtype of number), then x will be an integer;
  • otherwise, if a and b are numbers, then x will be a number;
  • otherwise, x will be any (a supertype of all types).

In the first two cases, we do not have to emit the code that may need to handle metamethods (and, by extension, coroutine switches), so this code does not introduce an additional entry point. Additionally, in the first case, we even have enough information to keep a, b and x unboxed. (Note: this step is not done yet.)

Relevant class: net.sandius.rembulan.compiler.analysis.Typer

Branch inlining

Now, suppose that condition is always false or nil. In that case, we can skip the then-branch entirely, and know that the type of x after the if statement is integer (with the value 0).

Relevant class: net.sandius.rembulan.compiler.tf.BranchInliner

Const folding

Constant values are treated as literal types. I.e., constant propagation is treated as part of the type inference step.

In the example above, if we know statically the values of a and b, we can compute the result of (a + b) at compile-time, and use it as the type of the expression. The actual "addition" can then be safely removed from the IR representation of the function. (This is only true for operations that do not have any side-effects.)

Relevant class: net.sandius.rembulan.compiler.tf.ConstFolder

Dead code elimination

Suppose we execute the snippet above, but never use the value of x. In that case, we do not need to keep x around, and if the expression (a + b) cannot trigger a metamethod call, we do not even need to evaluate it. This is done by computing the liveness of all variables (and temporary values), and pruning out dead code based on this information.

Relevant classes:

Code generation

Once a function has been optimised, it is compiled into Java bytecode. This involves several steps, and takes into account the type and liveness information computed during optimisation passes.

Slot allocation

Slot allocation uses the liveness information to come up with a mapping from variables and temporary values to local variables in the resulting bytecode, similar to registers in PUC-Lua. The slots are untyped (i.e, their type is java.lang.Object) in order to enable slot sharing. As a consequence, values that have equivalent primitive JVM types (i.e., Lua integers are longs, Lua floats are doubles, and Lua booleans are booleans) need to be boxed.

Note: this is likely to change in future versions, since boxing is a significant performance drag, and can in fact be removed rather easily at this point.

Relevant class: net.sandius.rembulan.compiler.analysis.SlotAllocator

Code generation

Code generation takes the slot information along with type information, and walks the IR graph, generating Java bytecode in the process. Long methods may need to be segmented into multiple nested methods (to avoid the 64kB limit on the size of a method in Java bytecode), wires up the methods so as to implement LuaFunction, and generates the resulting Java classfile as a byte array.

This step uses the ASM framework.

Relevant classes: