Skip to content

Tutorial: Introduction

Adriaan Moors edited this page Oct 10, 2011 · 2 revisions

The embedding of domain specific languages in Scala is based on a simple principle: the domain program looks like it's written in its own language with its own syntax, but the domain program is actually just a plain Scala program, where method calls are given a special meaning. Applying this principle naively, you end up with a "shallow embedding": DSL programs are just thinly veiled Scala programs. We refine this so that DSL programs can be analyzed (and thus optimized) by the DSL implementation. The latter feature is usually only offered by a "deep embedding", which typically has a higher implementation cost. We aim to shallowly embed our domain programs and optimize them too.

The shallow embedding is enabled by Scala's flexible syntax. The deep embedding relies on lightweight modular staging, which uses type information to drive staging. For the purpose of the tutorial, it suffices to think of staging as "delaying execution" of programs by turning them into a representation that can first be analyzed and then executed. This representation is usually structured as an abstract syntax tree. Overall, for reasons explained later, we'll distinguish two kinds of AST's: expressions and statements.

Core Infrastructure: Expressions

The CoreExps trait provides the core functionality for representing expressions. It defines the super trait Exp[T] for expressions that evaluate to a value of type T, and it provides a way of automatically lifting (staging-time) constants into their (trivial) representation, Const[T].

trait CoreExps {
  trait Exp[T]

  case class Const[T](x: T) extends Exp[T]
  implicit def liftString(x: String): Exp[String] = Const(x)
}

It's really as simple as that: when a DSL program contains the string "hello", all we need to know in order to represent it, is its literal value, "hello", which is captured quite readily by the expression tree Const("hello").

The tutorials will extend CoreExps with more interesting types.

Example: Parser Combinators

(If you're not familiar with how parser combinators work, you may skip this section. It does not introduce anything new. It simply aims to ground the abstract concepts explained above in a concrete example that exists outside virtualized Scala.)

A similar principle is used in the parser combinator library: you write a | b to express the alternative of the production a or b, which looks like BNF, but it's also a valid Scala expression. More precisely, it's syntactic sugar for the method call a.|(b), where | is a method defined in the Parser class. Thus, it's BNF that's shallowly embedded into Scala.

It's important to note that the | method call does not do any parsing. It simply builds the description of a parser: this achieves the same thing as a deep embedding. This representation of a parser can later be applied to input.

We say that the domain program (a | b), which looks like it's written in a domain-specific language (BNF) with its own syntax, is represented by method calls in the host language, and these method calls are "intercepted" by the DSL to construct its internal representation of the domain program, or possibly to execute it directly, depending on the chosen implementation strategy.