Skip to content
phaller edited this page Oct 10, 2011 · 8 revisions

Generating SQL

Using the embedding of basic expressions (explained in the introduction) we are now going to show how to generate SQL from a small query DSL embedded in Scala. For this, we create a SQLExps trait that extends the CoreExps trait that you have seen before. This trait adds a bunch of case classes representing elements in the AST of an SQL expression:

trait SQLExps extends CoreExps {
  case class ResultRow[T](fields: Map[String, Exp[_]]) extends Exp[T] 

  case class Select[T, U](tgt: Exp[U], field: String) extends Exp[T]

  case class Table[Tuple <: Result](name: String) extends Exp[List[Tuple]]

  case class ListSelect[Tuple <: Result, T](t: Exp[List[Tuple]], f: Exp[Tuple] => Exp[T]) extends Exp[List[Tuple]] {
    def projectedNames: Iterable[String] = f(null) match {
      case ResultRow(fields) => fields.keys
    }
  }

  ...
}

An expression object of type ResultRow describes the shape of results returned by an SQL Select clause. It contains a mapping from column names to expression trees which describe how each column value of the result is computed. The Select case class represents expressions that select a field on a target tuple, e.g., item.customerName. To represent data base tables we use the Table case class. It takes a type parameter Tuple which is the type of tuples stored in the table. Tuple extends the Result class which we introduce below. Basically, Result is used to create result tuples in queries. The Table class extends Exp[List[Tuple]], i.e., it represents an expression returning a list of Tuple elements. This allows us to treat a table literal (which only contains the name of the corresponding data base table) as a list of tuples, for which the standard Select clauses are defined.

To represent Select clauses we use the ListSelect case class. Like Table, it extends Exp[List[Tuple]] where Tuple is a type parameter for the type of the returned tuples. A ListSelect expression node points to a list-valued expression of type Exp[List[Tuple]] which is the list that we are selecting elements from, and an expression of type Exp[Tuple] => Exp[T], which is the selector function used for determining (a) how to select elements from the list, and (b) how to transform a selected element to a result tuple. For example, the following DSL expression is represented as a ListSelect node:

items Select (e => new Result { val customerName = e.customerName })

In this case, items is lifted to an expression tree of type Exp[List[Tuple]], and the function literal is lifted to an expression tree of type Exp[Tuple] => Exp[T]. This means the selector function can use all of the fields defined in the Tuple type to select columns to be included in the result.

Finally, we add an implicit conversion which lifts generic lists into expression trees:

implicit def liftList[T](x: List[T]): Exp[List[T]] = Const(x)

Let's define the methods that are concerned with the actual embedding of query expressions in Scala. For this, we'll create a sub trait of SQLExps called EmbedSQL:

trait EmbedSQL extends SQLExps {
  ...
}

The first method we'll add provides Select clauses on lists of tuples:

implicit def listSelectOps[Tuple <: Result](l: Exp[List[Tuple]]) = new {
  def Select[T](f: Exp[Tuple] => Exp[T]): Exp[List[Tuple]] =
    ListSelect(l, f)
}

It simply creates a ListSelect node which points to the expression tree for the list that we are selecting from, as well as the selector function used for projecting out the fields that we are interested in. The result is again an expression node of type Exp[List[Tuple]].

As we have seen in the previous example, result tuples are created using new Result { val column_1 = ... ; ... ; val column_n = ... }. To represent such an expression in the AST of our DSL, we therefore need to lift instance creation using new. In Virtualized Scala this is done in two steps. First, we need to introduce the Result type as a marker class in our intermediate representation. For this, we declare the following class inside our SQLExps trait:

// marker to trigger __new reification
class Result extends Row[Exp]

The Row type is built-in to the virtualization part of the compiler; the class declaration basically says that creating "row type" instances of Result (using new Result) should be lifted. A "row type" is a type that basically only contains a list of fields, which is exactly what we need to represent result tuples.

The second step required for lifting instance creation is to define the __new method which should create an expression tree. Since this method is concerned with the embedding we define it in the EmbedSQL trait:

def __new[T](args: (String, Exp[T] => Exp[_])*): Exp[T] =
  new ResultRow(args map {case (n, rhs) => (n, rhs(null))} toMap)

The __new method takes a variable number of pairs as arguments. Each pair contains the name of a field in our row type, as well as a function which creates an expression tree for the field initializer when applied to an expression representing the "self" instance. In our case, we simply create an instance of ResultRow, using the arguments to fill its map. Note that to create the expression trees which are the field initializers, we pass null to each function in an argument pair, since we are not supporting the use of "self" inside result tuples anyway.

To support projections in queries, we need to be able to select fields of tuples. Therefore, we have to have a way to create expression trees for field selections. In Virtualized Scala we can lift such selections by defining the special selectDynamic method for the types of objects on which we would like to select fields. We can provide this method through the following implicit conversion, (which we add to our EmbedSQL trait):

implicit def selectOps(self: Exp[_ <: Result]) = new {
  def selectDynamic[T](n: String): Exp[T] = Select(self, n)
}

The above implicit conversion adds our own selectDynamic method to all instances of type Exp[_ <: Result], i.e., expression trees whose result type is a sub type of Result. Recall that Result is the marker class used for lifting instance creation. Also, since Tuple is a sub type of Result, this implicit will also provide the selectDynamic method to all expression trees with result type Tuple. This is precisely the type of objects for which we would like to lift field selections. The Virtualized Scala compiler will then invoke the selectDynamic method as defined above whenever a field is selected on a tuple. Our embedding then creates a Select node which contains the expression tree of the target of the selection (self) and the name of the selected field (n).

Let's create an example query that only uses the DSL elements introduced so far:

object Example extends EmbedSQL with SQLCodeGen {
  type Tuple = Result {
    val itemName: String
    val customerName: String
  }

  def prog = {
    val items = Table[Tuple]("items")
    items Select (e => new Result { val customerName = e.customerName })
 }
}

Compiling the Example object and running its prog method should create an expression tree corresponding to the following SQL query:

SELECT customerName FROM items

Let's turn the expression trees into SQL. This is straight-forward using methods that traverse and print our expression trees (PrintWriter is imported from the java.io package):

trait SQLCodeGen extends SQLExps {
  def emitPlain(out: PrintWriter, s: String, more: Boolean = false) = {
    if (more) out.print(s) else out.println(s)
  }

  def emitExpr[T](out: PrintWriter, expr: Exp[T]): Unit = expr match {
    case Select(_, field) => emitPlain(out, field, true)
  }
  
  def emitSelector[T, S](out: PrintWriter, f: Exp[T] => Exp[S]): Unit = f(null) match {
    case ResultRow(fields) =>
      var first = true
      for ((name, value) <- fields) {
        if (first) { first = false } else emitPlain(out, ", ", true)
        emitExpr(out, value)
      }
  }
  
  def emitQuery[T](out: PrintWriter, expr: Exp[T]): Unit = expr match {
    case Table(name) =>
      emitPlain(out, name, true)
    case ListSelect(table, selector) =>
      emitPlain(out, "SELECT ", true)
      emitSelector(out, selector)
      emitPlain(out, " FROM ", true)
      emitQuery(out, table)
      emitPlain(out, "")
  }

  ...
}

Let's run our small example query against a real data base! For this, we'll add a runQuery method to our SQLCodeGen trait. The runQuery method takes a JDBC connection object as an argument:

def runQuery[T](con: Connection, expr: Exp[T]) {
  val writer = new StringWriter
  val printer = new PrintWriter(writer)
  emitQuery(printer, expr)
  val query = writer.toString()
  val sta = con.createStatement()
  try {
    val res = sta.executeQuery(query)
    res.next()
    val s = res.getString(1)
    println("Result: " + s)
  } catch {
    case e: Exception =>
      ...
  }
}

We can set up an H2 in-memory data base for testing our example:

  Class.forName("org.h2.Driver")
  val con = DriverManager.getConnection("jdbc:h2:mem:testdb", "sa", "")
  val createTableStmt = """
CREATE TABLE Items
(
CustomerName varchar(255),
ItemName varchar(255)
)
"""
  val insertStmt = """
INSERT INTO Items
VALUES ('Typesafe', 'Chair')
"""
  val res = con.createStatement().execute(createTableStmt)
  val res2 = con.createStatement().execute(insertStmt)
  Example.runQuery(con, Example.prog)
  con.close()

Running our query against this data base should result in the following output:

Result: Typesafe