Constant folding in Calcite

In this article we describe in detail how the Calcite SQL compiler framework optimizes constant expressions during compilation.

We assume that the reader has seen our prior blog post with the description of Calcite program representations.

In the previous blog post we saw that the Calcite compilation process is organized in several stages, each of which uses a different program representation. In this article we will add several more layers to this diagram.

We start with a very simple SQL program and show how it is transformed as it traverses the various layers of the software stack.

Writing a test case

Our input query is very simple:

SELECT 1 + 2 * 3

We add the following test case in the RelOptRulesTest.java file, which is a file containing test cases for the Calcite program rewriting optimization rules:

@Test void testConstantFolding() {
   final String sql = "SELECT 1 + 2 * 3";
   sql(sql).withRule(CoreRules.PROJECT_REDUCE_EXPRESSIONS)
     .check();
}

In order to execute this test successfully, the expected result of the test needs to be added to an xml resource file. If you try to execute the test without adding this result, Calcite will give you an error:

Actual and reference files differ. If you are adding new tests, replace the reference file with the current actual file, after checking its content. diff calcite/core/build/diffrepo/test/org/apache/calcite/test/RelOptRulesTest_actual.xml calcite/core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml

The following xml fragment needs to be added in the right place in the RelOptRulesTest.xml file (tests are sorted alphabetically):

<TestCase name="testCF">
  <Resource name="sql">
    <![CDATA[SELECT 1 + 2 * 3]]>
  </Resource>
  <Resource name="planBefore">
  <![CDATA[
LogicalProject(EXPR$0=[+(1, *(2, 3))])
  LogicalValues(tuples=[[{ 0 }]])
]]>
  </Resource>
  <Resource name="planAfter">
    <![CDATA[
LogicalProject(EXPR$0=[7])
  LogicalValues(tuples=[[{ 0 }]])
]]>
  </Resource>
</TestCase>

You can guess from this xml snippet that the program optimizer will convert the original program to the following (shown in planAfter):

SELECT 7

The operation of evaluating the expression 1 + 2 * 3, where all the values are constant, to the result 7 is an optimization called Constant Folding in the compiler literature.

Before optimizations

You will notice that this is a SQL program that contains a SELECT statement without a corresponding FROM. What is the meaning of such programs? The SQL standard explains that this program implicitly assumes that the FROM clause contains a table with a single row. So another way to write this program would be:

SELECT 1 + 2 * 3
FROM (VALUES (0)) AS T

As described in our previous blog post, Calcite starts by parsing the program and generating a representation using SqlNode data structures. This representation is subjected to validation and type inference, and then converted to a relational representation, using RelNode objects to describe relations and RexNode to describe expressions. And indeed, because there is no FROM, Calcite will insert a VALUES statement in the FROM clause. The result of this transformation is a tree of objects with the following structure:

LogicalProject(expr0=[+(1, *(2, 3)])
  LogicalValues(tuples =[[{ 0 }]]

We can inspect this object using the Java debugger by inserting a breakpoint in HepPlanner.setRoot and inspecting the rel argument.

LogicalProject is the internal representation of a SELECT statement. It contains an array of arguments, one for each column of the result. In this case there is a single column (named expr0 in the previous example). The actual value for each column is an expression, represented using a tree of RexNode objects. In our case the expression will have the following shape (where we omit some of the irrelevant fields):

RexCall +(1, *(2, 3))
|_op: SqlMonotonicBinaryOperator "+"
| |_kind: PLUS
| |_type: INTEGER
|_operands: List
  |_0: RexLiteral "1"
  | |_value: 1
  | |_type: INTEGER
  |_1: RexCall *(2, 3)
    |_op: SqlMonotonicBinaryOperator "*"
    | |_kind: TIMES
    | |_type: INTEGER
    |_operands: List
      |_0: RexLiteral: "2"
      | |_type: INTEGER
      | |_value: 2
      |_1: RexLiteral: "3"
        |_type: INTEGER
        |_value: 3

Notice how the compiler has correctly synthesized a tree which reflects the higher priority of multiplication compared with addition, where the operation 2*3 is computed first. Also, the type inference engine has assigned a SQL INTEGER type to every one of these expressions.

Finding constant expressions

There are several optimization rules for folding constant expressions in Calcite; they are named in CoreRules:

PROJECT_REDUCE_EXPRESSIONS FILTER_REDUCE_EXPRESSIONS CALC_REDUCE_EXPRESSIONS WINDOW_REDUCE_EXPRESSIONS JOIN_REDUCE_EXPRESSIONS

All these rules essentially do the same kind of transformation: they look for RexCall nodes where all input arguments are constant, and replace them with a single RexLiteral value if possible; each of the 5 rules looks for these nodes in different kinds of relational operators. Since our query has only a Project node, we will use the PROJECT_REDUCE_EXPRESSIONS rule.

In textbooks constant folding is an iterative optimization: it finds a deterministic operation where all inputs are constants, then it evaluates it, replaces the expression with the result, and then it repeats the process until no changes occur. (The operators involved have to be deterministic; expressions such as the NOW() or RAND() cannot be evaluated at compilation time, despite having no arguments.)

In other compilers the expression in our example would be optimized in 2 stages:

1 + (2 * 3) -> 1 + 6 -> 7

Calcite does things differently: it actually tries to locate a maximal expression tree where all leaves are constants, and all operators applied in tree nodes are deterministic. This is done by a visitor, called ReducibleExprLocator

In general, a single expression tree may contain multiple constant sub-expressions: e.g., in the expression a + 1 * 2 + b + 3 * 4 there are two constant sub-expressions: (1 * 2) and (3 * 4). So the visitor which looks for constant expressions will actually produce a list of expressions that have to be reduced.

If you add a breakpoint on the method ReduceExpressionsRule.ReducibleExprLocator.addResult you will notice that it will be called with the entire expression tree 1 + (2*3) as an argument.

Once the constant expressions have been collected, an attempt is made to evaluate them. That is done in the method ReduceExpressionsRule.reduceExpressionsInternal:

final List<RexNode> reducedValues = new ArrayList<>();
executor.reduce(simplify.rexBuilder, constExps2, reducedValues);

This code produces for each input expression in the list constExps2 one corresponding result in the reducedValues list. Another visitor called RexReplacer is afterwards used to substitute the results in the original tree.

Dealing with failures

The actual implementation of the reduce method is part of the RexExecutorImpl class. This method performs the work in several stages:

it compiles the constant expressions into a... Java program. We will describe this step in detail below
it compiles the java program into a class file
it loads the class file and executes a method from the class file
the result of this method is a list of Java values, one for each constant tree evaluated; each result is then wrapped into a RexLiteral and inserted back replacing the original tree

The list of literals is the actual result of the constant folding process.

This process can fail in multiple ways:

the compilation to Java can fail, e.g., because someone has added some new data types to the SQL front-end but forgot to add them to the part of the compiler which generates Java code
compilation of the Java code can fail because the Java generated code is invalid, due to bugs in the RexNode -> Java compiler
finally, the execution of the Java code itself may throw an exception

Consider this SQL program:

SELECT CASE
WHEN x IS NULL 1/0
ELSE x
END

This program contains a constant expression, 1/0. However, this expression cannot be evaluated at compilation time, since the result is... an exception (in most SQL dialects; in SQLite the result of division by zero is actually NULL). However, it does not mean that this program should be rejected by the compiler: if x is never NULL, this division may never be executed.

When attempting to optimize such an expression the Calcite optimizer will actually produce an exception when evaluating the Java code which performs division by 0. Calcite will catch the exception, and will return the original expression unmodified. This will ensure that the division by zero is only executed at runtime.

Compiling and executing Java code

How does Calcite execute Java code that is generated while it runs?

First we discuss steps 2, 3, and 4 above. You can actually inspect the Java code produced by setting a breakpoint after this line from RexExecutorImpl.reduce:

String code = compile(rexBuilder, constExps, (list, index, storageType) -> {
   throw new UnsupportedOperationException();
});

You can see that the value of the code variable for our example is:

public Object[] apply(Object root0) {
  return new Object[] {
    1 + 2 * 3};
}

You can see that this code indeed contains a method that returns an array with 1 object. There is 1 object because we are folding 1 expression tree.

Calcite uses the Janino embedded Java compiler. Janino is a very fast and compact Java compiler, written in Java, which performs several operations:

it parses and validates Java source code (it only accepts a subset of Java, but this subset is rich enough for Calcite's needs)
it compiles code into Java bytecode
it loads the generated code dynamically in the current running process

These features enable Calcite to generate code for a new function in Java, and then to execute the function using the existing Java virtual machine to evaluate constant expressions.

In step 4, after evaluating the generated code, Calcite obtains the result, the value 7, stored in a Java Integer. However, Calcite needs to convert this value into a RexLiteral, which is the internal compiler representation for a constant value. Fortunately, this is easy to do, since Calcite already knows the type (including nullability) of the result in SQL (INTEGER), and this is all that is needed to create a RexLiteral.

Compiling `RexNode` trees to Java source code

We now return to step 1: translating the RexNode tree into a Java source code string. Based on the Java code example produced above, you may think that this is a simple process, but it is perhaps the most involved part of the entire optimization process.

For our running example, the input is the expression tree we have shown above:

RexCall +(1, *(2, 3))
|_op: SqlMonotonicBinaryOperator "+"
| |_kind: PLUS
|_operands: List
  |_0: RexLiteral "1"
  | |_ value: 1
  |_1: RelCall *(2, 3)
       |_op: SqlMonotonicBinaryOperator "*"
       | |_kind: TIMES
       |_operands: List
          |_0: RexLiteral: "2"
          | |_value: 2
          |_1: RexLiteral: "3"
          |_value: 3

The translation is actually done in three stages:

Stage 1: each RexNode expression tree is translated to a simpler form by a visitor, which is similar to three-address code used in traditional compilers; this representation is essentially a list of statements of the form "variable = expression"; expressions are still represented as RexNode objects.

The result of this translation looks like this:

$t0 = RexLiteral "1"
$t1 = RexLiteral "2"
$t2 = RexLiteral "3"
$t3 = RexCall "*"
      |_operands
        |_0: RexLocalRef: "$t1"
        |_1: RexLocalRef: "$t2"
$t4 = RexCall "+"
      |_operands
        |_0: RexLocalRef: "$t0"
        |_1: RexLocalRef: "$t3"

This data structure is stored in a RexProgram; you can inspect this program in one of the RexExecutorImpl.compile methods.

Stage 2: The RexProgram is then translated into yet another intermediate representation, called linq4j, which we will describe in detail below. This translation is performed by the RexToLixTranslator visitor, using the RexToLixTranslator.translateProjects method.

Stage 3: finally, the linq4j representation is translated to Java source code

LINQ: A short (personal) history

In 2008 Microsoft introduced a set of libraries for the C# programming language, under the name LINQ, from Language Integrated Query. The design of LINQ is based on an academic paper: The essence of Data Access in Comega, Gavin Bierman, Erik Meijer, and Wolfram Schulte, ECOOP 2005.

LINQ is essentially a language for computing on collections (like SQL), but embedded in C#. C# LINQ has a very elegant design. I personally had the fortune of been involved the DryadLINQ, project, which built a large-scale computation engine programmed using LINQ, which predates Apache Spark by several years, but has similar capabilities. The system is described in DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language, Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey, OSDI 2008. Unfortunately, the DryadLINQ product (called HPC LINQ) was canceled by Microsoft in 2011, after a successful beta release.

Inspired by the design of DryadLINQ, our next-door colleagues built another system programmed in LINQ: Naiad, described in the paper Naiad: a timely dataflow system, Derek Gordon Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, Martín Abadi, SOSP 2013. This paper introduces a form of streaming incremental computation based on Z-sets, which inspired heavily Feldera's dataflow model called DBSP, which we have discussed in many other blog posts.

Finally, linq4j is a Calcite sub-project which reimplements the C# LINQ APIs in Java -- hence the 4j name. It has been a part of Calcite since at least 2010.

`linq4j`

The linq4j project is an important part of Calcite, since it is essentially the core implementation of the Calcite runtime. This may sound strange, because Calcite is, after all, a compiler. What runtime are we talking about? Turns out that Calcite also contains a runtime, which has multiple uses:

It is used for testing, because it can actually execute programs computing on collections end-to-end on real data
It is used for constant-folding, as described in this article
It is used as a production runtime in several query engines based on Calcite

linq4j has 3 layers:

An Enumerable layer, which implements all core operators on collections (selection, projection, join, union, etc.). Collections are accessed through an iterator API, which lazily "pulls" data from inputs, without necessarily materializing them. All classes in this layer derive from the Enumerable interface.
A Queryable layer, which contains compiler representations of execution plans that comprise Enumerable operators; all objects of this layer derive from the Queryable interface. This layer is similar to the RelNode layer, but the representation is targeted towards Java programs (not SQL) that rely on Enumerable APIs to compute on relational data.
An expression representation layer, which can be used to represent
computations on primitive values. There are two kinds of classes in
this layer: expressions, which derive from the Expression
interface, and statements, which derive from the Statement interface. This layer is similar to the RexNode layer, but the representation is targeted towards Java programs (not SQL).

For the problem that we set to solve in this article, constant folding of scalar expressions, only the Expression and Statement layers from linq4j are used.

Converting a linq4j expression into a Java program is actually very easy: the toString() method of these objects produces Java directly, so no visitors are required!

We hope to be able to write in the future another article about how the linq4j layer is used for program testing in Calcite. But this article is long enough as it is.

To summarize, here is a diagram which shows the translation layers used by the constant folding optimizer of Calcite:

Closing observations

Using the runtime for constant folding is an unusual compiler design. Many other compilers use set of rules separate from the runtime for constant folding. An important benefit of the Calcite design is that there is only ONE implementation of the basic computation rules -- in the runtime. Otherwise one would have to make sure that the optimization rules in the compiler produce the same results as the runtime itself; having TWO sets of rules doing the same thing makes code maintenance more difficult.

A second observation concerns the semantics of arithmetic in SQL and Java. Notice that in our example the operation 2 * 3 in SQL was converted into the operation 2 * 3 in Java. However, in Java arithmetic on integers will wrap-around on overflow, whereas in many SQL dialects arithmetic overflow will produce an exception. Amazingly enough, the SQL standard does not say what should happen on overflow... In the Feldera engine arithmetic always produces an exception on overflow; so during compilation the Feldera compiler replaces the SQL + with another operator (CHECKED_PLUS), which will cause the generated code to contain the equivalent of a Java Math.addExact operation. This illustrates how flexible the Calcite toolchain is.

I hope you now find Calcite a bit less intimidating!

Constant folding in Calcite

Writing a test case

Before optimizations

Finding constant expressions

Dealing with failures

Compiling and executing Java code

Compiling `RexNode` trees to Java source code

LINQ: A short (personal) history

`linq4j`

Closing observations

Other articles you may like

Database computations on Z-sets

Implementing Batch Processes with Feldera

Feldera: three tools for the price of one

Writing a test case

Before optimizations

Finding constant expressions

Dealing with failures

Compiling and executing Java code

Compiling RexNode trees to Java source code

LINQ: A short (personal) history

linq4j

Closing observations

Other articles you may like

Database computations on Z-sets

Implementing Batch Processes with Feldera

Feldera: three tools for the price of one

Compiling `RexNode` trees to Java source code

`linq4j`