In this article we describe in detail how the Calcite SQL compiler framework optimizes constant expressions during compilation.
We assume that the reader has seen our prior blog post with the description of Calcite program representations.
In the previous blog post we saw that the Calcite compilation process is organized in several stages, each of which uses a different program representation. In this article we will add several more layers to this diagram.
We start with a very simple SQL program and show how it is transformed as it traverses the various layers of the software stack.
Writing a test case
Our input query is very simple:
SELECT 1 + 2 * 3
We add the following test case in the RelOptRulesTest.java
file, which is a file containing test cases for the Calcite program rewriting optimization rules:
@Test void testConstantFolding() {
final String sql = "SELECT 1 + 2 * 3";
sql(sql).withRule(CoreRules.PROJECT_REDUCE_EXPRESSIONS)
.check();
}
In order to execute this test successfully, the expected result of the test needs to be added to an xml resource file. If you try to execute the test without adding this result, Calcite will give you an error:
Actual and reference files differ. If you are adding new tests, replace the reference file with the current actual file, after checking its content.
diff calcite/core/build/diffrepo/test/org/apache/calcite/test/RelOptRulesTest_actual.xml calcite/core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml
The following xml fragment needs to be added in the right place in the RelOptRulesTest.xml
file (tests are sorted alphabetically):
<TestCase name="testCF">
<Resource name="sql">
<![CDATA[SELECT 1 + 2 * 3]]>
</Resource>
<Resource name="planBefore">
<![CDATA[
LogicalProject(EXPR$0=[+(1, *(2, 3))])
LogicalValues(tuples=[[{ 0 }]])
]]>
</Resource>
<Resource name="planAfter">
<![CDATA[
LogicalProject(EXPR$0=[7])
LogicalValues(tuples=[[{ 0 }]])
]]>
</Resource>
</TestCase>
You can guess from this xml snippet that the program optimizer will convert the original program to the following (shown in planAfter
):
SELECT 7
The operation of evaluating the expression 1 + 2 * 3
, where all the values are constant, to the result 7
is an optimization called Constant Folding in the compiler literature.
Before optimizations
You will notice that this is a SQL program that contains a SELECT
statement without a corresponding FROM
. What is the meaning of such programs? The SQL standard explains that this program implicitly assumes that the FROM
clause contains a table with a single row. So another way to write this program would be:
SELECT 1 + 2 * 3
FROM (VALUES (0)) AS T
As described in our previous blog post, Calcite starts by parsing the program and generating a representation using SqlNode
data structures. This representation is subjected to validation and type inference, and then converted to a relational representation, using RelNode
objects to describe relations and RexNode
to describe expressions. And indeed, because there is no FROM
, Calcite will insert a VALUES
statement in the FROM
clause. The result of this transformation is a tree of objects with the following structure:
LogicalProject(expr0=[+(1, *(2, 3)])
LogicalValues(tuples =[[{ 0 }]]
We can inspect this object using the Java debugger by inserting a breakpoint in HepPlanner.setRoot
and inspecting the rel
argument.
LogicalProject
is the internal representation of a SELECT
statement. It contains an array of arguments, one for each column of the result. In this case there is a single column (named expr0
in the previous example). The actual value for each column is an expression, represented using a tree of RexNode
objects. In our case the expression will have the following shape (where we omit some of the irrelevant fields):
RexCall +(1, *(2, 3))
|_op: SqlMonotonicBinaryOperator "+"
| |_kind: PLUS
| |_type: INTEGER
|_operands: List
|_0: RexLiteral "1"
| |_value: 1
| |_type: INTEGER
|_1: RexCall *(2, 3)
|_op: SqlMonotonicBinaryOperator "*"
| |_kind: TIMES
| |_type: INTEGER
|_operands: List
|_0: RexLiteral: "2"
| |_type: INTEGER
| |_value: 2
|_1: RexLiteral: "3"
|_type: INTEGER
|_value: 3
Notice how the compiler has correctly synthesized a tree which reflects the higher priority of multiplication compared with addition, where the operation 2*3 is computed first. Also, the type inference engine has assigned a SQL INTEGER
type to every one of these expressions.
Finding constant expressions
There are several optimization rules for folding constant expressions in Calcite; they are named in CoreRules
:
PROJECT_REDUCE_EXPRESSIONS
FILTER_REDUCE_EXPRESSIONS
CALC_REDUCE_EXPRESSIONS
WINDOW_REDUCE_EXPRESSIONS
JOIN_REDUCE_EXPRESSIONS
All these rules essentially do the same kind of transformation: they look for RexCall
nodes where all input arguments are constant, and replace them with a single RexLiteral
value if possible; each of the 5 rules looks for these nodes in different kinds of relational operators. Since our query has only a Project
node, we will use the PROJECT_REDUCE_EXPRESSIONS
rule.
In textbooks constant folding is an iterative optimization: it finds a deterministic operation where all inputs are constants, then it evaluates it, replaces the expression with the result, and then it repeats the process until no changes occur. (The operators involved have to be deterministic; expressions such as the NOW()
or RAND()
cannot be evaluated at compilation time, despite having no arguments.)
In other compilers the expression in our example would be optimized in 2 stages:
1 + (2 * 3) -> 1 + 6 -> 7
Calcite does things differently: it actually tries to locate a maximal expression tree where all leaves are constants, and all operators applied in tree nodes are deterministic. This is done by a visitor, called ReducibleExprLocator
In general, a single expression tree may contain multiple constant sub-expressions: e.g., in the expression a + 1 * 2 + b + 3 * 4 there are two constant sub-expressions: (1 * 2) and (3 * 4). So the visitor which looks for constant expressions will actually produce a list of expressions that have to be reduced.
If you add a breakpoint on the method ReduceExpressionsRule.ReducibleExprLocator.addResult
you will notice that it will be called with the entire expression tree 1 + (2*3) as an argument.
Once the constant expressions have been collected, an attempt is made to evaluate them. That is done in the method ReduceExpressionsRule.reduceExpressionsInternal
:
final List<RexNode> reducedValues = new ArrayList<>();
executor.reduce(simplify.rexBuilder, constExps2, reducedValues);
This code produces for each input expression in the list constExps2
one corresponding result in the reducedValues
list. Another visitor called RexReplacer
is afterwards used to substitute the results in the original tree.
Dealing with failures
The actual implementation of the reduce
method is part of the RexExecutorImpl
class. This method performs the work in several stages:
- it compiles the constant expressions into a... Java program. We will describe this step in detail below
- it compiles the java program into a class file
- it loads the class file and executes a method from the class file
- the result of this method is a list of Java values, one for each constant tree evaluated; each result is then wrapped into a
RexLiteral
and inserted back replacing the original tree
The list of literals is the actual result of the constant folding process.
This process can fail in multiple ways:
- the compilation to Java can fail, e.g., because someone has added some new data types to the SQL front-end but forgot to add them to the part of the compiler which generates Java code
- compilation of the Java code can fail because the Java generated code is invalid, due to bugs in the
RexNode
-> Java compiler - finally, the execution of the Java code itself may throw an exception
Consider this SQL program:
SELECT CASE
WHEN x IS NULL 1/0
ELSE x
END
This program contains a constant expression, 1/0
. However, this expression cannot be evaluated at compilation time, since the result is... an exception (in most SQL dialects; in SQLite the result of division by zero is actually NULL
). However, it does not mean that this program should be rejected by the compiler: if x
is never NULL
, this division may never be executed.
When attempting to optimize such an expression the Calcite optimizer will actually produce an exception when evaluating the Java code which performs division by 0. Calcite will catch the exception, and will return the original expression unmodified. This will ensure that the division by zero is only executed at runtime.
Compiling and executing Java code
How does Calcite execute Java code that is generated while it runs?
First we discuss steps 2, 3, and 4 above. You can actually inspect the Java code produced by setting a breakpoint after this line from RexExecutorImpl.reduce
:
String code = compile(rexBuilder, constExps, (list, index, storageType) -> {
throw new UnsupportedOperationException();
});
You can see that the value of the code
variable for our example is:
public Object[] apply(Object root0) {
return new Object[] {
1 + 2 * 3};
}
You can see that this code indeed contains a method that returns an array with 1 object. There is 1 object because we are folding 1 expression tree.
Calcite uses the Janino embedded Java compiler. Janino is a very fast and compact Java compiler, written in Java, which performs several operations:
- it parses and validates Java source code (it only accepts a subset of Java, but this subset is rich enough for Calcite's needs)
- it compiles code into Java bytecode
- it loads the generated code dynamically in the current running process
These features enable Calcite to generate code for a new function in Java, and then to execute the function using the existing Java virtual machine to evaluate constant expressions.
In step 4, after evaluating the generated code, Calcite obtains the result, the value 7, stored in a Java Integer
. However, Calcite needs to convert this value into a RexLiteral
, which is the internal compiler representation for a constant value. Fortunately, this is easy to do, since Calcite already knows the type (including nullability) of the result in SQL (INTEGER
), and this is all that is needed to create a RexLiteral
.
Compiling RexNode
trees to Java source code
We now return to step 1: translating the RexNode
tree into a Java source code string. Based on the Java code example produced above, you may think that this is a simple process, but it is perhaps the most involved part of the entire optimization process.
For our running example, the input is the expression tree we have shown above:
RexCall +(1, *(2, 3))
|_op: SqlMonotonicBinaryOperator "+"
| |_kind: PLUS
|_operands: List
|_0: RexLiteral "1"
| |_ value: 1
|_1: RelCall *(2, 3)
|_op: SqlMonotonicBinaryOperator "*"
| |_kind: TIMES
|_operands: List
|_0: RexLiteral: "2"
| |_value: 2
|_1: RexLiteral: "3"
|_value: 3
The translation is actually done in three stages:
Stage 1: each RexNode
expression tree is translated to a simpler form by a visitor, which is similar to three-address code used in traditional compilers; this representation is essentially a list of statements of the form "variable = expression"; expressions are still represented as RexNode
objects.
The result of this translation looks like this:
$t0 = RexLiteral "1"
$t1 = RexLiteral "2"
$t2 = RexLiteral "3"
$t3 = RexCall "*"
|_operands
|_0: RexLocalRef: "$t1"
|_1: RexLocalRef: "$t2"
$t4 = RexCall "+"
|_operands
|_0: RexLocalRef: "$t0"
|_1: RexLocalRef: "$t3"
This data structure is stored in a RexProgram
; you can inspect this program in one of the RexExecutorImpl.compile
methods.
Stage 2: The RexProgram
is then translated into yet another intermediate representation, called linq4j
, which we will describe in detail below. This translation is performed by the RexToLixTranslator
visitor, using the RexToLixTranslator.translateProjects
method.
Stage 3: finally, the linq4j
representation is translated to Java source code
LINQ: A short (personal) history
In 2008 Microsoft introduced a set of libraries for the C# programming language, under the name LINQ, from Language Integrated Query. The design of LINQ is based on an academic paper: The essence of Data Access in Comega, Gavin Bierman, Erik Meijer, and Wolfram Schulte, ECOOP 2005.
LINQ is essentially a language for computing on collections (like SQL), but embedded in C#. C# LINQ has a very elegant design. I personally had the fortune of been involved the DryadLINQ, project, which built a large-scale computation engine programmed using LINQ, which predates Apache Spark by several years, but has similar capabilities. The system is described in DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language, Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep Kumar Gunda, and Jon Currey, OSDI 2008. Unfortunately, the DryadLINQ product (called HPC LINQ) was canceled by Microsoft in 2011, after a successful beta release.
Inspired by the design of DryadLINQ, our next-door colleagues built another system programmed in LINQ: Naiad, described in the paper Naiad: a timely dataflow system, Derek Gordon Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, Martín Abadi, SOSP 2013. This paper introduces a form of streaming incremental computation based on Z-sets, which inspired heavily Feldera's dataflow model called DBSP, which we have discussed in many other blog posts.
Finally, linq4j
is a Calcite sub-project which reimplements the C# LINQ APIs in Java -- hence the 4j name. It has been a part of Calcite since at least 2010.
linq4j
The linq4j
project is an important part of Calcite, since it is essentially the core implementation of the Calcite runtime. This may sound strange, because Calcite is, after all, a compiler. What runtime are we talking about? Turns out that Calcite also contains a runtime, which has multiple uses:
- It is used for testing, because it can actually execute programs computing on collections end-to-end on real data
- It is used for constant-folding, as described in this article
- It is used as a production runtime in several query engines based on Calcite
linq4j has 3 layers:
- An
Enumerable
layer, which implements all core operators on collections (selection, projection, join, union, etc.). Collections are accessed through an iterator API, which lazily "pulls" data from inputs, without necessarily materializing them. All classes in this layer derive from theEnumerable
interface. - A
Queryable
layer, which contains compiler representations of execution plans that comprise Enumerable operators; all objects of this layer derive from theQueryable
interface. This layer is similar to theRelNode
layer, but the representation is targeted towards Java programs (not SQL) that rely onEnumerable
APIs to compute on relational data. - An expression representation layer, which can be used to represent
computations on primitive values. There are two kinds of classes in
this layer: expressions, which derive from theExpression
interface, and statements, which derive from theStatement
interface. This layer is similar to theRexNode
layer, but the representation is targeted towards Java programs (not SQL).
For the problem that we set to solve in this article, constant folding of scalar expressions, only the Expression
and Statement
layers from linq4j
are used.
Converting a linq4j
expression into a Java program is actually very easy: the toString()
method of these objects produces Java directly, so no visitors are required!
We hope to be able to write in the future another article about how the linq4j
layer is used for program testing in Calcite. But this article is long enough as it is.
To summarize, here is a diagram which shows the translation layers used by the constant folding optimizer of Calcite:

Closing observations
Using the runtime for constant folding is an unusual compiler design. Many other compilers use set of rules separate from the runtime for constant folding. An important benefit of the Calcite design is that there is only ONE implementation of the basic computation rules -- in the runtime. Otherwise one would have to make sure that the optimization rules in the compiler produce the same results as the runtime itself; having TWO sets of rules doing the same thing makes code maintenance more difficult.
A second observation concerns the semantics of arithmetic in SQL and Java. Notice that in our example the operation 2 * 3 in SQL was converted into the operation 2 * 3 in Java. However, in Java arithmetic on integers will wrap-around on overflow, whereas in many SQL dialects arithmetic overflow will produce an exception. Amazingly enough, the SQL standard does not say what should happen on overflow... In the Feldera engine arithmetic always produces an exception on overflow; so during compilation the Feldera compiler replaces the SQL +
with another operator (CHECKED_PLUS
), which will cause the generated code to contain the equivalent of a Java Math.addExact
operation. This illustrates how flexible the Calcite toolchain is.
I hope you now find Calcite a bit less intimidating!