Part 1: Writing and testing your first SQL program
In this section of the tutorial we will:
- Write and test our first SQL program using Feldera
- Introduce continuous analytics -- one of the key concepts behind Feldera
The use case
We will use Feldera to implement a real-time analytics pipeline for a supply chain management system. The pipeline ingests data about suppliers, customers, and orders, and maintains an up-to-date summary of this data in an OLAP database. The data can arrive from a variety of sources, such as databases and event streams. In this tutorial we will ingest data from Amazon S3 and Redpanda, a Kafka-compatible message queue.
Step 0. Launch Feldera
If you started Feldera using the demo profile as described in the Getting Started guide, it has created a couple of SQL programs and pipelines. One of these programs, called "Feldera Basics Tutorial" and the associated pipeline "Feldera Basics Tutorial Pipeline" are identical to the ones we will manually create in this three-part tutorial.
We recommend that you follow the tutorial and build your own Feldera pipeline, but if you'd like to accelerate the journey, feel free to skim the tutorial and explore the resulting pipeline without going through all the steps.
Step 1. Declare input tables
We start with modeling input data as SQL tables. In the Feldera Web Console,
navigate to the
SQL Programs section and click on
ADD SQL PROGRAM. Give
the program a name, e.g., "Supply Chain Analytics" and paste the following code
in the SQL editor:
create table VENDOR (
id bigint not null primary key,
create table PART (
id bigint not null primary key,
create table PRICE (
part bigint not null,
vendor bigint not null,
This looks familiar, just plain old SQL
CREATE TABLE statements.
Indeed, SQL's data modeling language works for streaming
data just as well as for tables stored on the disk. No need to learn a new
language: if you know SQL, you already know streaming SQL!
It is worth pointing out that these declarations do not say anything
about the sources of data. Records for the
could arrive from a Kafka stream, a database, or an HTTP request. Below we will
see how our SQL program can be instantiated with any of these data sources, or
even multiple data sources connected to the same table.
Step 2. Write queries
We would like to compute the lowest price for each part across all vendors. Add the following statements to your SQL program:
-- Lowest available price for each part across all vendors.
create view LOW_PRICE (
select part, MIN(price) as price from PRICE group by part;
-- Lowest available price for each part along with part and vendor details.
create view PREFERRED_VENDOR (
PART.id as part_id,
PART.name as part_name,
VENDOR.id as vendor_id,
VENDOR.name as vendor_name,
PRICE.price = LOW_PRICE.price AND
PRICE.part = LOW_PRICE.part AND
PART.id = PRICE.part AND
VENDOR.id = PRICE.vendor;
In Feldera we write queries as SQL views. Views can be defined in terms of
tables and other views, making it possible to express deeply nested queries. In
this example, the
PREFERRED_VENDOR view is expressed in terms of the
Step 3. Run the program
In order to run our SQL program, we must instantiate it as part of a pipeline.
Navigate to the
Pipelines section and click
ADD PIPELINE. Give the new
pipeline the name
supply_chain and select "Supply Chain Analytics" from the
list of SQL programs.
The selected program is visualized as a rectangle with a blue dot for each table and view declared in the program. These can be used to connect data sources and sinks. For the time being, we will run our pipeline without any sources or sinks. We will build more exciting pipelines in the next part of the tutorial.
Go back to the pipelines view (click on
Pipelines in the navigation bar
on the left). Your newly created pipeline should appear in the list. Click the
play icon next to the pipeline.
The pipeline is now running and is ready to process inputs; however since we have not connected any data sources to it, no data has been received yet. Let us add some manually.
Step 4. Populate tables manually
Expand the runtime state of the pipeline by clicking the chevron icon on the left. You should see the list
of tables and views defined in your program. Click on the icon next to the
PART table. This will open a
BROWSE PART and
INSERT NEW ROWS tabs. The
tab should be empty because no data has been inserted yet. Click
INSERT NEW ROWS, where you can insert new rows to the table using a
configurable random data generator (feel free to play around with it!)
or by entering the data manually. For example, you might add the
INSERT ROWS to push the new rows to the table. Switch to the
BROWSE PART tab to see the contents of the table, which should
contain the newly inserted rows.
Follow the same process to populate
|222 Graviton Lane
|456 Warp Way
|333 Singularity Street
PREFERRED_VENDOR view from the dropdown list to see the
output of the query:
Step 5. Make changes
Let us see what happens if we add more input rows to the
INSERT ROWS and switch back to the
PREFERRED_VENDOR view. The view has
changed: DarkMatter Devices is now the cheapest supplier of Warp Cores, while
Gravitech Dynamics offers lowest-priced Flux Capacitors.
As you make more changes to the input tables, Feldera will keep updating the view.
This simple example illustrates the concept of continuous analytics. Feldera queries are always-on: the query engine continuously updates all views in response to input changes, making the most up-to-date results available at any time.
The Web Console does not yet support deleting records. Use the REST API described in the next part of the tutorial instead.
Step 6. Stop the pipeline
Click the stop icon to shut down the pipeline.
All pipeline state will be lost.
Let us recap what we have learned so far:
Feldera executes programs written in standard SQL, using
CREATE TABLEstatements define a schema for input data.
CREATE VIEWstatements define queries over input tables and other views.
A SQL program is instantiated as part of a pipeline.
Feldera evaluates queries continuously, updating their results as input data changes.