Chapter 6. EQL Reference

6.1. EQL Introduction

EQL statements are used to derive and aggregate information from one or more streams of events, and to join or merge event streams. This section outlines EQL syntax. It also outlines the built-in views, which are the building blocks for deriving and aggregating information from event streams.

EQL is similar to SQL in its use of the select clause and the where clause. Where EQL differs most from SQL is in the use of tables. EQL replaces tables with the concept of event streams.

EQL statements contain definitions of one or more views. Similar to tables in an SQL statement, views define the data available for querying and filtering. Some views represent windows over a stream of events. Other views derive statistics from event properties, group events or handle unique event property values. Views can be staggered onto each other to build a chain of views. The Esper engine makes sure that views are reused among EQL statements for efficiency.

The built-in set of views is:

  1. Views that represent moving event windows: win:length, win:length_batch, win:time, win:time_batch, win:ext_time, ext:sort_window

  2. Views for aggregation: std:unique, std:groupby, std:lastevent (note: the group-by clause and the std:groupby view are very similar in function, see view description for differences)

  3. Views that derive statistics: std:size, stat:uni, stat:linest, stat:correl, stat:weighted_avg, stat:multidim_stat

Esper can be extended by plugging-in custom developed views.

6.2. EQL Syntax

EQL queries are created and stored in the engine, and publish results as events are received by the engine or timer events occur that match the criteria specified in the query. Events can also be pulled from running EQL queries.

The select clause in an EQL query specifies the event properties or events to retrieve. The from clause in an EQL query specifies the event stream definitions and stream names to use. The where clause in an EQL query specifies search conditions that specify which event or event combination to search for. For example, the following statement returns the average price for IBM stock ticks in the last 30 seconds.

select avg(price) from StockTick.win:time(30 sec) where symbol='IBM'

EQL queries follow the below syntax. EQL queries can be simple queries or more complex queries. A simple select contains only a select clause and a single stream definition. Complex EQL queries can be build that feature a more elaborate select list utilizing expressions, may join multiple streams, may contain a where clause with search conditions and so on.

[insert into insert_into_def]
select select_list
from stream_def [as name] [, stream_def [as name]] [,...]
[where search_conditions]
[group by grouping_expression_list]
[having grouping_search_conditions]
[output output_specification]
[order by order_by_expression_list]

6.2.1. Specifying Time Periods

Time-based windows as well as pattern observers and guards take a time period as a parameter. Time periods follow the syntax below.

time-period : [day-part] [hour-part] [minute-part] [seconds-part] [milliseconds-part]

day-part : number ("days" | "day")
hour-part : number ("hours" | "hour")
minute-part : number ("minutes" | "minute" | "min")
seconds-part : number ("seconds" | "second" | "sec")
milliseconds-part : number ("milliseconds" | "millisecond" | "msec")

Some examples of time periods are:

10 seconds
10 minutes 30 seconds
20 sec 100 msec
1 day 2 hours 20 minutes 15 seconds 110 milliseconds
0.5 minutes

6.3. Choosing Event Properties And Events: the Select Clause

The select clause is required in all EQL statements. The select clause can be used to select all properties via the wildcard *, or to specify a list of event properties and expressions. The select clause defines the event type (event property names and types) of the resulting events published by the statement, or pulled from the statement.

The select clause also offers optional istream and rstream keywords to control how events are posted to UpdateListener instances listening to the statement.

The syntax for the select clause is summarized below.

select [rstream | istream] * | expression_list ... 

6.3.1. Choosing all event properties: select *

The syntax for selecting all event properties in a stream is:

select * from stream_def

The following statement selects univariate statistics for the last 30 seconds of IBM stock ticks for price.

select * from StockTick(symbol='IBM').win:time(30 sec).stat:uni('price')

In a join statement, using the select * syntax selects event properties that contain the events representing the joined streams themselves.

The * wildcard and expressions can also be combined in a select clause. The combination selects all event properties and in addition the computed values as specified by any additional expressions that are part of the select clause. Here is an example that selects all properties of stock tick events plus a computed product of price and volume that the statement names 'pricevolume':

select *, price * volume as pricevolume from StockTick(symbol='IBM')

6.3.2. Choosing specific event properties

To chose the particular event properties to return:

select event_property [, event_property] [, ...] from stream_def

The following statement selects the count and standard deviation properties for the last 100 events of IBM stock ticks for volume.

select count, stdev from StockTick(symbol='IBM').win:length(100).stat:uni('volume')

6.3.3. Expressions

The select clause can contain one or more expressions.

select expression [, expression] [, ...] from stream_def

The following statement selects the volume multiplied by price for a time batch of the last 30 seconds of stock tick events.

select volume * price from StockTick.win:time_batch(30 sec)

6.3.4. Renaming event properties

Event properties and expressions can be renamed using below syntax.

select [event property | expression] as identifier [, ...]

The following statement selects volume multiplied by price and specifies the name volPrice for the event property.

select volume * price as volPrice from StockTick.win:length(100)

6.3.5. Selecting istream and rstream events

The optional istream and rstream keywords in the select clause define the event stream posted to listeners to the statement.

If neither keyword is specified, the engine posts insert stream events via the newEvents parameter to the update method of UpdateListener instances listening to the statement. The engine posts remove stream events to the oldEvents parameter of the update method. The insert stream consists of the events entering the respective window(s) or stream(s) or aggregations, while the remove stream consists of the events leaving the respective window(s) or the changed aggregation result. See Chapter 4, Understanding the Output Model for more information on insert and remove streams.

By specifying the istream keyword you can instruct the engine to only post insert stream events via the newEvents parameter to the update method on listeners. The engine will then not post any remove stream events, and the oldEvents parameter is always a null value.

By specifying the rstream keyword you can instruct the engine to only post remove stream events via the newEvents parameter to the update method on listeners. The engine will then not post any insert stream events, and the oldEvents parameter is also always a null value.

The following statement selects only the events that are leaving the 30 second time window.

select rstream * from StockTick.win:time(30 sec)

The istream and rstream keywords in the select clause are matched by same-name keywords available in the insert into clause. While the keywords in the select clause control the event stream posted to listeners to the statement, the same keywords in the insert into clause specify the event stream that the engine makes available to other statements.

6.4. Specifying Event Streams : the From Clause

The from clause is required in all EQL statements. It specifies one or more event streams. Each event stream can optionally be given a name by means of the as syntax.

from stream_def [as name] [, stream_def [as name]] [, ...]

The event stream definition stream_def as shown in the syntax above can consists of either a filter-based event stream definition or a pattern-based event stream definition.

For joins and outer joins, specify two or more event streams. Joins between pattern-based and filter-based event streams are also supported.

Esper supports joins against relational databases for access to historical or reference data as explained in Section 6.13, “Joining Relational Data via SQL”.

6.4.1. Filter-based event streams

For filter-based event streams, the event stream definition stream_def as shown in the from-clause syntax consists of an event type, optional filter expressions and an optional list of views that derive data from a stream. The syntax for a filter-based event stream is as below:

event_type ( [filter_criteria] ) [.view_spec] [.view_spec] [...]

The following EQL statement shows event type, filter criteria and views combined in one statement. It selects all event properties for the last 100 events of IBM stock ticks for volume. In the example, the event type is the fully qualified Java class name org.esper.example.StockTick. The expression filters for events where the property symbol has a value of "IBM". The optional view specifications for deriving data from the StockTick events are a length window and a view for computing statistics on volume. The name for the event stream is "volumeStats".

select * from 
  org.esper.example.StockTick(symbol='IBM').win:length(100).stat:uni('volume') as volumeStats

Esper filters out events in an event stream as defined by filter criteria before it sends events to subsequent views. Thus, compared to search conditions in a where-clause, filter criteria remove unneeded events early. In the above example, events with a symbol other then IBM do not enter the time window.

6.4.1.1. Specifying an event type

The simplest form of filter is a filter for events of a given type without any conditions on the event property values. This filter matches any event of that type regardless of the event's properties. The example below is such a filter.

select * from com.mypackage.myevents.RfidEvent

Instead of the fully-qualified Java class name any other event name can be mapped via Configuration to a Java class, making the resulting statement more readable:

select * from RfidEvent

Interfaces and superclasses are also supported as event types. In the below example IRfidReadable is an interface class.

select * from org.myorg.rfid.IRfidReadable

6.4.1.2. Specifying filter criteria

The filtering criteria to filter for events with certain event property values are placed within parenthesis after the event type name:

select * from RfidEvent(category="Perishable")

All expressions can be used in filters, including static methods that return a boolean value:

select * from RfidEvent(MyRFIDLib.isInRange(x, y) or (x < 0 and y < 0))

Filter expressions can be separated via a single comma ','. The comma represents a logical AND between filter expressions:

select * from RfidEvent(zone=1, category=10)
...is equivalent to...
select * from RfidEvent(zone=1 and category=10)

The following set of operators are highly optimized through indexing and are the preferred means of filtering in high-volumne event streams:

  • equals =

  • not equals !=

  • comparison operators < , > , >=, <=

  • ranges

    • use the between keyword for a closed range where both endpoints are included

    • use the in keyword and round () or square brackets [] to control how endpoints are included

    • for inverted ranges use the not keyword and the between or in keywords

  • list-of-values checks using the in keyword or the not in keywords followed by a comma-separated list of values

At compile time as well as at run time, the engine scans new filter expressions for sub-expressions that can be indexed. Indexing filter values to match event properties of incoming events enables the engine to match incoming events faster. The above list of operators represents the set of operators that the engine can best convert into indexes. The use of comma or logical and in filter expressions does not impact optimizations by the engine.

6.4.1.3. Filtering Ranges

Ranges come in the following 4 varieties. The use of round () or square [] bracket dictates whether an endpoint is included or excluded. The low point and the high-point of the range are separated by the colon : character.

  • Open ranges that contain neither endpoint (low:high)

  • Closed ranges that contain both endpoints [low:high]. The equivalent 'between' keyword also defines a closed range.

  • Half-open ranges that contain the low endpoint but not the high endpoint [low:high)

  • Half-closed ranges that contain the high endpoint but not the low endpoint (low:high]

The next statement shows a filter specifying a range for x and y values of RFID events. The range includes both endpoints therefore uses [] hard brackets.

mypackage.RfidEvent(x in [100:200], y in [0:100])

The between keyword is equivalent for closed ranges. The same filter using the between keyword is:

mypackage.RfidEvent(x between 100 and 200, y between 0 and 50)

The not keyword can be used to determine if a value falls outside a given range:

mypackage.RfidEvent(x not in [0:100])

The equivalent statement using the between keyword is:

mypackage.RfidEvent(x not between 0 and 100)

6.4.1.4. Filtering Sets of Values

The in keyword for filter criteria determines if a given value matches any value in a list of values.

In this example we are interested in RFID events where the category matches any of the given values:

mypackage.RfidEvent(category in ('Perishable', 'Container'))

By using the not in keywords we can filter events with a property value that does not match any of the values in a list of values:

mypackage.RfidEvent(category not in ('Household', 'Electrical'))

6.4.1.5. Filter Limitations

The following restrictions apply to filter criteria:

  • Range and comparison operators require the event property to be of a numeric type.

  • Aggregation functions are not allowed within filter expressions.

  • The prev previous event function and the prior prior event function cannot be used in filter expressions.

6.4.2. Pattern-based event streams

Event pattern expressions can also be used to specify one or more event streams in an EQL statement. For pattern-based event streams, the event stream definition stream_def consists of the keyword pattern and a pattern expression in brackets []. The syntax for an event stream definition using a pattern expression is below. As in filter-based event streams, an optional list of views that derive data from the stream can be supplied.

pattern [pattern_expression] [.view_spec] [.view_spec] [...]

The next statement specifies an event stream that consists of both stock tick events and trade events. The example tags stock tick events with the name "tick" and trade events with the name "trade".

select * from pattern [every tick=StockTickEvent or every trade=TradeEvent]

This statement generates an event every time the engine receives either one of the event types. The generated events resemble a map with "tick" and "trade" keys. For stock tick events, the "tick" key value is the underlying stock tick event, and the "trade" key value is a null value. For trade events, the "trade" key value is the underlying trade event, and the "tick" key value is a null value.

Lets further refine this statement adding a view the gives us the last 30 seconds of either stock tick or trade events. Lets also select prices and a price total.

select tick.price as tickPrice, trade.price as tradePrice, 
       sum(tick.price) + sum(trade.price) as total
  from pattern [every tick=StockTickEvent or every trade=TradeEvent].win:time(30 sec)

Note that in the statement above tickPrice and tradePrice can each be null values depending on the event processed. Therefore, an aggregation function such as sum(tick.price + trade.price)) would always return null values as either of the two price properties are always a null value for any event matching the pattern. Use the coalesce function to handle null values, for example: sum(coalesce(tick.price, 0) + coalesce(trade.price, 0)).

6.4.3. Specifying views

Views are used to derive or aggregate data. Views can be staggered onto each other. See the section Section 6.16, “Built-in views” on the views available.

Views can optionally take one or more parameters. These parameters can consist of primitive constants such as String, boolean or numeric types. Arrays are also supported as a view parameter types.

The below example serves to show views and staggering of views. It uses a car location event that contains information about the location of a car on a highway.

The first view std:groupby('carId') groups car location events by car id. The second view win:length(4) keeps a length window of the 4 last events, with one length window for each car id. The next view std:groupby({'expressway', 'direction', 'segment'}) groups each event by its expressway, direction and segment property values. Again, the grouping is done for each car id considering the last 4 events only. The last view std:size() is used to report the number of events. Thus the below example reports the number of events per car id and per expressway, direction and segment considering the last 4 events for each car id only.

select * from CarLocEvent.std:groupby('carId').win:length(4).
  std:groupby({'expressway', 'direction', 'segment'}).std:size()

6.5. Specifying Search Conditions: the Where Clause

The where clause is an optional clause in EQL statements. Via the where clause event streams can be joined and events can be filtered.

Comparison operators =, < , > , >=, <=, !=, <>, is null, is not null and logical combinations via and and or are supported in the where clause. The where clause can also introduce join conditions as outlined in Section 6.10, “Joining Event Streams”. Where-clauses can also contain expressions. Some examples are listed below.

...where fraud.severity = 5 and amount > 500
...where (orderItem.orderId is null) or (orderItem.class != 10)		 
...where (orderItem.orderId = null) or (orderItem.class <> 10)		 
...where itemCount / packageCount > 10		 

6.6. Aggregates and grouping: the Group-by Clause and the Having Clause

6.6.1. Using aggregate functions

The aggregate functions are sum, avg, count, max, min, median, stddev, avedev. You can use aggregate functions to calculate and summarize data from event properties. For example, to find out the total price for all stock tick events in the last 30 seconds, type:

select sum(price) from StockTickEvent.win:time(30 sec)

Here is the syntax for aggregate functions:

aggregate_function( [all | distinct] expression)

You can apply aggregate functions to all events in an event stream window or other view, or to one or more groups of events. From each set of events to which an aggregate function is applied, Esper generates a single value.

Expression is usually an event property name. However it can also be a constant, function, or any combination of event property names, constants, and functions connected by arithmetic operators.

For example, to find out the average price for all stock tick events in the last 30 seconds if the price was doubled:

select avg(price * 2) from StockTickEvent.win:time(30 seconds)

You can use the optional keyword distinct with all aggregate functions to eliminate duplicate values before the aggregate function is applied. The optional keyword all which performs the operation on all events is the default.

The syntax of the aggregation functions and the results they produce are shown in below table.

Table 6.1. Syntax and results of aggregate functions

Aggregate FunctionResult
sum([all|distinct] expression)

Totals the (distinct) values in the expression, returning a value of long, double, float or integer type depending on the expression

avg([all|distinct] expression)

Average of the (distinct) values in the expression, returning a value of double type

count([all|distinct] expression)

Number of the (distinct) non-null values in the expression, returning a value of long type

count(*)

Number of events, returning a value of long type

max([all|distinct] expression)

Highest (distinct) value in the expression, returning a value of the same type as the expression itself returns

min([all|distinct] expression)

Lowest (distinct) value in the expression, returning a value of the same type as the expression itself returns

median([all|distinct] expression)

Median (distinct) value in the expression, returning a value of double type

stddev([all|distinct] expression)

Standard deviation of the (distinct) values in the expression, returning a value of double type

avedev([all|distinct] expression)

Mean deviation of the (distinct) values in the expression, returning a value of double type

You can use aggregation functions in a select clause and in a having clause. You cannot use aggregate functions in a where clause, but you can use the where clause to restrict the events to which the aggregate is applied. The next query computes the average and sum of the price of stock tick events for the symbol IBM only, for the last 10 stock tick events regardless of their symbol.

select 'IBM stats' as title, avg(price) as avgPrice, sum(price) as sumPrice
from StockTickEvent.win:length(10)
where symbol='IBM'

In the above example the length window of 10 elements is not affected by the where-clause, i.e. all events enter and leave the length window regardless of their symbol. If we only care about the last 10 IBM events, we need to add filter criteria as below.

select 'IBM stats' as title, avg(price) as avgPrice, sum(price) as sumPrice
from StockTickEvent(symbol='IBM').win:length(10)
where symbol='IBM'

You can use aggregate functions with any type of event property or expression, with the following exceptions:

  1. You can use sum, avg, median, stddev, avedev with numeric event properties only

Esper ignores any null values returned by the event property or expression on which the aggregate function is operating, except for the count(*) function, which counts null values as well. All aggregate functions return null if the data set contains no events, or if all events in the data set contain only null values for the aggregated expression.

6.6.2. Organizing statement results into groups: the Group-by clause

The group by clause is optional in all EQL statements. The group by clause divides the output of an EQL statement into groups. You can group by one or more event property names, or by the result of computed expressions. When used with aggregate functions, group by retrieves the calculations in each subgroup. You can use group by without aggregate functions, but generally that can produce confusing results.

For example, the below statement returns the total price per symbol for all stock tick events in the last 30 seconds:

select symbol, sum(price) from StockTickEvent.win:time(30 sec) group by symbol

The syntax of the group by clause is:

group by arregate_free_expression [, arregate_free_expression] [, ...]

Esper places the following restrictions on expressions in the group by clause:

  1. Expressions in the group by cannot contain aggregate functions

  2. Event properties that are used within aggregate functions in the select clause cannot also be used in a group by expression

You can list more then one expression in the group by clause to nest groups. Once the sets are established with group by the aggregation functions are applied. This statement posts the median volume for all stock tick events in the last 30 seconds per symbol and tick data feed. Esper posts one event for each group to statement listeners:

select symbol, tickDataFeed, median(volume) 
from StockTickEvent.win:time(30 sec) 
group by symbol, tickDataFeed

In the statement above the event properties in the select list (symbol, tickDataFeed) are also listed in the group by clause. The statement thus follows the SQL standard which prescribes that non-aggregated event properties in the select list must match the group by columns.

Esper also supports statements in which one or more event properties in the select list are not listed in the group by clause. The statement below demonstrates this case. It calculates the standard deviation for the last 30 seconds of stock ticks aggregating by symbol and posting for each event the symbol, tickDataFeed and the standard deviation on price.

select symbol, tickDataFeed, stddev(price) from StockTickEvent.win:time(30 sec) group by symbol

The above example still aggregates the price event property based on the symbol, but produces one event per incoming event, not one event per group.

Additionally, Esper supports statements in which one or more event properties in the group by clause are not listed in the select list. This is an example that calculates the mean deviation per symbol and tickDataFeed and posts one event per group with symbol and mean deviation of price in the generated events. Since tickDataFeed is not in the posted results, this can potentially be confusing.

select symbol, avedev(price) 
from StockTickEvent.win:time(30 sec) 
group by symbol, tickDataFeed

Expressions are also allowed in the group by list:

select symbol * price, count(*) from StockTickEvent.win:time(30 sec) group by symbol * price

If the group by expression resulted in a null value, the null value becomes its own group. All null values are aggregated into the same group. If you are using the count(expression) aggregate function which does not count null values, the count returns zero if only null values are encountered.

You can use a where clause in a statement with group by. Events that do not satisfy the conditions in the where clause are eliminated before any grouping is done. For example, the statement below posts the number of stock ticks in the last 30 seconds with a volume larger then 100, posting one event per group (symbol).

select symbol, count(*) from StockTickEvent.win:time(30 sec) where volume > 100 group by symbol

6.6.3. Selecting groups of events: the Having clause

Use the having clause to pass or reject events defined by the group-by clause. The having clause sets conditions for the group by clause in the same way where sets conditions for the select clause, except where cannot include aggregate functions, while having often does.

This statement is an example of a having clause with an aggregate function. It posts the total price per symbol for the last 30 seconds of stock tick events for only those symbols in which the total price exceeds 1000. The having clause eliminates all symbols where the total price is equal or less then 1000.

select symbol, sum(price) 
from StockTickEvent.win:time(30 sec) 
group by symbol 
having sum(price) > 1000

To include more then one condition in the having clause combine the conditions with and, or or not. This is shown in the statement below which selects only groups with a total price greater then 1000 and an average volume less then 500.

select symbol, sum(price), avg(volume)
from StockTickEvent.win:time(30 sec) 
group by symbol 
having sum(price) > 1000 and avg(volume) < 500

Esper places the following restrictions on expressions in the having clause:

  1. Any expressions that contain aggregate functions must also occur in the select clause

A statement with the having clause should also have a group by clause. If you omit group-by, all the events not excluded by the where clause return as a single group. In that case having acts like a where except that having can have aggregate functions.

The having clause can also be used without group by clause as the below example shows. The example below posts events where the price is less then the current running average price of all stock tick events in the last 30 seconds.

select symbol, price, avg(price) 
from StockTickEvent.win:time(30 sec) 
having price < avg(price)

6.6.4. How the stream filter, Where, Group By and Having clauses interact

When you include filters, the where condition, the group by clause and the having condition in an EQL statement the sequence in which each clause affects events determines the final result:

  1. The event stream's filter condition, if present, dictates which events enter a window (if one is used). The filter discards any events not meeting filter criteria.

  2. The where clause excludes events that do not meet its search condition.

  3. Aggregate functions in the select list calculate summary values for each group.

  4. The having clause excludes events from the final results that do not meet its search condition.

The following query illustrates the use of filter, where, group by and having clauses in one statement with a select clause containing an aggregate function.

select tickDataFeed, stddev(price)
from StockTickEvent(symbol='IBM').win:length(10) 
where volume > 1000
group by tickDataFeed 
having stddev(price) > 0.8

Esper filters events using the filter criteria for the event stream StockTickEvent. In the example above only events with symbol IBM enter the length window over the last 10 events, all other events are simply discarded. The where clause removes any events posted by the length window (events entering the window and event leaving the window) that do not match the condition of volume greater then 1000. Remaining events are applied to the stddev standard deviation aggregate function for each tick data feed as specified in the group by clause. Each tickDataFeed value generates one event. Esper applies the having clause and only lets events pass for tickDataFeed groups with a standard deviation of price greater then 0.8.

6.7. Stabilizing and Limiting Output: the Output Clause

6.7.1. Output Clause Options

The output clause is optional in Esper and is used to control or stabilize the rate at which events are output. For example, the following statement batches old and new events and outputs them at the end of every 90 second interval.

select * from StockTickEvent.win:length(5) output every 90 seconds

Here is the syntax for output rate limiting:

output [all | first | last] every number [minutes | seconds | events]

The all keyword is the default and specifies that all events in a batch should be output. The batch size can be specified in terms of time or number of events.

The first keyword specifies that only the first event in an output batch is to be output. Using the first keyword instructs the engine to output the first matching event as soon as it arrives, and then ignore matching events for the time interval or number of events specified. After the time interval elapsed, or the number of matching events has been reached, the next first matching event is output again and the following interval the engine again ignores matching events.

The last keyword specifies to only output the last event at the end of the given time interval or after the given number of matching events have been accumulated.

The time interval can also be specified in terms of minutes; the following statement is identical to the first one.

select * from StockTickEvent.win:length(5) output every 1.5 minutes

A second way that output can be stabilized is by batching events until a certain number of events have been collected. The next statement only outputs when either 5 (or more) new or 5 (or more) old events have been batched.

select * from StockTickEvent.win:time(30 sec) output every 5 events

Additionally, event output can be further modified by the optional last keyword, which causes output of only the last event to arrive into an output batch.

select * from StockTickEvent.win:time(30 sec) output last every 5 events

Using the first keyword you can be notified at the start of the interval. The allows to watch for situations such as a rate falling below a threshold and only be informed every now and again after the specified output interval, but be informed the moment it first happens.

select * from TickRate.win:time(30 seconds) output first every 60 seconds where rate<100

6.7.2. Group By, Having and Output clause interaction

The output clause interacts in two ways with the group by and having clauses. First, in the output every n events case, the number n refers to the number of events arriving into the group by clause. That is, if the group by clause outputs only 1 event per group, or if the arriving events don't satisfy the having clause, then the actual number of events output by the statement could be fewer than n.

Second, the last and all keywords have special meanings when used in a statement with aggregate functions and the group by clause. The last keyword specifies that only groups whose aggregate values have been updated with the most recent batch of events should be output. The all keyword (the default) specifies that the most recent data for all groups seen so far should be output, whether or not these groups' aggregate values have just been updated.

6.8. Sorting Output: the Order By Clause

The order by clause is optional in Esper. It is used for ordering output events by their properties, or by expressions involving those properties. For example, the following statement outputs batches of 5 or more stock tick events that are sorted first by price and then by volume.

select symbol from StockTickEvent.win:time(60 sec) 
output every 5 events 
order by price, volume

Here is the syntax for the order by clause:

order by expression [asc | desc] [, expression [asc | desc]] [, ...]

Esper places the following restrictions on the expressions in the order by clause:

  1. All aggregate functions that appear in the order by clause must also appear in the select expression.

Otherwise, any kind of expression that can appear in the select clause, as well as any alias defined in the select clause, is also valid in the order by clause.

6.9. Merging Streams and Continuous Insertion: the Insert Into Clause

The insert into clause is optional in Esper. This clause can be specified to make the results of a statement available as an event stream for use in further statements. The clause can also be used to merge multiple event streams to form a single stream of events.

insert into CombinedEvent
select A.customerId as custId, A.timestamp - B.timestamp as latency
  from EventA.win:time(30 min) A, EventB.win:time(30 min) B
 where A.txnId = B.txnId

The insert into clause in above statement generates events of type CombinedEvent. Each generated CombinedEvent event has 2 event properties named "custId" and "latency". The events generated by above statement can be used in further statements. The below statement uses the generated events.

select custId, sum(latency)
  from CombinedEvent.win:time(30 min)
 group by custId

The insert into clause can consist of just an event type alias, or of an event type alias and 1 or more event property names. The syntax for the insert into clause is as follows:

insert [istream | rstream] into event_type_alias  [ (property_name [, property_name] ) ]

The istream (default) and rstream keywords are optional. If neither keyword or the istream keyword is specified, the engine supplies the insert stream events generated by the statement. The insert stream consists of the events entering the respective window(s) or stream(s). If the rstream keyword is specified, the engine supplies the remove stream events generated by the statement. The remove stream consists of the events leaving the respective window(s).

The event_type_alias is an identifier that names the events generated by the engine. The identifier can be used in statements to filter and process events of the given name.

The engine also allows listeners to be attached to a statement that contain an insert into clause.

To merge event streams, simply use the same event_type_alias identifier in all EQL statements that merge their result event streams. Make sure to use the same number and names of event properties and event property types match up.

Esper places the following restrictions on the insert into clause:

  1. The number of elements in the select clause must match the number of elements in the insert into clause if the clause specifies a list of event property names

  2. If the event type alias has already been defined by a prior statement or configuration, and the event property names and types do not match, an exception is thrown at statement creation time.

The example statement below shows the alternative form of the insert into clause that explicitly defines the property names to use.

insert into CombinedEvent (custId, latency)
select A.customerId, A.timestamp - B.timestamp 
...

The rstream keyword can be useful to indicate to the engine to generate only remove stream events. This can be useful if we want to trigger actions when events leave a window rather then when events enter a window. The statement below generates CombinedEvent events when EventA and EventB leave the window after 30 minutes (1800 seconds).

insert rstream into CombinedEvent
select A.customerId as custId, A.timestamp - B.timestamp as latency
  from EventA.win:time(30 min) A, EventB.win:time(30 min) B
 where A.txnId = B.txnId

6.10. Joining Event Streams

Two or more event streams can be part of the from clause and thus both streams determine the resulting events. The where-clause lists the join conditions that Esper uses to relate events in the two or more streams. Reference and historical data such as stored in your relational database can also be included in joins. Please see Section 6.13, “Joining Relational Data via SQL” for details.

Each point in time that an event arrives to one of the event streams, the two event streams are joined and output events are produced according to the where-clause.

This example joins 2 event streams. The first event stream consists of fraud warning events for which we keep the last 30 minutes (1800 seconds). The second stream is withdrawal events for which we consider the last 30 seconds. The streams are joined on account number.

select fraud.accountNumber as accntNum, fraud.warning as warn, withdraw.amount as amount,
       max(fraud.timestamp, withdraw.timestamp) as timestamp, 'withdrawlFraud' as desc
  from net.esper.example.atm.FraudWarningEvent.win:time(30 min) as fraud,
       net.esper.example.atm.WithdrawalEvent.win:time(30 sec) as withdraw
 where fraud.accountNumber = withdraw.accountNumber

Joins can also include one or more pattern statements as the next example shows:

select * from FraudWarningEvent.win:time(30 min) as fraud,
       pattern [every w=WithdrawalEvent -> PINChangeEvent(acct=w.acct)] as withdraw
 where fraud.accountNumber = withdraw.w.accountNumber

The statement above joins the last 30 minutes of fraud warnings with a pattern. The pattern consists of every withdrawal event that is followed by a PIN change event for the same account number. It joins the two event streams on account number.

6.11. Outer Joins

Esper supports left outer joins, right outer joins and full outer joins between an unlimited number of event streams. Outer joins can also join reference and historical data as explained in Section 6.13, “Joining Relational Data via SQL”.

If the outer join is a left outer join, there will be an output event for each event of the stream on the left-hand side of the clause. For example, in the left outer join shown below we will get output for each event in the stream RfidEvent, even if the event does not match any event in the event stream OrderList.

select * from net.esper.example.rfid.RfidEvent.win:time(30 sec) as rfid
       left outer join
       net.esper.example.rfid.OrderList.win:length(10000) as orderlist
     on rfid.itemId = orderList.itemId

Similarly, if the join is a Right Outer Join, then there will be an output event for each event of the stream on the right-hand side of the clause. For example, in the right outer join shown below we will get output for each event in the stream OrderList, even if the event does not match any event in the event stream RfidEvent.

select * from net.esper.example.rfid.RfidEvent.win:time(30 sec) as rfid
       right outer join
       net.esper.example.rfid.OrderList.win:length(10000) as orderlist
     on rfid.itemId = orderList.itemId

For all types of outer joins, if the join condition is not met, the select list is computed with the event properties of the arrived event while all other event properties are considered to be null.

select * from net.esper.example.rfid.RfidEvent.win:time(30 sec) as rfid
       full outer join
       net.esper.example.rfid.OrderList.win:length(10000) as orderlist
     on rfid.itemId = orderList.itemId

The last type of outer join is a full outer join. In a full outer join, each point in time that an event arrives to one of the event streams, one or more output events are produced. In the example below, when either an RfidEvent or an OrderList event arrive, one or more output event is produced.

6.12. Subqueries

A subquery is a select within another statement. Esper supports subqueries in the select-clause and in the where-clause of EQL statements. Subqueries provide an alternative way to perform operations that would otherwise require complex joins. Subqueries can also make statements more readable then complex joins.

Esper supports both simple subqueries as well as correlated subqueries. In a simple subquery, the inner query is not correlated to the outer query. Here is an example simple subquery within a select-clause:

select assetId, (select zone from ZoneClosed.std:lastevent) as lastClosed from RFIDEvent

If the inner query is dependent on the outer query, we will have a correlated subquery. An example of a correlated subquery is shown below. Notice the where-clause in the inner query, where the condition involves a stream from the outer query:

select * from RfidEvent as RFID where 'Dock 1' = 
  (select name from Zones.std:unique('zoneId') where zoneId = RFID.zoneId)

The example above shows a subquery in the where-clause. The statement selects RFID events in which the zone name matches a string constant based on zone id. The statement uses the view std:unique to guarantee that only the last event per zone id is held from processing by the subquery.

The next example is a correlated subquery within a select-clause. In this statement the select-clause retrieves the zone name by means of a subquery against the Zones set of events correlated by zone id:

select zoneId, (select name from Zones.std:unique('zoneId') 
  where zoneId = RFID.zoneId) as name from RFIDEvent

Note that when a simple or correlated subquery returns multiple rows, the engine returns a NULL value as the subquery result. To limit the number of events returned by a subquery consider using one of the views std:lastevent, std:unique and std:groupby.

The following restrictions apply to subqueries:

  1. Wildcard selection criteria can only be used in subqueries using the exists keyword

  2. The subquery stream definition must define a data window or other view to limit subquery results, reducing the number of events held for subquery execution

  3. Aggregation functions cannot be used in subqueries. Instead, the insert-into clause can be used to provide aggregation results for use in subqueries

  4. Subqueries can only consist of a select-clause, a from-clause and a where-clause. The group-by and having-clauses, as well as joins, outer-joins and output rate limiting are not permitted within subqueries.

Performance of your statement containing one or more subqueries principally depends on two parameters. First, if your subquery correlates one or more columns in the subquery stream with the enclosing statement's streams via equals '=', the engine automatically builds the appropriate indexes for fast row retrieval based on the key values correlated (joined). The second parameter is the number of rows found in the subquery stream and the complexity of the filter criteria (where-clause), as each row in the subquery stream must evaluate against the where-clause filter.

6.12.1. The 'exists' keyword

The exists condition is considered "to be met" if the subquery returns at least one row. The not exists condition is considered true if the subquery returns no rows.

Let's take a look at a simple example. The following is an EQL statement that uses the exists condition:

select assetId from RFIDEvent as RFID 
  where exists (select * from Asset.std:unique(assetId) where assetId = RFID.assetId)

This select statement will return all RFID events where there is at least one event in Assets unique by asset id with the same asset id.

6.12.2. The 'in' keyword

The in subquery condition is true if the value of an expression matches one or more of the values returned by the subquery. Consequently, the not in condition is true if the value of an expression matches none of the values returned by the subquery.

The next statement demonstrates the use of the in subquery condition:

select assetId from RFIDEvent as RFID 
  where zone in (select zone from ZoneUpdate.win:time(10 min) where status = 'closed' )

The above statement demonstrated the in subquery to select RFID events for which the zone status is in a closed state.

6.13. Joining Relational Data via SQL

This chapter outlines how reference data and historical data that are stored in a relational database can be queried via SQL within EQL statements.

Esper can join and outer join all types of event streams to stored data. In order for such data sources to become accessible to Esper, some configuration is required. The Section 2.4.5, “Relational Database Access” explains the required configuration for database access in greater detail, and includes information of configuring a query result cache.

The following restrictions currently apply:

  • Only one event stream and one SQL query can be joined; Joins of two or more event streams with an SQL query are not yet supported.

  • Sub-views on an SQL query are not allowed; That is, one cannot create a time or length window on an SQL query. However one can use the insert into syntax to make join results available to a further statement.

  • Your database software must support JDBC prepared statements that provide statement meta data at compilation time. Most major databases provide this function.

The next sections assume basic knowledge of SQL (Structured Query Language).

6.13.1. Joining SQL Query Results

To join an event stream against stored data, specify the sql keyword followed by the name of the database and a parameterized SQL query. The syntax to use in the from-clause of an EQL statement is:

sql:database_name [" parameterized_sql_query "]

The engine uses the database_name identifier to obtain configuration information in order to establish a database connection, as well as settings that control connection creation and removal. Please see Section 2.4.5, “Relational Database Access” to configure an engine for database access.

Following the database name is the SQL query to execute. The SQL query can contain one or more substitution parameters. The SQL query string is placed in single brackets [ and ]. The SQL query can be placed in either single quotes (') or double quotes ("). The SQL query grammer is passed to your database software unchanged, allowing you to write any SQL query syntax that your database understands, including stored procedure calls.

Substitution parameters in the SQL query string take the form ${event_property_name}. The engine resolves event_property_name at statement execution time to the actual event property value supplied by the events in the joined event stream.

The engine determines the type of the SQL query output columns by means of the result set metadata that your database software returns for the statement. The actual query results are obtained via the getObject on java.sql.ResultSet.

The sample EQL statement below joins an event stream consisting of CustomerCallEvent events with the results of an SQL query against the database named MyCustomerDB and table Customer:

select custId, cust_name from CustomerCallEvent,
  sql:MyCustomerDB [' select cust_name from Customer where cust_id = ${custId} ']

The example above assumes that CustomerCallEvent supplies an event property named custId. The SQL query selects the customer name from the Customer table. The where-clause in the SQL matches the Customer table column cust_id with the value of custId in each CustomerCallEvent event. The engine executes the SQL query for each new CustomerCallEvent encountered.

If the SQL query returns no rows for a given customer id, the engine generates no output event. Else the engine generates one output event for each row returned by the SQL query. An outer join as described in the next section can be used to control whether the engine should generate output events even when the SQL query returns no rows.

The next example adds a time window of 30 seconds to the event stream CustomerCallEvent. It also renames the selected properties to customerName and customerId to demonstrate how the naming of columns in an SQL query can be used in the select clause in the EQL query. And the example uses explicit stream names via the as keyword.

select customerId, customerName from
  CustomerCallEvent.win:time(30 sec) as cce,
  sql:MyCustomerDB ["select cust_id as customerId, cust_name as customerName from Customer 
                  where cust_id = ${cce.custId}"] as cq

Any window, such as the time window, generates insert stream (istream) events as events enter the window, and remove stream (rstream) events as events leave the window. The engine executes the given SQL query for each CustomerCallEvent in both the insert stream and the remove stream. As a performance optimization, the istream or rstream keywords in the select-clause can be used to instruct the engine to only join insert stream or remove stream events, reducing the number of SQL query executions.

6.13.2. Outer Joins With SQL Queries

You can use outer joins to join data obtained from an SQL query and control when an event is produced. Use a left outer join, such as in the next statement, if you need an output event for each event regardless of whether or not the SQL query returns rows. If the SQL query returns no rows, the join result populates null values into the selected properties.

select custId, custName from
  CustomerCallEvent as cce
  left outer join 
  sql:MyCustomerDB ["select cust_id, cust_name as custName 
                     from Customer where cust_id = ${cce.custId}"] as cq
  on cce.custId = cq.cust_id

The statement above always generates at least one output event for each CustomerCallEvent, containing all columns selected by the SQL query, even if the SQL query does not return any rows. Note the on expression that is required for outer joins. The on acts as an additional filter to rows returned by the SQL query.

6.13.3. Using Patterns to Request (Poll) Data

Pattern statements and SQL queries can also be applied together in useful ways. One such use is to poll or request data from a database at regular intervals. The next statement is an example that shows a pattern that fires every 5 seconds to query the NewOrder table for new orders:

insert into NewOrders
select orderId, orderAmount from
  pattern [every timer:interval(5 sec)],
  sql:MyCustomerDB ['select orderId, orderAmount from NewOrders']

6.13.4. JDBC Implementation Overview

The engine translates SQL queries into JDBC java.sql.PreparedStatement statements by replacing ${name} parameters with '?' placeholders. It obtains name and type of result columns from the compiled PreparedStatement meta data when the EQL statement is created.

The engine supplies parameters to the compiled statement via the setObject method on PreparedStatement. The engine uses the getObject method on the compiled statement PreparedStatement to obtain column values.

6.14. Single-row Function Reference

Single-row functions return a single value for every single result row generated by your statement. These functions can appear anywhere where expressions are allowed.

Esper allows static Java library methods as single-row functions, and also features built-in single-row functions.

Esper auto-imports the following Java library packages:

  • java.lang.*

  • java.math.*

  • java.text.*

  • java.util.*

Thus Java static library methods can be used in all expressions as shown in below example:

select symbol, Math.round(volume/1000)
from StockTickEvent.win:time(30 sec)

In general, arbitrary Java class names have to be fully qualified (e.g. java.lang.Math) but Esper provides a mechanism for user-controlled imports of classes and packages as outlined in Chapter 2, Configuration.

The below table outlines the built-in single-row functions available.

Table 6.2. Syntax and results of single-row functions

Single-row FunctionResult
max(expression, expression [, expression ...])

Returns the highest numeric value among the 2 or more comma-separated expressions.

min(expression, expression [, expression ...])

Returns the lowest numeric value among the 2 or more comma-separated expressions.

coalesce(expression, expression [, expression ...])

Returns the first non-null value in the list, or null if there are no non-null values.

case value 
  when compare_value then result
  [when compare_value then result ...] 
  [else result] 
  end 

Returns result where the first value equals compare_value.

case 
  when condition then result
  [when condition then result ...] 
  [else result] 
  end

Returns the result for the first condition that is true.

prev(expression, event_property)

Returns a property value of a previous event, relative to the event order within a data window

prior(integer, event_property)

Returns a property value of a prior event, relative to the natural order of arrival of events

6.14.1. The Min and Max Functions

The min and max function take two or more parameters that itself can be expressions. The min function returns the lowest numeric value among the 2 or more comma-separated expressions, while the max function returns the highest numeric value. The return type is the compatible aggregated type of all return values.

The next example shows the max function that has a Double return type and returns the value 1.1.

select max(1, 1.1, 2 * 0.5) from ...

The min function returns the lowest value. The statement below uses the function to determine the smaller of two timestamp values.

select symbol, min(ticks.timestamp, news.timestamp) as minT
	from StockTickEvent.win:time(30 sec) as ticks, NewsEvent.win:time(30 sec) as news
	where ticks.symbol = news.symbol

6.14.2. The Coalesce Function

The result of the coalesce function is the first expression in a list of expressions that returns a non-null value. The return type is the compatible aggregated type of all return values.

This example returns a String-typed result of value 'foo'.

select coalesce(null, 'foo') from ...

6.14.3. The Case Control Flow Function

The case control flow function has two versions. The first version takes a value and a list of compare values to compare against, and returns the result where the first value equals the compare value. The second version takes a list of conditions and returns the result for the first condition that is true.

The return type of a case expression is the compatible aggregated type of all return values.

The example below shows the first version of a case statement. It has a String return type and returns the value 'one'.

select case 1 when 1 then 'one' when 2 then 'two' else 'more' end from ...

The second version of the case function takes a list of conditions. The next example has a Boolean return type and returns the boolean value true.

select case when 1>0 then true else false end from ...

6.14.4. The Previous Function

The prev function returns the property value of a previous event. The first parameter denotes the i-th previous event in the order established by the data window. The second parameter is a property name for which the function returns the value for the previous event.

This example selects the value of the price property of the 2nd-previous event from the current Trade event.

select prev(2, price) from Trade.win:length(10)

Since the prev function takes the order established by the data window into account, the function works well with sorted windows. In the following example the statement selects the symbol of the 3 Trade events that had the largest, second-largest and third-largest volume.

select prev(0, symbol), prev(1, symbol), prev(2, symbol)
  from Trade.ext:sort(volume, true, 10)

The i-th previous event parameter can also be an expression returning an Integer-type value. The next statement joins the Trade data window with an RankSelectionEvent event that provides a rank property used to look up a certain position in the sorted Trade data window:

select prev(rank, symbol) from Trade.ext:sort(volume, true, 10), RankSelectionEvent

And the expression count(*) - 1 allows us to select the oldest event in the length window:

select prev(count(*) - 1, price) from Trade.win:length(100)

The prev function returns a null value if the data window does not currently hold the i-th previous event. The example below illustrates this using a time batch window. Here the prev function returns a null value for any events in which the previous event is not in the same batch of events. Note that the prior function as discussed below can be used if a null value is not the desired result.

select prev(1, symbol) from Trade.win:time_batch(1 min)

6.14.4.1. Previous Event per Group

The combination of prev function and group-by view returns the property value for a previous event in the given group.

Let's look at an example. Assume we want to obtain the price of the previous event of the same symbol as the current event.

The statement that follows solves this problem. It declares a group-by view grouping on the symbol property and a time window of 1 minute. As a result, when the engine encounters a new symbol value that it hasn't seen before, it creates a new time window specifically to hold events for that symbol. Consequently, the previous function returns the previous event within the respective time window for that event's symbol value.

select prev(1, price) as prevPrice from Trade.std:groupby('symbol').win:time(1 min)

In a second example, assume we need to return, for each event, the current top price per symbol. We can use the prev to obtain the highest price from a sorted data window, and use the group-by view to group by symbol:

select prev(0, price) as topPricePerSymbol 
from Trade.std:groupby('symbol').ext:sort('price', false, 1)

6.14.4.2. Restrictions

The following restrictions apply to the prev functions and its results:

  • The function always returns a null value for remove stream (old data) events

  • The function requires a data window view, or a group-by and data window view, without any additional sub-views. Data window views are: length window, time and time batch window and sorted window

6.14.4.3. Comparison to the prior Function

The prev function is similar to the prior function. The key differences between the two functions are as follows:

  • The prev function returns previous events in the order provided by the data window, while the prior function returns prior events in the order of arrival as posted by a stream's declared views.

  • The prev function requires a data window view while the prior function does not have any view requirements.

  • The prev function returns the previous event grouped by a criteria by combining the std:groupby view and a data window. The prior function returns prior events posted by the last view regardless of data window grouping.

  • The prev function returns a null value for remove stream events, i.e. for events leaving a data window. The prior function does not have this restriction.

6.14.5. The Prior Function

The prior function returns the property value of a prior event. The first parameter is an integer value that denotes the i-th prior event in the natural order of arrival. The second parameter is a property name for which the function returns the value for the prior event.

This example selects the value of the price property of the 2nd-prior event to the current Trade event.

select prior(2, price) from Trade

The prior function can be used on any event stream or view and does not have any specific view requirements. The function operates on the order of arrival of events by the event stream or view that provides the events.

The next statement uses a time batch window to compute an average volume for 1 minute of Trade events, posting results every minute. The select-clause employs the prior function to select the current average and the average before the current average:

select average, prior(1, average) 
    from TradeAverages.win:time_batch(1 min).stat:uni('volume')

6.15. Operator Reference

Esper arithmatic and logical operator precedence follows Java standard arithmatic and logical operator precedence.

6.15.1. Arithmatic Operators

The below table outlines the arithmatic operators available.

Table 6.3. Syntax and results of arithmatic operators

OperatorDescription
+, -

As unary operators they denote a positive or negative expression. As binary operators they add or subtract.

*, /

Multiplication and division are binary operators.

%

Modulo binary operator.

6.15.2. Logical And Comparsion Operators

The below table outlines the logical and comparison operators available.

Table 6.4. Syntax and results of logical and comparison operators

OperatorDescription
NOT

Returns true if the following condition is false, returns false if it is true.

OR

Returns true if either component condition is true, returns false if both are false.

AND

Returns true if both component conditions are true, returns false if either is false.

=, !=, <, > <=, >=,

Comparison.

6.15.3. Concatenation Operators

The below table outlines the concatenation operators available.

Table 6.5. Syntax and results of concatenation operators

OperatorDescription
||

Concatenates character strings

6.15.4. Binary Operators

The below table outlines the binary operators available.

Table 6.6. Syntax and results of binary operators

OperatorDescription
&

Bitwise AND if both operands are numbers; conditional AND if both operands are boolean

|

Bitwise OR if both operands are numbers; conditional OR if both operands are boolean

^

Bitwise exclusive OR (XOR)

6.15.5. Array Definition Operator

The { and } curly braces are array definition operators following the Java array initialization syntax. Arrays can be useful to pass to user-defined functions or to select array data in a select clause.

Array definitions consist of zero or more expressions within curly braces. Any type of expression is allowed within array definitions including constants, arithmatic expressions or event properties. This is the syntax of an array definition:

{ [expression [,expression...]] }

Consider the next statement that returns an event property named actions. The engine populates the actions property as an array of java.lang.String values with a length of 2 elements. The first element of the array contains the observation property value and the second element the command property value of RFIDEvent events.

select {observation, command} as actions from RFIDEvent

The engine determines the array type based on the types returned by the expressions in the array definiton. For example, if all expressions in the array definition return integer values then the type of the array is java.lang.Integer[]. If the types returned by all expressions are compatible number types, such as integer and double values, the engine coerces the array element values and returns a suitable type, java.lang.Double[] in this example. The type of the array returned is Object[] if the types of expressions cannot be coerced or return object values. Null values can also be used in an array definition.

Arrays can come in handy for use as parameters to user-defined functions:

select * from RFIDEvent where Filter.myFilter(zone, {1,2,3})

6.15.6. The 'in' Keyword

The in keyword determines if a given value matches any value in a list. The syntax of the keyword is:

test_expression [not] in (expression [,expression...] )

The test_expression is any valid expression. The keyword is followed by a list of expressions to test for a match. The optional not keyword specifies that the result of the predicate be negated.

The result of an in expression is of type Boolean. If the value of test_expression is equal to any expression from the comma-separated list, the result value is true. Otherwise, the result value is false. All expressions must be of the same type as or a compatible type to test_expression.

The next example shows how the in keyword can be applied to select certain command types of RFID events:

select * from RFIDEvent where command in ('OBSERVATION', 'SIGNAL')

The statement is equivalent to:

select * from RFIDEvent where command = 'OBSERVATION' or symbol = 'SIGNAL'

6.15.7. The 'between' Keyword

The between keyword specifies a range to test. The syntax of the keyword is:

test_expression [not] between begin_expression and end_expression

The test_expression is any valid expression and is the expression to test for in the range defined by begin_expression and end_expression. The not keyword specifies that the result of the predicate be negated.

The result of a between expression is of type Boolean. If the value of test_expression is greater then or equal to the value of begin_expression and less than or equal to the value of end_expression, the result is true.

The next example shows how the between keyword can be used to select events with a price between 55 and 60 (inclusive).

select * from StockTickEvent where price between 55 and 60

The equivalent expression without between is:

select * from StockTickEvent where price >= 55 and price <= 60

And also equivalent to:

select * from StockTickEvent where price between 60 and 55

6.15.8. The 'like' Keyword

The like keyword provides standard SQL pattern matching. SQL pattern matching allows you to use '_' to match any single character and '%' to match an arbitrary number of characters (including zero characters). In Esper, SQL patterns are case-sensitive by default. The syntax of like is:

test_expression [not] like pattern_expression [escape string_literal]

The test_expression is any valid expression yielding a String-type or a numeric result. The optional not keyword specifies that the result of the predicate be negated. The like keyword is followed by any valid standard SQL pattern_expression yielding a String-typed result. The optional escape keyword signals the escape character to escape '_' and '%' values in the pattern.

The result of a like expression is of type Boolean. If the value of test_expression matches the pattern_expression, the result value is true. Otherwise, the result value is false.

An example for the like keyword is below.

select * from PersonLocationEvent where name like '%Jack%'

The escape character can be defined as follows. In this example the where-clause matches events where the suffix property is a single '_' character.

select * from PersonLocationEvent where suffix like '!_' escape '!'

6.15.9. The 'regexp' Keyword

The regexp keyword is a form of pattern matching based on regular expressions implemented through the Java java.util.regex package. The syntax of regexp is:

test_expression [not] regexp pattern_expression

The test_expression is any valid expression yielding a String-type or a numeric result. The optional not keyword specifies that the result of the predicate be negated. The regexp keyword is followed by any valid regular expression pattern_expression yielding a String-typed result.

The result of a regexp expression is of type Boolean. If the value of test_expression matches the regular expression pattern_expression, the result value is true. Otherwise, the result value is false.

An example for the regexp keyword is below.

select * from PersonLocationEvent where name regexp '*Jack*'

6.16. Built-in views

This chapter outlines the views that are built into Esper. All views can be arbitrarily combined as many of the examples below show. The section on Chapter 4, Understanding the Output Model provides additional information on the relationship of views, filtering and aggregation.

6.16.1. Window views

6.16.1.1. Length window (win:length)

Creates a moving window extending the specified number of elements into the past. The view takes a single numeric parameter that defines the window size:

win:length(size)

The below example calculates univariate statistics on price for the last 5 stock ticks for symbol IBM.

select * from StockTickEvent(symbol='IBM').win:length(5).stat:uni('price')

The next example keeps a length window of 10 events of stock trade events, with a separate window for each symbol. The statistics on price is calculated only for the last 10 events for each symbol.

select * from StockTickEvent.std:groupby('symbol').win:length(10).stat:uni('price')

6.16.1.2. Length window batch (win:length_batch)

This window view buffers events and releases them when a given minimum number of events has been collected. The view takes the number of events to batch as a parameter:

win:length_batch(size)

The next statement buffers events until a minimum of 5 events have collected. Listeners to updates posted by this view receive updated information only when 5 or more events have collected.

select * from StockTickEvent.win:length_batch(5)

6.16.1.3. Time window (win:time)

Creates a moving time window extending from the specified time interval into the past based on the system time. This view takes a time period (see Section 6.2.1, “Specifying Time Periods”) or a number of seconds as a parameter:

win:time(time period)
win:time(number of seconds)

For the IBM stock tick events in the last 1 second, calculate statistics on price.

select * from StockTickEvent(symbol='IBM').win:time(1 sec).stat:uni('price')

The same statement rewritten to use a parameter supplying number-of-seconds is:

select * from StockTickEvent(symbol='IBM').win:time(1).stat:uni('price')

The following time windows are equivalent specifications:

win:time(2 minutes 5 seconds)
win:time(125 sec)
win:time(125)

6.16.1.4. Externally-timed window (win:ext_timed)

Similar to the time window, this view is a moving time window extending from the specified time interval into the past, but based on the millisecond time value supplied by an event property. The view takes two parameters: the name of the event property to return the long-typed timestamp value, and a time period or a number of seconds:

win:time(timestamp_property_name, time_period)
win:time(timestamp_property_name, number_of_seconds)

This view holds stock tick events of the last 10 seconds based on the timestamp property in StockTickEvent.

select * from StockTickEvent.win:ext_timed('timestamp', 10 seconds)

6.16.1.5. Time window batch (win:time_batch)

This window view buffers events and releases them every specified time interval in one update. The view takes a time period or a number of seconds as a parameter.

win:time_batch(time_period)
win:time_batch(number_of_seconds)

The below example batches events into a 5 second window releasing new batches every 5 seconds. Listeners to updates posted by this view receive updated information only every 5 seconds.

select * from StockTickEvent.win:time_batch(5 sec)

6.16.2. Standard view set

6.16.2.1. Unique (std:unique)

The unique view is a view that includes only the most recent among events having the same value for the specified field:

std:unique(event_property_name)

The view acts as a length window of size 1 for each distinct value of the event property. It thus posts as old events the prior event of the same property value, if any.

The below example creates a view that retains only the last event per symbol.

select * from StockTickEvent.std:unique('symbol')

6.16.2.2. Group By (std:groupby)

This view groups events into sub-views by the value of the specified field. The view takes a single property name to supply the group-by values, or a list of property names as the synopsis shows:

std:groupby(property_name)
std:groupby({property_name [, property_name ...] })

This example calculates statistics on price separately for each symbol.

select * from StockTickEvent.std:groupby('symbol').stat:uni('price')

The group-by view can also take multiple fields to group by. This example calculates statistics on price for each symbol and feed.

select * from StockTickEvent.std:groupby({'symbol', 'feed'}).stat:uni('price')

The order in which the group-by view appears within sub-views of a stream controls the data the engine derives from events for each group. The next 2 statements demonstrate this using a length window.

This example keeps a length window of 10 events of stock trade events, with a separate length window for each symbol. The engine calculates statistics on price for the last 10 events for each symbol. During runtime, the engine actually allocates a separate length window for each new symbol arriving.

select * from StockTickEvent.std:groupby('symbol').win:length(10).stat:uni('price')

By putting the group-by view in position after the length window, we can change the semantics of the query. The query now returns the statistics on price per symbol for only the last 10 events across all symbols. Here the engine allocates only one length window for all events.

select * from StockTickEvent.win:length(10).std:groupby('symbol').stat:uni('price')

We have learned that by placing the group-by view before other views, these other views become part of the grouped set of views. The engine dynamically allocates a new view instance for each subview, every time it encounters a new group key such as a new value for symbol. Therefore, in std:groupby('symbol').win:length(10) the engine allocates a new length window for each distinct symbol. However in win:length(10).std:groupby('symbol') the engine maintains a single length window.

Multiple group-by views can also be used in the same statement. The statement below groups by symbol and feed. As the statement declares the time window after the group-by view for symbols, the engine allocates a new time window per symbol however reports statistics on price per symbol and feed. The query results are statistics on price per symbol and feed for the last 1 minute of events per symbol (and not per feed).

select * from StockTickEvent.std:groupby('symbol').win:time(1 minute)
    .std:groupby('feed').stat:uni('price')

Last, we consider the permutation where the time window is declared after the group-by. Here, the query results are statistics on price per symbol and feed for the last 1 minute of events per symbol and feed.

select * from StockTickEvent.std:groupby({'symbol', 'feed'})
    .win:time(1 minute).stat:uni('price')

6.16.2.3. Size (std:size)

This view simply posts the number of events received from a stream or view. The synopsis is simply:

std:size()

The view posts a single long-typed property named size. The view posts the prior size as old data, and the current size as new data to update listeners of the view. Via the iterator method of the statement the size value can also be polled (read).

When combined with a data window view, the size view reports the current and prior number of events in the data window. This example reports the number of tick events within the last 1 minute:

select size from StockTickEvent.win:time(1 min).std:size()

The size view is also useful in conjunction with a group-by view to count the number of events per group. The EQL below returns the number of events per symbol.

select size from StockTickEvent.std:groupby('symbol').std:size()

When used without a data window, the view simply counts the number of events:

select size from StockTickEvent.std:size()

All views can be used with pattern statements as well. The next EQL snippet shows a pattern where we look for tick events followed by trade events for the same symbol. The size view counts the number of occurances of the pattern.

select size from pattern[every s=StockTickEvent -> TradeEvent(symbol=s.symbol)].std:size()

6.16.2.4. Last (std:lastevent)

This view exposes the last element of its parent view:

std:lastevent()

The view acts as a length window of size 1. It thus posts as old events the prior event in the stream, if any.

This example statement retains statistics calculated on stock tick price for the symbol IBM.

select * from StockTickEvent(symbol='IBM').stat:uni('price').std:lastevent()

6.16.3. Statistics views

6.16.3.1. Univariate statistics (stat:uni)

This view calculates univariate statistics on an event property. The view takes a single event property name as a parameter. The event property must be of numeric type:

stat:uni(event_property_name)

Table 6.7. Univariate statistics derived properties

Property NameDescription
countNumber of values
sumSum of values
averageAverage of values
varianceVariance
stdevSample standard deviation (square root of variance)
stdevpaPopulation standard deviation

The below example selects the standard deviation on price for stock tick events for the last 10 events.

select stdev from StockTickEvent.win:length(10).stat:uni('price')

6.16.3.2. Regression (stat:linest)

This view calculates regression on two event properties. The view takes two event property names as parameters. The event properties must be of numeric type:

stat:linest(event_property_name_1, event_property_name_2)

Table 6.8. Regression derived properties

Property NameDescription
slopeSlope
yinterceptY Intercept

Calculate slope and y-intercept on price and offer for all events in the last 10 seconds.

select slope, yintercept from StockTickEvent.win:time(10 seconds).stat:linest('price', 'offer')

6.16.3.3. Correlation (stat:correl)

This view calculates the correlation value on two event properties. The view takes two event property names as parameters. The event properties must be of numeric type:

stat:correl(event_property_name_1, event_property_name_2)

Table 6.9. Correlation derived properties

Property NameDescription
correlCorrelation between two event properties

Calculate correlation on price and offer over all stock tick events for IBM.

select * from StockTickEvent(symbol='IBM').stat:correl('price', 'offer')

6.16.3.4. Weighted average (stat:weighted_avg)

This view returns the weighted average given a weight field and a field to compute the average for. The view takes two event property names as parameters. The event properties must be of numeric type:

stat:weighted_avg(event_property_name_field, event_property_name_weight)

Table 6.10. Weighted average derived properties

Property NameDescription
averageWeighted average

A statement that derives the volume-weighted average price for the last 3 seconds:

select average 
from StockTickEvent(symbol='IBM').win:time(3 seconds).stat:weighted_avg('price', 'volume')

6.16.3.5. Multi-dimensional statistics (stat:multidim_stats)

This view works similar to the std:groupby views in that it groups information by one or more event properties. The view accepts 3 or more parameters: The first parameter to the view defines the univariate statistics values to derive. The second parameter is the property name to derive data from. The remaining parameters supply the event property names to use to derive dimensions.

stat:multidim_stats(values_to_derive, property_name_datapoint, property_name_column)
stat:multidim_stats(values_to_derive, property_name_datapoint, property_name_column, 
    property_name_row)
stat:multidim_stats(values_to_derive, property_name_datapoint, property_name_column, 
    property_name_row, property_name_page)

Table 6.11. Multi-dim derived properties

Property NameDescription
cubeThe cube following the net.esper.view.stat.olap.Cube interface

The example below derives the count, average and standard deviation latency of service measurement events per customer.

select cube from ServiceMeasurement.stat:multidim_stats({‘count’, ‘average’, ‘stdev’}, 
    'latency', 'customer')

This example derives the average latency of service measurement events per customer, service and error status for events in the last 30 seconds.

select * from ServiceMeasurement.win:length(30000).stat:multidim_stats({‘average’}, 
	'latency', 'customer', 'service', 'status')

6.16.4. Extension View Set

6.16.4.1. Sorted Window View (ext:sort)

This view sorts by values of the specified event properties and keeps only the top events up to the given size.

The syntax to sort on a single event property is as follows.

std:sort(property_name, is_descending, size) 

To sort on a multiple event properties the syntax is as follows.

sort( { property_name, is_descending [ , property_name, is_descending ...] }, size) 

The view below sorts on price descending keeping the lowest 10 prices and reporting statistics on price.

select * from StockTickEvent.ext:sort('price', false, 10).stat:uni('price')

The following example sorts events first by price in descending order, and then by symbol name in ascending (alphabetical) order, keeping only the 10 events with the highest price (with ties resolved by alphabetical order of symbol).

select * from StockTickEvent.ext:sort({'price', true, 'symbol', false}, 10)

6.17. User-Defined Functions

A user-defined function can be invoked anywhere as an expression itself or within an expresson. The function must simply be a public static method that the classloader can resolve at statement creation time. The engine resolves the function reference at statement creation time and verifies parameter types.

The example below assumes a class MyClass that exposes a public static method myFunction accepting 2 parameters, and returing a numeric type such as double.

select 3 * MyClass.myFunction(price, volume) as myValue 
from StockTick.win:time(30 sec)

User-defined functions also take array parameters as this example shows. The section on Section 6.15.5, “Array Definition Operator” outlines in more detail the types of arrays produced.

select * from RFIDEvent where com.mycompany.rfid.MyChecker.isInZone(zone, {10, 20, 30})