General

How do I measure the rate of arrival of events in a given time period?

The time batch window of 1 second as shown below produces an event each second with the count of the number of events in the batch.

select count(*) as cnt from MarketDataEvent.win:time_batch(1 second)

How do I measure the rate of arrival of events in a given time period per another category?

We can solve this problem by grouping the events in the window per the category. The below example uses 'feed' as the category.

select feed, count(*) as cnt from MarketDataEvent.win:time_batch(1 second) group by feed

How do I correlate events arriving in 2 or more streams?

The join of event streams looks very similar to joins in SQL. To bind data in the streams together, across streams, we identify keys to join on.

The below example specifies the 'accountNumber' field as the only join key. In this example we hold the last 30 seconds of events for each stream.

select fraud.accountNumber as accntNum, withdraw.amount as amount         
from FraudWarningEvent.win:time(30 sec) as fraud,
     WithdrawalEvent.win:time(30 sec) as withdraw
where fraud.accountNumber = withdraw.accountNumber

How do I find missing events arriving in 2 or more streams that are correlated?

As in SQL we can use outer joins to generate a result even if one or more of the correlated events are not found in a stream. Usually we want to generate the result after a certain time or after a certain number of events have been received, indicating that a correlated event is truely missing.

In this example we are looking for a withdrawal event without a login event for the same account number after 60 seconds.

We join withdrawal events with login events looking for login events that do not exist (account number is null). We want to get notified as these events leave the 60-second time window.

select withdraw.accountNumber as accntNum, withdraw.amount as amount         
from WithdrawalEvent.win:time(60 sec) as withdraw
     left outer join
     LoginEvent.win:time(60 sec) as login
on fraud.accountNumber = withdraw.accountNumber
where login.accountNumber = null

How do I keep a separate window of events per category and compute aggregates for each category's window?

I have one or more categories and for each of these categories I need to keep a separate window of events.

In the statement below we have stock tick events for which we want to compute the average price of the last 10 stock tick events per symbol. Notice we are not using the last 10 events overall, we are looking at the last 10 events per symbol.

select symbol, avg(price) as avgPrice from StockTick.std:groupby('symbol').win:length(10)

We can also specify multiple categories:

select symbol, location, avg(price) as avgPrice from StockTick.std:groupby({'symbol','location'}).win:length(10)

How do I use results of one statement in another statement?

Use the insert into syntax to use the events generated by one statement as input to another statement.

We can first compute the number of events arriving within 1 second, then use that number to perform additional aggregation. Here we compute for the last 30 seconds the maximum and minimum rate per feed.

insert into TicksPerSecond select feed, count(*) as cnt from MarketDataEvent.win:time_batch(1 second) group by feed
	
select feed, max(cnt) as maxCount, min(cnt) as minCount from TicksPerSecond.win:time(30 sec) group by feed

How do I reduce the rate of event output by my statement? How do I get frequent but not continuous results?

Use output rate limiting to stabilize or reduce the rate at which rows are output from a query, by outputting rows at a specified time or row-based interval.

The example below limits the otherwise continuous output to an output row every 5 seconds. The output contains the feed and average volume per feed of the last 60 seconds of market data events.

select feed, avg(volume) as cnt from MarketDataEvent.win:time(60 sec) group by feed output every 5 seconds

How do I detect the absence of an event?

Use a pattern to detect the absence of an event. The below pattern fires if an event A is not followed by an event B within 10 seconds.

select * from pattern [every EventA -> (timer:interval(10 sec) and not EventB)]

Outer joins are also a good way to detect missing events. A solution with an outer join was discussed above.