J2EE Self-Service Terminal - Case Study

A J2EE-based self-service terminal managing system in an airport gets a lot of events from connected terminals. The event rate is around 500 events per second. Some events indicate abnormal situations such as 'paper low' or 'terminal out of order'. Other events observe activity as customers use a terminal to check in and print boarding tickets.

Our goal is to resolve self-service terminal or network problems before our customers report them to us, which means higher availability and greater customer satisfaction.

To accomplish this, we would like to alert when certain condition occur that warrant human intervention. For example, a customer may be in the middle of a check-in process when the terminal detects a hardware problem or when the network goes down. Under these conditions we would like to dispatch staff to help that customer, and staff to diagnose the hardware or network problem.

We also need to view and summarize activity on an ongoing basis and feed this to a real-time interface. This enables a person to watch the system in action and spot abnormalities. And the system can compare the summarized activity to stored normal usage patterns.

The case study will first define the events published by terminals. Next, it discusses and evolves the EQL event queries to solve a couple of different challenges in managing and reporting on terminal activity. Last, the case study discusses how this example has been implemented using a J2EE application server.

Events

Each self-service terminal can publish any of the 6 events below.

CheckinIndicates a customer started a check-in dialog
CancelledIndicates a customer cancelled a check-in dialog
CompletedIndicates a customer completed a check-in dialog
OutOfOrderIndicates the terminal detected a hardware problem
LowPaperIndicates the terminal is low on paper
StatusIndicates terminal status, published every 1 minute regardless of activity as a terminal heartbeat
Event Types

All events provide information about the terminal that published the event, and a timestamp. The terminal information is held in a property named "term" and provides a terminal id.

Since all events carry similar information, we model each event as a subtype to a base class BaseTerminalEvent, which will provide the terminal information that all events share. This enables us to treat all terminal events polymorphically, that is we can treat derived event types just like their parent event types. This helps simplify our queries.

All terminals publish Status events every 1 minute. In normal cases, the Status events indicate that a terminal is alive and online. The absence of status events may indicate that a terminal went offline for some reason and that may need to be investigated.

Introduction to EQL and Patterns

EQL is the object-oriented event stream query language that Esper provides. EQL is very similar to SQL in its syntax and provides additional capabilities. As part of EQL, Esper also offers a pattern language that provides for stateful (state-machine) event pattern matching. EQL and patterns can be used alone or can be combined into useful, easy to read statements.

As a start, let's assume we want to dispatch staff to restock paper supply when a terminal publishes a LowPaper event. We use a simple EQL statement as below.

select * from LowPaper

The next statement is equivalent to the statement before, but uses a pattern syntax to filter for LowPaper events. Pattern statements are identified via the pattern keyword and placed in square brackets.

select a from pattern [ every a=LowPaper ]

Besides looking for LowPaper events, we would also like to be notified when OutOfOrder events arrive. One solution uses the or operator for patterns:

select a,b from pattern [ every a=LowPaper or every b=OutOfOrder]

We could also use two separate statements that each filter for only one type of event:

select * from LowPaper
select * from OutOfOrder

Let's look at another solution that we could implement with the help of BaseTerminalEvent. Remember, all our events are subclasses of BaseTerminalEvent since they share similar information. We can use the where clause to filter out the events we are interested in.

select * from BaseTerminalEvent
where type = 'LowPaper' or type = 'OutOfOrder'

Detecting customer check-in issues

A customer may be in the middle of a check-in when the terminal detects a hardware problem or when the network goes down. In that situation we want to alert a team member to help the customer.

When the terminal detects a problem, it issues an OutOfOrder event. When the network or network connection goes down, we can detect this fact by the absence of Status events that the terminal sends every 1 minute.

A simple pattern that allows us to detect a Checkin event that is followed by an OutOfOrder event for the same terminal is shown below.

select * from pattern [ every a=Checkin -> OutOfOrder(term.id = a.term.id) ]

The every keyword in this pattern indicates that we want to consider all Checkin events, not just the first Checkin event. The -> symbol is the followed-by operator. In the followed-by we are looking for OutOfOrder events in which the terminal id matches the terminal id of the Checkin event.

If the customer cancels or completes her or his check-in process before the terminal indicates an OutOfOrder event, then we don't want to alert a team member right away. Let's refine the pattern to match on CheckIn events followed by an OutOfOrder event without a Cancelled or Completed event in between.

select * from pattern [ every a=Checkin -> 
      ( OutOfOrder(term.id=a.term.id) and not (Cancelled(term.id=a.term.id) or Completed(term.id=a.term.id)) )]

The and and the not pattern operators allow us to specify that we are not interested in OutOfOrder events after a customer cancelled or completed his check-in process. The last statement fires if a Checkin event is followed by an OutOfOrder event without a Cancelled or Completed event occurring after the Checkin event and before the OutOfOrder event.

Absence of Status events

Each self-service terminal publishes a Status event every 1 minute. In normal cases, the Status event indicates the terminal is alive and online. The absence of Status events may indicate that a terminal went offline for some reason and that needs to be investigated.

Since Status events arrive in regular intervals of 60 seconds, we can make us of temporal pattern matching using timer to find events that didn't arrive. We can use the every operator and timer:interval() to repeat an action every 60 seconds. Then we combine this with a not operator to check for absence of Status events. A 65 second interval during which we look for Status events allows 5 seconds to account for a possible delay in transmission or processing:

select 'terminal 1 is offline' from pattern [ every timer:interval(60 sec) -> (timer:interval(65 sec) and not Status(term.id = 'T1')) ]

This statement allowed us to detect situations in which no Status event arrives for a given terminal id 1. The statement will alert us to the absence of Status events every 60 seconds.

We way not want to see an alert for the same terminal every 1 minute. Therefore let's add an output first clause to indicate that we only want to be alerted the first time this happens, and then not be alerted for 5 minutes, and then be alerted again if it happens again.

select 'terminal 1 is offline' from pattern [ every timer:interval(60 sec) -> (timer:interval(65 sec) and not Status(term.id = 'T1')) ]
output first every 5 minutes

Activity summary data

By presenting statistical information about terminal activity to our staff in real-time we enable them to monitor the system and spot problems. To begin with, the real-time console should show a count of the number of check-in processes started, in progress, cancelled and completed within the last 10 minutes.

Let's model a first query that counts the number of Checkin processes only, considering only the last 10 minutes of event data:

select count(*) from Checkin.win:time(10 minutes)

Note the use of the win:time syntax that gives us a time window consisting of only the last 10 minutes of Checkin events.

We can easily improve on this query and get a count per event type considering all types of events (Checkin, Completed, Cancelled, Status, OutOfOrder, LowPaper) by using BaseTerminalEvent. Again we want to look at only the last 10 minutes of events.

select type, count(*) from BaseTerminalEvent.win:time(10 minutes) group by type

Let's also use the number of events per event type to compare against a pre-recorded normal usage pattern. By using the insert into syntax we can make the results of this query available to other statements:

insert into CountPerType
select type, min(timeMinute) as time, count(*) as countPerType 
from BaseTerminalEvent.win:time(10 minutes) 
group by type
output all every 1 minutes

In the above query we have added an output all clause. Since we only need to compare the counts every minute or so, via the output all we can trigger the query to output counts for all event types every 1 minute only rather then continuously. The "timeMinute" field has been added to the selection list to hold the current hour and minute.

In our terminal managing system, when the count for any event type falls below 50% of the normal usage pattern for that time, we would like to have the system alert a staff member.

Assuming that usage pattern data has been loaded into NormalUsagePattern at the beginning of the day, we can join the CountPerType using hour and minute:

select type, countPerType from CountPerType.win:time(10 minutes) as t1, NormalUsagePattern as t2
where t1.timeMinute = t2.timeMinute
  and t1.countPerType < 0.5 * t2.countPerType

Sample application for J2EE application server

The example code in the distribution package implements a message-driven enterprise java bean (MDB EJB). We used an MDB as a convenient place for processing incoming events via a JMS message queue or topic. The example uses 2 JMS queues: One queue to receive events published by the terminals, and a second queue to indicate situations detected via EQL statement and listener back to a receiving process.

This example has been packaged for deployment into a JBoss Java application server (see http://www.jboss.org) with default deployment configuration. JBoss is an open-source application server available under LGPL license. Of course the choice of application server does not indicate a requirement or preference for the use of Esper in a J2EE container. Other quality J2EE application servers are available and perhaps more suitable to run this example or a similar application.

Note that the example code implements only a subset of the statements in this case study, since many of the statements are shown to provide a step-by-step explanation. The complete example code can be found in the "eg/terminalsvc" folder of the distribution. The Java package name is net.esper.example.terminalsvc.

Running the Example

The pre-build EAR file contains the MDB for deployment to a JBoss application server with default deployment options. The JBoss default configuration provides 2 queues that this example utilizes: queue/A and queue/B.

The application can be deployed by copying the ear file in the "eg/terminalsvc/terminalsvc-ear" folder to your JBoss deployment directory located under the JBoss home directory under "server/default/deploy".

The example contains an event simulator and a event receiver that can be invoked from the command line. See the folder "eg/terminalsvc/etc" folder readme file and start scripts for Windows and Unix, and the documentation set for further information on the simulator.

Building the Example

This example requires Maven 2 to build. To build the example, change directory to the folder "eg/terminalsvc" and type "mvn package".

The Maven build packages the EAR file for deployment to a JBoss application server with default deployment options.

The above instructions have been tested with JBoss AS 4.0.4.GA and Maven 2.0.4.