Introductions
Drools 5.0 is now split into 4 modules, each with their own manual - Guvnor, Expert, Fusion and Flow. Guvnor is our web based governance system, traditionally referred to in the rules world as a BRMS. We decided to move away from the BRMS term to a play on governance as it's not rules specific. Expert is the traditional rules engine. Fusion is the event processing side, it's a play on data/sensor fusion terminology. Finally there is Flow which is our workflow module.
Drools documentation could be found on jboss site: http://jboss.org/drools/documentation.html
Drools Expert (Rule Engine)
Working Sessions
Stateless
Stateless session, not utilising inference, forms the simplest use case. A stateless session can be called like a function passing it some data and then receiving some results back. Some common use cases for stateless sessions are, but not limited to:
- Validation
- Is this person eligible for a mortgage?
- Is this person eligible for a mortgage?
- Calculation
- Compute a mortgage premium.
- Compute a mortgage premium.
- Routing and Filtering
- Filter incoming messages, such as emails, into folders.
- Send incoming messages to a destination.
- Filter incoming messages, such as emails, into folders.
Statefull
Stateful Sessions are longer lived and allow iterative changes over time. Some common use cases for Stateful Sessions are, but not limited to:- Monitoring
- Stock market monitoring and analysis for semi-automatic buying.
- Stock market monitoring and analysis for semi-automatic buying.
- Diagnostics
- Fault finding, medical diagnostics
- Fault finding, medical diagnostics
- Logistics
- Parcel tracking and delivery provisioning
- Parcel tracking and delivery provisioning
- Compliance
- Validation of legality for market trades.
- Validation of legality for market trades.
Rules vs Methods
Methods:- Methods are called directly.
- Specific instances are passed.
- One call results in a single execution.
- Rules execute by matching against any data as long it is inserted into the engine.
- Rules can never be called directly.
- Specific instances cannot be passed to a rule.
- Depending on the matches, a rule may fire once or several times, or not at all.
Agenda
What if you don't want the order of Activation execution to be arbitrary? When there is one or more Activations on the Agenda they are said to be in conflict, and a conflict resolver strategy is used to determine the order of execution. At the simplest level the default strategy uses salience to determine rule priority. Each rule has a default value of 0, the higher the value the higher the priority.Rules and SQL
Example:DRL riles:
The two rules above can be represented with two SQL views and a trigger for each view, as below
Knowledge Base by Configuration Using Changesets
So far, the programmatic API has been used to build a Knowledge Base. Quite often it's more desirable to do this via configuration. To facilitate this, Drools supports the "Changeset" feature. The file changeset.xml contains a list of resources, and it may also point recursively to another changeset XML file. Currently the changeset has only a single "add" element, but support for remove and modify will be added in the future, for more powerful incremental changes over time. Currently there is no XML schema for the changeset XML, but we hope to add one soon. A few examples will be shown to give you the gist of things. A resource approach is employed that uses a prefix to indicate the protocol. All the protocols provided by java.net.URL, such as "file" and "http", are supported, as well as an additional "classpath".Currently the type attribute must always be specified for a resource, as it is not inferred from the file name extension. Here is a simple example that points to a http location for some rules.
<change-set xmlns='http://drools.org/drools-5.0/change-set'
xmlns:xs='http://www.w3.org/2001/XMLSchema-instance'
xs:schemaLocation='http://drools.org/drools-5.0/change-set drools-change-set-5.0.xsd' >
<add>
<resource source='http:org/domain/myrules.drl' type='DRL' />
</add>
</change-set>
Knowledge Agent
The Knowlege Agent provides automatic loading, caching and re-loading of resources and is configured from a properties files. The Knowledge Agent can update or rebuild this Knowlege Base as the resources it uses are changed.Drools Fusion
Use Cases
Supporting Complex Event Processing, though, is much more than simply understanding what an event is. CEP scenarios share several common and distiguishing characteristics:- Usually required to process huge volumes of events, but only a small percentage of the events are of real interest.
- Events are usually immutable, since they are a record of state change.
- Usually the rules and queries on events must run in reactive modes, i.e., react to the detection of event patterns.
- Usually there are strong temporal relationships between related events.
- Individual events are usually not important. The system is concerned about patterns of related events and their relationships.
- Usually, the system is required to perform composition and aggregation of events.
Based on this general common characteristics, Drools Fusion defined a set of goals to be achieved in order to support Complex Event Processing appropriately:
- Support Events, with their proper semantics, as first class citizens.
- Allow detection, correlation, aggregation and composition of events.
- Support processing of Streams of events.
- Support temporal constraints in order to model the temporal relationships between events.
- Support sliding windows of interesting events.
- Support a session scoped unified clock.
- Support the required volumes of events for CEP use cases.
- Support to (re)active rules.
- Support adapters for event input into the engine (pipeline).
Event
An event is a fact that present a few distinguishing characteristics:- Usually immutables
- Strong temporal constraints: rules involving events usually require the correlation of multiple events, specially temporal correlations where events are said to happen at some point in time relative to other events.
- Managed lifecycle: due to their immutable nature and the temporal constraints, events usually will only match other events and facts during a limited window of time
- Use of sliding windows: since all events have timestamps associated to them, it is possible to define and use sliding windows over them
Declaration
import some.package.StockTick
declare StockTick
@role( event )
end
Event Metadata
All events have a set of metadata associated to them. For the examples, lets assume
the user has the following class in the application domain model:
public class VoiceCall {
private String originNumber;
private String destinationNumber;
private Date callDateTime;
private long callDuration; // in milliseconds
// constructors, getters and setters
}
The event declaration could be:
declare VoiceCall
@role( event )
@timestamp( callDateTime )
@duration( callDuration )
@expires( 1h35m )
end
Where:
attribute | syntax | notes |
---|---|---|
@role | @role( <fact | event> ) | Possible values: fact - this is the default, declares that the type is to be handled as a regular fact. event - declares that the type is to be handled as an event. |
@timestamp | @timestamp( <attributename> ) | Every event has an associated timestamp assigned to it. By default, the timestamp for a given event is read from the Session Clock and assigned to the event at the time the event is inserted into the working memory. |
@duration | @duration( <attributename> ) | Drools supports both event semantics: point-in-time events and interval-based events. A point-in-time event is represented as an interval-based event whose duration is zero. By default, all events have duration zero |
@expires | @expires( <timeoffset> ) | Events may be automatically expired after some time in the working memory. Typically this happens when, based on the existing rules in the knowledge base, the event can no longer match and activate any rules. Although, it is possible to explicitly define when an event should expire. |
Streams Support
Most CEP use cases have to deal with streams of events. The streams can be provided to the application in various forms, from JMS queues to flat text files, from database tables to raw sockets or even through web service calls. In any case, the streams share a common set of characteristics:
- events in the stream are ordered by a timestamp. The timestamp may have different semantics for different streams but they are always ordered internally.
- volumes of events are usually high.
- atomic events are rarely useful by themselves. Usually meaning is extracted from the correlation between multiple events from the stream and also from other sources.
- streams may be homogeneous, i.e. contain a single type of events, or heterogeneous, i.e. contain multiple types of events.
Drools generalized the concept of a stream as an "entry point" into the engine. An entry point is for drools a gate from which facts come. The facts may be regular facts or special facts like events.
Open stream
// create your rulebase and your session as usual
StatefulKnowledgeSession session = ...
// get a reference to the entry point
WorkingMemoryEntryPoint atmStream = session.getWorkingMemoryEntryPoint( "ATM Stream" );
// and start inserting your facts into the entry point
atmStream.insert( aWithdrawRequest );
Temporal Operators
After
Lets look at an example:
$eventA : EventA( this after[ 3m30s, 4m ] $eventB )
The previous pattern will match if and only if the temporal distance between the time when $eventB finished and the time when $eventA started is between ( 3 minutes and 30 seconds ) and ( 4 minutes ). In other words:
3m30s <= $eventA.startTimestamp - $eventB.endTimeStamp <= 4m
Before
Lets look at an example:
$eventA : EventA( this before[ 3m30s, 4m ] $eventB )
The previous pattern will match if and only if the temporal distance between the time when $eventA finished and the time when $eventB started is between ( 3 minutes and 30 seconds ) and ( 4 minutes ). In other words:
3m30s <= $eventB.startTimestamp - $eventA.endTimeStamp <= 4m
Coincides
Lets look at an example:$eventA : EventA( this coincides $eventB )
The previous pattern will match if and only if the start timestamps of both $eventA and $eventB are the same AND the end timestamp of both $eventA and $eventB alsoare the same.
During
Lets look at an example:
$eventA : EventA( this during $eventB )
The previous pattern will match if and only if the $eventA starts after $eventB starts and finishes before $eventB finishes.
Finishes
Lets look at an example:
$eventA : EventA( this finishes $eventB )
The previous pattern will match if and only if the $eventA starts after $eventB starts and finishes at the same time $eventB finishes.
Finished By
Lets look at an example:
$eventA : EventA( this finishedby $eventB )
The previous pattern will match if and only if the $eventA starts before $eventB starts and finishes at the same time $eventB finishes
Includes
Lets look at an example:
$eventA : EventA( this includes $eventB )
The previous pattern will match if and only if the $eventB starts after $eventA starts and finishes before $eventA finishes.
Meets
Lets look at an example:
$eventA : EventA( this meets $eventB )
The previous pattern will match if and only if the $eventA finishes at the same time $eventB starts.
Met By
Lets look at an example:
$eventA : EventA( this metby $eventB )
The previous pattern will match if and only if the $eventA starts at the same time $eventB finishes.
Overlaps
Lets look at an example:
$eventA : EventA( this overlaps $eventB )
The previous pattern will match if and only if:
$eventA.startTimestamp < $eventB.startTimestamp < $eventA.endTimestamp < $eventB.endTimestamp
Overlapped By
Lets look at an example:
$eventA : EventA( this overlappedby $eventB )
The previous pattern will match if and only if:
$eventB.startTimestamp < $eventA.startTimestamp < $eventB.endTimestamp < $eventA.endTimestamp
Starts
Lets look at an example:
$eventA : EventA( this starts $eventB )
The previous pattern will match if and only if the $eventA finishes before $eventB finishes and starts at the same time $eventB starts.
Started By
Lets look at an example:
$eventA : EventA( this startedby $eventB )
The previous pattern will match if and only if the $eventB finishes before $eventA finishes and starts at the same time $eventB starts.
Event Processing Modes
CLOUD
The CLOUD processing mode is the default processing mode. Users of rules engine are familiar with this mode because it behaves in exactly the same way as any pure forward chaining rules engine, including previous versions of Drools. When running in CLOUD mode, the engine sees all facts in the working memory, does not matter if they are regular facts or events, as a whole. There is no notion of flow of time, although events have a timestamp as usual. In other words, although the engine knows that a given event was created, for instance, on January 1st 2009, at 09:35:40.767,it is not possible for the engine to determine how "old" the event is, because there
is no concept of "now".
KnowledgeBaseConfiguration config = KnowledgeBaseFactory.newKnowledgeBaseConfiguration();
config.setOption( EventProcessingOption.CLOUD );
Stream Mode
The STREAM processing mode is the mode of choice when the application needs to process streams of events. It adds a few common requirements to the regular processing, but enables a whole lot of features that make stream event processing a lot simpler.
KnowledgeBaseConfiguration
config = KnowledgeBaseFactory.newKnowledgeBaseConfiguration();
config.setOption(EventProcessingOption.STREAM );
Sliding Window
Sliding Window is a way to scope the events of interest as a the ones belonging to a window that is constantly moving. The two most common sliding window implementations are time based windows and length based windows.
Sliding Time Windows
Sliding Time Windows allow the user to write rules that will only match events occurring in the last X time units. For instance, if the user wants to consider only the Stock Ticks that happened in the last 2 minutes, the pattern would look like this:
StockTick() over window:time( 2m )Drools uses the "over" keyword to associate windows to patterns.
On a more elaborate example, if the user wants to sound an alarm in case the average temperature over the last 10 minutes read from a sensor is above the threshold value, the rule would look like:
rule "Sound the alarm in case temperature rises above threshold"
when
TemperatureThreshold( $max : max )
Number( doubleValue > $max ) from accumulate( SensorReading( $temp : temperature ) over window:time( 10m ), average( $temp ) )
then
// sound the alarm
end
The engine will automatically discard any SensorReading older than 10 minutes and keep the calculated average consistent.
Sliding Length Windows
Sliding Length Windows work the same way as Time Windows, but discard events based on the arrival of new events instead of flow of time. For instance, if the user wants to consider only the last 10 IBM Stock Ticks, independent of how old they are, the pattern would look like this:
StockTick( company == "IBM" ) over window:length( 10 )As you can see, the pattern is similar to the one presented in the previous section, but instead of using window:time to define the sliding window, it uses window:length. Using a similar example
to the one in the previous section, if the user wants to sound an alarm in case the average temperature over the last 100 readings from a sensor is above the threshold value, the rule would look like:
rule "Sound the alarm in case temperature rises above threshold"
when
TemperatureThreshold( $max : max )
Number( doubleValue > $max ) from accumulate( SensorReading( $temp : temperature ) over window:length( 100 ), average( $temp ) )
then
// sound the alarm
end
The engine will keep only the last 100 readings.
Memory Management for Events
One of the benefits of running the engine in STREAM mode is that the engine can detect when an event can no longer match any rule due to its temporal constraints. When that happens, the engine can safely retract the event from the session without side effects and release any resources used by that event. There are basically 2 ways for the engine to calculate the matching window for a given event:
- explicitly, using the expiration policy
- implicitly, analyzing the temporal constraints on events
Explicit expiration offset
The first way of allowing the engine to calculate the window of interest for a given event type is by explicitly setting it. To do that, just use the declare statement and define an expiration for the fact type:
declare StockTick
@expires( 30m )
end
Inferred expiration offset
Another way for the engine to calculate the expiration offset for a given event is implicitly, by analyzing the temporal constraints in the rules. For instance, given the following rule:
rule "correlate orders"
when
$bo : BuyOrderEvent( $id : id )
$ae : AckEvent( id == $id, this after[0,10s] $bo )
then
// do something
end
Analyzing the above rule, the engine automatically calculates that whenever a BuyOrderEvent matches, it needs to store it for up to 10 seconds to wait for matching AckEvent's. So, the implicit expiration offset for BuyOrderEvent will be 10 seconds. AckEvent, on the other hand, can only match existing BuyOrderEvent's, and so its expiration offset will be zero seconds. The engine will make this analysis for the whole rulebase and find the offset for every event type. Whenever an implicit expiration offset clashes with the explicit expiration offset, then engine will use the greater of the two.
Nice Notes!
ReplyDeleteagreed! Very useful reference material.
ReplyDeleteWhere can I find that change set XSD??? It's nowhere to be seen.
ReplyDeleteThanks a lot for these notes, they are somewhat identical to mine, but already taken to the web, thanks!
ReplyDelete