Processing unit
The processing unit is the key function Joule provides to the user.
Overview
At its simplest form, a processing unit provides two key functions; a stream processing pipeline and a metrics engine that generates metrics based upon a time schedule policy.
Pipeline
A pipeline is a sequence of processors that compute functions on events. Joule provides out-of-the-box a set of core processors to enable you to build useful use cases, see processor documentation to learn more about the available processors.
Example
This example filters events that match the symbol 'A' and then places the event into a tumbling window. on every 5 seconds, the two aggregation functions execute against the stored events which are grouped by symbol, see the use cases overview for the complete definition. This pipeline will generate one event per group by criteria i.e. symbol.
The pipeline is defined as a list of processors whereby each processor defines its own DSL syntax.
Metrics Engine
The metrics engine provides the ability to compute complex metrics based upon SQL queries, the embedded SQL engine provides this feature which is enabled by default.
Example
policy
Metrics are generated by applying a time schedule policy. The three below DSL elements provide the required setting.
Required
timeUnit
Time unit used for the set frequency and startup delay elements.
Supported timeUnits: SECONDS, MINUTES, HOURS
Default: MINUTES
frequency
The time period between calculations.
Default: 1 Minute
startup delay
Time to wait before the first metric calculation are perform.
Default: 5 Minutes
Metrics Computations
The foreach metric compute
syntax defines a metric table, computation, management and assigns it to a named metric family.
name
A unique name for the metric. This is otherwise know as the metrics family name.
Required
metric key
This is required to dynamically generate optimised metric queries for user lookups and management functions.
Required
table definition
A SQL table needs to be defined for metrics to be stored and accessed. The definition required the schema to be added to the table name.
Required
query
The computation is an ANSI SQL query which is executed periodically with the results inserted into the defined metric table.
Required
Management
Metrics are managed on startup and during processing to support general good housekeeping and memory management. Default management processes are enabled on an hourly basis.
truncate on start
Truncates table on startup if set to true.
Optional: Default true
compaction
Compaction is used to remove old metrics over the period set.
Optional: Default hourly management
frequency
Optional: Default hourly
timeUnits
Time units applied to frequency setting.
Support Type: HOURS, MINUTES
Optional: Default HOURS
Last updated