Stream Join
Join independent stream events to trigger advance analytical insights and dynamic business rules
The stream join operation within stream processing is core potent function. It reduces time to insight latency for complex event processing by correlating event streams in to a simple meaningful event flow. And thereby supports use case scenarios that require analysing and understanding data in real-time. This not to be confused with real-time data enrichment as that addresses the need for additional data that may have be computed within process or imported in as a static data set.
Example use cases
Here are some common use cases where stream joins are applied:
Real-time ad performance monitoring: Gain real-time view on user ad clicks, impressions and user clicks within your website thereby enabling targeting ad spend and customer intelligence.
Dynamic promotions: Recognise a new prospective customer on your site, track product clicks and apply a promotion rule to generate a targeted promotion for presentation on the current web page.
Product recommendations: By joining user sessions, presented ads and site clicks a generate product recommendations and monitor reaction via a feedback loop.
Customer 360 proactive support: Analyse multiple customer telemetry streams to generate next best support actions.
Platform anomaly detection: Analyse and aggregate multiple device telemetry streams to determine compute and network utilisation abnormal loads.
Using Inner stream Joins
When you need a strict join between two event streams an inner join is requires. This operator will only emitted an event on a successful join otherwise it will wait until a join has occurred, depending upon policay configuration.
Using Outer stream Joins
Outer joins operate slightly differently to inner with respect to when the first event received on the outer stream. Whenever a new event on outer stream is received this immediately is passed on within the pipeline processing, (i.e. ideal to prime processing), and thereafter any events that cause a join are emitted.
For the above example when the sitevisits
stream generates a new customerId event this will be emitted by the join processor for initial processing. Whereas further any joins will cause new join events to be emitting until join expiry policy is invoked.
Join Attributes
These attributes define how the join is performed with respect to the expression, how to create the emitted event structure etc,.
Attribute | Description | Data Type | Required |
---|---|---|---|
expression | Simple join expression that evaluates two streams | String | |
merge events | Flag that either merge event attributes into a flattened event structure or places each event | Boolean Default: true | |
event type | User defined type of emitted event | String Default: left type - right type (i.e. typeA-typeB) | |
left / right join | See Join attribute section | Policy attribute Default |
Policy Attributes
Fine grained control can be applied to how joins are handled through the use of the policy attribute. This becomes important for large and fast event stream with respect to memory and processing overhead.
Attribute | Description | Data Type | Required |
---|---|---|---|
delete on join | Delete event from state management on a successful join | Boolean Default: false | |
time to live | Expire configuration for event stored within the state management system. Supported time units include nanoseconds, microseconds, milliseconds, seconds, hours and days | Formatted string Default: 60 minutes |
Default configuration
delete on join is set to false and therefore will join events until the time to live setting is honoured
time to live is set for 60 minutes per stored event. This is refreshed every time a new event is received with the same join attribute value
Last updated