# Outer stream joins

## Objective

Outer joins are different from inner joins by **immediately passing the first event received** on the outer stream into the processing pipeline, which can be useful for initialising downstream processors.

Any additional events that meet the join criteria are also emitted as they arrive.

{% hint style="info" %}
Outer stream joins are extremely useful to tackle the [cold start](https://en.wikipedia.org/wiki/Cold_start_\(recommender_systems\)) problem in stream recommendation engines.
{% endhint %}

## Uses

It is extremely useful for **ideal to prime processing**.

Priming can improve performance by **reducing the latency to get initial results**, which are then refined or updated as more data becomes available.

<details>

<summary>What is priming?</summary>

In stream processing, priming refers to initialising or setting up a process to start working as soon as possible with partial or incomplete data, often before all necessary data has arrived.

For example, with an outer join in stream processing, as soon as an event arrives on the outer stream, it’s **immediately processed and sent downstream**. This priming action allows downstream processors to begin working with the initial data, while waiting for more events to join and complete the full picture.

It’s especially useful for processes that depend on getting an initial seed of data to start calculations, aggregations, or other operations quickly.

</details>

## Example

This code defines a **join** between the `sitevisits` and `webpage.adclicks` streams based on `customerId`.

1. <mark style="color:green;">**expression**</mark>\
   It matches all `customerId` values from `sitevisits` to `webpage.adclicks`.
2. <mark style="color:green;">**merge events**</mark>\
   Matching events are merged.
3. <mark style="color:green;">**left policy**</mark>\
   `sitevisits` events expire after 30 minutes.
4. <mark style="color:green;">**right policy**</mark>\
   Matched `webpage.adclicks` events are deleted after the join.

New `sitevisits` events trigger with a new `customerId` initial processing and subsequent matching `webpage.adclicks` events continue until expiration, with cleanup after joining.

{% hint style="info" %}
Outer joins are set by `*=`
{% endhint %}

```yaml
streams join:
  expression: "sitevisits.customerId *= webpage.adclicks.customerId"
  merge events: true
  left policy:
    time to live: 30 minutes
  right policy:
    delete on join: true
```
