# Outer stream joins

## Objective

Outer joins are different from inner joins by **immediately passing the first event received** on the outer stream into the processing pipeline, which can be useful for initialising downstream processors.

Any additional events that meet the join criteria are also emitted as they arrive.

{% hint style="info" %}
Outer stream joins are extremely useful to tackle the [cold start](https://en.wikipedia.org/wiki/Cold_start_\(recommender_systems\)) problem in stream recommendation engines.
{% endhint %}

## Uses

It is extremely useful for **ideal to prime processing**.

Priming can improve performance by **reducing the latency to get initial results**, which are then refined or updated as more data becomes available.

<details>

<summary>What is priming?</summary>

In stream processing, priming refers to initialising or setting up a process to start working as soon as possible with partial or incomplete data, often before all necessary data has arrived.

For example, with an outer join in stream processing, as soon as an event arrives on the outer stream, it’s **immediately processed and sent downstream**. This priming action allows downstream processors to begin working with the initial data, while waiting for more events to join and complete the full picture.

It’s especially useful for processes that depend on getting an initial seed of data to start calculations, aggregations, or other operations quickly.

</details>

## Example

This code defines a **join** between the `sitevisits` and `webpage.adclicks` streams based on `customerId`.

1. <mark style="color:green;">**expression**</mark>\
   It matches all `customerId` values from `sitevisits` to `webpage.adclicks`.
2. <mark style="color:green;">**merge events**</mark>\
   Matching events are merged.
3. <mark style="color:green;">**left policy**</mark>\
   `sitevisits` events expire after 30 minutes.
4. <mark style="color:green;">**right policy**</mark>\
   Matched `webpage.adclicks` events are deleted after the join.

New `sitevisits` events trigger with a new `customerId` initial processing and subsequent matching `webpage.adclicks` events continue until expiration, with cleanup after joining.

{% hint style="info" %}
Outer joins are set by `*=`
{% endhint %}

```yaml
streams join:
  expression: "sitevisits.customerId *= webpage.adclicks.customerId"
  merge events: true
  left policy:
    time to live: 30 minutes
  right policy:
    delete on join: true
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.fractalworks.io/joule/components/processors/stream-join/outer-stream-joins.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
