Data pipelines
Data (stream) pipelines enable business-specific, real-time analytics use cases
Visit the use case concept article to understand what use cases are in Joule.
Overview
At the core of the platform is the use case, defined by users who understand the business's specific computation and needs. This page should give an overview of how a use case is defined within the Joule platform.
It will describes a sample use case for setting up a real-time data stream that monitors and analyses stock quote data, using features like window aggregation, metrics calculations, filtering and event grouping.
Key features of data pipelines
The following described features equip users to build, manage and refine robust data-driven pipelines tailored to their business objectives.
Data priming Joule allows the system to be primed at initialisation with static contextual data and pre-calculated metrics, ensuring that initial data needs are met before processing begins. This setup is essential for accurate data handling and pre-population of relevant tables.
Processing unit The platform enables rapid and straightforward creation of use cases, providing tools for customisable computation and processing logic. This flexibility allows users to define and execute business-specific workflows quickly.
Emit Joule supports tailored output selection, allowing users to specify the exact fields to be included in the final output. Additionally, it provides options for final filtering, streamlining the data sent to downstream systems.
Group by Grouping similar events enables more efficient data aggregation and reduces output size, helping optimise processing and storage requirements for downstream systems.
Telemetry auditing With inbound and outbound event auditing, Joule supports rigorous testing, model validation and retraining processes. This enhances reliability and ongoing refinement of data models.
Example
In this use case, we are setting up a real-time data stream pipeline called tumblingWindowQuoteStream
to monitor and analyse stock quote data from a live source. This setup provides real-time analytics on stock quotes, capturing trends and key statistics for each stock symbol over defined time intervals.
The use case makes use of the metrics engine and tumbling window before publishing a filter stream of events to a connected publisher.
This example will be split up in section in each element of the data pipeline
Types of elements for the data pipelines
Last updated