Configuration

Set up a contextual data-driven use case

Overview

Joule’s contextual data management is streamlined through an unified configuration file that defines all relevant data sources, ensuring consistency and applicability across all deployed use cases.

The configuration specifies one or more contextual data sources, with current support for:

  1. Disk access MinIO for S3 object storage.

  2. Memory access Apache Geode as a high-performance cached data source.

This file needs to be deployed before any use case deployments

Example & DSL attributes

The following is an example configuration for integrating Apache Geode with Joule as an external data source. For detailed guidance on leveraging this robust enterprise feature, refer to the Geode connector documentation.

This configuration specifies key data locations for nasdaqIndexCompanies and holidays. Although these data points may not change frequently, they may have high read demands, making them ideal candidates for hosting within Joule’s in-memory space.

reference data:
  name: banking market data 
  data sources:
    - geode stores:
        name: us markets
        connection:
          locator address: 192.168.86.39
          locator port: 41111
        stores:
          nasdaqIndexCompanies:
            region: nasdaq-companies
            keyClass : java.lang.String
            gii: true
          holidays:
            region: us-holidays
            keyClass : java.lang.Integer

Stores explanation

A key element of the configuration’s DSL is the stores syntax. In essence, the stores definition binds a logical store name to a specific implementation, with properties tailored to the underlying technology. For instance, an S3 implementation will have different configuration properties compared to a distributed caching solution.

This can be better illustrated through an example where company information needs to be added to each event, and an ML model requires updates during the Joule process’s runtime. In this case, the configurations for nasdaqIndexCompanies and predictors differ, reflecting their distinct underlying technologies and usage patterns.

reference data:
  name: banking market data 
  data sources:
    - geode stores:
        ...
        stores:
          nasdaqIndexCompanies:
            region: nasdaq-companies
            keyClass : java.lang.String
            gii: true
            
    - minio stores:
        ...
        stores:
          predictors:
            bucketId: models
            initial version Id: 12345
            download dir: /home/joule/nasdaq-companies-model/tmp

Enrichment example

In Joule, contextual data is applied to events through the enricher processor, which attaches the relevant contextual data objects to each StreamEvent.

Read the enrichment document for full feature information and further examples

The example below enriches the event's companyInformation attribute using the nasdaqCompanies contextual data, by performing a lookup based on the symbol key to retrieve relevant company data.

enricher:
  fields:
    companyInformation:
      key: symbol
      using: nasdaqCompanies
      
    stores:
       nasdaqCompanies:
          store name: nasdaqIndexCompanies

Attributes schema

AttributeDescriptionData TypeRequired

name

Contextual data store namespace

String

data sources

List of data sources to connect and bind in to the Joule processor

List of connector configurations

Last updated