Joule
  • Welcome to Joule's Docs
  • Why Joule?
    • Joule capabilities
  • What is Joule?
    • Key features
    • The tech stack
  • Use case enablement
    • Use case building framework
  • Concepts
    • Core concepts
    • Low code development
    • Unified execution engine
    • Batch and stream processing
    • Continuous metrics
    • Key Joule data types
      • StreamEvent object
      • Contextual data
      • GeoNode
  • Tutorials
    • Getting started
    • Build your first use case
    • Stream sliding window quote analytics
    • Advanced tutorials
      • Custom missing value processor
      • Stateless Bollinger band analytics
      • IoT device control
  • FAQ
  • Glossary
  • Components
    • Pipelines
      • Use case anatomy
      • Data priming
        • Types of import
      • Processing unit
      • Group by
      • Emit computed events
      • Telemetry auditing
    • Processors
      • Common attributes
      • Filters
        • By type
        • By expression
        • Send on delta
        • Remove attributes
        • Drop all events
      • Enrichment
        • Key concepts
          • Anatomy of enrichment DSL
          • Banking example
        • Metrics
        • Dynamic contextual data
          • Caching architecture
        • Static contextual data
      • Transformation
        • Field Tokeniser
        • Obfuscation
          • Encryption
          • Masking
          • Bucketing
          • Redaction
      • Triggers
        • Change Data Capture
        • Business rules
      • Stream join
        • Inner stream joins
        • Outer stream joins
        • Join attributes & policy
      • Event tap
        • Anatomy of a Tap
        • SQL Queries
    • Analytics
      • Analytic tools
        • User defined analytics
          • Streaming analytics example
          • User defined analytics
          • User defined scripts
          • User defined functions
            • Average function library
        • Window analytics
          • Tumbling window
          • Sliding window
          • Aggregate functions
        • Analytic functions
          • Stateful
            • Exponential moving average
            • Rolling Sum
          • Stateless
            • Normalisation
              • Absolute max
              • Min max
              • Standardisation
              • Mean
              • Log
              • Z-Score
            • Scaling
              • Unit scale
              • Robust Scale
            • Statistics
              • Statistic summaries
              • Weighted moving average
              • Simple moving average
              • Count
            • General
              • Euclidean
        • Advanced analytics
          • Geospatial
            • Entity geo tracker
            • Geofence occupancy trigger
            • Geo search
            • IP address resolver
            • Reverse geocoding
            • Spatial Index
          • HyperLogLog
          • Distinct counter
      • ML inferencing
        • Feature engineering
          • Scripting
          • Scaling
          • Transform
        • Online predictive analytics
        • Model audit
        • Model management
      • Metrics engine
        • Create metrics
        • Apply metrics
        • Manage metrics
        • Priming metrics
    • Contextual data
      • Architecture
      • Configuration
      • MinIO S3
      • Apache Geode
    • Connectors
      • Sources
        • Kafka
          • Ingestion
        • RabbitMQ
          • Further RabbitMQ configurations
        • MQTT
          • Topic wildcards
          • Session management
          • Last Will and Testament
        • Rest endpoints
        • MinIO S3
        • File watcher
      • Sinks
        • Kafka
        • RabbitMQ
          • Further configurations
        • MQTT
          • Persistent messaging
          • Last Will and Testament
        • SQL databases
        • InfluxDB
        • MongoDB
        • Geode
        • WebSocket endpoint
        • MinIO S3
        • File transport
        • Slack
        • Email
      • Serialisers
        • Serialisation
          • Custom transform example
          • Formatters
        • Deserialisers
          • Custom parsing example
    • Observability
      • Enabling JMX for Joule
      • Meters
      • Metrics API
  • DEVELOPER GUIDES
    • Setting up developer environment
      • Environment setup
      • Build and deploy
      • Install Joule
        • Install Docker demo environment
        • Install with Docker
        • Install from source
        • Install Joule examples
    • Joulectl CLI
    • API Endpoints
      • Mangement API
        • Use case
        • Pipelines
        • Data connectors
        • Contextual data
      • Data access API
        • Query
        • Upload
        • WebSocket
      • SQL support
    • Builder SDK
      • Connector API
        • Sources
          • StreamEventParser API
        • Sinks
          • CustomTransformer API
      • Processor API
      • Analytics API
        • Create custom metrics
        • Define analytics
        • Windows API
        • SQL queries
      • Transformation API
        • Obfuscation API
        • FieldTokenizer API
      • File processing
      • Data types
        • StreamEvent
        • ReferenceDataObject
        • GeoNode
    • System configuration
      • System properties
  • Deployment strategies
    • Deployment Overview
    • Single Node
    • Cluster
    • GuardianDB
    • Packaging
      • Containers
      • Bare metal
  • Product updates
    • Public Roadmap
    • Release Notes
      • v1.2.0 Join Streams with stateful analytics
      • v1.1.0 Streaming analytics enhancements
      • v1.0.4 Predictive stream processing
      • v1.0.3 Contextual SQL based metrics
    • Change history
Powered by GitBook
On this page
  • Overview
  • Serialisation DSL element
  • Example
  • Attributes schema
  • Custom transformer API
  • Object CustomTransformer API
  • AVRO serialisation
  • Data formatters
  • Available Implementations

Was this helpful?

  1. Components
  2. Connectors
  3. Serialisers

Serialisation

Data emitted to downstream systems is serialised using provided or custom serialisers

PreviousSerialisersNextCustom transform example

Last updated 6 months ago

Was this helpful?

Overview

Joule provides event formatters and serializers to transform StreamEvent objects into structured formats for stream processing, visualization or analytics.

These tools enable seamless integration with diverse systems by converting raw data to formats like JSON, CSV or Parquet, tailored to downstream needs.

For advanced use cases, the allows developers to implement custom transformations, mapping StreamEvent objects to domain-specific types or unique formats not supported by standard serialisers. This flexibility ensures Joule can meet specialised data integration requirements.

Each connector type will have its preferred supported serialisation method

This page serves as a guide for configuring data serialisation and formatting, with examples and options for effective data integration in Joule.

This page includes the following:

  1. Serialisation DSL element Describes serialisation element, which defines how StreamEvents are serialised for downstream systems, using formatters like JSON, CSV and Parquet.

  2. Kafka example Provides an example Kafka publisher configuration that serialises StreamEvents as JSON events.

  3. Custom transformer API Covers the CustomTransformer interface for custom serialisation when unique data transformations are required.

  4. Avro serialiser Introduce how can be used to transform Joule events to target domain events using a provided AVRO IDL schema file.

  5. Formatters Outlines available formatters (JSON, CSV, Parquet) with example configurations for data storage needs.

Serialisation DSL element

The serializer DSL element defines how StreamEvents should be serialised for downstream consumption, specifying attributes such as data format and target configuration.

For example, when using Kafka to publish data, you can use a JSON formatter to structure StreamEvents into JSON objects, enabling efficient data exchange across multiple processes in a system topology.

Example

This code snippet shows a basic setup for a Kafka publisher within Joule, where StreamEvent objects are converted to JSON and published to a target topic.

This setup enables Joule inter-process communication in a larger use case topology, where Joule processes communicate by publishing and subscribing to events.

kafkaPublisher:
  ...
  serializer:
    formatter:
      json formatter: {}

Attributes schema

Attribute
Description
Type

transformer

Convert StreamEvent to a target domain type using a custom transformer developed using the Developer JDK

formatter

Convert a StreamEvent to a target format such as JSON, CSV, Object

Default: JSON

compress

Compress resulting serialise data for efficient network transfer

Boolean

Default: False

batch

Batch events to process in to a single payload.

Boolean

Default: False

properties

Specific properties required for custom serialisation process

Properties

Custom transformer API

Convert StreamEvent to a target domain type using a custom transformer developed using the Developer JDK.

Two methods are provided:

  1. CustomTransformer API For code based transformations.

  2. AVRO serialisation For automated event translation.

Object CustomTransformer API

This capability is used when the use case requires a specific domain data type for downstream consumption. Developers are expected to provide domain specific implementations by implementing the CustomTransformer interface.

Example

The following example converts the StreamEvent to a StockAnalyticRecord using custom code.

This data object is then converted in to a JSON object using the Kafka serialisation framework.

kafkaPublisher:
  ...
  serializer:
    transform: 
      com.fractalworks.examples.banking.data.StockAnalyticRecordTransform
      
    key serializer: 
      org.apache.kafka.common.serialization.IntegerSerializer
    value serializer: 
      com.fractalworks.streams.transport.kafka.serializers.json.ObjectJsonSerializer

AVRO serialisation

This is an AVRO transform implementation utilising the CustomTransformer API. To simplify usage, an avro serializer DSL element is provided, along with configurable attributes.

The transformer automatically maps StreamEvent attributes on to the the target data domain type attributes using a provided AVRO schema IDL. Currently only local schema files are supported with schema registry support on request.

Integrate Joule to any existing system using already established data structures

Example

kafkaPublisher:
  ...
  serializer:
    transform:
      avro serializer:
        schema: /home/myapp/schema/customer.avsc
    ...

Attributes schema

Attribute
Description
Data Type
Required

schema

Path and name of schema file

String

field mapping

Custom mapping of source StreamEvent fields to target domain fields

Map<String,String>

Data formatters

Formatters are mainly used for direct storage whereby data tools such as PySpark, Apache Presto, DuckDB, MongoDB, Postgres, MySQL etc,. can use the data directly without further data transformation overhead.

Available Implementations

Refer to the for detailed instructions on implementing a custom transformer.

See for configuration options.

JSON formatter

CSV formatter

Parquet formatter

custom transform example documentation
data formatters documentation
AVRO
CustomTransformer API
CustomTransformer API
Formatter