Deserialisers

Consumed data is mapped to internal StreamEvents using a flexible deserialisation framework

Overview

Joule's deserialization framework enables seamless conversion of raw domain-specific data into StreamEvent objects for processing.

By leveraging built-in deserializers (i.e., JSON, CSV, AVRO) or custom parsers, it ensures compatibility with various data formats and workflows, allowing precise transformation tailored to application needs.

For advanced scenarios, the StreamEventParser API supports custom parsing when standard deserializers cannot handle unique formats or structures. This flexibility allows developers to map domain-specific objects into StreamEvent collections, ensuring smooth integration with Joule's processing pipeline.

Every connector type will have it preferred supported deserialisation method

This page serves as a guide for configuring data deserialisation, with examples and options for effective data integration in Joule.

This page includes the following:

  1. Deserialiser DSL element Describes the deserialisation element, which defines how received data is parsed ready for Joule processing.

  2. Kafka example Provides an example Kafka consumer configuration that deserialises incoming domain events to StreamEvents.

  3. Custom parser API Covers the StreamEventParser interface for custom deserialisation when unique data parsing is required.

  4. AVRO deserialiser Introduce how AVRO can be used to transform domain events to Joule StreamEvent events using a provided AVRO IDL schema file.

  5. Available Implementations Learn how to use OOTB deserialisers for JSON and CSV event data.

Deserialiser DSL element

The deserialiser DSL element defines the deserialisation process for StreamEvents to prepare them for Joule processing.

It allows developers to specify how data should be transformed into StreamEvents using custom parsing logic, enabling precise handling of various data formats and structures tailored to specific application needs.

Example

This code snippet shows a basic setup for a Kafka consumer within Joule, where JSON StreamEvent string objects are converted toStreamEvent objects ready for pipeline processing.

kafkaConsumer:
   ...
   deserializer:
      parser: 
         json deserializer: {} 

Attributes schema

AttributeDescriptionType

parser

Parser interface to convert a Object to a collection of stream events

StreamEventParser

Default: StreamEventJsonDeserializer

compressed

If passed payload is compressed

Boolean

Default: false

batch

Is payload a batch as events

Boolean

Default: false

properties

Specific properties required for custom deserialisation process

Properties

Parser

Parser interface to converts a received Object to a collection of stream events.

StreamEventParser API

For complex use cases whereby AVRO cannot be used a custom parser will be needed. In this case developers can provide a parsing solution using the StreamEventParser API.

Example

This example shows how a custom parser (QuoteToStreamEventParser) transforms Quote domain objects into StreamEvents that can be processed by Joule. Deserialisation in action.

kafkaConsumer:
  ... 
  deserializer:
    parser: 
      com.fractalworks.examples.banking.data.QuoteToStreamEventParser
    
    key deserializer: 
      org.apache.kafka.common.serialization.IntegerDeserializer
    
    value deserializer: 
      com.fractalworks.streams.transport.kafka.serializers.object.ObjectDeserializer

AVRO deserialisation

This deserialiser has extra setting to support ability to load the target schema. Currently only local schema files are supported with schema registry support on request.

The transformer automatically maps domain attributes to StreamEvent attributes using a provided AVRO schema IDL. Currently only local schema files are supported with schema registry support on request.

Integrate Joule to any existing system using already established data structures

Example

kafkaConsumer:
   ...
   deserializer:
      parser: 
         avro deserializer: 
            schema: /home/myapp/schema/customer.avro

Attributes schema

AttributeDescriptionTypeRequired

schema

Path and name of schema file

String

field mapping

Map source fields to map to a target StreamEvent field

Map<String,String>

Available Implementations

Joule provides several deserialiser implementations for different data formats, each requiring a full package namespace.

Key implementations include:

  1. JSON deserialiser Parse StreamEvent JSON events into concrete objects.

  2. CSV Deserialiser Parse CSV formatted data using options for custom headers and a type map to define data types like STRING and DOUBLE.

  3. Object Deserialiser Parse StreamEvent binary events into concrete objects.

FormatDSLClass

json

json deserialiser

StreamEventJsonDeserializer

csv

csv deserialiser

StreamEventCSVDeserializer

object

object deserialiser

StreamEventObjectDeserializer

JSON

This parser reads a converts a JSON string formatted StreamEvent to a StreamEvent object.

AttributeDescriptionTypeRequired

date format

Provide a custom date format. See JavaDocs for further data format options.

String Default: yyyy/MM/dd

Example

deserializer:
    parser: 
        json deserializer:
            date format: dd/MM/yyyy

CSV

This parser has extra optional setting to support ability to custom parsing through the use of defining a custom header.

AttributeDescriptionTypeRequired

type map

Map of fields to types. See supported types

Map<String,Type>

delimiter

Define a custom file delimiter

String Default: ","

date format

Provide a custom date format.

String Default: yyyy/MM/dd

Example

deserializer:
    parser: 
        csv deserializer:
            type map:
               symbol: STRING
               rate:   DOUBLE

Supported Types

Following types are supported

  • DOUBLE, FLOAT, LONG, INTEGER, SHORT, BYTE, BOOLEAN, STRING, DATE

Object

This parser reads a converts a binary formatted StreamEvent to a StreamEvent object

Example

deserializer:
    parser: 
        object deserializer: {}

Last updated