Ingestion
Joule enables flexible Kafka data deserialisation for processing
Overview
When working with external data, deserialisation is often needed to convert raw data into formats compatible with Joule’s processing pipelines.
Joule provides a versatile deserialisation framework that extends the native Kafka deserialisation capabilities, supporting multiple formats to make ingestion and processing easier.
Available deserialisation options include:
Domain object deserialisation Custom parsers can transform domain-specific objects into Joule-compatible formats.
AVRO deserialisation AVRO schemas can convert structured data into Joule's internal
StreamEvents
.Native Joule
StreamEvent
deserialisation Joule's format for internal processing allows direct ingestion of data asStreamEvents
.
Examples & DSL attributes
This example shows how a custom parser (QuoteToStreamEventParser
) transforms Quote
domain objects into StreamEvents
that can be processed by Joule. Deserialisation in action.
Processing steps
Consumption The Kafka client receives the event data using a generic object via the specified value deserialiser.
Parsing The object is passed to
QuoteToStreamEventParser
, converting it from aQuote
object to aStreamEvent
.Streaming pipeline The
StreamEvent
is added to Joule’s streaming pipeline queue.Acknowledgment The Kafka infrastructure acknowledges the event asynchronously to manage processing flow.
Deserialisation attributes schema
parser
User provided parser implementation
Implementation of CustomTransformer
key deserializer
Domain class that maps to the partition key type. Property maps to key.serializer
property
String
Default: IntegerDeserializer
value deserializer
Domain class that deserialises incoming object to StreamEvent
. Property maps to value.serializer
property
String
Default: StringDeserializer
avro setting
AVRO ingestion specification.
Avro specification
Provided value deserialisers
The following deserialisation options enable Joule to ingest external data structures seamlessly:
Custom data types To handle custom data types without using an AVRO IDL schema, you must configure the system with
com.fractalworks.streams.transport.kafka.serializers.object.ObjectDeserializer
. This configuration requires implementing a custom parser that can interpret and transform the incoming data into a format Joule can process.StreamEvent
(Binary) When connecting Joule processes in a Directed Acyclic Graph (DAG), events are passed as internalStreamEvent
objects in binary format. To properly deserialise these events, configure the system with the following setting:com.fractalworks.streams.transport.kafka.serializers.avro.AvroStreamEventDeserializer
This ensures that the binaryStreamEvent
is correctly interpreted and processed.StreamEvent
(JSON) For testing purposes, you may have savedStreamEvent
objects as JSON strings. These events can be consumed and converted into concrete objects by configuring the system with the following setting:com.fractalworks.streams.transport.kafka.serializers.json.StreamEventJsonDeserializer
This setting ensures that the JSON data is properly deserialised into the appropriateStreamEvent
objects.
Avro Support
Joule includes AVRO support through locally stored schemas (AVRO IDL files) to map domain objects to StreamEvents
. Although AVRO schema registries are not supported yet, this feature is planned for 2025.
Example AVRO deserialisation
Here’s a configuration that uses an AVRO schema to map Quote
events into StreamEvents.
To review the schema for this example, please visit the Kafka sources page.
Sample AVRO IDL for quote schema
This AVRO schema defines a Quote
object with attributes like symbol
, mid
, bid
, and ask
, allowing Joule to map incoming Quote
data into StreamEvents
automatically for streamlined ingestion.
Last updated