Kafka

Standard Kafka consumers transport ingests data from subscribed cluster topics

Overview

The Kafka consumer transport allows Joule to ingest data from specified Kafka cluster topics. Enabling integration with a flexible data infrastructure.

Kafka streams are deserialized into Joule StreamEvents for pipeline processing. This deserialisation can occur automatically or be managed through a custom transformer, depending on the configuration.

Example

In the following example the consumed events from the quotes topic are automatically transformed from a JSON string to a StreamEvent object:

kafkaConsumer:
  name: nasdaq_quotes_stream
  cluster address: KAFKA_BROKER:9092
  consumerGroupId: nasdaq
  topics:
    - quotes
    
  properties:
    partition.assignment.strategy: org.apache.kafka.clients.consumer.StickyAssignor

Add to the /etc/hosts file the address of the Kafka host. This enables you to use the symbolic name rather than a specific IP address

i.e. 127.0.0.1 KAFKA_BROKER

Attributes schema

AttributeDescriptionData TypeRequired

name

Name of source stream

String

cluster address

Kafka cluster details. Maps to bootstrap.servers

http://<ip-address>:port

consumerGroupId

Consumer group to ensure an event is read only once. Maps to group.id

String

topics

Message topics to subscribe too

List of strings

pollingTimeout

This places an upper bound on the amount of time that the consumer can be idle before fetching more records. See this documentation for more information.

Long

Default: 250ms

deserializer

Deserialization configuration

See Ingestion documentation

properties

Additional consumer properties to be applied to the Kafka client API. See official documentation for available properties.

Properties map

Last updated