Joule
  • Welcome to Joule's Docs
  • Why Joule?
    • Joule capabilities
  • What is Joule?
    • Key features
    • The tech stack
  • Use case enablement
    • Use case building framework
  • Concepts
    • Core concepts
    • Low code development
    • Unified execution engine
    • Batch and stream processing
    • Continuous metrics
    • Key Joule data types
      • StreamEvent object
      • Contextual data
      • GeoNode
  • Tutorials
    • Getting started
    • Build your first use case
    • Stream sliding window quote analytics
    • Advanced tutorials
      • Custom missing value processor
      • Stateless Bollinger band analytics
      • IoT device control
  • FAQ
  • Glossary
  • Components
    • Pipelines
      • Use case anatomy
      • Data priming
        • Types of import
      • Processing unit
      • Group by
      • Emit computed events
      • Telemetry auditing
    • Processors
      • Common attributes
      • Filters
        • By type
        • By expression
        • Send on delta
        • Remove attributes
        • Drop all events
      • Enrichment
        • Key concepts
          • Anatomy of enrichment DSL
          • Banking example
        • Metrics
        • Dynamic contextual data
          • Caching architecture
        • Static contextual data
      • Transformation
        • Field Tokeniser
        • Obfuscation
          • Encryption
          • Masking
          • Bucketing
          • Redaction
      • Triggers
        • Change Data Capture
        • Business rules
      • Stream join
        • Inner stream joins
        • Outer stream joins
        • Join attributes & policy
      • Event tap
        • Anatomy of a Tap
        • SQL Queries
    • Analytics
      • Analytic tools
        • User defined analytics
          • Streaming analytics example
          • User defined analytics
          • User defined scripts
          • User defined functions
            • Average function library
        • Window analytics
          • Tumbling window
          • Sliding window
          • Aggregate functions
        • Analytic functions
          • Stateful
            • Exponential moving average
            • Rolling Sum
          • Stateless
            • Normalisation
              • Absolute max
              • Min max
              • Standardisation
              • Mean
              • Log
              • Z-Score
            • Scaling
              • Unit scale
              • Robust Scale
            • Statistics
              • Statistic summaries
              • Weighted moving average
              • Simple moving average
              • Count
            • General
              • Euclidean
        • Advanced analytics
          • Geospatial
            • Entity geo tracker
            • Geofence occupancy trigger
            • Geo search
            • IP address resolver
            • Reverse geocoding
            • Spatial Index
          • HyperLogLog
          • Distinct counter
      • ML inferencing
        • Feature engineering
          • Scripting
          • Scaling
          • Transform
        • Online predictive analytics
        • Model audit
        • Model management
      • Metrics engine
        • Create metrics
        • Apply metrics
        • Manage metrics
        • Priming metrics
    • Contextual data
      • Architecture
      • Configuration
      • MinIO S3
      • Apache Geode
    • Connectors
      • Sources
        • Kafka
          • Ingestion
        • RabbitMQ
          • Further RabbitMQ configurations
        • MQTT
          • Topic wildcards
          • Session management
          • Last Will and Testament
        • Rest endpoints
        • MinIO S3
        • File watcher
      • Sinks
        • Kafka
        • RabbitMQ
          • Further configurations
        • MQTT
          • Persistent messaging
          • Last Will and Testament
        • SQL databases
        • InfluxDB
        • MongoDB
        • Geode
        • WebSocket endpoint
        • MinIO S3
        • File transport
        • Slack
        • Email
      • Serialisers
        • Serialisation
          • Custom transform example
          • Formatters
        • Deserialisers
          • Custom parsing example
    • Observability
      • Enabling JMX for Joule
      • Meters
      • Metrics API
  • DEVELOPER GUIDES
    • Setting up developer environment
      • Environment setup
      • Build and deploy
      • Install Joule
        • Install Docker demo environment
        • Install with Docker
        • Install from source
        • Install Joule examples
    • Joulectl CLI
    • API Endpoints
      • Mangement API
        • Use case
        • Pipelines
        • Data connectors
        • Contextual data
      • Data access API
        • Query
        • Upload
        • WebSocket
      • SQL support
    • Builder SDK
      • Connector API
        • Sources
          • StreamEventParser API
        • Sinks
          • CustomTransformer API
      • Processor API
      • Analytics API
        • Create custom metrics
        • Define analytics
        • Windows API
        • SQL queries
      • Transformation API
        • Obfuscation API
        • FieldTokenizer API
      • File processing
      • Data types
        • StreamEvent
        • ReferenceDataObject
        • GeoNode
    • System configuration
      • System properties
  • Deployment strategies
    • Deployment Overview
    • Single Node
    • Cluster
    • GuardianDB
    • Packaging
      • Containers
      • Bare metal
  • Product updates
    • Public Roadmap
    • Release Notes
      • v1.2.0 Join Streams with stateful analytics
      • v1.1.0 Streaming analytics enhancements
      • v1.0.4 Predictive stream processing
      • v1.0.3 Contextual SQL based metrics
    • Change history
Powered by GitBook
On this page
  • Overview
  • Examples & DSL attributes
  • Attributes schema
  • Connection attributes schema
  • Credentials attributes schema
  • JouleProviderPlugin interface
  • Bucket attributes schema
  • Additional resources

Was this helpful?

  1. Components
  2. Connectors
  3. Sinks

MinIO S3

MinIO file producer using S3 cloud or local hosted buckets

PreviousWebSocket endpointNextFile transport

Last updated 6 months ago

Was this helpful?

Overview

S3 support is provided using the MinIO Publisher Transport. Processed event data is saved as files in the MinIO, a high-performance, S3-compatible object storage system.

This setup supports cloud or local MinIO storage and uses predefined file formats for each bucket. Configurable options include:

  1. Custom schemas

  2. Batch size

  3. Object naming formats

  4. Retries

making Joule's application with MinIO suitable for scalable data storage solutions.

Driver details:

Examples & DSL attributes

The example configures a MinIOPublisher to save event data in a local S3-compatible bucket named marketdata.

Files are stored under the stocks object using a schema defined in marketDataSchema.avsc and are organised by date format. The file format is provided in AVRO.

minioPublisher:
  name: "marketdata-S3Publisher"

  connection:
    endpoint: "https://localhost"
    port: 9000
    credentials:
      access key: "XXXXXX"
      secret key: "YYYYYYYYYYYYYYY"

  bucket:
    bucketId: "marketdata"
    object name: "stocks"
    date format: "yyyyMMdd/HH"
    versioning: ENABLED
    retries: 3

  serializer:
    formatter:
      parquet formatter:
        schema: "./avro/marketDataSchema.avsc"
        temp directory: "./tmp"

  batchSize: 500000  

Attributes schema

Attribute
Description
Data Type
Required

name

Name of source stream

String

connection

Connection details

bucket

S3 bucket to ingest object data from

serializer

Serialisation configuration

batchSize

Number of eventsents to be processed and written to a single file.

Long Default: 1024

Connection attributes schema

Attribute
Description
Data Type
Required

endpoint

S3 service endpoint

String Default: https://localhost

port

Port the S3 service is hosted on

Integer Default:9000

url

Provide a fully qualified url endpoint, i.e. AWS, GCP, Azure urls. This is used over the endpoint setting if provided

URL String

region

Region where the bucket is to be accessed

String

tls

Use a TLS connection

Boolean Default: false

credentials

IAM access credentials

Credentials attributes schema

For non-production use cases the access/secret keys can be used to prove data ingestion functionality. When migrating to a production environment implement a provider plugin using the provided JouleProviderPlugin interface, see basic example below.

Attribute
Description
Data Type
Required

access key

IAM user access key

String

secret key

IAM user secret key

String

provider plugin

Custom implementation of credentials ideal for production level deployments

JouleProviderPlugin implementation

JouleProviderPlugin interface

The JWTCredentialsProvider implements the JouleProviderPlugin interface, providing methods for initialisation, validation and setting properties, but with no functionality implemented in this case.

It's a template for custom credential providers.

public class JWTCredentialsProvider implements JouleProviderPlugin {
    @Override
    public Provider getProvider() {
        return null;
    }

    @Override
    public void initialize() throws CustomPluginException {
    }

    @Override
    public void validate() throws InvalidSpecificationException {
    }

    @Override
    public void setProperties(Properties properties) {
    }
}

Bucket attributes schema

Attribute
Description
Data Type
Required

bucketId

Bucket name

String

object name

Object name to listen for events too

String

versioning

Ability to use object versioning. Valid values are either ENABLED or SUSPENDED

ENUM

bucket policy

Policy file location to be applied

String

partition by date

Write files using date partitioning

Boolean Default: true

date format

Directory date format to apply when partitioning by date

String Default: yyyyMMdd

custom directory

Custom directory to be applied after the date path. Useful when writing multiple objects to to same bucket and date partition but want an independent directory

String

headers

Object header information

Map<String,String>

user metadata

Applied user metadata to object

Map<String,String>

Additional resources

See section

See section

See section

See

Official

MinIO

io.minio:minio:8.5.4
MinIO documentation
Docker image
Serialisation Attributes
Connection attributes
Bucket attributes
Credentials section