File Watcher

Joule offers file watcher that processes large files using stream processing

Configuration Guide

The file watcher excels in efficiently processing large files once they have been fully received. Files are processed once an new file event has been detected on the watch directory. Following processing, they are then moved to the local processed directory, distinguished by a completion timestamp.

Example file based configuration

file watcher:
  name: nasdaq_quotes_file
  topic: quotes
  file name: nasdaq.parqet
  file format: PARQUET
  watch dir: nasdaq/dowloads
  processed dir: nasdaq/processed

Core Attributes

Available configuration parameters

AttributeDescriptionData TypeRequired

topic

User defined topic to be used as the final endpoint component

String

file name

Name of file to process

String

file format

Expected file format to process. Defined as a enumeration, see below for supported file types.

Enum Default: PARQUET

watch dir

User defined directory for files received

String

processed dir

Location where processed files are place upon completion

String Default: processed

Supported File Types

  • PARQUET

  • ARROW_IPC

  • ORC

  • CSV

  • JSON

Last updated