File watcher
Process large files using stream processing
Overview
Joule offers file watcher that processes large files using stream processing.
The file watcher excels in efficiently processing large files once they have been fully received. Files are processed once a new file event has been detected on the watch directory.
After processing, these files are then moved to the local processed directory, distinguished by a completion timestamp.
Examples & DSL attributes
This example configures a file watcher named nasdaq_quotes_file
to monitor the nasdaq/downloads
directory for new PARQUET
files (e.g., nasdaq.parquet
).
After processing the file, it moves it to the nasdaq/processed
directory and publishes the data to the quotes
topic.
Attributes schema
topic
User defined topic to be used as the final endpoint component
String
file name
Name of file to process
String
file format
Enum Default: PARQUET
watch dir
User defined directory for files received
String
processed dir
Location where processed files are place upon completion
String Default: processed
Supported File Types
PARQUET
ARROW_IPC
ORC
CSV
JSON
Last updated