File watcher
Process large files using stream processing
Last updated
Process large files using stream processing
Last updated
Joule offers file watcher that processes large files using stream processing.
The file watcher excels in efficiently processing large files once they have been fully received. Files are processed once a new file event has been detected on the watch directory.
After processing, these files are then moved to the local processed directory, distinguished by a completion timestamp.
This example configures a file watcher named nasdaq_quotes_file
to monitor the nasdaq/downloads
directory for new PARQUET
files (e.g., nasdaq.parquet
).
After processing the file, it moves it to the nasdaq/processed
directory and publishes the data to the quotes
topic.
Attribute | Description | Data Type | Required |
---|---|---|---|
PARQUET
ARROW_IPC
ORC
CSV
JSON
topic
User defined topic to be used as the final endpoint component
String
file name
Name of file to process
String
file format
Expected file format to process. Defined as a enumeration, see below for supported file types
Enum Default: PARQUET
watch dir
User defined directory for files received
String
processed dir
Location where processed files are place upon completion
String Default: processed