# File watcher

## Overview

Joule offers file watcher that **processes large files** using **stream processing**.

The file watcher excels in efficiently processing large files once they have been fully received. Files are processed once a new file event has been detected on the watch directory.

After processing, these files are then moved to the local processed directory, distinguished by a completion timestamp.

## Examples & DSL attributes

This example configures a **file watcher** named `nasdaq_quotes_file` to monitor the `nasdaq/downloads` directory for new `PARQUET` files (e.g., `nasdaq.parquet`).

After processing the file, it moves it to the `nasdaq/processed` directory and publishes the data to the `quotes` topic.

```yaml
file watcher:
  name: nasdaq_quotes_file
  topic: quotes
  file name: nasdaq.parqet
  file format: PARQUET
  watch dir: nasdaq/dowloads
  processed dir: nasdaq/processed
```

### Attributes schema

<table><thead><tr><th width="178">Attribute</th><th width="281">Description</th><th width="338">Data Type</th><th data-type="checkbox">Required</th></tr></thead><tbody><tr><td>topic</td><td>User defined topic to be used as the final endpoint component</td><td>String</td><td>true</td></tr><tr><td>file name</td><td>Name of file to process</td><td>String</td><td>true</td></tr><tr><td>file format</td><td>Expected file format to process. Defined as a enumeration, <a href="#supported-file-types">see below for supported file types</a></td><td>Enum<br>Default: PARQUET</td><td>true</td></tr><tr><td>watch dir</td><td>User defined directory for files received</td><td>String</td><td>true</td></tr><tr><td>processed dir</td><td>Location where processed files are place upon completion</td><td>String<br>Default: processed</td><td>false</td></tr></tbody></table>

### Supported File Types

* PARQUET
* ARROW\_IPC
* ORC
* CSV
* JSON
