# Feature engineering

> “Coming up with features is difficult, time-consuming, requires expert knowledge. 'Applied machine learning' is basically feature engineering.” — Prof. Andrew Ng.

## Objective

**Joule provides** a feature engineering processor that enables users to define how features are to be created ready for predictive analytics use cases.

The processor generates for each declared feature field an engineered value. Two methods are supported:

* raw
* compute values using custom expression and plugins

On completion a feature map is generated with all the required features and placed in the `StreamEvent` ready for the next processor in the pipeline.

## Example & DSL attributes

```yaml
feature engineering:
  name: retailProfilingFeatures
  versioned: true
  features:
    as values:
      - location_code
      - store_id

    compute:
      spend_ratio:
        scripting:
          macro:
            expression: 1 - spend/avg_spend
            language: js
            variables:
              avg_spend: 133.78
      age:
        function:
          age binning:
            source field: date_of_birth
      day:
        function:
          day-of-week transform:
            source field: date
```

### Top level attributes

<table><thead><tr><th width="155">Attribute</th><th width="358">Description</th><th width="133">Data Type</th><th data-type="checkbox">Required</th></tr></thead><tbody><tr><td>name</td><td>Name feature set which is used for a predicting model</td><td>String</td><td>true</td></tr><tr><td>versioned</td><td>A boolean flag to apply a unique version identifier to the resulting feature map</td><td><p>Boolean</p><p>Default: true</p></td><td>false</td></tr><tr><td>features</td><td>List of supported feature functions</td><td>List</td><td>true</td></tr></tbody></table>

{% hint style="info" %}
The features attribute provide two key elements, `as value` and `compute`.  Either one of the attributes must be defined.
{% endhint %}

```yaml
feature engineering:
  name: retailProfilingFeatures
  versioned: true
  features:
    as values:
      - location_code
      - store_id
```

### Attributes schema

<table><thead><tr><th width="155">Attribute</th><th width="358">Description</th><th width="133">Data Type</th><th data-type="checkbox">Required</th></tr></thead><tbody><tr><td>as values</td><td>List of event fields whose value will be copied in to the feature map without any changes</td><td>List</td><td>false</td></tr><tr><td>compute</td><td>List of supported feature functions mapped to output variables to be executed using the passed event</td><td>List</td><td>false</td></tr></tbody></table>

```yaml
feature engineering:
  ...
  features:
    as values:
      - event field1
      - event field2
      
    compute:
      output_field:
        scripting:
          ...
      other_output_field:
        function:
          plugin_name:
            ... < plugin setting > ...
            event fields:
              - f
            variables:
              varname: value
```

## Supported feature engineering

### As value

This is most basic function whereby the `StreamEvent` field value is copied in to the feature map.

#### Example

The following example will copy the `location_code` and `store_id` values directly in to the feature map.

```yaml
feature engineering:
  ...
  features:
    as values:
      - location_code
      - store_id
```

### Expression based

Joule core provides the ability to deploy declarative expressions using the [custom analytics processor](https://docs.fractalworks.io/joule/components/analytics/analytic-tools). This has been reused within the context of feature engineering to enable users to define custom calculations within the DSL.

### Example

The following example computes per event, the spend ration based utilising a Javascript expression.&#x20;

```yaml
feature engineering:
  ...
  features:
    compute:
      spend_ratio:
        scripting:
          macro:
            expression: 1 - spend/avg_spend
            variables:
              avg_spend: 133.78
```

## Custom Plugins

Developers can extend the feature engineering capabilities by extending the `AbstractFeatureEngineeringFunction` interface.

See [CustomUserPlugin API documentation](https://docs.fractalworks.io/joule/developer-guides/builder-sdk/transformation-api) for further details.

### Example

The following example computes per event, the scale price based utilising the `MinMax` algorithm.

This example implements the `AbstractFeatureEngineeringFunction` class.

```yaml
feature engineering:
  ...
  features:
    compute:
      scaled_price:
        function:
          minmax scaler:
            source field: price
            variables:
              min: 10.00
              max: 12.78
```

## Available options

Joule provides a small set of OOTB feature engineering functions.

<table data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td><mark style="color:orange;"><strong>Scripting</strong></mark></td><td>Define custom analytics with declarative expressions</td><td></td><td><a href="feature-engineering/scripting">scripting</a></td></tr><tr><td><mark style="color:orange;"><strong>Scaling</strong></mark></td><td>Normalise data with various scaling methods</td><td></td><td><a href="feature-engineering/scaling">scaling</a></td></tr><tr><td><mark style="color:orange;"><strong>Transform</strong></mark></td><td>Generate analytics-ready features from data</td><td></td><td><a href="feature-engineering/transform">transform</a></td></tr></tbody></table>

## Versioning

Every feature map created is versioned using a random UUID.

The version is place directly in to the resulting map and accessed using the `feature_version` key.
