Feature Engineering
“Coming up with features is difficult, time-consuming, requires expert knowledge. ‘Applied machine learning is basically feature engineering.” — Prof. Andrew Ng.
Joule provides a feature engineering processor that enables users to define how features are to be created ready for predictive analytics use cases.
The processor generates for each declared feature field an engineered value. Two methods are supported; raw and compute values using custom expression and plugins. On completion a feature map is generated will all the required features and placed in the StreamEvent ready for the next processor in the pipeline.
feature engineering:
name: retailProfilingFeatures
versioned: true
features:
as values:
- location_code
- store_id
compute:
spend_ratio:
scripting:
macro:
expression: 1 - spend/avg_spend
language: js
variables:
avg_spend: 133.78
age:
function:
age binning:
source field: date_of_birth
day:
function:
day-of-week transform:
source field: date
Attribute | Description | Data Type | Required |
---|---|---|---|
name | Name feature set which is used for a predicting model | String | |
features | List of supported feature functions | List | |
versioned | A boolean flag to apply a unique version identifier to the resulting feature map | Boolean Default: true |
The features attribute provide two key elements,
as value
and compute.
features:
as values:
- event field1
- event field2
compute:
output_field:
scripting:
...
other_output_field:
function:
plugin_name:
... < plugin setting > ...
event fields:
- f
variables:
varname: value
Attribute | Description | Data Type | Required |
---|---|---|---|
as values | List of event fields whose value will be copied in to the feature map without any changes. | List | |
compute | List of supported feature functions mapped to output variables to be executed using the passed event. | List |
Note: Either one of the attributes must be defined
This is most basic function whereby the StreamEvent field value is copied in to the feature map.
The below example will copy the defined event field values directly in to the feature map.
features:
as values:
- location_code
- store_id
Joule core provides the ability to deploy declariative expressions using the custom analytics processor. This has been reused within the context of feature engineering to enable users to define custom calculations within the DSL.
The below example computes, per event, the spend ration based upon a Javascript expression.
features:
compute:
spend_ratio:
scripting:
macro:
expression: 1 - spend/avg_spend
variables:
avg_spend: 133.78
Developers can extend the feature engineering capabilities by extending the
AbstractFeatureEngineeringFunction
interface. See CustomUserPlugin API documentation for further details.The below example computes, per event, the scale price based upon the MinMax algorithm. This example implements the
AbstractFeatureEngineeringFunction
class.features:
compute:
scaled_price:
function:
minmax scaler:
source field: price
variables:
min: 10.00
max: 12.78
Joule provides a small set of OOTB feature engineering functions.
Every feature map created is versioned using a random UUID. The version is place directly in to the resulting map and accessed using the
feature_version
key.
Last modified 4mo ago