Feature engineering
Decorate a feature vector with enriched features specific to the deployed model
“Coming up with features is difficult, time-consuming, requires expert knowledge. 'Applied machine learning' is basically feature engineering.” — Prof. Andrew Ng.
Objective
Joule provides a feature engineering processor that enables users to define how features are to be created ready for predictive analytics use cases.
The processor generates for each declared feature field an engineered value. Two methods are supported:
raw
compute values using custom expression and plugins
On completion a feature map is generated with all the required features and placed in the StreamEvent ready for the next processor in the pipeline.
Example & DSL attributes
feature engineering:
name: retailProfilingFeatures
versioned: true
features:
as values:
- location_code
- store_id
compute:
spend_ratio:
scripting:
macro:
expression: 1 - spend/avg_spend
language: js
variables:
avg_spend: 133.78
age:
function:
age binning:
source field: date_of_birth
day:
function:
day-of-week transform:
source field: dateTop level attributes
name
Name feature set which is used for a predicting model
String
versioned
A boolean flag to apply a unique version identifier to the resulting feature map
Boolean
Default: true
features
List of supported feature functions
List
feature engineering:
name: retailProfilingFeatures
versioned: true
features:
as values:
- location_code
- store_idAttributes schema
as values
List of event fields whose value will be copied in to the feature map without any changes
List
compute
List of supported feature functions mapped to output variables to be executed using the passed event
List
feature engineering:
...
features:
as values:
- event field1
- event field2
compute:
output_field:
scripting:
...
other_output_field:
function:
plugin_name:
... < plugin setting > ...
event fields:
- f
variables:
varname: valueSupported feature engineering
As value
This is most basic function whereby the StreamEvent field value is copied in to the feature map.
Example
The following example will copy the location_code and store_id values directly in to the feature map.
feature engineering:
...
features:
as values:
- location_code
- store_idExpression based
Joule core provides the ability to deploy declarative expressions using the custom analytics processor. This has been reused within the context of feature engineering to enable users to define custom calculations within the DSL.
Example
The following example computes per event, the spend ration based utilising a Javascript expression.
feature engineering:
...
features:
compute:
spend_ratio:
scripting:
macro:
expression: 1 - spend/avg_spend
variables:
avg_spend: 133.78Custom Plugins
Developers can extend the feature engineering capabilities by extending the AbstractFeatureEngineeringFunction interface.
See CustomUserPlugin API documentation for further details.
Example
The following example computes per event, the scale price based utilising the MinMax algorithm.
This example implements the AbstractFeatureEngineeringFunction class.
feature engineering:
...
features:
compute:
scaled_price:
function:
minmax scaler:
source field: price
variables:
min: 10.00
max: 12.78Available options
Joule provides a small set of OOTB feature engineering functions.
Versioning
Every feature map created is versioned using a random UUID.
The version is place directly in to the resulting map and accessed using the feature_version key.
Last updated
Was this helpful?