Transform
Generate analytics-ready features from data
Objective
Feature engineering prepares raw data for analysis by creating new, insightful features:
Log transform Applies a log function to positive values, commonly used to handle skewed data.
Day of week transform Converts a date to its day of the week as a number (1-7).
Day binning Categorises a date as a weekday (1) or weekend (2).
Age binning Categorises ages into specified age ranges for easier analysis.
Each method produces targeted features, simplifying data for analytics.
Log transform
Log transformation is a data transformation method in which it replaces each variable x with a log(x) where x is a positive number and greater than zero
Example
feature engineering:
...
features:
compute:
log_spend:
function:
log transform:
source field: spend
Attributes schema
source field
The column to perform the calculation upon
Double
Day of week transform
Provide the day of week from the passed date object to a number between 1 and 7, where start of week is Monday = 1.
Supported date objects:
java.time.LocalDate
java.sql.Date
org.joda.time.DateTime
Example
feature engineering:
...
features:
compute:
day_of_week:
function:
day-of-week transform:
source field: date
Attributes schema
source field
The column to perform the calculation upon
Double
Day binning
Categorise a day into one of two categories following the Gregorian calendar.
Weekday (Mon-Fri) = 1
Weekends (Sat-Sun) = 2
Example
feature engineering:
...
features:
compute:
day_bin:
function:
day binning:
source field: date
Attributes schema
source field
The column to perform the calculation upon
Double
Age binning
Categorise a passed age in a pre-configured age bin as either an integer or date object.
Example
feature engineering:
...
features:
compute:
age_bin:
function:
age binning:
bins: [ [0,18], [19,21], [22, 40], [41, 55], [56,76]]
base date: 2023-01-01
source field: current_age
Attributes schema
bins
Array of age bins to use. Default bins are set to 0-9, 10-19,...110-119
Int[][]
as date
Passed event field is a supported date object
Supported Data classes:
java.time.LocalDate
java.sql.Date
Boolean
Default: false
base date
Provide a date which is used to calculate the age. Default set to the date process is started
String
Format: YYYY-MM-DD
source field
The column to perform the calculation upon
Double
Last updated
Was this helpful?