Transform
Generate analytics-ready features from data
Objective
Feature engineering prepares raw data for analysis by creating new, insightful features:
Log transform Applies a log function to positive values, commonly used to handle skewed data.
Day of week transform Converts a date to its day of the week as a number (1-7).
Day binning Categorises a date as a weekday (1) or weekend (2).
Age binning Categorises ages into specified age ranges for easier analysis.
Each method produces targeted features, simplifying data for analytics.
Log transform
Log transformation is a data transformation method in which it replaces each variable x with a log(x) where x is a positive number and greater than zero
Example
feature engineering:
  ...
  features:
    compute:
      log_spend:
        function:
          log transform:
            source field: spendAttributes schema
source field
The column to perform the calculation upon
Double
Day of week transform
Provide the day of week from the passed date object to a number between 1 and 7, where start of week is Monday = 1.
Supported date objects:
java.time.LocalDatejava.sql.Dateorg.joda.time.DateTime
Example
feature engineering:
  ...
  features:
    compute:
      day_of_week:
        function:
          day-of-week transform:
            source field: dateAttributes schema
source field
The column to perform the calculation upon
Double
Day binning
Categorise a day into one of two categories following the Gregorian calendar.
Weekday (Mon-Fri) = 1
Weekends (Sat-Sun) = 2
Example
feature engineering:
  ...
  features:
    compute:
      day_bin:
        function:
          day binning:
            source field: dateAttributes schema
source field
The column to perform the calculation upon
Double
Age binning
Categorise a passed age in a pre-configured age bin as either an integer or date object.
Example
feature engineering:
  ...
  features:
    compute:
        age_bin:
          function:
            age binning:
              bins: [ [0,18], [19,21], [22, 40], [41, 55], [56,76]]
              base date: 2023-01-01
              source field: current_ageAttributes schema
bins
Array of age bins to use. Default bins are set to 0-9, 10-19,...110-119
Int[][]
as date
Passed event field is a supported date object
Supported Data classes:
java.time.LocalDatejava.sql.Date
Boolean
Default: false
base date
Provide a date which is used to calculate the age. Default set to the date process is started
String
Format: YYYY-MM-DD
source field
The column to perform the calculation upon
Double
Last updated
Was this helpful?