FieldTokenizer API

Tokenisation is typically used in many data engineering processing pipelines to extract key data elements from aggregate values.

Stream processing also using the same functionality to drive further processing requirements such as external data enrichments, decoding composite value in to component parts to drive complex logic paths etc. Joule provides an API to define a tokenisation process.

API

/**
 * Field tokenizer that will provide a map of tokenized attributes
 * from the passed object
 */
public interface FieldTokenizer {
    Optional<Map<String, Object>> decode(Object value);
}

Code example

Below is a straightforward example of extracting a field value by using a comma to separate the required values.

import com.fractalworks.streams.sdk.referenceData.FieldTokenizer;
import java.util.HashMap;
import java.util.Map;
import java.util.Optional;

public class LatitudeLongitudeDecoder implements FieldTokenizer {

    public LatitudeLongitudeDecoder() {}

    @Override
    public Optional<Map<String, Object>> decode(Object o) {
        if( o instanceof String) {
            String[] co = ((String) o).split(",");
            Map<String, Object> map = new HashMap<>();
            map.put("latitude", Float.parseFloat(co[1]));
            map.put("longitude", Float.parseFloat(co[0]));
            return Optional.of(map);
        }
        return Optional.empty();
    }
}

Last updated