Tokenisation is typically used in many data engineering processing pipelines to extract key data elements from aggregate values.
Stream processing also using the same functionality to drive further processing requirements such as external data enrichments, decoding composite value in to component parts to drive complex logic paths etc. Joule provides an API to define a tokenisation process.
API
/** * Field tokenizer that will provide a map of tokenized attributes * from the passed object */publicinterfaceFieldTokenizer {Optional<Map<String,Object>> decode(Object value);}
Code example
Below is a straightforward example of extracting a field value by using a comma to separate the required values.
importcom.fractalworks.streams.sdk.referenceData.FieldTokenizer;importjava.util.HashMap;importjava.util.Map;importjava.util.Optional;publicclassLatitudeLongitudeDecoderimplementsFieldTokenizer {publicLatitudeLongitudeDecoder() {} @OverridepublicOptional<Map<String,Object>> decode(Object o) {if( o instanceof String) {String[] co = ((String) o).split(",");Map<String,Object> map =newHashMap<>();map.put("latitude",Float.parseFloat(co[1]));map.put("longitude",Float.parseFloat(co[0]));returnOptional.of(map); }returnOptional.empty(); }}