HyperLogLog

Probabilistic counter for large or high cardinality datasets

Overview

HyperLogLog processor provides an implementation is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset.

Calculating the exact cardinality of the distinct elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets.

Example & DSL attributes

hyperloglog:
  name: distinctIMSICounter
  hash type: HASH64
  log2m: 3
  registry width: 3
  fields:
    - imsi

Response

The processor adds a distinctIMSICounter attribute with the following result

distinctIMSICounter:
  505010000011111 : 101
  904130454090869 : 78

Attributes schema

Additional Information

Last updated