HyperLogLog

HyperLogLog processor provides an implementation is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset. Calculating the exact cardinality of the distinct elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets.

Example

hyperloglog:
  name: distinctIMSICounter
  hash type: HASH64
  log2m: 3
  registry width: 3
  fields:
    - imsi

Attributes

AttributeDescriptionData TypeRequired

name

name of the counter

String

hash type

Hashing algorithm to be applied to event field.

Supported types:

DEFAULT (object hashcode), HASH32, HASH64

Long

Default: DEFAULT

log2m

the number of probabilistic HLL registers

Integer

Default: 11

registry width

The size (width) each register in bits. Supported range between 1 to 8 bits.

Integer

Default: 5

fields

List of fields to perform distinct counts

List

Computation

The distinct count computation occurs for every see with the result being added to the output event as a named map of field and associated counts using the provided name.

Additional Information

See wikipedia for more information

Last updated