Joule
Search
K
Comment on page

HyperLogLog

HyperLogLog processor provides an implementation is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset. Calculating the exact cardinality of the distinct elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets.

Example

hyperloglog:
name: distinctIMSICounter
hash type: HASH64
log2m: 3
registry width: 3
fields:
- imsi

Attributes

Attribute
Description
Data Type
Required
name
name of the counter
String
hash type
Hashing algorithm to be applied to event field.
Supported types:
DEFAULT (object hashcode), HASH32, HASH64
Long
Default: DEFAULT
log2m
the number of probabilistic HLL registers
Integer
Default: 11
registry width
The size (width) each register in bits. Supported range between 1 to 8 bits.
Integer
Default: 5
fields
List of fields to perform distinct counts
List

Computation

The distinct count computation occurs for every see with the result being added to the output event as a named map of field and associated counts using the provided name.

Additional Information

See wikipedia for more information