HyperLogLog
Probabilistic counter for large or high cardinality datasets
Overview
HyperLogLog processor provides an implementation is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset.
Calculating the exact cardinality of the distinct elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets.
Example & DSL attributes
Response
The processor adds a distinctIMSICounter
attribute with the following result
Attributes schema
name
name of the counter
String
hash type
Hashing algorithm to be applied to event field.
Supported types:
DEFAULT (object hashcode), HASH32, HASH64
Long
Default: DEFAULT
log2m
the number of probabilistic HLL registers
Integer
Default: 11
registry width
The size (width) each register in bits. Supported range between 1 to 8 bits.
Integer
Default: 5
fields
List of fields to perform distinct counts
List
Additional Information
Article about HyperLogLog for more information
Last updated