# HyperLogLog

## Overview

HyperLogLog processor provides an implementation is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset.

Calculating the exact cardinality of the distinct elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets.

## Example & DSL attributes

```yaml
hyperloglog:
  name: distinctIMSICounter
  hash type: HASH64
  log2m: 3
  registry width: 3
  fields:
    - imsi
```

### Response

The processor adds a `distinctIMSICounter` attribute with the following result

```
distinctIMSICounter:
  505010000011111 : 101
  904130454090869 : 78
```

### Attributes schema

<table><thead><tr><th width="193">Attribute</th><th width="217">Description</th><th width="219">Data Type</th><th data-type="checkbox">Required</th></tr></thead><tbody><tr><td>name</td><td>name of the counter</td><td>String</td><td>true</td></tr><tr><td>hash type</td><td><p>Hashing algorithm to be applied to event field.</p><p></p><p>Supported types:</p><p>DEFAULT (object hashcode), HASH32, HASH64</p></td><td><p>Long</p><p>Default: DEFAULT</p></td><td>false</td></tr><tr><td>log2m</td><td>the number of probabilistic HLL registers</td><td><p>Integer</p><p>Default: 11</p></td><td>false</td></tr><tr><td>registry width</td><td><p>The size (width) each register in bits. Supported range between 1 to 8 bits.</p><p></p></td><td><p>Integer</p><p>Default: 5</p></td><td>false</td></tr><tr><td>fields</td><td>List of fields to perform distinct counts</td><td>List</td><td>true</td></tr></tbody></table>

## Additional Information

* Article about [HyperLogLog](https://en.wikipedia.org/wiki/HyperLogLog) for more information
