# Bucketing

## Objective

The objective of this page is to explain how **date** and **number** values in a `StreamEvent` can be obfuscated using variance tolerances to protect sensitive information, such as date of birth, salary and age.

This method, similar to blurring, adjusts values within a specified range while preserving the original distribution and accuracy, ensuring privacy without compromising data utility.

{% hint style="info" %}
This is more akin to blurring rather than obfuscation
{% endhint %}

## Date variance

Each date value for a specified field will be varied by a **random number of days**, whilst maintaining the original variance, range and distribution.

This can useful where it would otherwise be possible to identify individuals by an exact match, such as date of birth.

### Example

This code defines an **obfuscation** strategy named `dateBucketing` applied to the `dateOfBirth` field.

It uses **date bucketing** with a **variance of 30**, meaning that the actual date of birth will be obscured by randomly shifting the date within a 30-day range.

This protects the exact date while maintaining some level of accuracy.

```yaml
obfuscation:
  name: dateBucketing
  fields:
    dateOfBirth:
      date bucketing:
        variance: 30
```

### Attributes schema

<table><thead><tr><th width="193">Attribute</th><th width="217">Description</th><th width="219">Data Type</th><th data-type="checkbox">Required</th></tr></thead><tbody><tr><td>variance</td><td>Maximum number of days to vary the source date</td><td><p>Integer</p><p>Default: 120</p></td><td>false</td></tr></tbody></table>

## Number variance

Each number can be varied by a **random percentage**, whilst maintaining the original variance, range and distribution.

This can useful where it would otherwise be possible to identify individuals salary by an exact match.

### Example

This code defines an **obfuscation** strategy called `numberBucketing` applied to two fields: `salary` and `age`.

1. <mark style="color:green;">**salary**</mark>\
   The value of `salary` will be obscured by a **variance of 0.25**, meaning the value can fluctuate by 25% up or down.
2. <mark style="color:green;">**age**</mark>\
   The value of `age` will be obscured by a **variance of 0.10**, meaning the age can fluctuate by 10% up or down.

This technique hides the exact values while maintaining general data accuracy within the specified variance.

```yaml
obfuscation:
  name: numberBucketing
  fields:
    salary:
      number bucketing:
        variance: 0.25
    age: 
      number bucketing:
        variance: 0.10
```

### Attributes schema

<table><thead><tr><th width="193">Attribute</th><th width="217">Description</th><th width="219">Data Type</th><th data-type="checkbox">Required</th></tr></thead><tbody><tr><td>variance</td><td>Variance multiplier to be applied to random masking process</td><td><p>Double</p><p>Default: 0.15</p></td><td>false</td></tr></tbody></table>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.fractalworks.io/joule/components/processors/transformation/obfuscation/bucketing.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
