28 November, 2023
min reading time
I help you get insights from your data. Easier. Faster.
We’ve all had the question where an organisation wants to give insight into their data to a broader audience, but without breaking any privacy guidelines. Usually this question consists of words like “hide”, “mask” or “obscure”.
My first reaction would be to point to Row Level Security where named users are given access to parts of the data they’re allowed to see. But this doesn’t work for the situation where we have a lot of non-named users (guests) that are allowed to see the big picture, but shouldn’t be able to see the data once they drill down to a certain subgroup of the data or a sensitivity threshold has been met.
Unfortunately, the answer to this question isn’t that straightforward as Tableau doesn’t have native functionality to do any of this hiding, masking or obscuring of data dynamically.
So let’s see if we can create that ourselves!
The client asks you to mask data in a dashboard based on certain rules. The rules are dynamic and based on how the data is partitioned and filtered in a visualisation.
These are the requirements:
General rules:
There are two ways to mask data in Tableau and it depends on your data which of these two you might go for:
So, do you have NULLS in your data? Go with option 2.
Do you have negative numbers in your data? Go with option 1.
I went with option 2 as I didn’t have negatives in the data.
Since these calculations are based on the partitioning ánd filtering in the viz, you cannot pre-calculate this and thus you need either an LOD (EXCLUDE, not FIXED!) or a Table Calc. I went with the LOD route.
Let's say we have this table and we need masking for every region/year combination that has a quantity of less than 1400. This is the first privacy rule and is calculated along the red horizontal lines.
Also we need masking if the Office Supplies category differs less than 750 from the total of the region/year combination because Office Supplies is a sensitive category. This is the difference between the two columns marked in blue.
Then, we need 3 calculated fields:
This first rule calculates the total of the row (region/year combination) by excluding the category from the level of detail. If the row total is below 1400, then we set this value to -999999.
This second rule calculates the difference between the Office Supplies column and the total of the row (region/year combination). If this difference is smaller than 750, then we set this value to -999999.
Then we need another calculation to combine these previous two calculations and check if one of them triggered the masking (or: if one of them is a negative number!). An extra complication here is that we also need to account for the (sub)total rows.
When we then display these three calculations, it becomes clearer what we’re actually doing:
The final result then looks like this:
I help you get insights from your data. Easier. Faster.
Read more articles of this author