This is a short post with a useful recommendation regarding performance when working with parameters and extracts in Tableau. One of the common uses of parameters is to create something that can be called "dynamic measures". An example:
Say we are working on the usual Superstore data, but this time we have exchange rate data as well:
We'll extract the data first. Then, what we want, is to be able to dynamically select which currency to display our sales. This can be achieved using a well-known method involving a parameter:
- We create a parameter to select which currency we want to display sales in:
- We can then create the calculated field that will apply the "magic":
case [Select Currency]
when 0 then [Sales EUR]
when 1 then [Sales EUR] * [Exchange USD]
when 2 then [Sales EUR] * [Exchange PLN]
when 3 then [Sales EUR] * [Exchange AUD]
when 4 then [Sales EUR] * [Exchange BRL]
- And there we are! We can dynamically choose which currency to display our sales in:
Now this is not quite the point of this post though. As mentioned, we're focusing on optimizing the performance of our sheet fed by data that was extracted. Take a moment and try to find out what is not optimal about this setup before continuing.
To find the answer, you should know something about how Tableau Extracts work. One of the things Tableau does when extracting or refreshing data (or when you optimize an extract), is materializing columns. That is, Tableau will pre-calculate all calculated fields it can at row-level, and store those in the extract to gain time not having to calculate them. There are a few exceptions to this rule. Tableau will not (be able to) materialize fields in these cases:
- Calculations containing table calculation
- Calculations that are non-deterministic (using functions with varying results, such as NOW() or with parameters, user functions (USERNAME(), ISMEMBEROF()) or LoD calculations (including FIXED))
- Calculations using external functions such as RAWSQL and R or Python
You'll be quick to see that the second item in this list will prevent Tableau from pre-calculating our field. In this case that wouldn't make a big difference, but in some, this may have a considerable impact. There's something very simple that can be done to make sure Tableau does materialize our calculation though. We can simply split it up into separate calculations that are stable, and only apply the parameter trick afterwards:
Now that these fields are stable, Tableau can materialize them to optimize performance! All we have to do is adapt the last calculation in the stream as well:
While this may seem like an inconspicuous change, the impact can be important. There are cases, when working on large amounts of data (say, 50M rows) where this will make a difference between a 30 second loading time or a 3 second loading time.
In summary: when working with parameters and when you already have extracted data, keep the calculations involving the parameters as far "down the stream" as possible, i.e. involve them as late as possible, so Tableau can materialize as many fields as possible.