## Scatter Plots & outliers

Scatter plots are a great way for us to see relationships between two measures. They help us to identify correlations and outliers. Outliers can be interesting, they show you areas where you are straying away from the norm. But they can also make the rest of your data very difficult to see!

Katie Liddle will let you see how these outliers can be hidden from your visualisation, or focus on them instead. ## Identifying the Outliers

Firstly we need to identify our outliers, to do this we will look at what values are considered within the normal range for each measure and see what falls outside of this range. To calculate the normal range I’ve used the common formula of three standard deviations from the mean. The two measures we will be using are Sales and Profit.

To find the mean and standard deviation we need to use a level of detail calculation. There are two options for this, depending on how you have broken the data on your scatter plot down. We then need to add or subtract three times the standard deviation from the mean to get our lower and upper limits. You don’t have to do both of these if you only want to remove the high outliers (use the upper limit) or low outliers (use the lower limit).

## Aggregated to a Dimension

If you are using a dimension on the ‘Detail’ card to aggregate your measures to this level you will need to use a fixed level of detail to that dimension, and then average these results, like below.

##### Upper Limit: ##### Lower Limit: ## Row Level Data

If you are using row level data you will still need to use a level of detail calculation but you won’t need to fix it at a dimension, like below.

##### Upper Limit: ##### Lower Limit: Do this for both measures on the scatter plot so you can calculate whether a point is an outlier for either measure.

Next we need to compare the values of our data to these upper and lower limits, again this is different depending on if you are aggregating up to a dimension or using row level data.

## Aggregated to a Dimension ## Row Level Data This will give you a true or false as to whether any of these conditions are true, and if they are this is an outlier!

## Setting up the User Filter

We’ve identified our outliers, now we need to give our users the options to filter the data. We are going to have three options:

• Show all values
• Exclude outliers
• Show only outliers

First step is to set up a parameter with these three options using the settings below. I have used integers to optimise the calculations but you can put the strings straight into the parameter if you prefer! Then use the parameter and the result of the outlier calculation in a calculation to use as a filter. This calculation checks the data against different criteria, depending on which parameter option you choose. Drag this calculation to the filter shelf and set it to only show the ‘True’ results.

Finally, show the parameter so you can control it and flick between the three options to see how the visualisation changes!

## Empower your organization withintuitive analytics

Tableau is designed to put the user first because data analysis should be about asking questions and not about learning software. With built-in visual best practices, Tableau enables limitless visual data exploration without interrupting the flow of analysis.

As the market-leading choice for modern business intelligence, the Tableau platform is known for taking any kind of data from almost any system and turning it into actionable insights with speed and ease. It’s as simple as dragging and dropping. Katie Liddle
Senior Analytics Consultant & UK Team Lead
Biztory ## Discover other Tableau content

###### Author 