I was introduced to control charts about 15 years ago working in the NHS looking for variation in a process, in the NHS it was about flow and patient volumes, however control charts came out of industry and there are more than one type. While working for a large confectionery business I was asked to make a P control chart - commence frantic googling of P control charts and the underlying maths.
In Short a P control chart is very useful when looking for rate variance over time where population is variable and population is directly impacting on expected rate - putting my NHS hat back on I instantly thought about A&E waiting time breaches. It is important to understand that P-charts are best used in binomial distributions (i.e. there are only two outcome options, pass or fail).
The maths behind the chart are you will be using an average and form of binomial distribution to calculate the control limits, weighting each period's control limits for population. In a stats text book it looks like this;
This means for a P control chart you need 1 extra piece of information over a normal control chart, population. Normally i could simply plot the frequency of an event occurring over time, put control limits in place and voila done, For the P chart we are adjusting the control limits for each time bucket by the total number of times the event could have happened.
For this example I am using a csv downloadable here - P chart example data
I simply have [Period], [Population] and [Error] as fields in this data set, to create the P control chart we;
|Create the base 'Rate' calculation, in this example it is [Error Rate]. This will be used to draw our line we want to compare the control limits against||
|Add in the upper and lower control limits.To make it clearer in the calculation I have used a parameter to set the number of population weighted standard deviations from the mean.||
To create the chart we want to see we need to add [Error Rate] to rows and the continuous month of [period] to columns displayed in a discrete way (I really should do a blog post on this). When you right click on the [Period] field on the columns shelf you need the options to look like this.
|Next we need to add each of the control line and average fields to the detail part of the marks card.|
|As in my first blog post about Control Charts we now need to add the reference lines to the chart, the only difference being the upper and lower control limits should be calculated per cell instead of table.|
|If you read my second blog post about Pimping your Control Charts you'll know I believe in making it very easy for someone to see if a point is an outlier, as such we create a calculated field to identify them.||
|Again as described in my second blog post in this series we now create a dual axis chart with using the [Error Rate] field and change its visual type to circle. We then use the [Outlier] calculated field on both colour and size.|
If you have followed these instructions using the supplied sample data you should have a chart very similar to the one below (i played around with size ranges and colour choices to give this chart.
If you have found this blog post useful please let me know, also if you want any help taking on any other complex maths in Tableau please do contact us.
Join the Data Jam
92% of companies fail to scale their analytics, which likely includes you. We have studied the patterns in hundreds of client engagements and cracked the code for a modern data stack that guarantees success.
We'll uncover this in the most original webinar you've attended this year.