## Featured Stories

Filter By Categories

In this series about Control Charts in Tableau, we have learnt the basics about creating them, understood the extra calculations to identify a single outlier and parameterise the control limit calculations and learnt about an alternative type of control chart (P-Chart).

In this blog post we are going a step further than identifying a single outlier, we will implement a series of calculations that identify signals in the data.

Control charts come from a discipline called Statistical Process Control and Six Sigma. Six sigma standing for the  3 Standard Deviations below and above the mean that the rule set uses, the rules are in place to identify a number of different 'signals' in time series data.

The original rules used for signal detection were put in place by Western Electric in 1956, to ensure their employees interpreted control charts in a uniform way. In 1984 these rules were updated by Lloyd S Nelson

The calculations involved for the most part require the nesting of logic statements within Table Calculations (that are themselves nested), so what I will do is start from the logic and move outwards adding each layer.

If we take Nelson Rule 2 - '9 or more points consecutively above or below the mean.'

1. The Basic logic calculation is; 'Is the point being evaluated above the mean'

`IIF(SUM([Value]) > WINDOW_AVG(SUM([Value])), 1, 0 )`

From now on i'll refer to the above statement as [Logic1], as the scenario requires us to test for 9 above or 9 below lets assume I have also created a second statement to test the 9 below and call it [Logic2].

2. Next we need to check that at least 9 points are consecutively true;

`WINDOW_SUM( [Logic1] ) , -9+1, 0) = 9`

In Table Calculations term we are taking a range of 8 points before the current one and the current one (making 9), this is the '-9+1,0)' element of the formula, I have left it as -9+1 for my own benefit so i am clear when reusing the calculation that the range goes where the 9 is. Adding together the output of the logic statement that gives us a 1 or 0. We want this to always be true therefore the sum must equal 9.

3. If we need to apply multiple logic statements, as in this case we can wrap it into a larger IIF function;

`IIF( index()>9 AND(WINDOW_SUM( [Logic1] ), -9+1, 0) = 9OR WINDOW_SUM( [Logic2] ), -9+1, 0) = 9), 1, 0 )`

In this case we have [Logic1] where 9 consecutive points are above the mean or [Logic 2] where 9 consecutive points are below the mean, both wrapped in the window_sum looking back 9 periods. In addition i have added index()>9 so that the output is always 1 or 0, as normally if a table calculation can't complete due not enough data points you get a null, so in this case the statement can't be properly completed until at least the 9th value exists. This statement will give you a 1 on the last point in the series when the logic statements are all true.

Now we could stop here and simply identify the last point in a sequence that meets a criteria - but why not add in the ability to highlight all of them?

4. To do this we wrap the above calculations in another table calculation.

`WINDOW_MAX( [Stacked Logic] , 0, 9-1 )`

By wrapping the stacked logic in a window_max and again with a defined range for that window max to be applied ',0,9-1)' - read as this value to 9-1 values ahead, we are able to duplicate the 1 from the last point in the range back to all the members of the range. (The range in this case is 9 points).

So that is it in this example calculation, below are the calculations written out in full for each of the Nelson Rules for process control charts, using the above structural information hopefully you'll be able to deconstruct the formulae.

If not and you just want them to work, you can copy and paste them replacing [Value] with the measure you are interested in, and assuming your charts are horizontal timelines set your table calculations to 'Table Across' - this should be the default setting.

 Rule Purpose Calculation Nelson Rule 1 (Western Electric Rule 1) One point is more than 3 standard deviations from the Mean. One period is grossly out of control. `INT(SUM([Value]) > (WINDOW_AVG(SUM([Value]))+3*WINDOW_STDEV(SUM([Value])))OR SUM([Value]) < (WINDOW_AVG(SUM([Value]))-3*WINDOW_STDEV(SUM([Value])))` Nelson Rule 2 (Western Electric Rule 4) Nine (or more) points in a row are on the same side of the mean. Some prolonged bias exists. `WINDOW_MAX( IIF( index()>9 AND(WINDOW_SUM( IIF(SUM([Value]) > WINDOW_AVG(SUM([Value])), 1, 0 ) , -9+1, 0) = 9OR WINDOW_SUM( IIF(SUM([Value]) < WINDOW_AVG(SUM([Value])), 1, 0 ) , -9+1, 0) = 9) , 1, 0 ), 0, 9-1 )` Nelson Rule 3 Six (or more) points in a row are continually increasing (or decreasing). A trend exists. `WINDOW_MAX( IIF( INDEX()>6 and (WINDOW_SUM( IIF(SUM([Value]) < LOOKUP( SUM([Value]), -1 ) , 1, 0 ) , -6+1, 0) = 6OR WINDOW_SUM( IIF(SUM([Value]) > LOOKUP( SUM([Value]), -1 ) , 1, 0 ) , -6+1, 0) = 6) , 1, 0 ), 0, 6 )` Nelson Rule 4 Fourteen (or more) points in a row alternate in direction, increasing then decreasing. This much oscillation is beyond noise.Note that the rule is concerned with directionality only. The position of the mean and the size of the standard deviation have no bearing. `WINDOW_MAX(IIF( INDEX()>14 AND WINDOW_SUM(IIF( (SUM([Value]) < LOOKUP( SUM([Value]), -1 ) AND LOOKUP( SUM([Value]), -1 )> LOOKUP( SUM([Value]), -2 ))OR (SUM([Value]) > LOOKUP( SUM([Value]), -1 ) AND LOOKUP( SUM([Value]), -1 )< LOOKUP( SUM([Value]), -2 )) , 1, 0 ) , -14+1, 0) = 14 , 1, 0 ), 0, 14 )` Nelson Rule 5 (Western Electric Rule 2) Two (or three) out of three points in a row are more than 2 standard deviations from the mean in the same direction. There is a medium tendency for samples to be partially out of control. `WINDOW_MAX(IIF( INDEX()>3 AND(WINDOW_SUM(IIF( SUM([Value]) > (WINDOW_AVG(SUM([Value]))+2*WINDOW_STDEV(SUM([Value]))), 1, 0 ) , -3+1, 0) >= 2OR WINDOW_SUM(IIF( SUM([Value]) < (WINDOW_AVG(SUM([Value]))-2*WINDOW_STDEV(SUM([Value]))), 1, 0 ) , -3+1, 0) >= 2) , 1, 0 ), 0, 3-1 )// mutiplying by 2 if this point is above boundary* IIF( SUM([Value]) > (WINDOW_AVG(SUM([Value]))+2*WINDOW_STDEV(SUM([Value]))) or SUM([Value]) < (WINDOW_AVG(SUM([Value]))-2*WINDOW_STDEV(SUM([Value]))),2,1)` Nelson Rule 6 (Western Electric Rule 3) Four (or five) out of five points in a row are more than 1 standard deviation from the mean in the same direction. There is a strong tendency for samples to be slightly out of control. `WINDOW_MAX( IIF( INDEX()>5 AND(WINDOW_SUM(IIF( SUM([Value]) >(WINDOW_AVG(SUM([Value]))+WINDOW_STDEV(SUM([Value]))), 1, 0 ) , -5+1, 0) >= 4OR WINDOW_SUM(IIF(SUM([Value]) < (WINDOW_AVG(SUM([Value]))-WINDOW_STDEV(SUM([Value]))), 1, 0 ), -5+1, 0) >= 4), 1, 0 ), 0, 5-1 )// mutiplying by 2 if this point is above boundary* IIF(SUM([Value]) > (WINDOW_AVG(SUM([Value]))+WINDOW_STDEV(SUM([Value]))) OR SUM([Value]) < (WINDOW_AVG(SUM([Value]))-WINDOW_STDEV(SUM([Value]))), 2, 1 )` Nelson Rule 7 Fifteen points in a row are all within 1 standard deviation of the mean on either side of the mean. With 1 standard deviation, greater variation would be expected. `WINDOW_MAX( IIF( INDEX()>15 AND WINDOW_SUM( IIF(SUM([Value]) < (WINDOW_AVG(SUM([Value]))+WINDOW_STDEVP(SUM([Value]))) AND SUM([Value]) > (WINDOW_AVG(SUM([Value]))-WINDOW_STDEV(SUM([Value]))), 1, 0 ) , -15+1, 0) = 15 , 1, 0 ), 0, 15-1 )` Nelson Rule 8 Eight points in a row exist, but none within 1 standard deviation of the mean, and the points are in both directions from the mean. Jumping from above to below whilst missing the first standard deviation band is rarely random. `WINDOW_MAX(IIF( INDEX()>8 AND WINDOW_SUM(IIF(SUM([Value]) >(WINDOW_AVG(SUM([Value]))+WINDOW_STDEV(SUM([Value]))) OR SUM([Value]) <(WINDOW_AVG(SUM([Value]))-WINDOW_STDEV(SUM([Value]))), 1, 0) , -8+1, 0) = 8 AND WINDOW_SUM(IIF(SUM([Value]) > (WINDOW_AVG(SUM([Value]))+WINDOW_STDEV(SUM([Value]))), 1, 0 ) , -8+1, 0) >= 1ANDWINDOW_SUM(IIF(SUM([Value]) < (WINDOW_AVG(SUM([Value]))-WINDOW_STDEV(SUM([Value]))), 1, 0 ) , -8+1, 0) >= 1 , 1, 0 ), 0, 8-1 )`

The next thing to do to use these is to combine them into a single calculation that you can put on colour size etc.

 `IF[Nelson1 WE1 - Outside 3 STDEV]=1OR [Nelson2 WE4 - 9 Above or below Mean]=1OR [Nelson3 - 6 Consecutive up or down] =1OR [Nelson4 - 14 Consecutive Alternate]=1OR [Nelson5 WE2 2 out of 3 outside 2sd]=2OR [Nelson6 WE3 4 out of 5 Outside 1sd]=2OR [Nelson7- 15 Inside 1 SD]=1OR [Nelson8- 8 Outside 1SD]=1THEN 'Alert'ELSEIF [Nelson5 WE2 2 out of 3 outside 2sd]=1OR [Nelson6 WE3 4 out of 5 Outside 1sd]=1THEN 'Member of Alert Group'END`

By this point in the series you should be pretty comfortable with building Control Charts, so I wont cover that again, but if you use the above calculation on size and colour, using the sample data set  (See download link below) you should end up with something like this;

One thing I really struggled with in writing this blog post was finding a data set I could use to demonstrate each rule, where a single rule is broken but the others are not, I couldn't find one so created one which I am more than happy to share. - WE_Nelson_rules_Data.

To see this and the other charts described in this blog series you can download the workbook - SPC blog.

So in the immortal words of Forest Gump - 'That is all I have to say about that.'

I hope this blog series has been informative/useful to some, I personally have learnt about the various different Control Chart types and found the time to work out the Table Calculations in this blog - so it has been a useful experience personally, if you have any requests for blog subjects please add them in the comments.