New Test on the Block – Introducing the Tetrad Test

Sensory difference tests are probably the most widely used form of sensory methodology

Sensory difference tests are probably the most widely used form of sensory methodology. We use difference tests to determine whether, with a specified level of confidence, respondents can tell a difference between two experimental samples. These might be a standard product and a recipe variant, a standard and a product made by a different process, an aged and a fresh product, a new pack versus an existing pack, or even your product imitation against your competitor. As such the tests support quality assurance, new product development and market research programmes in the food, beverage, home and personal care and textile industries.

Sometimes difference testing is combined with other techniques such as sensory profiling to determine the nature of any difference detected, or consumer acceptance testing to establish if the difference impacts on consumer choice.

Types of Difference Test

The most commonly used tests are the triangle, duo-trio and paired comparison. The triangle and duo-trio tests are examples of unspecified difference tests. In other words the nature of the difference is not specified to the assessors. So in the triangle test, the assessor is presented with three samples, two of which are the same and the other different, and is asked to select the ‘odd’ or different sample. In a duo-trio test, the assessor receives a reference sample and two coded samples, one of which will be the same as the reference. She is asked to select which of the two coded samples is most different from the reference sample. Usually we force the assessor to guess if she cannot tell the difference. In a triangle test the probability of getting the test correct by guessing is 1/3 whereas in a duo trio it is ½.

The paired comparison test is a specified difference test. The assessor is told which sensory attribute to focus on when making her choice. For example, in a paired comparison test, presented with two coded samples, the assessor may be asked to select the sweeter sample. The 3-AFC test is another example of a specified difference test: the assessor receives three samples, two of which are of the same sample, and a third which has more of a specified attribute. The assessor’s task is to select the third sample so the instruction would be ‘please select the sweeter sample’ as in the paired comparison test. The guessing probabilities for these tests are again ½ for the paired comparison and 1/3 for the 3-AFC.

Interpretation of difference test results

Difference test results are traditionally analysed using binomial statistics. For a given panel size, the probability of obtaining the number of ‘correct’ responses in the test is calculated…if the probability is small we conclude that a difference was detected between the samples and if the probability is large, that there was not.

What are we thinking?

Anyone who has used the 3-AFC test and the triangle test to compare the same two samples will know that assessors get more tests correct when doing 3-AFC than when doing triangle. This is interesting since the guessing probability in each case is 1/3. Assessors also get more 3-AFCs correct than duo-trios. So put simply, for a comparison of the same two samples, you are more likely to find a statistically significant difference with a 3-AFC test than with a triangle test.

The 3-AFC is in other words, more powerful.

Thurstonian modelling provides the basis for explaining these differences and is described excellently in various papers and texts by Mike O’Mahony. From Thurstonian modelling comes the concept of d’: put simply d’ is a measure of the size of sensory difference between two products. If two products are identical d’ is 0 (zero); at threshold d’ = 1.0 and its value increases as the difference between the products increases.

Computer simulation based on Thurstonian modelling has allowed curves to be drawn that compare the power of various test methods when comparing different sizes of sensory difference or d’. You can see that the 3-AFC is more powerful at finding a given difference than a triangle or a duo-trio.

Limitations of 3-AFC

So what does this mean? It means that we can use fewer people or have a smaller panel, for a 3-AFC test than a triangle and be as confident in our result. The consequence is large savings in panel time and effort! However, the downside is that often we can’t specify in advance what the sensory difference is that we are interested in…we don’t know what will change when we alter our recipe or process or what consumers will notice…or maybe more than one thing may have changed. What is needed is an unspecified difference test that is as powerful as the 3-AFC. Enter the tetrad test!

The Tetrad Test

The (unspecified) tetrad is a four sample difference test. The assessor receives two samples of one product and of the second. Her task is to sort them into two groups such that the samples in each group are more similar to each other than to the other samples. Like the triangle, the probability of getting the correct answer by guessing is 1/3. However, the chart shows how the power of the tetrad compares to the triangle, duo-trio and 3-AFC. Whilst not as powerful as the 3-AFC, the tetrad is more powerful than the other unspecified difference tests. The reason for this is explained by Thurstonian modelling but the significance for your business is potential saving in panel time and money! Fewer respondents will be needed in a tetrad test to attain a statistically significant result at a given test power. For example, Ennis cites that for a test with a significance level of 0.05 (95%) and a power of 90% for which d’ is quite high (1.5) you would need 87 assessors for a duo-trio test, 78 for a triangle but only 25 for a tetrad panel!

There is a caveat of course: for some sample types for example, very highly fragranced or flavoured personal care products or foods, the power advantage of the tetrad may be lost due to increased fatigue introduced by tasting or smelling the fourth sample or even by shear memory over-load. But it is worth making the comparison for your products…if we can help you with this or you would just like to discuss further then please don’t hesitate to get in touch.

Further Reading

Ennis. J. M., and Jesionka, V. (2011) The power of sensory discrimination methods revisited. Journal of Sensory Studies, 26371-382

O,Mahony, M & Rousseau B (2003) Discrimination Testing: a few ideas old and new. Food Quality & Preference 14 157-164