February 15, 2012 (Vol. 32, No. 4)
A Useful Strategy to Assist in Comparability Criteria Determination
A quality practitioner may be interested in assessing whether two process means are statistically equivalent, e.g., whether two processes (an historical process and a new process) produce equivalent results for a quality attribute. Statistical equivalency tests, e.g., two one-sided t-tests (TOST) are widely accepted as an acceptable method for demonstrating equivalency.
In contrast with traditional hypothesis testing approaches, where the null hypothesis assumes equality across two parameters of interest (e.g., equal means), the null hypothesis using TOST can assess whether the average difference exceeds a comparability criteria known as the EAC (equivalency acceptance criteria) and can be written as:
H01: µ1 – µ2 ≤ -EAC, and
H02: µ1 – µ2 ≥ EAC,
where µ1 represents the pre-change mean, and µ2 represents the post-change mean.
To show average equivalency a 90% two-sided confidence interval for the difference of two means must fall completely within the range from –EAC to EAC.
In some instances this region may be mandated, e.g., using 80% to 125% as is required in bioequivalence testing. In most cases, the EAC is developed with subject matter experts. This article describes a graphical approach based on the work of Burdick et al., (published in Quality and Reliability Engineering International, 2011) to demonstrate the appropriateness of an EAC prior to collecting post-change data.
Assume we have nH values from an historical process and an EAC has been determined. The true mean (µH) and standard deviation (σH) are estimated using the sample mean (Figure 2) and sample standard deviation (sH).
A sample of size nN will be used to calculate the sample mean (Figure 3) and sample standard deviation (sN) from the new process once it has been brought online.
Rewriting equation (3) from Burdick et al., we will be able to conclude average equivalency using the TOST procedure if the sample mean calculated with a sample of size nN (Figure 3) meets the following conditions:
where ME is the margin of error. Assuming equal but unknown variances, the ME can be calculated as:
where t is the upper α/2 value from a t distribution with nH + nN -2 degrees of freedom. The upper/lower equivalence determination limits (UEDL/LEDL), which provide a region where average equivalency would be concluded are:
For example, assume there are nH = 20 historical results, there will be nN = 20 values from the new process, and the EAC = 8 units. If the sample mean and standard deviation from the nH values are 44.7 and 3.4 units, respectively, we may calculate the lowest and highest mean values for the new process that would result in equivalency across the means as:
UEDL = 44.7 + 8 – (1.686) x (3.4) x √(1/10) = 50.9
LEDL = 44.7 – 8 + (1.686) x (3.4) x √(1/10) = 38.5
These results can be shown alongside the historical values as illustrated in Figure 1. This allows the SME to compare the region where the future mean needs to fall to conclude equivalency in comparison with the historical values and associated variation. Once the results from the new process are obtained, this graph may be extended to shown the nN results, and equivalency can be assessed graphically as discussed in Burdick et al.
Using the UEDL and LEDL values, the utility of the chosen EAC can be objectively assessed before data from the new process are collected. This methodology may be of particular use in the planning stage for evaluating EAC values when there are no pre-determined criteria available.
Keith M. Bower ([email protected]) is a principal quality engineer in global quality engineering at Amgen. Web: www.amgen.com.