This third part in a series on non-linearity looks at other tests and how they can be applied in laboratories that must meet
FDA regulations.
We continue here what our last column started (1): discussions of other ways to test data for non-linearity. We'll begin by
reviewing what we want to test. FDA/ICH guidelines, starting from a univariate perspective, considers the relationship between
the actual analyte concentration and what they generically call the "test result," a term that is independent of the technology
used to ascertain the analyte concentration. This term therefore holds good for every analytical methodology from manual wet
chemistry to the latest high-tech instrument. In the end, even the latest instrumental methods have to produce a number, representing
the final answer for that instrument's quantitative assessment of the concentration, which is the test result from that instrument.
This is a univariate concept to be sure, but the same concept that applies to all other analytical methods. Things might change
in the future, but currently this is the way analytical results are reported and evaluated.
The question to be answered, then, is that for any given method of analysis, is the relationship between instrument readings
(test results) and the actual concentration linear?
Three tests of this characteristic were discussed in previous columns on this topic — the FDA/ICH recommendation of linear
regression with a report of various regression statistics, visual inspection of a plot of test results versus the actual concentrations,
and use of the Durbin-Watson statistic. Because we analyzed these tests previously we will not discuss them further here,
but a summary is provided in Table I, along with other tests for non-linearity that we explain and discuss in this column.
We now proceed to present various linearity tests that can be found in the statistical literature.
Table I. Various tests for (non)linearity that have been proposed and a summary of their characteristics.
F-Test Figure 1 shows a schematic representation of the F-test for linearity. Note that there are some similarities to the Durbin-Watson test. The key difference between this test
and the Durbin-Watson test is that in order to use the F-test as a test for (non)linearity, you must have measured many repeat samples at each value of the analyte. The variabilities
of the readings for each sample are pooled, providing an estimate of the within-sample variance. This is indicated by the
label "Operative difference for denominator." By analysis of variance, we know that the total variation of residuals around
the calibration line is the sum of the within-sample variance (S2within) plus the variance of the means around the calibration line. Now, if the residuals truly are random, unbiased, and in particular
if the model is linear, then we know that the means for each sample will cluster randomly around the calibration line, and
that their variance will equal S2within/n1/2 (indicated by the label "Operative difference for numerator"). The ratio of these two variances will be distributed as the
F-distribution, with an expected value of unity. If there is non-linearity, such as is shown in Figure 1, then the variance
corresponding to the means will be inflated by the systematic offset of each sample, and the computed F-ratio will be statistically
significantly larger than unity.
This test thus shares several characteristics with the Durbin-Watson test. It is based on well-known and rigorously sound
statistics. It is amenable to automated computerized calculation and suitable for automatic operation in an automated process
situation. It does not have the "fatal flaw" of the Durbin-Watson statistic.