Kiser and Dolan begin their practical and useful review "Selecting the Best Curve" (1) with a quote from FDA guidelines (2)
that bears repeating here: "Standard curve fitting is determined by applying the simplest model that adequately describes
the concentration–response relationship using appropriate weighting and statistical tests for goodness of fit." As the last
installment of "Mass Spectrometry Forum" stated (3): "In a quantitative analysis using values measured with mass spectrometry
[MS] — usually values associated with ion intensities — a nonlinear regression (weighted or nonweighted) might model the proportionality
between measured instrument response and sample amount more accurately." And that brings up the following simple and direct
question: How does the analyst know what the "best" fit, and the most accurate model, actually is? Analytical data are not
always best described by the model of a simple straight line, and "straight" is a limiting term. The following discussion
about analyzing data with impartiality assumes that replicate data has been recorded across the recommended range (2) of concentrations
and is available for statistical analysis.
Even when a straight-line calibration (measured response versus sample concentration) appears satisfactory to the eye, and
even when the regression coefficient value for that straight line closely approaches the ideal value of 1, an F-test and a
residuals analysis should be used to assess the quality of the data, and to uncover properties not apparent in the straight-line
plot. Both can be completed easily using standard tools in software packages; the residuals analysis especially provides visual
plots that alert the analyst to hidden properties in the data set. The F-test compares the variances calculated for two different
data sets. These data sets commonly are chosen to be repetitive measurements of instrument response taken at the high- and
low- sample concentration ends of a putative calibration range. Given the specified desired confidence limits for the regression,
the F-test indicates whether the variances (the square of the standard deviations) are within an "allowable" range. An F-test
value falls within the allowable range when the data is homoscedastic — defined as when the standard deviation of data sets
measured at the different sample concentrations is the same. In such a situation, use of a weighted regression would not be
appropriate, and would not be supported within the analytical guidelines (2). A simple straight-line model will model the
data accurately. Commonly, however, especially across a broader range of sample concentrations, the standard deviations will
vary with sample concentration, with larger absolute standard deviations seen at higher sample concentrations. The influence
of these larger standard deviations on the regression line will be substantial. An F-test value outside of the accepted range
will indicate that the data sets are heteroscedastic, and a weighted regression analysis treatment of the data is appropriate.
For such heteroscedastic data, the relative standard deviation might be more or less constant across the concentration range
of interest. When a simple plot of the measured values at each sample concentration of interest is created (as might presage
the construction of a linear calibration plot), the heteroscedastic nature of the data might be obscured. Variations in the
data (especially at lower sample concentrations) might be smaller than the size of the symbol used to mark the values on the
graphical plot. In a residuals analysis, a value such as percentage deviation from the mean (y axis) in replicate measurements at each sample concentration level can be plotted against sample concentrations (x axis), and a wide disparity might then become evident. Statistical tables exist (2) that describe a reasonable and expected
variation as a function of sample concentration, as described in previous columns in this series (4). Values outside this
range indicate either overt error, or in the present context, the heteroscedastic nature of the data and the propriety of
a weighted regression.