In part I of this column series, we worked out the relationship between the calculus-based approach to least-squares calculations
and the matrix algebra approach to least-squares calculations, using a chemometrics-based approach (1). Now we need to discuss
a topic squarely based in the science of statistics.
The topic we will discuss is analysis of variance (ANOVA). This is a topic we have discussed previously — in fact, several
times. Put into words, ANOVA shows that when several different sources of variation act on a datum, the total variance of
the datum equals the sum of the variances introduced by each individual source. We first introduced the mathematics of the
underlying concepts behind this in (2), then discussed its relationship to precision and accuracy (3), the connection to statistical
design of experiments (4–6), and its relation to calibration results (7,8).
All of those discussions, however, were based upon considerations of the effects of multiple sources of variability on only
a single variable. To compare statistics with chemometrics, we need to enter the multivariate domain, and so we ask the question:
"Can ANOVA be calculated on multivariate data?" The answer to this question, as our long-time readers will undoubtedly guess,
is "Of course, otherwise we wouldn't have brought it up!" Multivariate ANOVA
Therefore, we come to the examination of ANOVA of data depending upon more than one variable. The basic operation of any ANOVA
is the partitioning of the sums of squares.
A multivariate ANOVA, however, has some properties different than the univariate ANOVA. To be multivariate, obviously there
must be more than one variable involved. As we like to do, then, we consider the simplest possible case; and the simplest
case beyond univariate is obviously to have two variables. The ANOVA for the simplest multivariate case — that is, the partitioning
of sums of squares of two random variables (X and Y) — proceeds as follows. From the definition of variance:
expanding equation 1 and noting that
results in:
expanding still further: