In practice, the usual procedure is that the contribution of the first (or, in general, the current) principal component is
removed from the dataset. Equation 16 (2) demonstrates how this is done. Basically, the idea is to compute the error term
from fitting the first/current principal component to the data set. The steps needed are these:
- Normalize the principal component just computed, by dividing each element by Σmi=1 Li2, as described earlier.
- Compute the matrix product of the normalized principal component with the [X] matrix ([S] = [X][L]). This provides the set
of scores ([S]) that is the contribution of that principal component to each data spectrum.
- For each sample, multiply the normalized principal component by the score for that sample; that is the contribution of that
principal component to that spectrum.
- For each spectrum, subtract that contribution due to the first principal component, wavelength by wavelength, from the corresponding
spectrum.
The principal component calculated will be principal component 2; the function that is the leastsquare fitting function to
the error terms remaining from the first principal component. The two functions, together, will account for more variance
from the original data set than any two functions, of any sort, can. Summary We have now plowed our way through considerable mathematical territory, while deliberately taking two dead-end side trips.
In the process, we demonstrated the following:
- We showed algebraically that the maximum variance functions describing a set of data are the least squares function that approximate
that data set.
- We showed algebraically that the least squares solution is not simply a collection of multiple univariate solutions, in which
the least squares value at each individual wavelength (or other variable), but requires the full variance–covariance matrix
to properly describe the least-squares nature of the problem.
- We showed algebraically that even so, the least-squares formulation of the problem results in a trivial solution, unless somehow
you can impose a constraint on the solutions so that nontrivial solutions can be found.
- We showed algebraically that the method of Lagrange multipliers is a way to impose the necessary constraint.
- We showed that finding the least square solutions required the use of derivatives, because the extremum is found by setting
the derivative equal to zero. This result comes from elementary calculus, our only departure from a purely algebraic approach.
- We then showed algebraically how the Lagrange multiplier comes out of the equations as a polynomial (called the "characteristic
equation"); therefore, there are as many solutions to the problem as there are roots to the polynomial. These multiple roots
correspond to the multiple principal components we can calculate.
- To calculate the functions corresponding to the roots of the characteristic equation, which are the functions that are the
least-squares fit to the original data set, we recast the equations into the form of an eigenvalue equation. From this we
found that the eigenvalue corresponds to the root of the characteristic equation, and the eigenfunction is the function that
is actually the least-square estimators of the original data set.
- Then, we subtract the contribution of the given principal component from the data set, and do the whole procedure all over
again, to find the next principal component. Iterate until you've approximated the data set to a sufficient degree of accuracy,
or run out of time, money, computer capacity, or patience.
Thus ends the story of the long, complicated, tedious, and difficult route to principal components. That's why it took us
six of our columns to explicate it all. And THAT'S why principal components are always done with matrices! However, don't miss the coda we will present in our next installment. References (1) H. Mark and J. Workman, Spectroscopy 22(9), 20–29 (2007). (2) H. Mark and J. Workman, Spectroscopy 23(2), 30–37 (2008). (3) H. Mark and J. Workman, Spectroscopy 23(5), 14–17 (2008). (4) H. Mark and J. Workman, Spectroscopy 23(6), 22–24 (2008). (5) H. Mark and J. Workman, Spectroscopy 23(10), 24–29 (2008). (6) H. Mark and J. Workman, Spectroscopy 21(6), 34–36 (2006). Howard Mark serves on the Editorial Advisory Board of Spectroscopy and runs a consulting service, Mark Electronics (Suffern, NY). He
can be reached via e-mail: hlmark@prodigy.net
Jerome Workman, Jr. serves on the Editorial Advisory Board of Spectroscopy and is currently with Luminous Medical, Inc., a company dedicated
to providing automated glucose management systems to empower health care professionals.
|