The Long, Complicated, Tedious, and Difficult Route to Principal Components: Part VI - (Or, "When you're through reading this set you'll know why it's always done with matrices.") - Spectroscopy
FindAnalytichem Custom Search
About Search
 Home   Mass Spectrometry   ICP-MS   Infrared   FT-IR   UV-Vis   Raman   NMR   X-Ray   Fluorescence  
Make This Page Your Home Page!

The Long, Complicated, Tedious, and Difficult Route to Principal Components: Part VI
(Or, "When you're through reading this set you'll know why it's always done with matrices.")


Spectroscopy


In practice, the usual procedure is that the contribution of the first (or, in general, the current) principal component is removed from the dataset. Equation 16 (2) demonstrates how this is done. Basically, the idea is to compute the error term from fitting the first/current principal component to the data set. The steps needed are these:

  • Normalize the principal component just computed, by dividing each element by Σmi=1 Li2, as described earlier.
  • Compute the matrix product of the normalized principal component with the [X] matrix ([S] = [X][L]). This provides the set of scores ([S]) that is the contribution of that principal component to each data spectrum.
  • For each sample, multiply the normalized principal component by the score for that sample; that is the contribution of that principal component to that spectrum.
  • For each spectrum, subtract that contribution due to the first principal component, wavelength by wavelength, from the corresponding spectrum.

The principal component calculated will be principal component 2; the function that is the leastsquare fitting function to the error terms remaining from the first principal component. The two functions, together, will account for more variance from the original data set than any two functions, of any sort, can.

Summary

We have now plowed our way through considerable mathematical territory, while deliberately taking two dead-end side trips. In the process, we demonstrated the following:

  • We showed algebraically that the maximum variance functions describing a set of data are the least squares function that approximate that data set.
  • We showed algebraically that the least squares solution is not simply a collection of multiple univariate solutions, in which the least squares value at each individual wavelength (or other variable), but requires the full variance–covariance matrix to properly describe the least-squares nature of the problem.
  • We showed algebraically that even so, the least-squares formulation of the problem results in a trivial solution, unless somehow you can impose a constraint on the solutions so that nontrivial solutions can be found.
  • We showed algebraically that the method of Lagrange multipliers is a way to impose the necessary constraint.
  • We showed that finding the least square solutions required the use of derivatives, because the extremum is found by setting the derivative equal to zero. This result comes from elementary calculus, our only departure from a purely algebraic approach.
  • We then showed algebraically how the Lagrange multiplier comes out of the equations as a polynomial (called the "characteristic equation"); therefore, there are as many solutions to the problem as there are roots to the polynomial. These multiple roots correspond to the multiple principal components we can calculate.
  • To calculate the functions corresponding to the roots of the characteristic equation, which are the functions that are the least-squares fit to the original data set, we recast the equations into the form of an eigenvalue equation. From this we found that the eigenvalue corresponds to the root of the characteristic equation, and the eigenfunction is the function that is actually the least-square estimators of the original data set.
  • Then, we subtract the contribution of the given principal component from the data set, and do the whole procedure all over again, to find the next principal component. Iterate until you've approximated the data set to a sufficient degree of accuracy, or run out of time, money, computer capacity, or patience.

Thus ends the story of the long, complicated, tedious, and difficult route to principal components. That's why it took us six of our columns to explicate it all.

And THAT'S why principal components are always done with matrices!

However, don't miss the coda we will present in our next installment.

References

(1) H. Mark and J. Workman, Spectroscopy 22(9), 20–29 (2007).

(2) H. Mark and J. Workman, Spectroscopy 23(2), 30–37 (2008).

(3) H. Mark and J. Workman, Spectroscopy 23(5), 14–17 (2008).

(4) H. Mark and J. Workman, Spectroscopy 23(6), 22–24 (2008).

(5) H. Mark and J. Workman, Spectroscopy 23(10), 24–29 (2008).

(6) H. Mark and J. Workman, Spectroscopy 21(6), 34–36 (2006).

Howard Mark serves on the Editorial Advisory Board of Spectroscopy and runs a consulting service, Mark Electronics (Suffern, NY). He can be reached via e-mail:

Jerome Workman, Jr. serves on the Editorial Advisory Board of Spectroscopy and is currently with Luminous Medical, Inc., a company dedicated to providing automated glucose management systems to empower health care professionals.


Rate This Article
Your original vote has been tallied and is included in the ratings results.
View our top pages
Average rating for this page is: 4
Post a Comment
Your email address will NOT be published.
appears with your comment
read our privacy policy
Note: does not support HTML
All comments submitted are subject to review, and may be delayed before posting. We reserve the right not to post comments.
Headlines from LCGC North America and Chromatography Online
Supercritical Fluid Chromatography in Theory and Practice - Introduction to Supercritical Fluid Chromatography
KNAUER - GPC cleanup of olive oil samples
Waters SFC: Introduction to Supercritical Fluid Chromatography
Ion Chromatography
Thermo Fisher Scientific NA - Highly efficient quantitative/qualitative metabolic stability assay
Source: Spectroscopy,
Click here