Figure 1.

Geometric meaning of PCA explained by using bivariate normally distributed variables. Scatters of sample are distributed in the shape of ellipse roughly, then orthogonally rotate the original plane rectangular coordinates composed of X1 and X2with an angle θ. By now, two original correlated variables(X1, X2)were transformed into two integrated and uncorrelated variables (Y1,Y2). Because the variance of the original variables is greater in Y1 axis than in Y2 axis, so the minimum of information will be lost if integrated variable Y1 is used for replacing all original variables. Hence,Y1 is defined as the first principal component; in contrast, variance of variables is smaller in Y2 axis, and it can explain minor information relative to Y1, soY2 is called the second principal component.

Zhao et al. BMC Genomics 2012 13:435   doi:10.1186/1471-2164-13-435
Download authors' original image