15.4 Factor Extraction Methods

Methods

  1. Principal Components
  2. Iterated Components
  3. Maximum Likelihood

15.4.1 Principal components (PC Factor model)

Recall that \(\mathbf{C} = \mathbf{A}\mathbf{X}\), C’s are a function of X

\[ C_{1} = a_{11}X_{1} + a_{12}X_{2} + \ldots + a_{1P}X_{p} \]

We want the reverse: X’s are a function of F’s.

  • Use the inverse! –> If \(c = 5x\) then \(x = 5^{-1}C\)

The inverse PC model is \(\mathbf{X} = \mathbf{A}^{-1}\mathbf{C}\).

Since \(\mathbf{A}\) is orthogonal, \(\mathbf{A}^{-1} = \mathbf{A}^{T} = \mathbf{A}^{'}\), so

\[ X_{1} = a_{11}C_{1} + a_{21}C_{2} + \ldots + a_{P1}C_{p} \]

But there are more PC’s than Factors…

\[ \begin{equation} \begin{aligned} X_{i} &= \sum_{j=1}^{P}a_{ji}C_{j} \\ &= \sum_{j=1}^{m}a_{ji}C_{j} + \sum_{j=m+1}^{m}a_{ji}C_{j} \\ &= \sum_{j=1}^{m}l_{ji}F_{j} + e_{i} \\ \end{aligned} \end{equation} \]

Adjustment

  • \(V(C_{j}) = \lambda_{j}\) not 1
  • We transform: \(F_{j} = C_{j}\lambda_{j}^{-1/2}\)
  • Now \(V(F_{j}) = 1\)
  • Loadings: \(l_{ij} = \lambda_{j}^{1/2}a_{ji}\)
\(l_{ij}\) is the correlation coefficient between variable \(i\) and factor \(j\)

This is similar to \(a_{ij}\) in PCA.

15.4.1.1 R code

Factor extraction via principal components can be done using the principal function in the psych package. We choose nfactors=2 here because we know there are 2 underlying factors in the data generation model.

#library(psych)
pc.extract.norotate <- principal(stan.dta, nfactors=2, rotate="none")
print(pc.extract.norotate)
## Principal Components Analysis
## Call: principal(r = stan.dta, nfactors = 2, rotate = "none")
## Standardized loadings (pattern matrix) based upon correlation matrix
##     PC1   PC2   h2    u2 com
## X1 0.57  0.75 0.89 0.112 1.9
## X2 0.61  0.72 0.89 0.113 1.9
## X3 0.58 -0.51 0.59 0.406 2.0
## X4 0.87 -0.38 0.89 0.109 1.4
## X5 0.92 -0.27 0.91 0.086 1.2
## 
##                        PC1  PC2
## SS loadings           2.63 1.55
## Proportion Var        0.53 0.31
## Cumulative Var        0.53 0.83
## Proportion Explained  0.63 0.37
## Cumulative Proportion 0.63 1.00
## 
## Mean item complexity =  1.7
## Test of the hypothesis that 2 components are sufficient.
## 
## The root mean square of the residuals (RMSR) is  0.09 
##  with the empirical chi square  16.62  with prob <  4.6e-05 
## 
## Fit based upon off diagonal values = 0.96

\[ \begin{equation} \begin{aligned} X_{1} &= 0.53F_{1} + 0.78F_{2} + e_{1} \\ X_{2} &= 0.59F_{1} + 0.74F_{2} + e_{2} \\ X_{3} &= 0.70F_{1} - 0.39F_{2} + e_{3} \\ X_{4} &= 0.87F_{1} - 0.38F_{2} + e_{4} \\ X_{5} &= 0.92F_{1} - 0.27F_{2} + e_{5} \\ \end{aligned} \end{equation} \]

These equations come from the top of the output, under Standardized loadings.

15.4.2 Iterated components

Select common factors to maximize the total communality

  1. Get initial communality estimates
  2. Use these (instead of original variances) to get the PC’s and factor loadings
  3. Get new communality estimates
  4. Rinse and repeat
  5. Stop when no appreciable changes occur.

R code not shown, but can be obtained using the factanal package in R.

15.4.3 Maximum Likelihood

  • Assume that all the variables are normally distributed
  • Use Maximum Likelihood to estimate the parameters

15.4.3.1 R code

The cutoff argument hides loadings under that value for ease of interpretation. Here I am setting that cutoff at 0 so that all loadings are being displayed. I encourage you to adjust this cutoff value in practice to see how it can be useful in reducing cognitave load of looking through a grid of numbers.

ml.extract.norotate <- factanal(stan.dta, factors=2, rotation="none")
print(ml.extract.norotate, digits=2, cutoff=0)
## 
## Call:
## factanal(x = stan.dta, factors = 2, rotation = "none")
## 
## Uniquenesses:
##   X1   X2   X3   X4   X5 
## 0.33 0.06 0.72 0.08 0.01 
## 
## Loadings:
##    Factor1 Factor2
## X1  0.35    0.74  
## X2  0.41    0.88  
## X3  0.50   -0.18  
## X4  0.94   -0.19  
## X5  0.99   -0.07  
## 
##                Factor1 Factor2
## SS loadings       2.41    1.39
## Proportion Var    0.48    0.28
## Cumulative Var    0.48    0.76
## 
## Test of the hypothesis that 2 factors are sufficient.
## The chi square statistic is 0.4 on 1 degree of freedom.
## The p-value is 0.526

The factor equations now are:

\[ \begin{equation} \begin{aligned} X_{1} &= -0.06F_{1} + 0.79F_{2} + e_{1} \\ X_{2} &= -0.07F_{1} + 1F_{2} + e_{2} \\ X_{3} &= 0.58F_{1} + 0.19F_{2} + e_{3} \\ \vdots \end{aligned} \end{equation} \]

15.4.4 Uniqueness

Recall Factor analysis splits the variance of the observed X’s into a part due to the communality \(h_{i}^{2}\) and specificity \(u_{i}^{2}\). This last term is the portion of the variance that is due to the unique factor. Let’s look at how those differ depending on the extraction method:

pc.extract.norotate$uniquenesses
##         X1         X2         X3         X4         X5 
## 0.11151283 0.11336123 0.40564130 0.10890203 0.08591098
ml.extract.norotate$uniquenesses
##         X1         X2         X3         X4         X5 
## 0.33432315 0.05506386 0.71685548 0.07508356 0.01414656

Here we see that the uniqueness for X2, X4 and X5 under ML is pretty low compared to the PC extraction method, but that’s almost offset by a much higher uniqueness for x1 and X3.

Ideally we want the variance in the X’s to be captured by the factors. So we want to see a low unique variance.

15.4.5 Resulting factors

par(mfrow=c(1,2)) # grid of 2 columns and 1 row

pc.load <- pc.extract.norotate$loadings[,1:2]
plot(pc.load, type="n", main="PCA Extraction") # set up the plot but don't put points down
text(pc.load, labels=rownames(pc.load)) # put names instead of points

ml.load <- ml.extract.norotate$loadings[,1:2]
plot(ml.load, type="n", main="ML Extraction") 
text(ml.load, labels=rownames(ml.load))

PCA Extraction

  • X1 and X2 load high on PC1, and low on PC1.
  • X3, 4 and 5 are negative on PC2, and moderate to high on PC1.
  • PC1 is not highly correlated with X3

ML Extraction

  • Same overall split, X3 still not loading high on Factor 1.
  • X1 loading lower on Factor 2 compared to PCA extraction method.

Neither extraction method reproduced our true hypothetical factor model. Rotating the factors will achieve our desired results.