I have always found the common definition of the generalized inverse of a matrix quite unsatisfactory, because it is usually defined by a mere property, $$A A^{-} A = A$$, which does not really give intuition on when such a matrix exists or on how it can be constructed, etc… But recently, I came across a much more satisfactory definition for the case of symmetric (or more general, normal) matrices.

As is well known, any symmetric matrix $$A$$ is diagonalizable,

$A = QDQ^T,$

where $$D$$ is a diagonal matrix with the eigenvalues of $$A$$ on its diagonal, and $$Q$$ is an orthogonal matrix with eigenvectors of $$A$$ as its columns (which magically form an orthogonal set , just kidding, absolutely no magic involved).

### The Definition

Assume that $$A$$ is a real symmetric matrix of size $$n\times n$$ and has rank $$k \leq n$$. Denoting the $$k$$ non-zero eigenvalues of $$A$$ by $$\lambda_1, \dots, \lambda_k$$ and the corresponding $$k$$ columns of $$Q$$ by $$q_1, \dots, q_k$$, we have that

$A = QDQ^T = \sum_{i=1}^k \lambda_i q_i q_i^T.$

We define the generalized inverse of $$A$$ by

$$$\label{TheDefinition} A^{-} := \sum_{i=1}^k \frac{1}{\lambda_i} q_i q_i^T.$$$

### Why this definition makes sense

1. The common definition/property of generalized inverse still holds:

$\begin{eqnarray} A A^{-} A &=& \sum_{i=1}^k \lambda_i q_i q_i^T \sum_{m=1}^k \frac{1}{\lambda_m} q_m q_m^T \sum_{j=1}^k \lambda_j q_j q_j^T \nonumber \\ &=& \sum_{i,m=1,\dots,k} \lambda_i \frac{1}{\lambda_m} q_i q_i^T q_m q_m^T \sum_{j=1}^k \lambda_j q_j q_j^T \nonumber \\ &=& \sum_{i=1}^k q_i q_i^T \sum_{j=1}^k \lambda_j q_j q_j^T \nonumber \\ &=& \sum_{i,j=1,\dots,k} \lambda_j q_i q_i^T q_j q_j^T \nonumber \\ &=& \sum_{i=1}^k \lambda_i q_i q_i^T = A, \nonumber \end{eqnarray}$

where we used the fact that $$q_i^T q_j = 0$$ unless $$i = j$$ (i.e., orthogonality of $$Q$$).

2. By a similar calculation, if $$A$$ is invertible, then $$k = n$$ and it holds that

$A A^{-} = \sum_{i=1}^n q_i q_i^T = QQ^T = I.$
3. If $$A$$ is invertible, then $$A^{-1}$$ has eigenvalues $$\frac{1}{\lambda_i}$$ and eigenvectors $$q_i$$ (because $$A^{-1}q_i = \frac{1}{\lambda_i} A^{-1} \lambda_i q_i = \frac{1}{\lambda_i} A^{-1} A q_i = \frac{1}{\lambda_i} q_i$$ for all $$i = 1,\dots,n$$).

Thus, Definition ($$\ref{TheDefinition}$$) is simply the diagonalization of $$A^{-1}$$ if $$A$$ is invertible.

4. Since $$q_1, \dots, q_k$$ form an orthonormal basis for the range of A, it follows that the matrix

$A A^{-} = \sum_{i=1}^k q_i q_i^T = Q_{1:k} Q_{1:k}^T$

is the projection operator onto the range of $$A$$.

### But what if A is not symmetric?

Well, then $$A$$ is not diagonalizable (in general), but instead we can use the singular value decomposition

$A = U \Sigma V^T = \sum_{i = 1}^k \sigma_i u_i v_i^T,$

and define

$A^{-} := \sum_{i = 1}^k \frac{1}{\sigma_i} v_i u_i^T.$

Easy.

### References

Definition $$(\ref{TheDefinition})$$ is mentioned in passing on page 87 in

• Morris L. Eaton, Multivariate Statistics: A Vector Space Approach. Beachwood, Ohio, USA: Institute of Mathematical Statistics, 2007.