I have always found the common definition of the generalized inverse of a matrix quite unsatisfactory, because it is usually defined by a mere property, , which does not really give intuition on when such a matrix exists or on how it can be constructed, etc… But recently, I came across a much more satisfactory definition for the case of symmetric (or more general, normal) matrices.
As is well known, any symmetric matrix is diagonalizable,
where is a diagonal matrix with the eigenvalues of on its diagonal, and is an orthogonal matrix with eigenvectors of as its columns (which magically form an orthogonal set , just kidding, absolutely no magic involved).
Assume that is a real symmetric matrix of size and has rank . Denoting the non-zero eigenvalues of by and the corresponding columns of by , we have that
We define the generalized inverse of by
Why this definition makes sense
The common definition/property of generalized inverse still holds:
where we used the fact that unless (i.e., orthogonality of ).
By a similar calculation, if is invertible, then and it holds that
If is invertible, then has eigenvalues and eigenvectors (because for all ).
Thus, Definition () is simply the diagonalization of if is invertible.
Since form an orthonormal basis for the range of A, it follows that the matrix
is the projection operator onto the range of .
But what if A is not symmetric?
Well, then is not diagonalizable (in general), but instead we can use the singular value decomposition
Definition is mentioned in passing on page 87 in
- Morris L. Eaton, Multivariate Statistics: A Vector Space Approach. Beachwood, Ohio, USA: Institute of Mathematical Statistics, 2007.