Given any
matrix
, the singular value decomposition (SVD) is
where
is an
unitary matrix,
is an
unitary matrix, and
is an
diagonal matrix where all off-diagonal entries are 0 and the diagonal entries are all non-negative real values. The diagonal entries of
are referred to as "singular values".
As an example, consider the shear transformation
. The singular value decomposition of
is:
The set of all unit length vectors
such that
form a sphere of radius 1 around the origin. When
is applied to this sphere, it becomes an ellipsoid. The principal radii of this ellipsoid are the singular values, and their directions form the columns of
.
One fact that is not immediately obvious is that the singular value decomposition always exists:
In essence, any linear transformation is a rotation, followed by stretching or shrinking parallel to each axis (with some dimensions added or zeroed out of existence), followed by another rotation.
The following proof will demonstrate that the singular value decomposition always exists. An outline of the proof will be given first:
Proof outline
We need to prove that an arbitrary linear transform
is a rotation:
, followed by scaling parallel to each axis:
, and lastly followed by another rotation:
, giving
.
If the columns of
are already mutually orthogonal, then the first rotation is not necessary:
. The entries of
are the lengths of the vectors formed by the columns of
, and
is a rotation that rotates the elementary basis vectors of
to be parallel with the columns of
.
In most cases however, the columns of
are not mutually orthogonal. In this case, the rotation
is non-trivial.
, so
must be chosen so that the columns of
are mutually orthogonal. Let
. We need to choose orthonormal vectors
so that
are all mutually orthogonal. This can be done iteratively. Imagine that we have chosen
so that when given any vector
that is orthogonal to
, that
is orthogonal to
. The effort now switches to finding an orthonormal set of vectors
confined to the space of vectors that are perpendicular to
such that
are mutually orthogonal.
Let
be a unitary matrix with
as the first column. Factoring
from the left side of
to get
results in a new set of orthonormal vectors that are the columns of
. The goal of having the columns of
be mutually orthogonal is converted to having the columns of
be mutually orthogonal with
effectively replacing
.
transforms to
, and the space of vectors orthogonal to
transforms to the space spanned by the standard basis vectors
. The first column of
is
and so is orthogonal to all other columns.
If
is a unitary matrix where the first column of
is
normalized to unit length, then factoring
from the left side of
to get
results in a matrix in which the first column is parallel to the standard basis vector
. The first column of
is orthogonal to all other columns, so the first column of
is orthogonal to all other columns, so hence the first row of
contains all 0s except for the first column.
can now be determined recursively with the dimension reduced to
, and
is replaced with
with the first row and column removed. This forms the inductive component of the coming proof.
Lastly, how do we know that there exists
so that when given any vector
that is orthogonal to
, that
is orthogonal to
? The answer will be that the unit length
that maximizes
is a valid
.
We are now ready to give the proof in full detail:
- Proof of the existence of the singular value decomposition
This proof will proceed using induction on both
and
.
- Base Case

has a single row, and therefore has the form
where
is an arbitrary unit length vector, and
is an arbitrary non-negative real number. Note that
and
exist for any single row matrix
.
Let
,
, and
where
together form an mutually orthogonal set of unit length vectors.
can be determined via Gram-Schmidt orthogonalization. It is clear that:
.
- Base Case

has a single column, and therefore has the form
where
is an arbitrary unit length vector, and
is an arbitrary non-negative real number. Note that
and
exist for any single column matrix
.
Let
,
, and
where
together form an mutually orthogonal set of unit length vectors.
can be determined via Gram-Schmidt orthogonalization. It is clear that:
.
- Inductive Case

Let
denote the
standard basis vector of
. Let
denote a
matrix of 0s.
Maximize
subject to the constraint
. Let
be a unit length vector that maximizes
, and let
. Let
(if
, then
is an arbitrary unit length vector).
Using Gram-Schmidt orthogonalization, unitary matrices
and
can be determined such that the first columns of
and
respectively are
and
:
and
.
where
. It will now be proven that the first column and row of
contain all 0s except for the (1,1) entry which contains
:
.
. This means that the first column of
is
.
To show that the first row of
is
, we will show that the first column of
is orthogonal to all of the other columns of
. This will require exploiting the fact that
maximizes
subject to the constraint
.
Let
be a parameterized unit length vector. Let
.
Taking the derivative of the constraint
gives
being maximized at
gives:
(
and
denote the real and imaginary components of a complex number respectively.)
Let
be arbitrary. Let
.
and
are orthogonal.
This gives:
.
Now let
where the
outside of the subscript is the imaginary constant.
and
are orthogonal.
This gives:
.
Hence:
.
Therefore, the first column of
is orthogonal to all of the other columns of
, and
has the form:
.
is an
matrix, and therefore by an inductive argument,
. Finally,
where
,
, and
.