Linear Algebra/Matrices

Matrices and Linear Transformations

It turns out that linear transformations can be represented in a 1-1 fashion in matrices. This chapter will be most likely be a review as the topic has already probably been covered in high school (see this link). The establishment of a one-to-one correspondence between linear transformations and matrices is very important in the study of linear transformations.

Suppose you have a set of basis vectors x₁, x₂, x₃, ..., x_m of a vector space X and basis vectors y₁, y₂, y₃, ..., y_n of a vector space Y.

Consider a linear transformation T from X to Y, and the vectors

T(x₁)=y₁a₁₁+y₂a₂₁+y₃a₃₁+...+y_na_n1,
T(x₂)=y₁a₁₂+y₂a₂₂+y₃a₃₂+...+y_na_n2,
T(x₃)=y₁a₁₃+y₂a₂₃+y₃a₃₃+...+y_na_n3,
...
T(x_m)=y₁a_1m+y₂a_2m+y₃a_3m+...+y_na_nm,

You can arrange these coefficients in a matrix

$M={\begin{pmatrix}a_{11}&a_{12}&a_{13}&\ldots &a_{1m}\\a_{21}&a_{22}&a_{23}&\ldots &a_{2m}\\a_{31}&a_{32}&a_{33}&\ldots &a_{3m}\\\vdots &\vdots &\vdots &\vdots &\vdots \\a_{n1}&a_{n2}&a_{n3}&\ldots &a_{nm}\\\end{pmatrix}}$ .

Thus, if you have any vector

$x=\sum _{j=1}^{m}x_{j}b_{j}$ ,

Then

$T(x)=T(\sum _{j=1}^{m}x_{j}b_{j})=\sum _{j=1}^{m}b_{j}T(x_{j})=\sum _{j=1}^{m}b_{j}\sum _{i=1}^{n}y_{i}a_{ij}=\sum _{i=1}^{n}y_{i}\sum _{j=1}^{m}a_{ij}b_{j}$

Thus, T(x) is a linear combination of basis vectors

$\sum _{i=1}^{n}y_{n}c_{i}$ , where

$c_{i}=\sum _{j=1}^{m}a_{ij}b_{j}$ .

Thus knowledge of a matrix in respect to bases can determine the value of a the result of a linear transformation.

Thus, given any matrix, there is a corresponding function with the results being

$\sum _{i=1}^{n}y_{n}c_{i}$ , where

$c_{i}=\sum _{j=1}^{m}a_{ij}b_{j}$ .

This is obviously a linear operator, whose matrix coincides with the matrix used. This establishes the fact that every n by m matrix can determine a linear operator mapping an m dimensional vector space into an n dimensional vector space.

Algebra of Transformations

Addition

Define the sum C=A+B where A and B are linear transformations to be the function C(x)=A(x)+B(x). One can easily verify that this is also a linear transformation. You can verify that given two linear transformations A and B, that

A+B=B+A
(A+B)+C=C+(B+A)
A+0=A
A+(-A)=0

where 0 is the zero operator -A is the function -A(x) which one can easily verify to be a linear transformation.

Scalar multiplication

Given a linear transformation a, define the function $\mu A$ where $\mu$ is an element of a field to be the function $(\mu L)(x)=\mu (L(x))$ .

You can easily verify that given a linear transformations A and B and an elements of a field $\mu$ , $\mu _{1}$ , and $\mu _{2}$ , that

$\mu _{1}(\mu _{2}A)=(\mu _{1}\mu _{2})A$
$1A=A$
$(\mu _{1}+\mu _{2})A=\mu _{1}A+\mu _{2}A$
$\mu (A+B)=\mu A+\mu B$

This implies that linear transformations form a vector space.

Multiplication

Given a linear transformation A from X to Y and a linear transformation B from Y to Z, then define the function AB from X to Z to be the composition of the two functions. One can easily verify that this is also a linear transformation.

Here are some useful relations that can easily be verified:

$\mu (AB)=(\mu A)B$
$(A+B)C=AC+BC$
$C(A+B)=CA+CB$
$(AB)C=A(BC)$ .

Corresponding algebra of matrices

Since there is a one-to-one correspondence between linear transformations from m-dimension spaces to n-dimensional spaces and m-by-n matrices, the addition, scalar multiplication, and multiplication operations are defined in their one-to-one correspondence, and all properties stated above hold for matrices. The addition of matrices M and N can be defined as the matrix that corresponds to the sum m+n where m and n are the linear transformations that correspond to M and N respectively. The other operations are defined similarly.

Addition

Let A = |a_ij| and B = |b_ij| be two matrices of dimension n by m. Consider A and B which are the corresponding linear transformations from an m-dimensional vector space M to an n-dimensional vector space N. Let m₁, m₂, m₃, ..., m_m, be basis vectors of M and n₁, n₂, n₃, ..., n_n be basis vectors of N. Then

$A(m_{i})=\sum _{j=1}^{n}a_{ij}n_{j}$ , and $B(m_{i})=\sum _{j=1}^{n}b_{ij}n_{j}$ .

Thus

$(A+B)(m_{i})=\sum _{j=1}^{n}(a_{ij}+b_{ij})n_{j}$

so the matrix of this operator has entries |a_ij+b_ij|. In other words, the sum of two matrices have entries that are the sum of the corresponding entries of the two matrices.

Examples:

{\begin{pmatrix}1&3\\1&0\\1&2\end{pmatrix}}+{\begin{pmatrix}0&0\\7&5\\2&1\end{pmatrix}}={\begin{pmatrix}1+0&3+0\\1+7&0+5\\1+2&2+1\end{pmatrix}}={\begin{pmatrix}1&3\\8&5\\3&3\end{pmatrix}}

Once addition is defined we have obviously also defined subtraction. A - B is computed by subtracting corresponding elements of A and B, and has the same dimensions as A and B. For example:

{\begin{pmatrix}1&3\\1&0\\1&2\end{pmatrix}}-{\begin{pmatrix}0&0\\7&5\\2&1\end{pmatrix}}={\begin{pmatrix}1-0&3-0\\1-7&0-5\\1-2&2-1\end{pmatrix}}={\begin{pmatrix}1&3\\-6&-5\\-1&1\end{pmatrix}}

Scalar multiplication

Scalar multiplication of matrices shall be defined to be the corresponding matrix of the scalar product of the corresponding linear transformations.

Consider a matrix A with entries |a_ij| and its corresponding linear transformation A from M to N, and an element of a field $\mu$ , and let m₁, m₂, m₃, ..., m_m, be basis vectors of M and n₁, n₂, n₃, ..., n_n be basis vectors of N. Since

$(\mu A)(m_{j})=\sum _{i=1}^{m}\mu a_{ij}n_{i}=\mu \sum _{i=1}^{m}a_{ij}n_{i}$ , the entries of the corresponding matrix is has entries | $\mu$ a_ij|.

For example, multiplication by 2 of a matrix:

2\cdot {\begin{pmatrix}1&8&-3\\4&-2&5\end{pmatrix}}={\begin{pmatrix}2\cdot 1&2\cdot 8&2\cdot -3\\2\cdot 4&2\cdot -2&2\cdot 5\end{pmatrix}}={\begin{pmatrix}2&16&-6\\8&-4&10\end{pmatrix}}

Scalar Multiplication has the following properties, which have been proven because of its one-to-one correspondence to Linear Transformations:

Left distributivity: (α+β)A = αA+βA.
Right distributivity: α(A+B) = αA+αB.
Associativity: (αβ)A=α(βA)).
1A = A.
0A= 0.
(-1)A = -A.

Matrix multiplication

As above, matrix multiplication will also be defined as its correspondence to linear transformations. The product of two matrices is the corresponding matrices of the product of the corresponding two linear transformations.

Consider an o by n dimensional matrix A with entries |a_ij|, n by m dimensional matrix B with entries |b_ij|, and let A be a linear transformation from n-dimensional M to o-dimensional O that corresponds to A, and let B be a linear transformation from m-dimensional N to n-dimensional N that corresponds to B, and let m₁, m₂, m₃, ..., m_m, be basis vectors of M, n₁, n₂, n₃, ..., n_n, be basis vectors of N, o₁, o₂, o₃, ..., o_o, be basis vectors of O. Then

$(AB)(m_{i})=A(\sum _{j=1}^{n}b_{ji}n_{j})=\sum _{j=1}^{n}b_{ji}A(n_{j})=\sum _{j=1}^{n}b_{ji}\sum _{k=1}^{o}a_{kj}o_{k}=\sum _{k=1}^{o}(\sum _{j=1}^{n}a_{kj}b_{ji})o_{k}$

Thus the corresponding matrix has entries |p_ij| that are given by:

$p_{ij}=\sum _{k=1}^{n}a_{ik}b_{kj}$

For example:

{\begin{pmatrix}1&0&2\\-1&3&1\\\end{pmatrix}}\times {\begin{pmatrix}3&1\\2&1\\1&0\\\end{pmatrix}}={\begin{pmatrix}(1\times 3+0\times 2+2\times 1)&(1\times 1+0\times 1+2\times 0)\\(-1\times 3+3\times 2+1\times 1)&(-1\times 1+3\times 1+1\times 0)\\\end{pmatrix}}

={\begin{pmatrix}5&1\\4&2\\\end{pmatrix}}.

Matrix multiplication has the following properties, which have been verified due to the fact that they are also true of linear transformations.

Associativity: A(BC) = (AB)C.
Left distributivity: A(B+C) = AB+AC.
Right distributivity: (A+B)C = AC+BC.
IA = A = AI.
α(BC) = (αB)C = B(αC).

Matrix multiplication is in general not commutative, i.e. there exist matrices for which AB $\neq$ BA. An example can be given by: $A={\begin{pmatrix}7&8\\9&10\\11&12\end{pmatrix}}$ and $B={\begin{pmatrix}1&2&3\\4&5&6\end{pmatrix}}$

The way matrix multiplication is defined seems illogical and strange; why can matrix multiplication not be defined as just multiplying corresponding entries as in the case of addition and scalar multiplication? Unfortunately the actual answer will be available to us only later on in Chapter 3. In the meantime we will satisfy ourselves by noting the advantage that matrix multiplication gives us by representing a linear system in matrix form. This will be clear in the following section.

At this point we see fit to make another definition. An n by n matrix A is invertible if and only if there exists a matrix B such that

AB = I_n = BA.

In this case, B is the inverse matrix of A, denoted by A⁻¹. Clearly the inverse of the identity matrix is itself. We will study invertible matrices in detail later.

One point more is to be noted here. The type of matrix multiplication when the product matrix is simply the matrix obtained by multiplying the corresponding entries of two equal dimension matrices also has a name. It is called the Hadamard product. We shall not use this kind of multiplication. Throughout the book matrix multiplication will always refer to the matrix product defined above.

Determinants of Products of Matrices (Binet's Theorem)

In addition, the determinant is a multiplicative map in the sense that

\det(AB)=\det(A)\det(B)\,

for all n-by-n matrices

A

and

B

.

This is generalized by the Cauchy-Binet formula to products of non-square matrices.

Matrices and system of linear equations

The concept of a matrix was historically introduced to simplify the solution of linear systems although they today have much greater and broad-reaching applications. Let's see how a linear system can be represented using a matrix.

Consider a general system of m linear equations with n unknowns:

{\begin{alignedat}{7}a_{11}x_{1}&&\;+\;&&a_{12}x_{2}&&\;+\cdots +\;&&a_{1n}x_{n}&&\;=\;&&&b_{1}\\a_{21}x_{1}&&\;+\;&&a_{22}x_{2}&&\;+\cdots +\;&&a_{2n}x_{n}&&\;=\;&&&b_{2}\\\vdots \;\;\;&&&&\vdots \;\;\;&&&&\vdots \;\;\;&&&&&\;\vdots \\a_{m1}x_{1}&&\;+\;&&a_{m2}x_{2}&&\;+\cdots +\;&&a_{mn}x_{n}&&\;=\;&&&b_{m}\\\end{alignedat}}

The system is equivalent to a matrix equation of the form

A\mathbf {x} =\mathbf {b}

where A is an m×n matrix, x is a column matrix with n entries, and b is a column matrix with m entries.

A={\begin{bmatrix}a_{11}&a_{12}&\cdots &a_{1n}\\a_{21}&a_{22}&\cdots &a_{2n}\\\vdots &\vdots &\ddots &\vdots \\a_{m1}&a_{m2}&\cdots &a_{mn}\end{bmatrix}},\quad \mathbf {x} ={\begin{bmatrix}x_{1}\\x_{2}\\\vdots \\x_{n}\end{bmatrix}},\quad \mathbf {b} ={\begin{bmatrix}b_{1}\\b_{2}\\\vdots \\b_{m}\end{bmatrix}}

Clearly our manner of defining matrix multiplication is used in representing the linear system in this fashion because now the product of the matrix A and the matrix x gives us precisely the matrix b.

Representing linear systems in this fashion also enables us to easily prove the following theorem:

Theorem 1: Any system of linear equations has either no solution, exactly one solution or infinitely many solutions.

Proof: Suppose a linear system Ax = b has two different solutions given by X and Y. Then let Z = X - Y. Clearly Z is non zero and A(X + kZ) = AX + kAZ = b + k(AX - AY) = b + k(b - b) = b so that X + kZ is a solution to the system for every possible value of k. Since k can assume infinitely many values so clearly we have an infinite number of solutions.

Exercises

Hints to many of the exercises can be found at Famous Theorems of Mathematics/Algebra/Matrix Theory.

1. Let A and B be m × matrices. Then:

(i)

(kA)^{T}

=

kA^{T}

(ii)

(A+B)^{T}=A^{T}+B^{T}

(iii)

(AB)^{T}=B^{T}A^{T}

2. Let a triangular matrix be a square matrix with either all (i,j) entries zero for either i<j (in which case it is called an lower triangular matrix) or for j<i (in which case it is called an upper triangular matrix). Show that any triangular matrix satisfying $AA^{T}=A^{T}A$ is a diagonal matrix.

3. For a square matrix A show that:

(i)

AA^{T}

and

A+A^{T}

are symmetric

(ii)

A-A^{T}

is skew symmetric

(iii) A can be expressed as the sum of a symmetric matrix,

{\frac {1}{2}}(A+A^{T})

and a skew symmetric matrix

{\frac {1}{2}}(A-A^{T})

4. Suppose A is a m×n matrix and x is a n×1 column vector. Show that if $x={\begin{pmatrix}x_{1}\\x_{2}\\\vdots \\x_{n}\end{pmatrix}}$ and $A={\begin{pmatrix}c_{1}&c_{2}&\cdots c_{n}\end{pmatrix}}$ where $c_{j}={\begin{pmatrix}A_{1j}\\A_{2j}\\\vdots \\A_{mj}\end{pmatrix}}$ then $Ax=x_{1}c_{1}+x_{2}c_{2}+\cdots x_{n}c_{n}$ . This is also expressed by saying that Ax is a linear combination of the columns of A.

Linear Algebra
Linear Transformations	Matrices	Elementary row transformations

Matrices and Linear Transformations

Algebra of Transformations

Addition

Scalar multiplication

Multiplication

Corresponding algebra of matrices

Addition

Scalar multiplication

Matrix multiplication

Determinants of Products of Matrices (Binet's Theorem)

Matrices and system of linear equations

Exercises

See also