Lecture 12 Basic Matrix Algebra
While the use of matrix notation may seem unnecessarily complicated at first, there are good reasons for learning about it:
Many results for linear models are much more easily derived and understood using matrix notation than without it.
Matrix formulation of linear models is the norm in some areas of science, so will help you understand the literature.
In this lecture we will cover the necessary terminology and matrix algebra.
even if you have a good understanding of matrices, you will see how various matrix operations are done using R.
you do not need to remember all R commands presented here. There is always time to come back and find them if you need something.
What you need to know before we start is that matrix algebra is the way that R does practically all the calculations needed for fitting linear models, no matter what kind of linear model we are fitting. The linear model work comes in the next lecture.
12.1 What is a Matrix?
A matrix is a collection of numbers arranged in a rectangular array.
Matrices are often denoted using capital letters, but notation is far from consistent due to changes in preference and typesetting technology.
If a matrix has r rows and c columns then the matrix is r by c.
\[X = \left [ \begin{array}{cccccc} x_{11} & x_{12} & \cdots & x_{1j} & \cdots & x_{1c} \\ x_{21} & x_{22} & \cdots & x_{2j} & \cdots & x_{2c} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ x_{i1} & x_{i2} & \cdots & x_{ij} & \cdots & x_{ic} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ x_{r1} & x_{r2} & \cdots & x_{rj} & \cdots & x_{rc} \end{array} \right ] \label{eq:matrix1}\]
- xij denotes the i,jth element of the matrix X.
12.2 Types of matrices
12.2.1 Square Matrices
Matrix with same number of rows as columns is square.
Square matrices have some nice properties, like having an inverse (coming soon).
The matrix A defined by \[A = \left [ \begin{array}{cc} 1.74 & 4.11 \\ 3.11 & 3.16 \\ \end{array} \right ]\] is a square matrix. The main diagonal of this matrix contains the elements 1.74 and 3.16.
[1] 1.74 3.16
12.2.2 Symmetric Matrices
Matrix with elements that can be defined as xij = xji for all i and j is called symmetric.
The matrix A defined by \[A = \left [ \begin{array}{ccc} 3 & -2 & 0 \\ -2 & 1 & 4 \\ 0 & 4 & 5 \\ \end{array} \right ]\] is a symmetric matrix.
12.2.3 Diagonal Matrices
The matrix with zero entries everywhere away from the main/leading diagonal (top left to bottom right) is called diagonal.
The matrix A defined by \[A = \left [ \begin{array}{ccc} 3 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 5 \\ \end{array} \right ]\] is a diagonal matrix.
[,1] [,2] [,3]
[1,] 3 0 0
[2,] 0 1 0
[3,] 0 0 5
12.2.4 Identity Matrices
The matrix with ones along the main diagonal and zero elements everywhere else is called the identity matrix.
The identity matrix In is said to be of size (or order) n, with dimensions \(n \times n\).
- Identity matrices can be thought of as the “units” in matrix algebra, in that they play much the same role as the number 1 in standard multiplication.
The matrix \[I_3 = \left [ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} \right ]\] is the identity matrix of order 3.
12.2.5 Vectors
A matrix with a single column is a column vector.
A matrix with a single row is a row vector.
Vectors typically assumed to be column vectors unless explicitly stated otherwise.
Vectors are a special type of matrix, and have their own notation. Specifically, vectors are usually denoted by a bold, lower case character, with elements specified by a single subscript.
R stores single column or single rows as matrices; R’s vectors don’t know about direction.
The vector \[\boldsymbol{x} = \left [ \begin{array}{c} x_1 \\ x_2 \\ x_3 \\ \end{array} \right ] = \left [ \begin{array}{c} 1.4 \\ 0.5 \\ -0.3 \\ \end{array} \right ]\] is a column vector. The vector \(\boldsymbol{v} = (4.2, 5.0)\) is a row vector.
12.3 Matrix operations
12.3.1 Matrix Addition and Subtraction
Only matrices of the same size can be added or subtracted.
Addition and subtraction are then done element by element.
An example of addition: \[\left [ \begin{array}{cc} 3 & 4 \\ 2 & 2 \\ 4 & 1 \\ \end{array} \right ] + \left [ \begin{array}{cc} 1 & 6 \\ 4 & 2 \\ 1 & 2 \\ \end{array} \right ] = \left [ \begin{array}{cc} 4 & 10 \\ 6 & 4 \\ 5 & 3 \\ \end{array} \right ].\] An example of subtraction: \[\left [ \begin{array}{cc} 3 & 4 \\ 2 & 2 \\ 4 & 1 \\ \end{array} \right ] - \left [ \begin{array}{cc} 1 & 6 \\ 4 & 2 \\ 1 & 2 \\ \end{array} \right ] = \left [ \begin{array}{cc} 2 & -2 \\ -2 & 0 \\ 3 & -1 \\ \end{array} \right ].\]
12.3.2 Multiplication by a Scalar
Multiplication of a matrix by a scalar (i.e. a single number) is achieved by multiplying each element of the matrix by that scalar.
If \[A = \left [ \begin{array}{cc} 3 & 4 \\ 2 & 2 \\ 4 & 1 \\ \end{array} \right ]\] then \[4A = \left [ \begin{array}{cc} 12 & 16 \\ 8 & 8 \\ 16 & 4 \\ \end{array} \right ].\]
12.3.3 Matrix by Matrix Multiplication
For matrices A and B we can evaluate the product AB only if the number of columns of A is the same as the number of rows of B.
If the matrices are conformable in this way, then the i,jth element of the product C=AB is given by \[c_{ij} = \sum_{k} a_{ik} b_{kj}.\]
12.3.4 A Simple Matrix Product
Define \[A = \left [ \begin{array}{ccc} 2 & 1 & 3\\ 1 & 3 & 4 \\ \end{array} \right ]\] and \[B = \left [ \begin{array}{cc} 3 & 4 \\ 2 & 2 \\ 4 & 1 \\ \end{array} \right ]\]
Then \[AB = \left [ \begin{array}{ccc} 2 & 1 & 3\\ 1 & 3 & 4 \\ \end{array} \right ] \left [ \begin{array}{cc} 3 & 4 \\ 2 & 2 \\ 4 & 1 \\ \end{array} \right ] = \left [ \begin{array}{cc} 20 & 13 \\ 25 & 14 \\ \end{array} \right ].\]
A.mat = matrix(c(2, 1, 3, 1, 3, 4), nrow = 2, byrow = TRUE)
B.mat = matrix(c(3, 4, 2, 2, 4, 1), nrow = 3, byrow = TRUE)
A.mat %*% B.mat # do not use a single star here!
[,1] [,2]
[1,] 20 13
[2,] 25 14
N.B. You might see what happens if you don’t use the correct operator %*%
in this last calculation.
Consider the matrices from the previous example. We will now compute the product BA. We have
\[BA = \left [ \begin{array}{cc} 3 & 4 \\ 2 & 2 \\ 4 & 1 \\ \end{array} \right ] \left [ \begin{array}{ccc} 2 & 1 & 3\\ 1 & 3 & 4 \\ \end{array} \right ] = \left [ \begin{array}{ccc} 10 & 15 & 25 \\ 6 & 8 & 14 \\ 9 & 7 & 16 \\ \end{array} \right ]\]
[,1] [,2] [,3]
[1,] 10 15 25
[2,] 6 8 14
[3,] 9 7 16
\(AB \ne BA\). In general matrix multiplication is non-commutative; that is, the order matters.
12.3.5 Product of Matrix by Vector
This follows exactly the same rules as for matrix by matrix multiplication.
But if we use the single subscript notation for elements of vectors, then the equations look a bit different.
Let A be an \(r \times c\) matrix, and let \(\boldsymbol{v}\) be a column vector with c elements.
Then if \(A\boldsymbol{v} = \boldsymbol{x}\), we find that \(\boldsymbol{x}\) is also a column vector but with r elements, defined by \[x_i = \sum_{j=1}^c a_{ij}v_j.\]
12.3.6 A Simple Product
Let \(A = \left [ \begin{array}{ccc} 2 & 1 & 3\\ 1 & 3 & 4 \\ \end{array} \right ]\) and \(\boldsymbol{v} = \left [ \begin{array}{c} 4\\ 0\\ 2\\ \end{array} \right ]\)
Then
\[A\boldsymbol{v} = \left [ \begin{array}{ccc} 2 & 1 & 3\\ 1 & 3 & 4\\ \end{array} \right ] \left [ \begin{array}{c} 4\\ 0\\ 2\\ \end{array} \right ] = \left [ \begin{array}{c} 14\\ 12\\ \end{array} \right ] = \boldsymbol{x} \]
12.4 Operations specific to matrices
12.4.1 Matrix Transpose
The transpose of a matrix is obtained by interchanging rows and columns.
- The transpose of \(\boldsymbol{A}\) is denoted \(\boldsymbol{A}^T\) or sometimes \(\boldsymbol{A}^t\) or \(\boldsymbol{A}^\prime\).
Using B as defined above, its transpose is given by \(\boldsymbol{B}^T = \left [ \begin{array}{ccc} 3 & 2 & 4\\ 4 & 2 & 1\\ \end{array} \right ]\).
[,1] [,2] [,3]
[1,] 3 2 4
[2,] 4 2 1
12.4.2 Properties of Transposition
There are numerous matrix manipulations that are used by software to make calculations more efficient.
As an example, \((\boldsymbol{A} \boldsymbol{B})^T = \boldsymbol{B}^T \boldsymbol{A}^T\).
Matrix/vector transposition proves useful in computing sums of squares in statistics.
\(\boldsymbol{v}^T \boldsymbol{v} = \left [ v_1, v_2, \ldots, v_n \right ] \left [ \begin{array}{c} v_1 \\ v_2 \\ \vdots \\ v_n \\ \end{array} \right ] = v_1 \times v_1 + v_2 \times v_2 + \cdots v_n \times v_n = \sum_{i=1}^{n} {v_i^2}\)
12.4.3 Matrix Inverse
For a given matrix A, the inverse matrix is denoted A-1 and satisfies AA-1 = I = A-1 A.
Only square matrices can have inverses, and even some square matrices will be uninvertible or singular.
12.4.4 A Simple Matrix and Its Inverse
Consider the matrix A defined by
\(A = \left [ \begin{array}{cc} 3 & 2 \\ 1 & 2 \\ \end{array} \right ]\)
The inverse of this matrix is \(A^{-1} = \left [ \begin{array}{cc} \tfrac{1}{2} & -\tfrac{1}{2} \\ -\tfrac{1}{4} & \tfrac{3}{4} \\ \end{array} \right ]\)
[,1] [,2]
[1,] 0.50 -0.50
[2,] -0.25 0.75
As an exercise you should perform the matrix multiplication to confirm that
\[A A^{-1} = \left [ \begin{array}{cc} 3 & 2 \\ 1 & 2 \\ \end{array} \right ] \left [ \begin{array}{cc} \tfrac{1}{2} & -\tfrac{1}{2} \\ -\tfrac{1}{4} & \tfrac{3}{4} \\ \end{array} \right ] = \left [ \begin{array}{cc} 1 & 0 \\ 0 & 1 \\ \end{array} \right ] = I\]
and (or)
\[A^{-1} A = \left [ \begin{array}{cc} \tfrac{1}{2} & -\tfrac{1}{2} \\ -\tfrac{1}{4} & \tfrac{3}{4} \\ \end{array} \right ] \left [ \begin{array}{cc} 3 & 2 \\ 1 & 2 \\ \end{array} \right ] = \left [ \begin{array}{cc} 1 & 0 \\ 0 & 1 \\ \end{array} \right ] = I.\]
12.4.5 Inverse of a 2 by 2 Matrix
In general evaluation of a matrix inverse is a tedious matter that is best left to a computer. However, the calculation of the inverse of a 2 by 2 matrix is easy to do by hand.
The inverse of \(\left [ \begin{array}{cc} a & b \\ c & d \\ \end{array} \right ]^{-1} = \frac{1}{ad-bc} \left [ \begin{array}{cc} d & -b \\ -c & a \\ \end{array} \right ]\), as long as \(ad-bc \ne 0\) (ad-bc=0 is very unusual).