Math 511 Home, Contents, Next

Matrices

 

Matrix Arithmetic: Having considered the rings and fields Zn, we are now going to consider a different algebraic structure, matrices. A matrix is a rectangular array of numbers, e.g. . While this example has two rows and two columns, you can build matrices with any number of rows and columns. If a matrix has n rows and k columns, we say the shape of the matrix is n´k. We will stick to the 2´2 case for the next day or two. The rules for manipulating 2´2 matrices are as follows.

 

Addition: .

 

Multiplication: .

 

We can check that with these rules, 2´2 matrices satisfy field laws 1-4, 5, 7, and 9. Laws 1-4 concern addition and since matrix multiplication is the same as regular addition in each position, it is easy to check that it satisfies the same rules. The associative law of multiplication isn't as obvious, but multiplying things out shows that law 5 is satisfied as well. The commutative law of multiplication, law 6, fails for matrices. For example  but . There is an identity matrix, , so law 7 is satisfied. The rule for multiplicative inverses is . This only makes sense for matrices such that , and if  then the matrix doesn't have an inverse, so law 8 fails. Finally, we can check the distributive law works. Since the commutative law fails, we want to check two different situations. If A, B, and C are matrices, then A(B + C) = AB + AC and (B + C)A = BA + CA.

 

Non-Commutativity and Factoring: Consider the matrix-valued equation

It is tempting to rewrite the x terms as , but this is not justified. Since matrix multiplication is non-commutative, we can’t factor out an x from the right side of one term and the left side of the other term. The distributive law only tells us that we can do this when we have the same term on the same side. You may recall that when we first mentioned factoring rules for Zn, I noted that we needed the commutative law for factoring. Because of issues like this, dealing with polynomial equations in matrix rings can be complicated. In this class, we will stick to linear equations when we are dealing with matrices.

 

Linear Equations: An application of matrix algebra is in the solution of simultaneous linear equations. Suppose we want to solve the system of equations 3x + y = 5, x - 3y = 7. We can rewrite these two equations as a single matrix equation . We solve this equation by multiplying both sides of the equation on the left by  which gives  so the solution to the system of equations is x = 2.2 and y = -1.6.

 

For the example of linear equation above, the only change having a non-commutative ring makes is that we need to be certain not just to multiply both sides of the equation by the inverse matrix, we must also multiply both sides on the same side by the same matrix. In solving linear equations as in the example above, we will find a unique solution if we have an inverse, and we will have either 0 or many (in this case infinitely many) solutions if our matrix doesn't have an inverse. This is exactly the same situation as we encountered with Zn. Next we will step outside the ring of 2´2 matrices to discuss a practical situation where we need to find an approximate solution by finding an approximate inverse.

 

Linear Regression: Suppose we want to fit a line y = ax + b to the data

 

x

y

3

14

4

20

6

27

8

41

12

63

15

73

 

We could write this as a matrix equation . Unfortunately, this equation has no solution. This is not surprising, since we are trying to find a single line that passes through 6 different points. In general, we will write our matrix equation as Aa = y where A = , a = , and y =  and the xi and yi are the data values. As long as we have more than two data points, we probably won't be able to find an exact solution. Algebraically, this corresponds to the fact that A doesn't have an inverse matrix, since it isn't square. Since we can't find an exact solution, we will try to find an approximate solution. Geometrically, since there is no line that passes through all our data points, we will find a line that comes close to all the points. We measure the error of our line using the sum of squared error, . To minimize SSE, we differentiate with respect to both variables a and b and set the results equal to 0. This will give us the following two equations

 

Collecting the a and b terms together in these equations gives us the system of two equations in two unknowns (a and b)

 

 

Recalling our matrix equation was Aa = y, we observe that this pair of equations can be written in the form ATAa = ATy, where AT is the transpose matrix formed by flipping A about its main diagonal, AT = . We now solve the equation by multiplying both sides on the left by (ATA)-1 to get a = (ATA)-1ATy. For our initial example, this gives a = 559/110 and b = -163/165, and you can check that the line y = (559/110)x - 163/165 does pass very near all the data points (a graphing calculator can be quite useful here).

 

Pseudoinverses: The above process is called linear regression. Your students will use linear regression to fit lines to data in their science classes, so it is nice to see how this is an application of the matrices we've been studying. But it would be nicer if we had some tie to the algebra of matrices, this being an algebra class and all. Fortunately, there is such a connection. We have avoided non-square matrices, because the field laws don't apply to them. We can't even define addition and multiplication between arbitrary non-smooth matrices. But while the non-square matrix A can't have an inverse, it does have what is called a pseudoinverse. A pseudoinverse is a matrix B so that BAB = B and ABA = A. The pseudoinverse of an n´2 matrix (with n > 2) is B = (ATA)-1AT. So our approximate solution to Aa = y is a = By, where B is the pseudoinverse of A. In other words, we find an approximate solution by multiplying through by an approximate inverse.

 

The same ideas we've used for linear regression apply to other forms than y = ax + b. For example, if we want to fit a quadratic curve y = ax2 + bx + c to a set of n data points, we could write this as a matrix equation for an n´3 matrix in the same fashion we used an n´2 matrix for linear regression. We would use the pseudoinverse to find the values of a, b, and c that minimize SSE in the exact same way. Because of these applications some advanced statistics classes deal with the algebra of pseudoinverses. For this class, we will just leave them as an example of what can be done in more complicated algebraic structures, and as an example of the applications of matrices to a standard topic in secondary mathematics.

 


Please report any problems with this page to bennett@math.ksu.edu
©2000 Andrew G. Bennett