\displaystyle \left[\begin{matrix}56753.6889897841\\8626.56072644719\\4.02951191827469\end{matrix}\right]
We all know what “linear” means: straight!
When we put them all together, we get a linear system:
\begin{cases} a_{11} x_1 + a_{12} x_2 +\dots + a_{1n} x_n = b_1 \\ a_{21} x_1 + a_{22} x_2 + \dots + a_{2n} x_n = b_2 \\ \vdots\\ a_{m1} x_1 + a_{m2} x_2 + \dots + a_{mn} x_n = b_m, \end{cases}
We use equations to represent constraints (physics, data, budgets, flows, etc.) When there are multiple constraints, we get a system of equations.
Examples:
Of course, many equations we can think of won’t be linear… But many things are!
A chemical manufacturer wants to lease a fleet of 24 railroad tank cars with a combined carrying capacity of 520,000 gallons.
Will we have a single solution?
Wait, what? How is this a linear system?
There are three years that our graph needs to pass through: 1900, 1950, and 2000. We can write these as t_1 = 0, t_2 = 50, and t_3 = 100.
We can write the population as y_1 = 75, y_2 = 150, and y_3 = 275.
We are asked to find a quadratic equation, which means it’s of the form y = ax^2 + bx + c. We need to find the values of a, b, and c so that the equation passes through our three points.
For the first point, we have a t_1^2 + b t_1 + c = y_1. Since t_1=0, this becomes a 0^2 + b 0 + c = 75.
If we do this for the other two points, we get the system:
\begin{cases} a t_1^2 + b t_1 + c = y_1 \\ a t_2^2 + b t_2 + c = y_2 \\ a t_3^2 + b t_3 + c = y_3 \end{cases}
Is this linear?
Yes, it is linear in a, b, and c, because none of them is raised to a power other than 1.
The terms that are squared are just coefficients. We can even calculate them: t_2^2 = 2500, for instance. Then we can rewrite the system as:
\begin{cases} 0 a + b 0 + c = 75 \\ 2500 a + 50 b + c = 150 \\ 10000 a + 100 b + c = 275 \end{cases}
We see that we now have a standard system of linear equations.
An algorithm should be:
We’ll get started with an example that is easy enough to solve by hand. We’ll pay attention to what we do, and why.
\begin{gather*} 2 x-y=1 \\ 4x+4y=20 \end{gather*}
It was a little hard to keep track of the variables just there.
(Not really! It shouldn’t have been! But it’s late when I’m writing this, and my brain doesn’t work so well, I need something easier…)
Make an augmented matrix to represent the problem.
Rules for an augmented matrix: - The numbers in the first column are the coefficients of the first variable, for each equation. - The numbers in the second column are the coefficients of the second variable, for each equation. And so on. - The numbers in the last column are the right-hand sides of the equations.
Remember, our two equations were:
\begin{gather*} 2 x-y=1 \\ 4x+4y=20 \end{gather*}
So as an augmented matrix, they become:
\begin{aligned} & x \quad y=r . h . s . \\ & {\left[\begin{array}{rrr} 2 & -1 & 1 \\ 4 & 4 & 20 \end{array}\right]} \end{aligned}
Our goal: manipulate the augmented matrix to solve the problem.
We are going to change around the numbers in our matrix without changing the solution.
What does that mean? It means that we will manipulate the matrix so it represents different systems of equations that have the same solution.
For example, these two systems have the same solution:
\begin{gather*} 2 x-y=1 \\ 4x+4y=20 \end{gather*}
\begin{gather*} 4x+4y=20 \\ 2 x-y=1 \end{gather*}
We have simply swapped the order of the equations. They still have the same solution.
How does this change the matrix? It swaps the first and second rows.
\begin{aligned} & x \quad y=r . h . s . \\ & {\left[\begin{array}{rrr} 4 & 4 & 20 \\ 2 & -1 & 1 \end{array}\right]} \end{aligned}
Swapping the rows of the matrix is equivalent to swapping the order of the equations, and it does not change the solution.
We can also multiply one of our equations by a constant. For example, if we multiply the second equation by 2, we get:
\begin{gather*} 2 x-y=1 \\ 8x+8y=40 \end{gather*}
This has the same solution as the original system.
How does this change the matrix? It multiplies the second row by 2.
\begin{aligned} & x \quad y=r . h . s . \\ & {\left[\begin{array}{rrr} 2 & -1 & 1 \\ 8 & 8 & 40 \end{array}\right]} \end{aligned}
We did this when we were solving the problem by hand. We subtracted two times the first row from the second row.
How does this change the matrix? It adds -2 times the first row to the second row.
\begin{aligned} & x \quad y=r . h . s . \\ & {\left[\begin{array}{rrr} 2 & -1 & 1 \\ 0 & 6 & 18 \end{array}\right]} \end{aligned}
When we do these on augmented matrices, the solutions are unchanged…
We are going to use a sequence of elementary row operations to transform our augmented matrix into a form where we will be able to read off the solution.
This form is called reduced row echelon form.
Here is an example of an augmented matrix in reduced row echelon form:
\begin{aligned} & x \quad y=r . h . s . \\ & {\left[\begin{array}{rrr} 1 & 4 & 2 \\ 0 & 1 & 3 \end{array}\right]} \end{aligned}
It corresponds to the following system of equations:
\begin{gather*} x + 4y = 2 \\ y = 3 \end{gather*}
See how easy that is to solve? We can just read from the bottom up: y=3, and then x + 4(3) = 2, so x = 2 - 12 = -10.
Any time we can get an augmented matrix into reduced row echelon form, it will be this easy to solve.
Every matrix can be reduced by a sequence of elementary row operations to one and only one reduced row echelon form:
We just need to know which row operations to do, and in what order. Then we can solve any linear system!
Vocabulary: process of getting a 1 in a location, and then making all other entries zeros in that column, is pivoting.
The number that is made a 1 is called the pivot element, and the row that contains the pivot element is called the pivot row.
\left[\begin{array}{rrr} 4 & 4 & 20 \\ 2 & -1 & 1 \end{array}\right] \overrightarrow{E_{1}(\frac{1}{4}}\left[\begin{array}{lll} 1 & 1 & 5 \\ 2 & -1 & 1 \end{array}\right] \overrightarrow{E_{12}(-2)}\left[\begin{array}{lll} 1 & 1 & 5 \\ 0 & -3 & -9 \end{array}\right]
and then…
\left[\begin{array}{rrr} 1 & 1 & 5 \\ 0 & -3 & -9 \end{array}\right] \overrightarrow{E_{2}(-1 / 3)}\left[\begin{array}{lll} 1 & 1 & 5 \\ 0 & 1 & 3 \end{array}\right] \overrightarrow{E_{12}(-1)}\left[\begin{array}{lll} 1 & 0 & 2 \\ 0 & 1 & 3 \end{array}\right] .
Writing this back as a linear system, we have:
\begin{gather*} 2 x-y=1 \\ 4 x+4 y=20 \end{gather*} \Longrightarrow {\begin{aligned} & 1 \cdot x+0 \cdot y=2 \\ & 0 \cdot x+1 \cdot y=3 \end{aligned}} \Longrightarrow {\begin{aligned}x &= 2 \\ y &= 3 \end{aligned}}
There are 2 trees in a garden (tree “A” and “B”) and in both trees are some birds.
The birds of tree A say to the birds of tree B that if one of you comes to our tree, then our population will be double yours.
Then the birds of tree B tell the birds of tree A that if one of you comes here, then our population will be equal to yours.
How many birds in each tree?
(Solve by making an augmented matrix and doing G-J elimination.)
A mine produces silica, iron, and gold
Needs Money (in $$), operating time (in hours), and labor (in person-hours).
1 pound of silica needs: $.0055, . 0011 hours of operating time, and .0093 hours of labor.
1 pound of iron needs: $.095, . 01 operating hours, and .025 labor hours.
1 pound of gold needs: $ 960, 112 operating hours, and 560 labor hours.
Suppose that during 600 hours of operation, exactly $ 5000 and 3000 labor-hours are used.
How much silica (x), iron (y), and gold (z) were produced?
Set up the linear system whose solution will yield the values for x, y, and z.
\begin{aligned} .0055 x + .095 y + 960 z &= 5000 \quad \text{(dollars)}\\ .0011 x + .01 y + 112 z &= 600 \quad \text{(operating hours)}\\ .0093 x + .025 y + 560 z &= 3000\quad \text{(labor hours)} \end{aligned}
Make the augmented matrix:
\left[\begin{array}{@{}ccc|c@{}} .0055 & .095 & 960 & 5000 \\ .0011 & .01 & 112 & 600 \\ .0093 & .025 & 560 & 3000 \end{array}\right]
We can solve this using Gauss-Jordan elimination.
\left[\begin{matrix}0.0055 & 0.095 & 960 & 5000\\0.0011 & 0.01 & 112 & 600\\0.0093 & 0.025 & 560 & 3000\end{matrix}\right] \overrightarrow{E_{12}(\frac{-1}{5})} \left[\begin{matrix}0.0055 & 0.095 & 9.6 \cdot 10^{2} & 5.0 \cdot 10^{3}\\0 & -0.009 & -80.0 & -4.0 \cdot 10^{2}\\0.0093 & 0.025 & 5.6 \cdot 10^{2} & 3.0 \cdot 10^{3}\end{matrix}\right]
\overrightarrow{E_{13}(\frac{93}{55})} \left[\begin{matrix}0.0055 & 0.095 & 9.6 \cdot 10^{2} & 5.0 \cdot 10^{3}\\0 & -0.009 & -80.0 & -4.0 \cdot 10^{2}\\0 & -0.14 & -1.1 \cdot 10^{3} & -5.4 \cdot 10^{3}\end{matrix}\right]
\overrightarrow{E_{23}(\frac{-14}{.9})} \left[\begin{matrix}0.0055 & 0.095 & 9.6 \cdot 10^{2} & 5.0 \cdot 10^{3}\\0 & -0.009 & -80.0 & -4.0 \cdot 10^{2}\\0 & 0 & 1.4 \cdot 10^{2} & 5.7 \cdot 10^{2}\end{matrix}\right]
This has solutions \left[\begin{matrix}5.7 \cdot 10^{4}\\8.9 \cdot 10^{3}\\4.0\end{matrix}\right].
How do these compare to the exact solutions? These are
\displaystyle \left[\begin{matrix}56753.6889897841\\8626.56072644719\\4.02951191827469\end{matrix}\right]
\left[\begin{matrix}0.0055 & 0.095 & 960 & 5000\\0.0011 & 0.01 & 112 & 600\\0.0093 & 0.025 & 560 & 3000\end{matrix}\right] \rightarrow \left[\begin{matrix}0.0055 & 0.095 & 960.0 & 5000.0\\0 & -0.009 & -80.0 & -400.0\\0.0093 & 0.025 & 560.0 & 3000.0\end{matrix}\right]
\rightarrow \left[\begin{matrix}0.0055 & 0.095 & 960.0 & 5000.0\\0 & -0.009 & -80.0 & -400.0\\0 & -0.135636363636364 & -1063.27272727273 & -5454.54545454545\end{matrix}\right] \rightarrow \left[\begin{matrix}0.0055 & 0.095 & 960.0 & 5000.0\\0 & -0.009 & -80.0 & -400.0\\0 & 0 & 142.383838383838 & 573.737373737372\end{matrix}\right]
This has solutions \left[\begin{matrix}56753.6889897845\\8626.56072644727\\4.02951191827468\end{matrix}\right].
Reminder, the exact solution was:
\displaystyle \left[\begin{matrix}56753.6889897841\\8626.56072644719\\4.02951191827469\end{matrix}\right]
An algorithm should be:
Try this on your calculator. What do you get?
\left(\left(\frac{2}{3}+100\right)-100\right)-\frac{2}{3}
Should this be (0)?
Let \epsilon be a number so small that our calculator yields 1+\epsilon=1.
With this calculator, 1+1 / \epsilon=(\epsilon+1) / \epsilon=1 / \epsilon
Want to solve the linear system
\begin{align*} \epsilon x_{1}+x_{2} & =1\\ x_{1}-x_{2} & =0 . \end{align*}
With our calculator,
\left[\begin{array}{rrr} \epsilon & 1 & 1 \\ 1 & -1 & 0 \end{array}\right] \overrightarrow{E_{21}\left(-\frac{1}{\epsilon}\right)}\left[\begin{array}{ccc} \epsilon & 1 & 1 \\ 0 & \frac{1}{\epsilon} & -\frac{1}{\epsilon} \end{array}\right] \overrightarrow{E_{2}(\epsilon)}\left[\begin{array}{ccc} \epsilon & 1 & 1 \\ 0 & 1 & 1 \end{array}\right] \\ \overrightarrow{E_{12}(-1)}\left[\begin{array}{ccc} \epsilon & 0 & 0 \\ 0 & 1 & 1 \end{array}\right] \overrightarrow{E_{1}\left(\frac{1}{\epsilon}\right)}\left[\begin{array}{lll} 1 & 0 & 0 \\ 0 & 1 & 1 \end{array}\right] .
Calculated solution: x_{1}=0, x_{2}=1
Correct answer should be
x_{1}=x_{2}=\frac{1}{1+\epsilon}=0.999999099999990 \ldots
Problem arose because we took a computational step where we added two numbers of very different scale, essentially losing the smaller number.
Led to a big changes in output!
There is no general cure for these difficulties…
Want to be aware of them, know when we are doing computations that might be susceptible.
We can improve performance of G-J by introducing a new step into the algorithm:
Find the entry in the left column with the largest absolute value. This entry is called the pivot. Perform row interchange (if necessary), so that the pivot is in the first row.
Use a row operation to get a 1 as the entry in the first row and first column.
Use row operations to make all other entries as zeros in column one.
Interchange rows if necessary to obtain a nonzero number with the largest absolute value in the second row, second column. Use a row operation to make this entry 1. Use row operations to make all other entries as zeros in column two.
Repeat step 4 for row 3, column 3. Continue moving along the main diagonal until you reach the last row, or until the number is zero.
A system of linear equations is said to be ill-conditioned when some small perturbation in the system (in the bs) can produce relatively large changes in the exact solution (in the x’s). Otherwise, the system is said to be well-conditioned.
Consider \begin{aligned} & .835 x+.667 y=.168, \\ & .333 x+.266 y=.067, \end{aligned}
Exact solution:
x=1 \quad \text { and } \quad y=-1 .
But if we change just one digit…
\begin{aligned} & .835 x+.667 y=.168, \\ & .333 x+.266 y=.06\color{red}{6}, \end{aligned}
Now exact solution:
\hat{x}=-666 \quad \text { and } \quad \hat{y}=834
What if b_1 and b_2 are the results of an experiment, need to be read off a dial? Suppose:
\begin{equation*} (x, y)=(1,-1) \end{equation*}
\begin{equation*} .167 \leq b_{1} \leq .169 \quad \text { and } \quad .066 \leq b_{2} \leq .068 \end{equation*}
| b_1 | b_2 | x | y |
|---|---|---|---|
| .168 | .067 | 1 | -1 |
| .167 | .068 | 934 | -1169 |
| .169 | .066 | -932 | 1169 |
If two straight lines are almost parallel and if one of the lines is moved only slightly, then the point of intersection is drastically altered.
The point of intersection is the solution of the associated 2 \times 2 linear system, so this is also drastically altered.
Suppose we’d like to find a polynomial that can interpolate a given function.
Take points 𝑥0,…𝑥𝑁 and values 𝑦0,…𝑦𝑁
Simple way: finding the coefficients $ c_1, c_n $ where
P(x) = \sum_{i=0}^N c_i x^i
This is a system of equations:
\begin{array} cy_0 = c_0 + c_1 x_0 + \ldots c_N x_0^N\\ \,\ldots\\ \,y_N = c_0 + c_1 x_N + \ldots c_N x_N^N \end{array}
Or, stacking, c = \begin{bmatrix} c_0 & \ldots & c_N\end{bmatrix}, y = \begin{bmatrix} y_0 & \ldots & y_N\end{bmatrix} , and
A = \begin{bmatrix} 1 & x_0 & x_0^2 & \ldots &x_0^N\\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_N & x_N^2 & \ldots & x_N^N \end{bmatrix}
Let’s try solving this for a simple function, y=exp(x).
x*y
We’ll start with picking 4 interpolating points: \vec x = \mathtt{\text{[ 1. 4.66666667 8.33333333 12. ]}}. Then we have our matrix A:
A = \begin{bmatrix} 1 & x_0 & x_0^2 & \ldots &x_0^N\\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_N & x_N^2 & \ldots & x_N^N \end{bmatrix} = \left[\begin{matrix}1.0 & 1.0 & 1.0 & 1.0\\1.0 & 4.66666666666667 & 21.7777777777778 & 101.62962962963\\1.0 & 8.33333333333333 & 69.4444444444444 & 578.703703703703\\1.0 & 12.0 & 144.0 & 1728.0\end{matrix}\right]
Our y values are \vec y = \mathtt{\text{[ 2.71828183 106.3426754 4160.26200538 162754.791419 ]}}. We could solve this using Gauss-Jordan elimination, or here the solving algorithm built into Numpy.