CL9 - Linear Regression
.pdf
keyboard_arrow_up
School
University of Illinois, Urbana Champaign *
*We aren’t endorsed by this school
Course
257
Subject
Mathematics
Date
Apr 3, 2024
Type
Pages
17
Uploaded by CoachGuanaco4012 on coursehero.com
12/13/22, 12:11 AM
Least Squares
https://www.prairielearn.org/pl/workspace/626847
1/17
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-45lty7mr becaus
e the default path (/tmp/cache/matplotlib) is not a writable directory; it is highly rec
ommended to set the MPLCONFIGDIR environment variable to a writable directory, in partic
ular to speed up the import of Matplotlib and to better support multiprocessing.
Lesson 9: Linear Least Squares
Linear Least Squares and Normal Equations
Consider a set of $m$ data points $\{(t_1,y_1),(t_2,y_2), ...,(t_m,y_m)\}$. Suppose we want to find a
straight line that best fits these data points. Mathematically, we want to find the slope and intercept
($x_1$ and $x_2$ respectively) such that
$$y_i = x_1 t_i + x_2, \qquad i \in \{1,2,...,m\} $$
We can write this more explicitly as ${\bf y} = {\bf t}\cdot x_1 + 1\cdot x_2$, which lets us separate
the coefficients we are trying to solve for ($x_1$ and $x_2$) from the input data that we have (${\bf
y}$ and ${\bf t}$). Now we can put these values into a matrix format:
$$\begin{bmatrix}y_1 \\ y_2 \\ \vdots \\ y_m\end{bmatrix} = \begin{bmatrix}t_1 & 1 \\ t_2 & 1 \\
\vdots & \vdots \\ t_m & 1\end{bmatrix} \begin{bmatrix}x_1 \\ x_2\end{bmatrix} \qquad
\Leftrightarrow \qquad {\bf y} = {\bf A}{\bf x}$$
where ${\bf A}$ is the design matrix that is a function of the $t_i$ data, ${\bf y}$ is the vector with
the $y_i$ data and ${\bf x}$ is the vector with the coefficients $x_i$. Notice how the values that
multiply our coefficients $x_i$ are the columns of ${\bf A}$.
Generally, if we have a linear system where ${\bf A}$ is an $m \times n$ matrix and $m > n$ we call
this system overdetermined
. For these systems, the equality is usually not exactly satisfiable as
${\bf y}$ may not lie in the column space of ${\bf A}$. Therefore, an overdetermined system is better
written as
$${\bf A}{\bf x} \cong {\bf y} $$
For an overdetermined system ${\bf A}{\bf x} \cong {\bf y}$, we are typically looking for a solution
${\bf x}$ that minimizes the squared norm of the residual vector ${\bf r} = {\bf y}-{\bf A x}$,
$$ \min_{\bf x} ||{\bf r}||^2 = \min_{\bf x} ||{\bf y} - {\bf A x}||^2$$
This problem is called a linear least-squares problem
, and the solution ${\bf x}$ is called least-
squares solution. You learned during lecture that the solution of the least squares problem is also
the solution of the system of normal equations
:
$${\bf A}^T {\bf A}{\bf x} = {\bf A}^T {\bf y} $$
Example 1 - fitting a line
In [1]:
import
numpy as
np
import
numpy.linalg as
la
import
scipy.linalg as
sla
import
matplotlib.pyplot as
plt
12/13/22, 12:11 AM
Least Squares
https://www.prairielearn.org/pl/workspace/626847
2/17
Consider the data set given by pts
:
[2 3 5 6]
Check your answers:
Define the matrix ${\bf A}$ that can be used to solve for the coefficients $x_1$ and $x_2$.
Store the value of it in variable A_line
.
Hint: Try to define the design matrix programmatically (i.e. not hard-coded). Rather than iterating
through a loop, you can directly construct ${\bf A}^T$ as a NumPy array whose first entry is t_line
and whose second entry is a vector of 1
s, then apply the transpose with .T
to obtain ${\bf A}$ as
desired.
Hint: To obtain the vector of 1
s with appropriate length, you may use t_line**0
or
np.ones_like(t_line)
. Recalling the syntax for power in Python and broadcasting, can you
reason why the first option works?
[[2. 1.]
[3. 1.]
In [2]:
pts =
np
.
array
([[
2
, 4
], [
3
, 2
], [
5
, 1
], [
6
, 0
]])
t_line =
pts
[:,
0
]
y_line =
pts
[:,
1
]
plt
.
plot
(
t_line
, y_line
, 'o'
)
print
(
t_line
)
In [19]:
#grade (enter your code in this cell - DO NOT DELETE THIS LINE) # Define A_line here
A_line =
np
.
ones
(
pts
.
shape
)
width
, length =
pts
.
shape
for
i in
range
(
width
):
A_line
[
i
][
0
] =
pts
[
i
][
0
]
A_line
[
i
][
1
] =
1
print
(
A_line
)
12/13/22, 12:11 AM
Least Squares
https://www.prairielearn.org/pl/workspace/626847
3/17
[5. 1.]
[6. 1.]]
Check your answers:
Use the normal equations to find the coefficients in ${\bf x}$.
Store the value of it in variable x_line
.
Plot the points and fit curve:
[<matplotlib.lines.Line2D at 0x7f3b849b5d60>]
Linear Least Squares and QR
While easy to understand and remember, using the normal equations can be prone to numerical
instabilities (you will learn more about this if you take a Numerical Methods or Analysis class! Hint
hint!)
Instead of solving the normal equations using the design matrix ${\bf A}$, we will instead use its QR
decomposition
$$ {\bf A} = {\bf QR} $$
where ${\bf Q}$ is an orthogonal matrix and ${\bf R}$ is an upper triangular matrix. Hence the least
squares solution ${\bf x}$ can be obtained as:
$$ {\bf A}^T{\bf Ax} = {\bf A}^T{\bf y} \quad \Leftrightarrow \quad {\bf Rx} = {\bf Q}^T{\bf y}$$
Check your answers:
In [4]:
#grade (enter your code in this cell - DO NOT DELETE THIS LINE) x_line =
np
.
linalg
.
solve
((
A_line
.
T @
A_line
), (
A_line
.
T@y_line
))
# x_line = (A_line.T@y_line) @ (np.linalg.inv(A_line.T @ A_line))
In [5]:
plt
.
plot
(
t_line
, y_line
, 'o'
)
plt
.
plot
(
t_line
, t_line *
x_line
[
0
] +
x_line
[
1
])
Out[5]:
12/13/22, 12:11 AM
Least Squares
https://www.prairielearn.org/pl/workspace/626847
4/17
Write the least_sq
function that uses QR decomposition and finds the least squares solution
${\bf x}$ given matrix ${\bf A}$ and vector ${\bf y}$. Do not use the numpy.linalg.inv function.
Hint: Try using the functions numpy.linalg.qr and scipy.linalg.solve_triangular
.
Let's confirm that we get the same answer as we did above with the normal equations.
[<matplotlib.lines.Line2D at 0x7f3b84930640>]
Example 2: quadratic curve
Lets look at another set of points:
[<matplotlib.lines.Line2D at 0x7f3b84910dc0>]
In [6]:
#grade (enter your code in this cell - DO NOT DELETE THIS LINE) def
least_sq
(
A
, y
):
# Find the least squares solution to Ax = y
# by solving Rx = Q^T y for x where A = QR
Q
, R =
np
.
linalg
.
qr
(
A
)
x =
np
.
linalg
.
solve
(
R
, (
Q
.
T@y
))
return
x
In [7]:
xqr =
least_sq
(
A_line
, y_line
)
In [8]:
plt
.
plot
(
t_line
, y_line
, 'o'
)
plt
.
plot
(
t_line
, t_line *
xqr
[
0
] +
xqr
[
1
])
plt
.
plot
(
t_line
, t_line *
x_line
[
0
] +
x_line
[
1
],
'x'
)
Out[8]:
In [9]:
n_quad =
50
t_quad =
np
.
linspace
(
-
1
, 1
, n_quad
)
y_quad =
t_quad **
2 +
np
.
random
.
randn
(
n_quad
) *
0.05 +
5
plt
.
plot
(
t_quad
, y_quad
, 'o'
)
Out[9]:
12/13/22, 12:11 AM
Least Squares
https://www.prairielearn.org/pl/workspace/626847
5/17
This looks a lot like a quadratic function! Let's try to fit a curve to this:
$$y = x_1 \,t^2 + x_2\,t + x_3$$
Check your answer:
We want to find the coefficients $x_1$, $x_2$, and $x_3$ that solve the least squares problem ${\bf A
x} \cong {\bf y}$; first, we need to construct the design matrix. Taking ${\bf x} = [x_1, x_2,x_3]^T$,
define the design matrix ${\bf A}$ corresponding to the model written above.
Store the value of it in variable A_quad
.
Hint: As with example A, try defining the matrix ${\bf A}^T$ then apply the transpose to obtain ${\bf
A}$ in a single line of code. This time, what will the first entry (row of the transpose/column of the
design matrix) be?
[[ 1.00000000e+00 -1.00000000e+00 1.00000000e+00]
[ 9.20033319e-01 -9.59183673e-01 1.00000000e+00]
[ 8.43398584e-01 -9.18367347e-01 1.00000000e+00]
[ 7.70095793e-01 -8.77551020e-01 1.00000000e+00]
[ 7.00124948e-01 -8.36734694e-01 1.00000000e+00]
[ 6.33486047e-01 -7.95918367e-01 1.00000000e+00]
[ 5.70179092e-01 -7.55102041e-01 1.00000000e+00]
[ 5.10204082e-01 -7.14285714e-01 1.00000000e+00]
[ 4.53561016e-01 -6.73469388e-01 1.00000000e+00]
[ 4.00249896e-01 -6.32653061e-01 1.00000000e+00]
[ 3.50270721e-01 -5.91836735e-01 1.00000000e+00]
[ 3.03623490e-01 -5.51020408e-01 1.00000000e+00]
[ 2.60308205e-01 -5.10204082e-01 1.00000000e+00]
[ 2.20324865e-01 -4.69387755e-01 1.00000000e+00]
[ 1.83673469e-01 -4.28571429e-01 1.00000000e+00]
[ 1.50354019e-01 -3.87755102e-01 1.00000000e+00]
[ 1.20366514e-01 -3.46938776e-01 1.00000000e+00]
[ 9.37109538e-02 -3.06122449e-01 1.00000000e+00]
In [10]:
#grade (enter your code in this cell - DO NOT DELETE THIS LINE) A_quad =
np
.
ones
((
50
,
3
))
for
i in
range
(
50
):
A_quad
[
i
][
0
] =
(
t_quad
[
i
])
**
2
A_quad
[
i
][
1
] =
t_quad
[
i
]
A_quad
[
i
][
2
] =
1
print
(
A_quad
)
12/13/22, 12:11 AM
Least Squares
https://www.prairielearn.org/pl/workspace/626847
6/17
[ 7.03873386e-02 -2.65306122e-01 1.00000000e+00]
[ 5.03956685e-02 -2.24489796e-01 1.00000000e+00]
[ 3.37359434e-02 -1.83673469e-01 1.00000000e+00]
[ 2.04081633e-02 -1.42857143e-01 1.00000000e+00]
[ 1.04123282e-02 -1.02040816e-01 1.00000000e+00]
[ 3.74843815e-03 -6.12244898e-02 1.00000000e+00]
[ 4.16493128e-04 -2.04081633e-02 1.00000000e+00]
[ 4.16493128e-04 2.04081633e-02 1.00000000e+00]
[ 3.74843815e-03 6.12244898e-02 1.00000000e+00]
[ 1.04123282e-02 1.02040816e-01 1.00000000e+00]
[ 2.04081633e-02 1.42857143e-01 1.00000000e+00]
[ 3.37359434e-02 1.83673469e-01 1.00000000e+00]
[ 5.03956685e-02 2.24489796e-01 1.00000000e+00]
[ 7.03873386e-02 2.65306122e-01 1.00000000e+00]
[ 9.37109538e-02 3.06122449e-01 1.00000000e+00]
[ 1.20366514e-01 3.46938776e-01 1.00000000e+00]
[ 1.50354019e-01 3.87755102e-01 1.00000000e+00]
[ 1.83673469e-01 4.28571429e-01 1.00000000e+00]
[ 2.20324865e-01 4.69387755e-01 1.00000000e+00]
[ 2.60308205e-01 5.10204082e-01 1.00000000e+00]
[ 3.03623490e-01 5.51020408e-01 1.00000000e+00]
[ 3.50270721e-01 5.91836735e-01 1.00000000e+00]
[ 4.00249896e-01 6.32653061e-01 1.00000000e+00]
[ 4.53561016e-01 6.73469388e-01 1.00000000e+00]
[ 5.10204082e-01 7.14285714e-01 1.00000000e+00]
[ 5.70179092e-01 7.55102041e-01 1.00000000e+00]
[ 6.33486047e-01 7.95918367e-01 1.00000000e+00]
[ 7.00124948e-01 8.36734694e-01 1.00000000e+00]
[ 7.70095793e-01 8.77551020e-01 1.00000000e+00]
[ 8.43398584e-01 9.18367347e-01 1.00000000e+00]
[ 9.20033319e-01 9.59183673e-01 1.00000000e+00]
[ 1.00000000e+00 1.00000000e+00 1.00000000e+00]]
Now we solve for the coefficients ${\bf x}$ using the least_sq
function you have defined above.
Store the value of it in variable x_quad
.
array([9.35089178e-01, 1.33628054e-03, 5.01979902e+00])
[<matplotlib.lines.Line2D at 0x7f3b84876880>]
In [11]:
x_quad =
least_sq
(
A_quad
, y_quad
)
x_quad
Out[11]:
In [12]:
plt
.
plot
(
t_quad
, y_quad
, 'o'
)
plt
.
plot
(
t_quad
, x_quad
[
0
] *
t_quad **
2 +
x_quad
[
1
] *
t_quad +
x_quad
[
2
])
Out[12]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help