CL9 - Linear Regression

.pdf

School

University of Illinois, Urbana Champaign *

*We aren’t endorsed by this school

Course

257

Subject

Mathematics

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by CoachGuanaco4012 on coursehero.com

12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 1/17 Matplotlib created a temporary config/cache directory at /tmp/matplotlib-45lty7mr becaus e the default path (/tmp/cache/matplotlib) is not a writable directory; it is highly rec ommended to set the MPLCONFIGDIR environment variable to a writable directory, in partic ular to speed up the import of Matplotlib and to better support multiprocessing. Lesson 9: Linear Least Squares Linear Least Squares and Normal Equations Consider a set of $m$ data points $\{(t_1,y_1),(t_2,y_2), ...,(t_m,y_m)\}$. Suppose we want to find a straight line that best fits these data points. Mathematically, we want to find the slope and intercept ($x_1$ and $x_2$ respectively) such that $$y_i = x_1 t_i + x_2, \qquad i \in \{1,2,...,m\} $$ We can write this more explicitly as ${\bf y} = {\bf t}\cdot x_1 + 1\cdot x_2$, which lets us separate the coefficients we are trying to solve for ($x_1$ and $x_2$) from the input data that we have (${\bf y}$ and ${\bf t}$). Now we can put these values into a matrix format: $$\begin{bmatrix}y_1 \\ y_2 \\ \vdots \\ y_m\end{bmatrix} = \begin{bmatrix}t_1 & 1 \\ t_2 & 1 \\ \vdots & \vdots \\ t_m & 1\end{bmatrix} \begin{bmatrix}x_1 \\ x_2\end{bmatrix} \qquad \Leftrightarrow \qquad {\bf y} = {\bf A}{\bf x}$$ where ${\bf A}$ is the design matrix that is a function of the $t_i$ data, ${\bf y}$ is the vector with the $y_i$ data and ${\bf x}$ is the vector with the coefficients $x_i$. Notice how the values that multiply our coefficients $x_i$ are the columns of ${\bf A}$. Generally, if we have a linear system where ${\bf A}$ is an $m \times n$ matrix and $m > n$ we call this system overdetermined . For these systems, the equality is usually not exactly satisfiable as ${\bf y}$ may not lie in the column space of ${\bf A}$. Therefore, an overdetermined system is better written as $${\bf A}{\bf x} \cong {\bf y} $$ For an overdetermined system ${\bf A}{\bf x} \cong {\bf y}$, we are typically looking for a solution ${\bf x}$ that minimizes the squared norm of the residual vector ${\bf r} = {\bf y}-{\bf A x}$, $$ \min_{\bf x} ||{\bf r}||^2 = \min_{\bf x} ||{\bf y} - {\bf A x}||^2$$ This problem is called a linear least-squares problem , and the solution ${\bf x}$ is called least- squares solution. You learned during lecture that the solution of the least squares problem is also the solution of the system of normal equations : $${\bf A}^T {\bf A}{\bf x} = {\bf A}^T {\bf y} $$ Example 1 - fitting a line In [1]: import numpy as np import numpy.linalg as la import scipy.linalg as sla import matplotlib.pyplot as plt

12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 2/17 Consider the data set given by pts : [2 3 5 6] Check your answers: Define the matrix ${\bf A}$ that can be used to solve for the coefficients $x_1$ and $x_2$. Store the value of it in variable A_line . Hint: Try to define the design matrix programmatically (i.e. not hard-coded). Rather than iterating through a loop, you can directly construct ${\bf A}^T$ as a NumPy array whose first entry is t_line and whose second entry is a vector of 1 s, then apply the transpose with .T to obtain ${\bf A}$ as desired. Hint: To obtain the vector of 1 s with appropriate length, you may use t_line**0 or np.ones_like(t_line) . Recalling the syntax for power in Python and broadcasting, can you reason why the first option works? [[2. 1.] [3. 1.] In [2]: pts = np . array ([[ 2 , 4 ], [ 3 , 2 ], [ 5 , 1 ], [ 6 , 0 ]]) t_line = pts [:, 0 ] y_line = pts [:, 1 ] plt . plot ( t_line , y_line , 'o' ) print ( t_line ) In [19]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) # Define A_line here A_line = np . ones ( pts . shape ) width , length = pts . shape for i in range ( width ): A_line [ i ][ 0 ] = pts [ i ][ 0 ] A_line [ i ][ 1 ] = 1 print ( A_line )

12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 3/17 [5. 1.] [6. 1.]] Check your answers: Use the normal equations to find the coefficients in ${\bf x}$. Store the value of it in variable x_line . Plot the points and fit curve: [<matplotlib.lines.Line2D at 0x7f3b849b5d60>] Linear Least Squares and QR While easy to understand and remember, using the normal equations can be prone to numerical instabilities (you will learn more about this if you take a Numerical Methods or Analysis class! Hint hint!) Instead of solving the normal equations using the design matrix ${\bf A}$, we will instead use its QR decomposition $$ {\bf A} = {\bf QR} $$ where ${\bf Q}$ is an orthogonal matrix and ${\bf R}$ is an upper triangular matrix. Hence the least squares solution ${\bf x}$ can be obtained as: $$ {\bf A}^T{\bf Ax} = {\bf A}^T{\bf y} \quad \Leftrightarrow \quad {\bf Rx} = {\bf Q}^T{\bf y}$$ Check your answers: In [4]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) x_line = np . linalg . solve (( A_line . T @ A_line ), ( A_line . T@y_line )) # x_line = (A_line.T@y_line) @ (np.linalg.inv(A_line.T @ A_line)) In [5]: plt . plot ( t_line , y_line , 'o' ) plt . plot ( t_line , t_line * x_line [ 0 ] + x_line [ 1 ]) Out[5]:

12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 4/17 Write the least_sq function that uses QR decomposition and finds the least squares solution ${\bf x}$ given matrix ${\bf A}$ and vector ${\bf y}$. Do not use the numpy.linalg.inv function. Hint: Try using the functions numpy.linalg.qr and scipy.linalg.solve_triangular . Let's confirm that we get the same answer as we did above with the normal equations. [<matplotlib.lines.Line2D at 0x7f3b84930640>] Example 2: quadratic curve Lets look at another set of points: [<matplotlib.lines.Line2D at 0x7f3b84910dc0>] In [6]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) def least_sq ( A , y ): # Find the least squares solution to Ax = y # by solving Rx = Q^T y for x where A = QR Q , R = np . linalg . qr ( A ) x = np . linalg . solve ( R , ( Q . T@y )) return x In [7]: xqr = least_sq ( A_line , y_line ) In [8]: plt . plot ( t_line , y_line , 'o' ) plt . plot ( t_line , t_line * xqr [ 0 ] + xqr [ 1 ]) plt . plot ( t_line , t_line * x_line [ 0 ] + x_line [ 1 ], 'x' ) Out[8]: In [9]: n_quad = 50 t_quad = np . linspace ( - 1 , 1 , n_quad ) y_quad = t_quad ** 2 + np . random . randn ( n_quad ) * 0.05 + 5 plt . plot ( t_quad , y_quad , 'o' ) Out[9]:

12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 5/17 This looks a lot like a quadratic function! Let's try to fit a curve to this: $$y = x_1 \,t^2 + x_2\,t + x_3$$ Check your answer: We want to find the coefficients $x_1$, $x_2$, and $x_3$ that solve the least squares problem ${\bf A x} \cong {\bf y}$; first, we need to construct the design matrix. Taking ${\bf x} = [x_1, x_2,x_3]^T$, define the design matrix ${\bf A}$ corresponding to the model written above. Store the value of it in variable A_quad . Hint: As with example A, try defining the matrix ${\bf A}^T$ then apply the transpose to obtain ${\bf A}$ in a single line of code. This time, what will the first entry (row of the transpose/column of the design matrix) be? [[ 1.00000000e+00 -1.00000000e+00 1.00000000e+00] [ 9.20033319e-01 -9.59183673e-01 1.00000000e+00] [ 8.43398584e-01 -9.18367347e-01 1.00000000e+00] [ 7.70095793e-01 -8.77551020e-01 1.00000000e+00] [ 7.00124948e-01 -8.36734694e-01 1.00000000e+00] [ 6.33486047e-01 -7.95918367e-01 1.00000000e+00] [ 5.70179092e-01 -7.55102041e-01 1.00000000e+00] [ 5.10204082e-01 -7.14285714e-01 1.00000000e+00] [ 4.53561016e-01 -6.73469388e-01 1.00000000e+00] [ 4.00249896e-01 -6.32653061e-01 1.00000000e+00] [ 3.50270721e-01 -5.91836735e-01 1.00000000e+00] [ 3.03623490e-01 -5.51020408e-01 1.00000000e+00] [ 2.60308205e-01 -5.10204082e-01 1.00000000e+00] [ 2.20324865e-01 -4.69387755e-01 1.00000000e+00] [ 1.83673469e-01 -4.28571429e-01 1.00000000e+00] [ 1.50354019e-01 -3.87755102e-01 1.00000000e+00] [ 1.20366514e-01 -3.46938776e-01 1.00000000e+00] [ 9.37109538e-02 -3.06122449e-01 1.00000000e+00] In [10]: #grade (enter your code in this cell - DO NOT DELETE THIS LINE) A_quad = np . ones (( 50 , 3 )) for i in range ( 50 ): A_quad [ i ][ 0 ] = ( t_quad [ i ]) ** 2 A_quad [ i ][ 1 ] = t_quad [ i ] A_quad [ i ][ 2 ] = 1 print ( A_quad )

12/13/22, 12:11 AM Least Squares https://www.prairielearn.org/pl/workspace/626847 6/17 [ 7.03873386e-02 -2.65306122e-01 1.00000000e+00] [ 5.03956685e-02 -2.24489796e-01 1.00000000e+00] [ 3.37359434e-02 -1.83673469e-01 1.00000000e+00] [ 2.04081633e-02 -1.42857143e-01 1.00000000e+00] [ 1.04123282e-02 -1.02040816e-01 1.00000000e+00] [ 3.74843815e-03 -6.12244898e-02 1.00000000e+00] [ 4.16493128e-04 -2.04081633e-02 1.00000000e+00] [ 4.16493128e-04 2.04081633e-02 1.00000000e+00] [ 3.74843815e-03 6.12244898e-02 1.00000000e+00] [ 1.04123282e-02 1.02040816e-01 1.00000000e+00] [ 2.04081633e-02 1.42857143e-01 1.00000000e+00] [ 3.37359434e-02 1.83673469e-01 1.00000000e+00] [ 5.03956685e-02 2.24489796e-01 1.00000000e+00] [ 7.03873386e-02 2.65306122e-01 1.00000000e+00] [ 9.37109538e-02 3.06122449e-01 1.00000000e+00] [ 1.20366514e-01 3.46938776e-01 1.00000000e+00] [ 1.50354019e-01 3.87755102e-01 1.00000000e+00] [ 1.83673469e-01 4.28571429e-01 1.00000000e+00] [ 2.20324865e-01 4.69387755e-01 1.00000000e+00] [ 2.60308205e-01 5.10204082e-01 1.00000000e+00] [ 3.03623490e-01 5.51020408e-01 1.00000000e+00] [ 3.50270721e-01 5.91836735e-01 1.00000000e+00] [ 4.00249896e-01 6.32653061e-01 1.00000000e+00] [ 4.53561016e-01 6.73469388e-01 1.00000000e+00] [ 5.10204082e-01 7.14285714e-01 1.00000000e+00] [ 5.70179092e-01 7.55102041e-01 1.00000000e+00] [ 6.33486047e-01 7.95918367e-01 1.00000000e+00] [ 7.00124948e-01 8.36734694e-01 1.00000000e+00] [ 7.70095793e-01 8.77551020e-01 1.00000000e+00] [ 8.43398584e-01 9.18367347e-01 1.00000000e+00] [ 9.20033319e-01 9.59183673e-01 1.00000000e+00] [ 1.00000000e+00 1.00000000e+00 1.00000000e+00]] Now we solve for the coefficients ${\bf x}$ using the least_sq function you have defined above. Store the value of it in variable x_quad . array([9.35089178e-01, 1.33628054e-03, 5.01979902e+00]) [<matplotlib.lines.Line2D at 0x7f3b84876880>] In [11]: x_quad = least_sq ( A_quad , y_quad ) x_quad Out[11]: In [12]: plt . plot ( t_quad , y_quad , 'o' ) plt . plot ( t_quad , x_quad [ 0 ] * t_quad ** 2 + x_quad [ 1 ] * t_quad + x_quad [ 2 ]) Out[12]:

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help