Hands-on: Linear least squares and error propagation

Perform a least squares fit of a parabola

$$ y(x) = \theta_0 + \theta_1 x + \theta_2 x^2$$

for the four independent measurments $(x_i, y_i)$ given by $(-0.6, 5 \pm 2)$, $(-0.2, 3 \pm 1)$, $(0.2, 5 \pm 1)$, $(0.6, 8 \pm 2)$.

a) Determine the best fit parameters $\hat \theta_i$ and their covariances.

b) Plot the fitted parabola and the $1\sigma$ error band around it. What is the prediced value $y$ at $x=1$ and its uncertainty?

This problem is taken from from W. Metzger, Statistical Methods in Data Analysis.

In [2]:
import numpy as np
import matplotlib.pyplot as plt
from numpy.linalg import inv
In [3]:
# data
x = np.array([-0.6, -0.2, 0.2, 0.6])
y = np.array([5., 3., 5., 8.])
sigma_y = np.array([2, 1, 1, 2])
In [4]:
# use formula for linear least squares to determine best fit parameters

# some hints:
# - diagonal matrix from vector v in numpy: A = np.diagflat(v)
# - matrix from column vectors v0, v1, v2: A = np.column_stack((v0, v1, v2))
# - multiplication of matrices A and B in numpy: C = A.dot(B)
# - transposed matrix: A_T = np.transpose(A)
# - inverse matrix: A_inv = inv(A)

V = np.diagflat(sigma_y**2) # covariance matrix of the data (diaogonal)
A = np.column_stack((np.ones(x.size), x, x**2)) # design matrix

# ...
# your code here
In [7]:
# function that returns uncertainty of the fit function at postion x
def sigma_y_pred(x):
    J = np.array([1., x, x**2], dtype=object) # vector J of gradients d/d theta_i y
    
    # your code here
    #return ... 
In [10]:
# plot data
fig, ax = plt.subplots()
plt.errorbar(x, y, yerr=sigma_y, fmt='o', color='blue')

# plot fit and error band
# hint for plotting the error band:
# A filled area between two curves can e.g. be drawn with
# ax.fill_between(xv, yv - sigma_yv, yv + sigma_yv, alpha=0.2, color='red')

# your code here
Out[10]:
<ErrorbarContainer object of 3 artists>
In [11]:
# prediction at x=1
# something like 
# print(f"Predicted value at x = {xv}: {theta[0] + theta[1] * xv + theta[2] * xv**2:.2f} +/- {sigma_y_pred(xv):.2f}")

# your code here