How do you find the y-intercept of the least squares regression line for the data set (1,8) (2,7) (3, 5)?

1 Answer
May 11, 2018

#hat{y} = -3/2 x + 29/3#

That's a #beta=-3/2# and an #alpha# aka y-intercept of #29/3#.

Explanation:

Let's derive least squares regression because I'm rusty.

Our model for the data is a linear equation with two parameters, #alpha and beta#.

#hat{y} = alpha x + beta #

Our total error is the sum of the squared residuals for each data point.

# E=sum_{i=1}^n ( y_i - hat{y_i})^2 =sum (y_i - alpha x_i - beta )^2 #

To control clutter I'll just write #sum# for #sum_{i=1}^n.#

We minimize #E# by setting the partials to zero:

#0 = {partial E}/{partial alpha} = sum -x_i(y_i - alpha x_i - beta ) #

# sum x_i y_i = alpha sum x_i^2 + beta sum x_i #

#0 = {partial E}/{partial beta } = sum -(y_i - alpha x_i - beta ) #

# sum y_i = alpha sum x_i + n beta #

That last one comes from #sum_{i=1}^n beta = beta .#

We have two equations in two unknowns. I remember #MA=S# has solutions #A=M^{-1} S# and for a two by two matrix #M=(a, b, quadquad c, d)# and #S=(s,t)^T# we get

#M^{-1}S = 1/{ad-bc}(ds -bt, -cs+at )#

Back to the problem. Let's declutter even more and write

#sum x_i = n bar{x}, sum x_i y_i = n bar{xy}, quad sum x_i ^2 = n bar{x^2},# etc. We rewrite our system, cancelling the #n#s:

# bar{xy} = alpha bar{x^2} + beta bar{x} #

#bar{y} = alpha bar{x} + beta #

Applying our solution, we substitute into our solution

#a=bar{x^2}, b=c=bar{x}, d=1, s=bar{xy}, t=bar{y}#

giving

# alpha = { bar{xy} - bar{x} \ bar{y}}/{ bar{x^2} - bar{x}^2 } #

#beta = { - bar{x} \ bar{xy} + bar{x^2} bar{y}}/{ bar{x^2} - bar{x}^2 } #

I don't know if that's right, but it's giving me flashbacks. Let's try our numbers.
#x\ y\ \ x^2\ \ y^2\ \ xy #
#1 \ 8\ \ \ 1 \ \ \ \ 64\ \ \ 8#
#2\ 7\ \ \ 4 \ \ \ \ 49\ 14 #
#3\ 5\ \ \ 9\ \ \ \ 25 \ 15#
#6\ 20\ 14\ 138\ 37# TOTALS

#alpha = { 37/3 -(6/3)(20/3) } /{ (14/3) -(6/3)^2 } = -3/2 #

# beta = { - (6/3) (37/3) + (14/3)(20/3) }/{ (14/3) -(6/3)^2 } = 29/3 #

Model:

#hat{y} = -3/2 x + 29/3#

enter image source here

Check:

Let's calculate the squared error:

# ( 8 - (-3/2(1) + (29/3) ) )^2 + ( 7 - (-3/2(2) + (29/3) ) )^2 + ( 5 - (-3/2(3) + (29/3) ) )^2 = 1/6 #

The theory says if we change #a# or #b# by a little bit (or a lot) we'll always get a bigger error. Let's pop a few into the computer:

# ( 8 - (- 1.501 (1) + (29/3)) )^2 + ( 7 - (-1.501(2) + (29/3)) )^2 + ( 5 - (-1.501(3) + (29/3)) )^2 approx 0.16668067 #

# ( 8 - (-3/2(1) + 10 ) )^2 + ( 7 - (-3/2(2) + 10) )^2 + ( 5 - (-3/2(3) + 10 ) )^2 = 1/2#

Let's call that checked.