We can approach this problem in two ways, I will do so using MLE

We know the following

#E(epsilon) = 0# thus we can remove it becuase we expect this to be 0 regardless of the variance.

#f(y|xbeta) = 1/(sqrt(2pisigma^2))e^(-(y-xbeta)^2/(2sigma^2))# if we assume a normal distribution

now using maximum likelihood we find the best estimate for #beta# if we know what #y# and #x# are. In most cases we know what these are so

#l(beta|x,y) = 1/(sqrt(2pisigma^2))e^(-(y-xbeta)^2/(2sigma^2))#

now we take the log to simplify thus

#log(l(beta|x,y)) = log(1/(sqrt(2pisigma^2))) + (-(y-xbeta)^2/(2sigma^2))#

then we take the derivative set to 0 and solve

#= -(x(y-xbeta))/(sigma^2)=0#

#= (x^2beta)/(sigma^2)= (xy)/(sigma^2)#

#= x^2beta=xy#

#= beta=(xy)/x^2#

This agrees algebraically if we are interested in minimizing the residual difference, usually using sum of squared loss function eg

#f_beta=(y-xbeta)^2#

#fprime_beta=-2x(y-xbeta)=0#

#=2x^2beta=2xy#

#=beta=(xy)/x^2#