What is a feasible least squares regression?
It it is basically Weighted Least Square. The issue is ordinary least squares assumes that there is constant variance in the errors (which is called homoskedasticity). The method of weighted least squares can be used when the ordinary least squares assumption of constant variance in the errors is violated (which is called heteroskedasticity).
Heteroskedasticity makes OLS inefficient and invalidates the
standard estimator of its variance, and hence invalidates
inference (t tests, F tests, etc.).
The WLS estimator is efficient and unbiased but it depends on
knowing the variance structure. In practice is seldom available.
In practice it is more common to keep OLS and replace its
variance estimator by White’s consistent method, which does
not require any assumptions.
When we have heteroskedasticity, even if each noise term is still Gaussian, ordinary least squares is no longer the maximum likelihood estimate, and so no longer efficient. If however we know the noise variance σ at each measurement i, and set wi = 1/σ2, we get the heteroskedastic MLE, and recover efficiency.
To say the same thing slightly differently, there’s just no way that we can estimate the regression function as accurately where the noise is large as we can where the noise is small. Trying to give equal attention to all parts of the input space is a waste of time; we should be more concerned about fitting well where the noise is small, and expect to fit poorly where the noise is big.
3. Doing something else. There are a number of other optimization problems which can be transformed into, or approximated by, weighted least squares. The most important of these arises from generalized linear models, where the mean response is some nonlinear function of a linear predictor. Logistic regression is an example.