This is probably the wrong place to post this. Why are only Y residuals used when calculating a least-squares linear regression line from a data set? This seems to be the universally accepted method for finding a line of best fit. It seems to me however that this often creates lines that are more horizontal than they should be, especially when the data points depict a line with a very high or low slope. Using only Y residuals as is accepted, the line will often turn out to be much too horizontal. Perhaps an average of X and Y residuals should be used instead? The current method makes no sense to me. (PS: I wish I knew more about computer programming so I could write a simple program to calculate a regression line using an average of x and y residuals, and compare it to the bazillion applets out that that do it based off of Y residuals only.)

Sorry. Against the advise of my professor, I tossed the book after I graduated. I had an A, too...you'd have thought more of it stuck. Mike

Boy, that course was about 30 years ago. However, conceptually, the thing that strike me in a data set, is that usually Y = f(x). Y = a function of X. In other words, you take X as a given, and measure ( and plot) Y as a function of X. Since there are not 2 variables at each data point, only one, there is no need to fit the line to X as X does not vary. It is the Y function that varies, and the line approximates where the Y would most likely be found over the given range of X.

I would also say that you are not trying to find a correlation or relationship between X and Y when doing this type of problem. Use a Pearson Product-Moment r correlation if you want to find out what the relationship is between all of X and all of Y, then you could work on a regression of some sort. There are a lot of cobwebs to dust off, but that was the first thought that came to my mind.

Do you use Minitab? It's just as easy as Excel, but a lot more statistically in depth. There website has a free trial if your school does not have it. http://www.minitab.com/

With regression, you are trying to calculate y as a function of x; that is, given a value of x, what is the likely value of y. The error is how wrong that is, squared, on average. So the x residuals aren't relevant -- actually they don't exist for the purposes of the regression. Do you see what I mean? The answer you get, Rsquared, is the amount of variance in y accounted for by knowing the value of x and not the other way around, in other words, it's an index of the predictibility of y given x. Correlation is different, and perhaps (I don't remember) does deal with error in x. But correlation is how much shared variance there is between x and y, which is the answer to a different question. Edit: I should have read Thor's post first, ya?

You are thinking about this incorrectly. A regression takes two series Y and X (either may be a vector) and explains Y as a linear function of X: Y=bX+e So Y-bX=e is the error series, and in Ordinary Least Squares, you assume these are all independent, and you want to choose b to minimize [e1^2+e2^2+e3^2+...+en^2]. I assume you are referring to the e as the Y-residuals ... I mean, they are the only residuals. And you need some criterion to choose b ... this is the standard OLS criterion. BTW, it's not the only way ... if the e are not independent, then you need to make assumptions about the covariance structure and you have a lot of variations. Additionally, computer intensive approaches offer alternatives to squared error criteria (which are an artifact from the hand calculation era) .. you have robust and stepwise regression approaches, and so forth

It may also help to think of this geometrically. Regression is a process that projects the Y column vector space on to the X column vector space. You have different criteria for how you can do this, and these will yield different coefficients ('b') Think of this as shining a flashlight (Y-vectors) on a wall (X-vector plane). As you tilt the flashlight (change the criteria) you get a different oval of light (different 'b').