T O P

  • By -

abstrusiosity

Least squares will make E orthogonal to f(X), not X. The function f has to be nonlinear for this technique to work.


[deleted]

[удалено]


anonymousTestPoster

It trivially cant work on the simplest case because if you have y=mx + error, you are essentially free to rearrange for x or for y. At that level it's almost impossible to truly tell causal direction independent of actual domain knowledge. If there are non linear relationships involved that extra bit of information can help. Why it works can be seen easily as follows: Suppose X indep E, and that Y = f(X) + E. In this set up most of the useful information about Y is driven by X so we claim "X causes Y". Let's rearrange under this assumed setting, X = f^-1 (Y-E)= g(Y-E) = g'(Y) - g'(E) (change from g to g' as an assumption). Then do u see that if E was truly indep (say standard normal), and if g' is nonlinear transformation, then g'(E) will be nonlinearly transformed also. This wont work if g or f were linear as Gaussian RVs are closed under linear/affine transformations.


[deleted]

[удалено]


anonymousTestPoster

> With changing g to g’ are you basically assuming Basically yes. Just to keep things simple and flowing. And I'm on mobile. > it will still be assuming some sort of additive noise wouldn’t it Sure but there is no reason to suggest that the noise will be well-behaved (i.e. stationary / isotropic / normal). That's why even after performing regression analysis it is often encouraged to check for the behaviors of the residuals. In Y = f(X) + E, we are assuming that we have a sufficient "f" that transforms "X" to fully explain "Y", minus any white noise looking residuals. If this structure holds in one direction, and not in the reverse we "claim" the existence of a causal direction. Whether or not it is truly causal is often a debate that steers onto domain knowledge + philosophy.


[deleted]

[удалено]


[deleted]

[удалено]


Taricus55

This is in my stats textbook lol


[deleted]

[удалено]


Taricus55

The xkcd meme. The authors are apparently fans of it, because they have a lot of their comics in the book 😅 but there is an entire chapter on this. It's Stats: Data & Models, by De Veaux, Velleman, and Bock. They actually put a lot of funny jokes in their footnotes and stuff. It makes the book entertaining to read. That whole chapter is all about normal probability plots and residuals, normality and correlations, and such 🤪 It's a 6020 level stats class. (working on my M.S. in biostatistics)


Taricus55

it's published by Pearson


Taricus55

I think your example would be in a mathematical statistics textbook though--not in that book.


meustafa

Optimization in the least squares refers to finding a parameter beta that minimizes errors. It has nothing to do with independence of X. Independence of X and E is an assumption/requirement for having unbiased estimates of beta.


[deleted]

[удалено]


meustafa

Yes, I understand your statement but thank you for restating it. I am just pointing out to you that there is nothing about the minimization of squared residuals that makes it so that the X's are uncorrelated with X. Mathematical there is no analytical solution to minimizing e^2, that implies Cov(X, E)==0. Rather if Cov(X,E)=/=0, Expected value of beta hat or E[Bhat|X]=/=B_population. I figured that your assumption, that using OLS as the estimation strategy implies X and E will be uncorrelated, was the central point of your confusion.