From association to causation via regression

From association to causation via regression

Report Number
David A. Freedman
J. Am. Stat. Assoc., v91, pp329-337, 1996.

For nearly a century, investigators in the social sciences have used regression models to deduce cause-and-effect relationships from patterns of association. Path models and automated search procedures are more recent developments. In my view, this enterprise has not been successful. The models tend to neglect the difficulties in establishing causal relations, and the mathematical complexities tend to obscure rather than clarify the assumptions on which the analysis is based.

Formal statistical inference is, by its nature, conditional. If maintained hypotheses A, B, C, ... hold, then H can be tested against the data. However, if A, B, C, ... remain in doubt, so must inferences about H. Careful scrutiny of maintained hypotheses should therefore be a critical part of empirical work-- a principle honored more often in the breach than the observance.

I will discuss modeling techniques that seem to convert association into causation. The object is to clarify the differences among the various uses of regression, and the difficulties in making causal inferences by modeling.

PDF File
Postscript File