Diagrams of linear regression
I made a big diagram describing some assumptions (MLR16) that are used in linear regression. In my diagram, there are categories (in rectangles with dotted lines) of mathematical facts that follow from different subsets of MLR16. References in brackets are to Hayashi (2000).
A couple of comments about the diagram are in order.
 , are a vectors of random variables. may contain numbers or random variables. is a vector of numbers.
 We measure: realisations of , (realisations of) . We do not measure: , . We have one equation and two unknowns: we need additional assumptions on .
 We make a set of assumptions (MLR16) about the joint distribution . These assumptions imply some theorems relating the distribution of and the distribution of .
 Note the difference between MLR4 and MLR4’. The point of using the stronger MLR4 is that, in some cases, provided MLR4, MLR2 is not needed. To prove unbiasedness, we don’t need MLR2. For finite sample inference, we also don’t need MLR2. But whenever the law of large numbers is involved, we do need MLR2 as a standalone condition.
 In the diagram, I stick to the brute mathematics, which is entirely independent of its (causal) interpretation.^{1}
The second diagram gives the asymptotic distribution of the IV estimator^{2}.

But of course what really matters is the causal interpretation.
As Pearl (2009) writes, “behind every causal claim there must lie some causal assumption that is not discernible from the joint distribution and, hence, not testable in observational studies”. If we wish to interpret (and hence ) causally, we must interpret MLR4 causally; it becomes a (strong) causal assumption.
As far as I can tell, when econometricians give a causal interpretation it is typically done thus (they are rarely explicit about it):
 MLR1 holds in every possible world (alternatively: it specifies not just actual, but all potential outcomes), hence is unobservable even in principle.
 yet we make assumption MLR4 about
This talk of the distribution of a fundamentally unobservable “variable” is a confusing device. Pearl’s method is more explicit: replace MLR with the causal graph below, where is used to make it extra clear that the causation only runs one way. MLR1 corresponds to the expression for (and, redundantly, the two arrows towards ), MLR4 corresponds to the absence of arrows connecting and . We thus avoid “hiding causal assumptions under the guise of latent variables” (Pearl). (Because of the confusing device, econometricians, to put it kindly, don’t always sharply distinguish the mathematics of the diagram from its (causal) interpretation. To see me rant about this, see here.)

For IV, it’s even clearer that the only reason to care is the causal interpretation. But I follow good econometrics practice and make only mathematical claims. ↩