3.6 Borel Paradox
Throughout this chapter, for continuous rv \(X, Y\), we have been writing expressions such as \(E(Y \rvert X=x)\) and \(P(Y \le y \rvert X=x)\). Thus far, we have not gotten into trouble. However, we might have.
Formally, the conditioning in a conditional expectation is done with respect to a sub sigma-algebra(1.2.1), and the conditional E $E(Y G) $ is defined as a rv whose integral, over any set in the sub sigma-algebra \(G\), agrees with that of \(X\). This is quite an advanced concept in probatbility theory (see Billingsley 1995, Section 34).
Since the conditional E is only defined in terms of its integral, it may not be unique even if the conditioning is well-defined. However, when we condition on sets of probatbility 0 (such as $ { X=x }$), conditioning may not be well defined, so different conditional expectations are more likely to appear. To see how this could affect us, it is easiest to look at conditional distributions, which amounts to calculating \(E \left[ I(Y \le y) \rvert X=x \right]\).
Proschan and Presnell (1998) tell the story of a statistics exam that had the question “If \(X\) and \(Y\) are independent standard normals, what is the conditional distributions of \(Y\) given that \(Y=X\)?” Different students interpreted the condition \(Y=X\) in the following ways: 1. \(Z_1 = 0\), where \(Z_1 = Y-X\); 2. \(Z_2 = 1\), where \(Z_2 = Y/X\); 3. \(Z_3 = 1\), where \(Z_3 = I(Y=X)\).
Each condtion is a correct interpretation of the conditon \(Y=X\), and each leads to a different conditional distribution (see Excercise 4.60.).
This is the Borel Paradox and arises b/c different (Correct) interpretations of the probatbility 0 conditioning sets result in different conditional E. How can we avoid the paradox? One way is to avoid conditioning on sets probatbility 0. That is, compute only \(E(Y \rvert X \in B )\), where \(B\) is a set with \(P (X \in B)>0\). So to compute something like \(E(Y \rvert X =x )\), take a sequence \(B_n \downarrow x\), and define \(E(Y \rvert X =x )= \lim_{n \rightarrow \infty} E(Y \rvert X \in B_n )\). We now avoid the paradox, as the different answers for \(E(Y \rvert X =x )\) will arose from different sequences, so there should be no surprises (Exercise 4.61).