6.7 Flat
6.7.1 1.Flat
Sometimes in statistical applications it is useful to consider a linear subspace that is shifted or translated from the origin. This will happen, for example, in models that include an intercept. It is therefore helpful to have the following definition of a space that is displaced from the origin.
- Definition 1 (Flat)
suppose M⊂V is a linear subspace, and y0∈V. Then a flat consists of {x+y0|x∈M}. We will write y0+M where M is a subspace to indicate a flat.
By considering translations, flats are equivalent to vector spaces. If Y is a rv whose domain is the flat y0+M, then, if y0 is fixed, Y−y0 has domain M.
- example
set S4={(1,1,1)′+z,z∈S2} is a flat, because 0∉S4.
- example
In Ce2, consider M={α(12)|α∈Ce}, and y0=(22).
Then the flat y0+M is given by the set y0+M={(22)+α(12)|α∈Ce}.
which is just a straight line that does not pass through the origin, but rather through the point (2,2). The choice of y0 is not unique and it can be any point y=y0+yα, where yα=α(1,2)′. For example, if α=−2, then y=(0,−2)′ and if α=+1, then y=(3,4)′, and so on. For any y0 not of this form, we simply get a different flat. This is summarized in the next remark.
- Theorem 1
The two spans
$ F1={z|z=y0+x,y0∈V,x∈M⊂V}F2={z|z=y1+x,y1∈F1,x∈M⊂V} $
are the same subspace, so the representation of the flat is not unique.
- Definition 2 (Sum and intersection of subspaces)
let H,K be two linear subspaces. Then
$ H+K={x+y|x∈H,y∈K}H∩K={x|x∈H,x∈K} $
- Theorem 2
Both H+K and H∩K are linear subspaces.
- Definition 3 (Disjoint subspaces)
Two subspaces are disjoint if H∩K={0}, the null vector.
- Theorem 3
If H∩K={0}, and z∈H+K, then the decomposition z=x+y with x∈H and y∈K is unique.
prf) suppose z=x+y and z=x′+y′. Then, x−x′∈H and y−y′∈K. We must have x+y=x′+y′ or x−x′=y−y′, which in turn requires that x−x′=y−y′=0, since 0 is the only vector common to H and K. Thus, x=x′ and y=y′.
- Theorem 4
if H∩K={0}, then dim(H+K)=dim(H)+dim(K). In general, $(H+K) = (H) + (K) -(H K) $
Proof: Exercise.
- Definition 4 (Complement of a space)
If M and Mc are disjoint subspaces of V and V=M+Mc, then Mc is called a complement of M.
- Remark 1: The complement is not unique. In R2, a subspace M of dimension 1 consists of a line through the origin. A complement of M is given by any other line Mc≠αM through the origin, because linear combinations of any two such lines span Ce2.
In the linear model Y=Xβ+ϵ, we have that $= E(Y ) = X $, so that μ∈C(X). To estimate μ with ˆμ, we might want to require that ˆμ∈C(X) (note: if X includes a constant, then C(X) is a flat; otherwise, it is a subspace). The estimate would then depend upon Y in a sensible way by moving Y to the subspace. The method of moving is via projections. The optimality of moves depends on the way we measure distance - on an inner product defined on the vector space.
6.7.2 2. Solutions to systems of linear equations
Consider the Matrix equation Xn×pβp×1=yn×1. For a given X and Y does there exist β to these equations? Is it unique? If not unique, can we characterize all possible solutions?
- If n=p and X is nonsingular, the unique solution is β=X−1y.
- If y∈C(X), y can be expressed as a linear combination of the columns of X. If X is of full column rank, then the columns of X form a basis for C(X), and the solution β is just the coordinates of y relative to this basis. For any g-inverse X−, we have XX−y=y for all y∈C(X), and so a solution is given by β=X−y.
If ρ(X)=rank(C(X))<p, then the solution is not unique. If β0 as any solution, for example the solution is given by β=X−y, then so is β0+z,z∈N(X), which is null-space of X. The set of solutions is given by β0+N(X), which is a flat.
- If y∉C(X), then there is no exact solution. This is the usual situation in linear models, and leads to the estimation problem discussed in the next chapter.
What we might do is get the closest solution by replacing Y by another vector ˆY that is as close to Y as possible; if we define close as ‖ making small, we need to solve X \beta = P_{\mathcal{C}(X)}Y insetead of the original equation. If X has full column rank, this leads to the familiar solution:
$ \begin{align} \beta_0 &= X^+ P y \\ &= (X'X)^{-1} X' X (X'X)^{-1}X' Y \\ & = (X'X)^{-1}X'Y \tag{2} \end{align} $
which is unique. If X does not have not full column rank, then the set of solutions again forms a flat of the form \beta_0 + N(X) with \beta_0 given by (2).