6.7 Flat

6.7.1 1.Flat

Sometimes in statistical applications it is useful to consider a linear subspace that is shifted or translated from the origin. This will happen, for example, in models that include an intercept. It is therefore helpful to have the following definition of a space that is displaced from the origin.



  • Definition 1 (Flat)

suppose MV is a linear subspace, and y0V. Then a flat consists of {x+y0|xM}. We will write y0+M where M is a subspace to indicate a flat.

By considering translations, flats are equivalent to vector spaces. If Y is a rv whose domain is the flat y0+M, then, if y0 is fixed, Yy0 has domain M.



  • example

set S4={(1,1,1)+z,zS2} is a flat, because 0S4.



  • example

In Ce2, consider M={α(12)|αCe}, and y0=(22).

Then the flat y0+M is given by the set y0+M={(22)+α(12)|αCe}.

which is just a straight line that does not pass through the origin, but rather through the point (2,2). The choice of y0 is not unique and it can be any point y=y0+yα, where yα=α(1,2). For example, if α=2, then y=(0,2) and if α=+1, then y=(3,4), and so on. For any y0 not of this form, we simply get a different flat. This is summarized in the next remark.



  • Theorem 1

The two spans

$ F1={z|z=y0+x,y0V,xMV}F2={z|z=y1+x,y1F1,xMV} $

are the same subspace, so the representation of the flat is not unique.



  • Definition 2 (Sum and intersection of subspaces)

let H,K be two linear subspaces. Then

$ H+K={x+y|xH,yK}HK={x|xH,xK} $



  • Theorem 2

Both H+K and HK are linear subspaces.



  • Definition 3 (Disjoint subspaces)

Two subspaces are disjoint if HK={0}, the null vector.



  • Theorem 3

If HK={0}, and zH+K, then the decomposition z=x+y with xH and yK is unique.

prf) suppose z=x+y and z=x+y. Then, xxH and yyK. We must have x+y=x+y or xx=yy, which in turn requires that xx=yy=0, since 0 is the only vector common to H and K. Thus, x=x and y=y.



  • Theorem 4

if HK={0}, then dim(H+K)=dim(H)+dim(K). In general, $(H+K) = (H) + (K) -(H K) $

Proof: Exercise.



  • Definition 4 (Complement of a space)

If M and Mc are disjoint subspaces of V and V=M+Mc, then Mc is called a complement of M.

  • Remark 1: The complement is not unique. In R2, a subspace M of dimension 1 consists of a line through the origin. A complement of M is given by any other line McαM through the origin, because linear combinations of any two such lines span Ce2.

In the linear model Y=Xβ+ϵ, we have that $= E(Y ) = X $, so that μC(X). To estimate μ with ˆμ, we might want to require that ˆμC(X) (note: if X includes a constant, then C(X) is a flat; otherwise, it is a subspace). The estimate would then depend upon Y in a sensible way by moving Y to the subspace. The method of moving is via projections. The optimality of moves depends on the way we measure distance - on an inner product defined on the vector space.







6.7.2 2. Solutions to systems of linear equations

Consider the Matrix equation Xn×pβp×1=yn×1. For a given X and Y does there exist β to these equations? Is it unique? If not unique, can we characterize all possible solutions?



  1. If n=p and X is nonsingular, the unique solution is β=X1y.



  1. If yC(X), y can be expressed as a linear combination of the columns of X. If X is of full column rank, then the columns of X form a basis for C(X), and the solution β is just the coordinates of y relative to this basis. For any g-inverse X, we have XXy=y for all yC(X), and so a solution is given by β=Xy.
    If ρ(X)=rank(C(X))<p, then the solution is not unique. If β0 as any solution, for example the solution is given by β=Xy, then so is β0+z,zN(X), which is null-space of X. The set of solutions is given by β0+N(X), which is a flat.



  1. If yC(X), then there is no exact solution. This is the usual situation in linear models, and leads to the estimation problem discussed in the next chapter.

What we might do is get the closest solution by replacing Y by another vector ˆY that is as close to Y as possible; if we define close as making small, we need to solve X \beta = P_{\mathcal{C}(X)}Y insetead of the original equation. If X has full column rank, this leads to the familiar solution:

$ \begin{align} \beta_0 &= X^+ P y \\ &= (X'X)^{-1} X' X (X'X)^{-1}X' Y \\ & = (X'X)^{-1}X'Y \tag{2} \end{align} $

which is unique. If X does not have not full column rank, then the set of solutions again forms a flat of the form \beta_0 + N(X) with \beta_0 given by (2).