5.5 Multivariate Multiple Regression (wk6)
5.5.1 Overview
Recall:
univariate Linear Regression:
repsponse variable Y, r predictor variables Z1,⋯,Zr.
- model:
$ Yj=β0+β1Zj1+⋯+βjZjr+ϵj,E(ϵj)=0,Var(ϵj)=σ2YYn×1=ZZn×(r+1)ββ(r+1)×1+ϵϵn×1,E(ϵϵ)=0,Var(ϵϵ)=σ2I $
- estimation:
$
^ββ=(ZZ′ZZ)−1ZZ′YY^ϵϵ=(YY−ZZ^ββ)=YY−ZZ(ZZ′ZZ)−1ZZ′YY=(I−ZZ(ZZ′ZZ)−1ZZ′)YY=(I−H)YY $
- inference:
let ϵ∼Nn(00,σ2I). then
$ ^ββ∼Nr+1(ββ,σ2(ZZ′ZZ)−1)^ϵϵ′^ϵϵ∼σ2χ2n−r−1E(^ϵϵ)=00Cov(^ϵϵ)=σ2(I−ZZ(ZZ′ZZ)−1ZZ′)E(^ϵϵ′^ϵϵn−r−1)=σ2 $
5.5.2 Multivariate Multiple Regression
Notation
Model
$
Y_{n m} = Z_{n (r+1)} {(r+1) m} + {n m}, ; ; ; ; ; E({(i)} ) = , Cov({(i)}, {(j)}) = {ik} I, ; ; ; i,k = 1, , m
$
- Cov of m responses:
$ Σ=[σ11σ1m⋱σm1σmm],Var(ϵϵ(i))=σiiI,Cov(ϵϵ(i),ϵϵ(j))=[σik00⋱00σik] $
- the meaning of
- 0: observations from different trials, are uncorrelated
- σik: errors for different responses on the same trial are correlated
- ith response YY(i):
$ Y_{(i)} = Z {(i)} + {(i)}, ; ; ; ; ; Corr({(i)}) = {ii}I , = (Z ’ Z )^{-1} Z ’ Y_{(i)}
$
##### Least Square |
- Collecting Univariate Least Squares Estimates (LSE) - Errors |
$ |
Y - Z = |
$ |
- Error Sum of Squares (SSE) - diagonal elements: Error SS for univariate least squares (YY(i)−ZZββ(i))′(YY(i)−ZZββ(i)) is minimized. - the generalized Var |(YY−ZZββ)′(YY−ZZββ)| is also minimized. |
- Properties |
$ \begin{align*} |
&= Z = Z(Z’Z)^{-1} Z’ Y \ &= HY \ |
&= Y - Y = Y \ &= (I-H)Y \ |
Z’ &= Z’ Y \ &= [Z-Z’] Y = |
Y’ &= ’ Z’ Y \ &= [ ’ Z- ’ Z’] Y = |
\end{align*} $ - - by (3), residuals are orthogonal to Z - by (4), residuals are orthogonal to ˆY |
- Error Sum of Squares |
$ \begin{align*} |
Y’Y &= (Y ) ’ (Y ) \ &= Y ’ Y + ’ \ |
\ |
’ &= Y’Y - Y ’ Y \ |
&= Y ’ Y - ’ Z’ Z |
\end{align*} $ |
- Results 1 $ \begin{alignat*}{2} |
E() &= , ; ; ; ; ; Cov(, ) &= _{il} (Z’Z)^{-1} \ \ E() = , ; ; ; ; ;E ( ’ ) = |
\end{alignat*} $ - - at here, ^ϵϵ and ^ββ are correlated. |
- Results 2 - If ϵϵj has a Nm(00,Σ), then ^ββ=(ZZ′ZZ)−1ZZ′Y is MLE of ββ |
$ \begin{align*} |
&N_{r+1} ({{(i)}}, {ii} (Z ’ Z )^{-1}) \ |
&= ’ \ &= (Y - Z ) ’ (Y - Z ) |
\end{align*} $ |
- - (5) is MLE of Σ - nˆΣ∼Wp,n−r−1(Σ). |
- Comment - Multivariate regression requires no new computational problems. - Univariate least squares ^ββ(i) are computed individually for each response variable. - Diagnostics check must be done as in univariate regression. - Residual vectors [ϵϵj1,⋯,ϵϵjm] can be examined for multivariate normality. |
### Hypothesis Testing |
- Note: |
$ \begin{align*} |
&H_0: Z_{q+1}, Z_{q+2}, , Z_{r} \ |
&H_0: |
&H_0: = , ; ; ; ; ; = |
\end{align*} $ |
##### Full Model vs. Reduced Model |
let Z=[Z1⋮Z2], then Zβ=Z1β(1)+Z2β(2). |
under H0, Y=Zβ(1)+ϵ, |
let |
$ \begin{align*} |
E &= n &\ &= (Y - Z )’(Y - Z )& \ \ H &= n(_1 - ), & E_1 = n(_1) = (Y - Z )’(Y - Z ) |
\end{align*} $ |
- $ E=n $. 여기서 E라는 것은 오차행렬이기 때문에, 즉 univariate 를 4번 반복해서 나온 오차를 모은 것이 바로 이 E라는 행렬. |
let λ1≥⋯≥λs be non-zero ev of HE−1, s=min(m,r−q). |
- Four Test Stat:
- Wilk’s Lambda:
$ = _{i=1}^s $
- Pillai Trace:
$ tr = _{i=1}^s $
- Lawley-Hotelling’s Trace:
$ tr(HE^{-1}) = _{i=1}^s {_i} $
- Roy’s Largest Root:
- maximum ev of H(H+E)−1=λ1.
5.5.3 Example)
fit FM Y=Zβ+ϵ.
fit Y1,Y2,Y3,Y4=X1,X2,X3, then we acquire E=nˆΣ.
1. $~H_0: \begin{bmatrix} \beta_{31},\beta_{32},\beta_{33},\beta_{34} \end{bmatrix} =0~$,
- H0:[β21,β22,β23,β24β31,β32,β33,β34]=0,
under H0, Y=Zβ(1)+ϵ
$
Z_1 = [1X11⋯⋯1Xn1]_{n }, ; ; ; ; ;
_{(1)} = [β01⋯β0mβ11⋯β1m]_{2 m}
$
now, fit Y1,Y2,Y3,Y4=X1 (X_2, X_3 excluded), then we acquire E1=nˆΣ1,H=nˆΣ1−nˆΣ=E1−E.
let’s calculate ev of HE−1, and compute Wilk’s Lambda Λ∗=|E||E+H|.
5.5.3.0.1 Sampling Distribution of the Wilk’s Lambda
let Z be full rank of r+1, and (r+1)+m≤n.
let ϵ be normally distributed.
under H0, $ - (^) ^2_{m(r-q)}$.
5.5.3.0.2 Prediction
$
{n m} = Z {(r+1) m}
$
assume fixed values Z0Z0(r+1)×1 of the predictor variables. then ^ββ′m×(r+1)ZZ0∼Nm(ββ′ZZ0,ZZ′0(ZZ′ZZ)−1ZZ0Σ).
- 100(1−α)% simultaneous CI for E(Yi)=ZZ′0ββ(i):
$
Z_0 ’ _{(i)} , ; ; ; ; ; i=1,, m
$
- where ββ(i) is the ith column of ββ.
- ˆσii is the ith diagonal element of ˆΣ.
- 100(1−α)% simultaneous C.I. for the individual responses Y0i=ZZ′0ββ(i)+ϵ0i:
$
Z_0 ’ _{(i)} , ; ; ; ; ; i=1,, m
$