5.5 Multivariate Multiple Regression (wk6)

5.5.1 Overview

Recall:

univariate Linear Regression:

repsponse variable $Y$, $r$ predictor variables $Z_1 , \cdots, Z_r$.

model:

$ \[\begin{alignat*}{3} Y_j &= \beta_0 + \beta_1 Z_{j1} + \cdots + \beta_j Z_{jr} + \epsilon_j , \; \; \; \; \; &E(\epsilon_j) = 0, Var(\epsilon_j) = \sigma^2 \pmb Y_{n \times 1} &= \pmb Z_{n \times (r+1)} \pmb \beta_{(r+1) \times 1} + \pmb \epsilon_{n \times 1}, \; \; \; \; \; &E(\pmb \epsilon) = 0, Var(\pmb \epsilon) = \sigma^2 I \end{alignat*}\] $

estimation:

\[\begin{alignat*}{3} \hat {\pmb \beta} &= (\pmb Z ' \pmb Z )^{-1} \pmb Z ' \pmb Y \\ \hat {\pmb \epsilon} &= (\pmb Y - \pmb Z \hat {\pmb \beta}) = \pmb Y - \pmb Z (\pmb Z ' \pmb Z )^{-1} \pmb Z ' \pmb Y = (I - \pmb Z (\pmb Z ' \pmb Z )^{-1} \pmb Z ') \pmb Y \\ &= (I-H)\pmb Y \end{alignat*}\] $

inference:

let $\epsilon \sim N_n (\pmb 0, \sigma^2 I)$. then

$ \[\begin{alignat*}{3} \hat {\pmb \beta} &\sim N_{r+1} (\pmb \beta , \sigma^2(\pmb Z '\pmb Z )^{-1}) \\ \hat {\pmb \epsilon} ' \hat {\pmb \epsilon} &\sim \sigma^2 \chi^2_{n-r-1} \\ \\ E(\hat {\pmb \epsilon} ) &= \pmb 0 \\ Cov(\hat {\pmb \epsilon} ) &= \sigma^2 (I - \pmb Z (\pmb Z ' \pmb Z )^{-1} \pmb Z ') \\ E\left( \dfrac{\hat {\pmb \epsilon} ' \hat {\pmb \epsilon}}{n-r-1} \right) &= \sigma^2 \end{alignat*}\] $

5.5.2 Multivariate Multiple Regression

Notation
Model

Y_{n m} = Z_{n (r+1)} {(r+1) m} + {n m}, ; ; ; ; ; E({(i)} ) = , Cov({(i)}, {(j)}) = {ik} I, ; ; ; i,k = 1, , m

- Cov of $m$ responses:

$ \[\begin{alignat*}{3} &\Sigma = \begin{bmatrix} \sigma_{11} & & \sigma_{1m} \\ & \ddots & \\ \sigma_{m1}&& \sigma_{mm} \end{bmatrix}, \; \; \; \; \; &&Var(\pmb \epsilon_{(i)}) = \sigma_{ii} I,\; \; \; \; \; &&Cov(\pmb \epsilon_{(i)}, \pmb \epsilon_{(j)}) = \begin{bmatrix} \sigma_{ik} & & \pmb 0 \\ & \ddots & \\ \pmb 0 && \sigma_{ik} \end{bmatrix} \end{alignat*}\] $

the meaning of
- $0$: observations from different trials, are uncorrelated
- $\sigma_{ik}$: errors for different responses on the same trial are correlated
$i$th response $\pmb Y_{(i)}$:

$ Y_{(i)} = Z {(i)} + {(i)}, ; ; ; ; ; Corr({(i)}) = {ii}I , = (Z ’ Z )^{-1} Z ’ Y_{(i)}

##### Least Square

- Collecting Univariate Least Squares Estimates (LSE) - Errors

Y - Z =

- Error Sum of Squares (SSE) - diagonal elements: Error SS for univariate least squares $(\pmb Y_{(i)}-\pmb Z \pmb \beta_{(i)})' (\pmb Y_{(i)}-\pmb Z \pmb \beta_{(i)})$ is minimized. - the generalized $Var$ $\lvert (\pmb Y-\pmb Z \pmb \beta)' (\pmb Y-\pmb Z \pmb \beta) \rvert$ is also minimized.

- Properties

$ \begin{align*}

&= Z = Z(Z’Z)^{-1} Z’ Y \ &= HY \

&= Y - Y = Y \ &= (I-H)Y \

Z’ &= Z’ Y \ &= [Z-Z’] Y =

Y’ &= ’ Z’ Y \ &= [ ’ Z- ’ Z’] Y =

\end{align*} $ - - by (3), residuals are orthogonal to $Z$ - by (4), residuals are orthogonal to $\hat Y$

- Error Sum of Squares

$ \begin{align*}

Y’Y &= (Y ) ’ (Y ) \ &= Y ’ Y + ’ \

’ &= Y’Y - Y ’ Y \

&= Y ’ Y - ’ Z’ Z

\end{align*} $

- Results 1 $ \begin{alignat*}{2}

E() &= , ; ; ; ; ; Cov(, ) &= _{il} (Z’Z)^{-1} \ \ E() = , ; ; ; ; ;E ( ’ ) =

\end{alignat*} $ - - at here, $\hat {\pmb \epsilon}$ and $\hat {\pmb \beta}$ are correlated.

- Results 2 - If $\pmb \epsilon_j$ has a $N_m (\pmb 0 , \Sigma)$, then $\hat {\pmb \beta}= (\pmb Z ' \pmb Z )^{-1}\pmb Z 'Y$ is MLE of $\pmb \beta$

$ \begin{align*}

&N_{r+1} ({{(i)}}, {ii} (Z ’ Z )^{-1}) \

&= ’ \ &= (Y - Z ) ’ (Y - Z )

\end{align*} $

- - (5) is MLE of $\Sigma$ - $n \hat \Sigma \sim W_{p,n-r-1} (\Sigma)$.

- Comment - Multivariate regression requires no new computational problems. - Univariate least squares $\hat {\pmb \beta_{(i)}}$ are computed individually for each response variable. - Diagnostics check must be done as in univariate regression. - Residual vectors $[ \pmb \epsilon_{j1}, \cdots, \pmb \epsilon_{jm} ]$ can be examined for multivariate normality.

### Hypothesis Testing

- Note:

$ \begin{align*}

&H_0: Z_{q+1}, Z_{q+2}, , Z_{r} \

&H_0:

&H_0: = , ; ; ; ; ; =

\end{align*} $

##### Full Model vs. Reduced Model

let $Z = \begin{bmatrix} Z_1 & \vdots Z_2 \end{bmatrix}$, then $Z \beta = Z_1 \beta_{(1)} + Z_2 \beta_{(2)}$.

under $H_0$, $Y = Z \beta_{(1)} + \epsilon$,

let

$ \begin{align*}

E &= n &\ &= (Y - Z )’(Y - Z )& \ \ H &= n(_1 - ), & E_1 = n(_1) = (Y - Z )’(Y - Z )

\end{align*} $

- $ E=n $. 여기서 E라는 것은 오차행렬이기 때문에, 즉 univariate 를 4번 반복해서 나온 오차를 모은 것이 바로 이 $E$라는 행렬.

let $\lambda_1 \ge \cdots \ge \lambda_s$ be non-zero ev of $HE^{-1}$, $s=min(m, r-q)$.

Four Test Stat:

Wilk’s Lambda:

$ = _{i=1}^s $

Pillai Trace:

$ tr = _{i=1}^s $

Lawley-Hotelling’s Trace:

$ tr(HE^{-1}) = _{i=1}^s {_i} $

Roy’s Largest Root:
- maximum ev of $H(H+E)^{-1} = \lambda_1$.

5.5.3 Example)

fit FM $Y = Z \beta + \epsilon$.

fit $Y_1 , Y_2 , Y_3 , Y_4 = X_1,X_2,X_3$, then we acquire $E=n \hat \Sigma$.


1. $~H_0: \begin{bmatrix} \beta_{31},\beta_{32},\beta_{33},\beta_{34} \end{bmatrix} =0~$,

$H_0: \begin{bmatrix} \beta_{21},\beta_{22},\beta_{23},\beta_{24}\\\beta_{31},\beta_{32},\beta_{33},\beta_{34} \end{bmatrix} =0$,

under $H_0$, $Y=Z \beta_{(1)} + \epsilon$

Z_1 = \[\begin{bmatrix} 1 & X_{11} \\ \cdots & \cdots \\ 1 & X_{n1} \end{bmatrix}\]

_{n }, ; ; ; ; ;

_{(1)} = \[\begin{bmatrix} \beta_{01} & \cdots & \beta_{0m} \\ \beta_{11} & \cdots & \beta_{1m} \end{bmatrix}\]

_{2 m}

now, fit $Y_1 , Y_2 , Y_3 , Y_4 = X_1$ (X_2, X_3 excluded), then we acquire $E_1 =n \hat \Sigma_1, H = n \hat \Sigma_1 - n \hat \Sigma = E_1 - E$.

let’s calculate ev of $HE^{-1}$, and compute Wilk’s Lambda $\Lambda^\ast = \dfrac{\vert E \vert }{\vert E+H\vert }$.

5.5.3.0.1 Sampling Distribution of the Wilk’s Lambda

let Z be full rank of $r+1$, and $(r+1) + m \le n$.

let $\epsilon$ be normally distributed.

under $H_0$, $ - (^) ^2_{m(r-q)}$.

5.5.3.0.2 Prediction

{n m} = Z {(r+1) m}

assume fixed values $\pmb {Z_0}_{(r+1) \times 1}$ of the predictor variables. then $\hat {\pmb \beta}'_{m \times (r+1)} \pmb Z_0 \sim N_m(\pmb \beta ' \pmb Z_0 , \pmb Z_0 ' (\pmb Z ' \pmb Z)^{-1} \pmb Z_0 \Sigma)$.

$100(1-\alpha)\%$ simultaneous CI for $E(Y_i) = \pmb Z_0 ' \pmb \beta_{(i)}$:

Z_0 ’ _{(i)} , ; ; ; ; ; i=1,, m

- where $\pmb \beta_{(i)}$ is the $i$th column of $\pmb \beta$.
- $\hat \sigma_{ii}$ is the $i$th diagonal element of $\hat \Sigma$.

$100(1-\alpha)\%$ simultaneous C.I. for the individual responses $Y_{0i} = \pmb Z_0 ' \pmb \beta_{(i)} + \epsilon_{0i}$:

Z_0 ’ _{(i)} , ; ; ; ; ; i=1,, m