5.5 Multivariate Multiple Regression (wk6)

5.5.1 Overview

Recall:

univariate Linear Regression:

repsponse variable \(Y\), \(r\) predictor variables \(Z_1 , \cdots, Z_r\).

  • model:

$ \[\begin{alignat*}{3} Y_j &= \beta_0 + \beta_1 Z_{j1} + \cdots + \beta_j Z_{jr} + \epsilon_j , \; \; \; \; \; &E(\epsilon_j) = 0, Var(\epsilon_j) = \sigma^2 \pmb Y_{n \times 1} &= \pmb Z_{n \times (r+1)} \pmb \beta_{(r+1) \times 1} + \pmb \epsilon_{n \times 1}, \; \; \; \; \; &E(\pmb \epsilon) = 0, Var(\pmb \epsilon) = \sigma^2 I \end{alignat*}\] $

  • estimation:

$

\[\begin{alignat*}{3} \hat {\pmb \beta} &= (\pmb Z ' \pmb Z )^{-1} \pmb Z ' \pmb Y \\ \hat {\pmb \epsilon} &= (\pmb Y - \pmb Z \hat {\pmb \beta}) = \pmb Y - \pmb Z (\pmb Z ' \pmb Z )^{-1} \pmb Z ' \pmb Y = (I - \pmb Z (\pmb Z ' \pmb Z )^{-1} \pmb Z ') \pmb Y \\ &= (I-H)\pmb Y \end{alignat*}\] $

  • inference:

let \(\epsilon \sim N_n (\pmb 0, \sigma^2 I)\). then

$ \[\begin{alignat*}{3} \hat {\pmb \beta} &\sim N_{r+1} (\pmb \beta , \sigma^2(\pmb Z '\pmb Z )^{-1}) \\ \hat {\pmb \epsilon} ' \hat {\pmb \epsilon} &\sim \sigma^2 \chi^2_{n-r-1} \\ \\ E(\hat {\pmb \epsilon} ) &= \pmb 0 \\ Cov(\hat {\pmb \epsilon} ) &= \sigma^2 (I - \pmb Z (\pmb Z ' \pmb Z )^{-1} \pmb Z ') \\ E\left( \dfrac{\hat {\pmb \epsilon} ' \hat {\pmb \epsilon}}{n-r-1} \right) &= \sigma^2 \end{alignat*}\] $


5.5.2 Multivariate Multiple Regression

  • Notation

  • Model

$

Y_{n m} = Z_{n (r+1)} {(r+1) m} + {n m}, ; ; ; ; ; E({(i)} ) = , Cov({(i)}, {(j)}) = {ik} I, ; ; ; i,k = 1, , m

$

    • Cov of \(m\) responses:

$ \[\begin{alignat*}{3} &\Sigma = \begin{bmatrix} \sigma_{11} & & \sigma_{1m} \\ & \ddots & \\ \sigma_{m1}&& \sigma_{mm} \end{bmatrix}, \; \; \; \; \; &&Var(\pmb \epsilon_{(i)}) = \sigma_{ii} I,\; \; \; \; \; &&Cov(\pmb \epsilon_{(i)}, \pmb \epsilon_{(j)}) = \begin{bmatrix} \sigma_{ik} & & \pmb 0 \\ & \ddots & \\ \pmb 0 && \sigma_{ik} \end{bmatrix} \end{alignat*}\] $

  • the meaning of
    • \(0\): observations from different trials, are uncorrelated
    • \(\sigma_{ik}\): errors for different responses on the same trial are correlated
  • \(i\)th response \(\pmb Y_{(i)}\):

$ Y_{(i)} = Z {(i)} + {(i)}, ; ; ; ; ; Corr({(i)}) = {ii}I , = (Z ’ Z )^{-1} Z ’ Y_{(i)}

$

##### Least Square
- Collecting Univariate Least Squares Estimates (LSE) - Errors
$
Y - Z =
$
- Error Sum of Squares (SSE) - diagonal elements: Error SS for univariate least squares \((\pmb Y_{(i)}-\pmb Z \pmb \beta_{(i)})' (\pmb Y_{(i)}-\pmb Z \pmb \beta_{(i)})\) is minimized. - the generalized \(Var\) \(\lvert (\pmb Y-\pmb Z \pmb \beta)' (\pmb Y-\pmb Z \pmb \beta) \rvert\) is also minimized.
- Properties
$ \begin{align*}
&= Z = Z(Z’Z)^{-1} Z’ Y \ &= HY \
&= Y - Y = Y \ &= (I-H)Y \
Z’ &= Z’ Y \ &= [Z-Z’] Y =
Y’ &= ’ Z’ Y \ &= [ ’ Z- ’ Z’] Y =
\end{align*} $ - - by (3), residuals are orthogonal to \(Z\) - by (4), residuals are orthogonal to \(\hat Y\)
- Error Sum of Squares
$ \begin{align*}
Y’Y &= (Y ) ’ (Y ) \ &= Y ’ Y + ’ \
\
’ &= Y’Y - Y ’ Y \
&= Y ’ Y - ’ Z’ Z
\end{align*} $
- Results 1 $ \begin{alignat*}{2}
E() &= , ; ; ; ; ; Cov(, ) &= _{il} (Z’Z)^{-1} \ \ E() = , ; ; ; ; ;E ( ’ ) =
\end{alignat*} $ - - at here, \(\hat {\pmb \epsilon}\) and \(\hat {\pmb \beta}\) are correlated.
- Results 2 - If \(\pmb \epsilon_j\) has a \(N_m (\pmb 0 , \Sigma)\), then \(\hat {\pmb \beta}= (\pmb Z ' \pmb Z )^{-1}\pmb Z 'Y\) is MLE of \(\pmb \beta\)
$ \begin{align*}
&N_{r+1} ({{(i)}}, {ii} (Z ’ Z )^{-1}) \
&= ’ \ &= (Y - Z ) ’ (Y - Z )
\end{align*} $
- - (5) is MLE of \(\Sigma\) - \(n \hat \Sigma \sim W_{p,n-r-1} (\Sigma)\).
- Comment - Multivariate regression requires no new computational problems. - Univariate least squares \(\hat {\pmb \beta_{(i)}}\) are computed individually for each response variable. - Diagnostics check must be done as in univariate regression. - Residual vectors \([ \pmb \epsilon_{j1}, \cdots, \pmb \epsilon_{jm} ]\) can be examined for multivariate normality.
### Hypothesis Testing
- Note:
$ \begin{align*}
&H_0: Z_{q+1}, Z_{q+2}, , Z_{r} \
&H_0:
&H_0: = , ; ; ; ; ;  =
\end{align*} $
##### Full Model vs. Reduced Model
let \(Z = \begin{bmatrix} Z_1 & \vdots Z_2 \end{bmatrix}\), then \(Z \beta = Z_1 \beta_{(1)} + Z_2 \beta_{(2)}\).
under \(H_0\), \(Y = Z \beta_{(1)} + \epsilon\),
let
$ \begin{align*}
E &= n &\ &= (Y - Z )’(Y - Z )& \ \ H &= n(_1 - ), & E_1 = n(_1) = (Y - Z )’(Y - Z )
\end{align*} $
- $ E=n $. 여기서 E라는 것은 오차행렬이기 때문에, 즉 univariate 를 4번 반복해서 나온 오차를 모은 것이 바로 이 \(E\)라는 행렬.
let \(\lambda_1 \ge \cdots \ge \lambda_s\) be non-zero ev of \(HE^{-1}\), \(s=min(m, r-q)\).
  • Four Test Stat:
  1. Wilk’s Lambda:

$ = _{i=1}^s $

  1. Pillai Trace:

$ tr = _{i=1}^s $

  1. Lawley-Hotelling’s Trace:

$ tr(HE^{-1}) = _{i=1}^s {_i} $

  1. Roy’s Largest Root:
    • maximum ev of \(H(H+E)^{-1} = \lambda_1\).

5.5.3 Example)

fit FM \(Y = Z \beta + \epsilon\).

fit \(Y_1 , Y_2 , Y_3 , Y_4 = X_1,X_2,X_3\), then we acquire \(E=n \hat \Sigma\).


1. $~H_0: \begin{bmatrix} \beta_{31},\beta_{32},\beta_{33},\beta_{34} \end{bmatrix} =0~$,
  1. \(H_0: \begin{bmatrix} \beta_{21},\beta_{22},\beta_{23},\beta_{24}\\\beta_{31},\beta_{32},\beta_{33},\beta_{34} \end{bmatrix} =0\),

under \(H_0\), \(Y=Z \beta_{(1)} + \epsilon\)

$

Z_1 = \[\begin{bmatrix} 1 & X_{11} \\ \cdots & \cdots \\ 1 & X_{n1} \end{bmatrix}\]

_{n }, ; ; ; ; ;

_{(1)} = \[\begin{bmatrix} \beta_{01} & \cdots & \beta_{0m} \\ \beta_{11} & \cdots & \beta_{1m} \end{bmatrix}\]

_{2 m}

$

now, fit \(Y_1 , Y_2 , Y_3 , Y_4 = X_1\) (X_2, X_3 excluded), then we acquire \(E_1 =n \hat \Sigma_1, H = n \hat \Sigma_1 - n \hat \Sigma = E_1 - E\).

let’s calculate ev of \(HE^{-1}\), and compute Wilk’s Lambda \(\Lambda^\ast = \dfrac{\vert E \vert }{\vert E+H\vert }\).


5.5.3.0.1 Sampling Distribution of the Wilk’s Lambda

let Z be full rank of \(r+1\), and \((r+1) + m \le n\).

let \(\epsilon\) be normally distributed.

under \(H_0\), $ - (^) ^2_{m(r-q)}$.


5.5.3.0.2 Prediction

$

{n m} = Z {(r+1) m}

$

assume fixed values \(\pmb {Z_0}_{(r+1) \times 1}\) of the predictor variables. then \(\hat {\pmb \beta}'_{m \times (r+1)} \pmb Z_0 \sim N_m(\pmb \beta ' \pmb Z_0 , \pmb Z_0 ' (\pmb Z ' \pmb Z)^{-1} \pmb Z_0 \Sigma)\).




  • \(100(1-\alpha)\%\) simultaneous CI for \(E(Y_i) = \pmb Z_0 ' \pmb \beta_{(i)}\):

$

Z_0 ’ _{(i)} , ; ; ; ; ; i=1,, m

$

    • where \(\pmb \beta_{(i)}\) is the \(i\)th column of \(\pmb \beta\).
    • \(\hat \sigma_{ii}\) is the \(i\)th diagonal element of \(\hat \Sigma\).




  • \(100(1-\alpha)\%\) simultaneous C.I. for the individual responses \(Y_{0i} = \pmb Z_0 ' \pmb \beta_{(i)} + \epsilon_{0i}\):

$

Z_0 ’ _{(i)} , ; ; ; ; ; i=1,, m

$