5.5 Multivariate Multiple Regression (wk6)
5.5.1 Overview
Recall:
univariate Linear Regression:
repsponse variable \(Y\), \(r\) predictor variables \(Z_1 , \cdots, Z_r\).
- model:
$ \[\begin{alignat*}{3} Y_j &= \beta_0 + \beta_1 Z_{j1} + \cdots + \beta_j Z_{jr} + \epsilon_j , \; \; \; \; \; &E(\epsilon_j) = 0, Var(\epsilon_j) = \sigma^2 \pmb Y_{n \times 1} &= \pmb Z_{n \times (r+1)} \pmb \beta_{(r+1) \times 1} + \pmb \epsilon_{n \times 1}, \; \; \; \; \; &E(\pmb \epsilon) = 0, Var(\pmb \epsilon) = \sigma^2 I \end{alignat*}\] $
- estimation:
$
\[\begin{alignat*}{3} \hat {\pmb \beta} &= (\pmb Z ' \pmb Z )^{-1} \pmb Z ' \pmb Y \\ \hat {\pmb \epsilon} &= (\pmb Y - \pmb Z \hat {\pmb \beta}) = \pmb Y - \pmb Z (\pmb Z ' \pmb Z )^{-1} \pmb Z ' \pmb Y = (I - \pmb Z (\pmb Z ' \pmb Z )^{-1} \pmb Z ') \pmb Y \\ &= (I-H)\pmb Y \end{alignat*}\] $
- inference:
let \(\epsilon \sim N_n (\pmb 0, \sigma^2 I)\). then
$ \[\begin{alignat*}{3} \hat {\pmb \beta} &\sim N_{r+1} (\pmb \beta , \sigma^2(\pmb Z '\pmb Z )^{-1}) \\ \hat {\pmb \epsilon} ' \hat {\pmb \epsilon} &\sim \sigma^2 \chi^2_{n-r-1} \\ \\ E(\hat {\pmb \epsilon} ) &= \pmb 0 \\ Cov(\hat {\pmb \epsilon} ) &= \sigma^2 (I - \pmb Z (\pmb Z ' \pmb Z )^{-1} \pmb Z ') \\ E\left( \dfrac{\hat {\pmb \epsilon} ' \hat {\pmb \epsilon}}{n-r-1} \right) &= \sigma^2 \end{alignat*}\] $
5.5.2 Multivariate Multiple Regression
Notation
Model
$
Y_{n m} = Z_{n (r+1)} {(r+1) m} + {n m}, ; ; ; ; ; E({(i)} ) = , Cov({(i)}, {(j)}) = {ik} I, ; ; ; i,k = 1, , m
$
- Cov of \(m\) responses:
$ \[\begin{alignat*}{3} &\Sigma = \begin{bmatrix} \sigma_{11} & & \sigma_{1m} \\ & \ddots & \\ \sigma_{m1}&& \sigma_{mm} \end{bmatrix}, \; \; \; \; \; &&Var(\pmb \epsilon_{(i)}) = \sigma_{ii} I,\; \; \; \; \; &&Cov(\pmb \epsilon_{(i)}, \pmb \epsilon_{(j)}) = \begin{bmatrix} \sigma_{ik} & & \pmb 0 \\ & \ddots & \\ \pmb 0 && \sigma_{ik} \end{bmatrix} \end{alignat*}\] $
- the meaning of
- \(0\): observations from different trials, are uncorrelated
- \(\sigma_{ik}\): errors for different responses on the same trial are correlated
- \(i\)th response \(\pmb Y_{(i)}\):
$ Y_{(i)} = Z {(i)} + {(i)}, ; ; ; ; ; Corr({(i)}) = {ii}I , = (Z ’ Z )^{-1} Z ’ Y_{(i)}
$
##### Least Square |
- Collecting Univariate Least Squares Estimates (LSE) - Errors |
$ |
Y - Z = |
$ |
- Error Sum of Squares (SSE) - diagonal elements: Error SS for univariate least squares \((\pmb Y_{(i)}-\pmb Z \pmb \beta_{(i)})' (\pmb Y_{(i)}-\pmb Z \pmb \beta_{(i)})\) is minimized. - the generalized \(Var\) \(\lvert (\pmb Y-\pmb Z \pmb \beta)' (\pmb Y-\pmb Z \pmb \beta) \rvert\) is also minimized. |
- Properties |
$ \begin{align*} |
&= Z = Z(Z’Z)^{-1} Z’ Y \ &= HY \ |
&= Y - Y = Y \ &= (I-H)Y \ |
Z’ &= Z’ Y \ &= [Z-Z’] Y = |
Y’ &= ’ Z’ Y \ &= [ ’ Z- ’ Z’] Y = |
\end{align*} $ - - by (3), residuals are orthogonal to \(Z\) - by (4), residuals are orthogonal to \(\hat Y\) |
- Error Sum of Squares |
$ \begin{align*} |
Y’Y &= (Y ) ’ (Y ) \ &= Y ’ Y + ’ \ |
\ |
’ &= Y’Y - Y ’ Y \ |
&= Y ’ Y - ’ Z’ Z |
\end{align*} $ |
- Results 1 $ \begin{alignat*}{2} |
E() &= , ; ; ; ; ; Cov(, ) &= _{il} (Z’Z)^{-1} \ \ E() = , ; ; ; ; ;E ( ’ ) = |
\end{alignat*} $ - - at here, \(\hat {\pmb \epsilon}\) and \(\hat {\pmb \beta}\) are correlated. |
- Results 2 - If \(\pmb \epsilon_j\) has a \(N_m (\pmb 0 , \Sigma)\), then \(\hat {\pmb \beta}= (\pmb Z ' \pmb Z )^{-1}\pmb Z 'Y\) is MLE of \(\pmb \beta\) |
$ \begin{align*} |
&N_{r+1} ({{(i)}}, {ii} (Z ’ Z )^{-1}) \ |
&= ’ \ &= (Y - Z ) ’ (Y - Z ) |
\end{align*} $ |
- - (5) is MLE of \(\Sigma\) - \(n \hat \Sigma \sim W_{p,n-r-1} (\Sigma)\). |
- Comment - Multivariate regression requires no new computational problems. - Univariate least squares \(\hat {\pmb \beta_{(i)}}\) are computed individually for each response variable. - Diagnostics check must be done as in univariate regression. - Residual vectors \([ \pmb \epsilon_{j1}, \cdots, \pmb \epsilon_{jm} ]\) can be examined for multivariate normality. |
### Hypothesis Testing |
- Note: |
$ \begin{align*} |
&H_0: Z_{q+1}, Z_{q+2}, , Z_{r} \ |
&H_0: |
&H_0: = , ; ; ; ; ; = |
\end{align*} $ |
##### Full Model vs. Reduced Model |
let \(Z = \begin{bmatrix} Z_1 & \vdots Z_2 \end{bmatrix}\), then \(Z \beta = Z_1 \beta_{(1)} + Z_2 \beta_{(2)}\). |
under \(H_0\), \(Y = Z \beta_{(1)} + \epsilon\), |
let |
$ \begin{align*} |
E &= n &\ &= (Y - Z )’(Y - Z )& \ \ H &= n(_1 - ), & E_1 = n(_1) = (Y - Z )’(Y - Z ) |
\end{align*} $ |
- $ E=n $. 여기서 E라는 것은 오차행렬이기 때문에, 즉 univariate 를 4번 반복해서 나온 오차를 모은 것이 바로 이 \(E\)라는 행렬. |
let \(\lambda_1 \ge \cdots \ge \lambda_s\) be non-zero ev of \(HE^{-1}\), \(s=min(m, r-q)\). |
- Four Test Stat:
- Wilk’s Lambda:
$ = _{i=1}^s $
- Pillai Trace:
$ tr = _{i=1}^s $
- Lawley-Hotelling’s Trace:
$ tr(HE^{-1}) = _{i=1}^s {_i} $
- Roy’s Largest Root:
- maximum ev of \(H(H+E)^{-1} = \lambda_1\).
5.5.3 Example)
fit FM \(Y = Z \beta + \epsilon\).
fit \(Y_1 , Y_2 , Y_3 , Y_4 = X_1,X_2,X_3\), then we acquire \(E=n \hat \Sigma\).
1. $~H_0: \begin{bmatrix} \beta_{31},\beta_{32},\beta_{33},\beta_{34} \end{bmatrix} =0~$,
- \(H_0: \begin{bmatrix} \beta_{21},\beta_{22},\beta_{23},\beta_{24}\\\beta_{31},\beta_{32},\beta_{33},\beta_{34} \end{bmatrix} =0\),
under \(H_0\), \(Y=Z \beta_{(1)} + \epsilon\)
$
Z_1 = \[\begin{bmatrix} 1 & X_{11} \\ \cdots & \cdots \\ 1 & X_{n1} \end{bmatrix}\]_{n }, ; ; ; ; ;
_{(1)} = \[\begin{bmatrix} \beta_{01} & \cdots & \beta_{0m} \\ \beta_{11} & \cdots & \beta_{1m} \end{bmatrix}\]_{2 m}
$
now, fit \(Y_1 , Y_2 , Y_3 , Y_4 = X_1\) (X_2, X_3 excluded), then we acquire \(E_1 =n \hat \Sigma_1, H = n \hat \Sigma_1 - n \hat \Sigma = E_1 - E\).
let’s calculate ev of \(HE^{-1}\), and compute Wilk’s Lambda \(\Lambda^\ast = \dfrac{\vert E \vert }{\vert E+H\vert }\).
5.5.3.0.1 Sampling Distribution of the Wilk’s Lambda
let Z be full rank of \(r+1\), and \((r+1) + m \le n\).
let \(\epsilon\) be normally distributed.
under \(H_0\), $ - (^) ^2_{m(r-q)}$.
5.5.3.0.2 Prediction
$
{n m} = Z {(r+1) m}
$
assume fixed values \(\pmb {Z_0}_{(r+1) \times 1}\) of the predictor variables. then \(\hat {\pmb \beta}'_{m \times (r+1)} \pmb Z_0 \sim N_m(\pmb \beta ' \pmb Z_0 , \pmb Z_0 ' (\pmb Z ' \pmb Z)^{-1} \pmb Z_0 \Sigma)\).
- \(100(1-\alpha)\%\) simultaneous CI for \(E(Y_i) = \pmb Z_0 ' \pmb \beta_{(i)}\):
$
Z_0 ’ _{(i)} , ; ; ; ; ; i=1,, m
$
- where \(\pmb \beta_{(i)}\) is the \(i\)th column of \(\pmb \beta\).
- \(\hat \sigma_{ii}\) is the \(i\)th diagonal element of \(\hat \Sigma\).
- \(100(1-\alpha)\%\) simultaneous C.I. for the individual responses \(Y_{0i} = \pmb Z_0 ' \pmb \beta_{(i)} + \epsilon_{0i}\):
$
Z_0 ’ _{(i)} , ; ; ; ; ; i=1,, m
$