5.3 Inference about Mean Vector (wk3)
5.3.1 Overview
Recall: univariate case \(x_1 , \cdots, x_n \overset {iid} {\sim} N(\mu, \sigma^2)\)
\(H_0 : \mu = \mu_0\)
$ \[\begin{alignat*}{2} \text{test stat } &t &&=\tfrac{\bar X - \mu_0}{\tfrac{S}{\sqrt{n}}} &\overset{H_0}{\sim} t_{n-1} \\ &t^2 &&=\tfrac{(\bar X - \mu_0)^2}{\tfrac{S^2}{n}} &\overset{H_0}{\sim} F_{1, \; n-1} \end{alignat*}\] $
reject \(H_0\) if as below, which means upper \((100-\alpha)\)th percentile.
$ \[\begin{alignat*}{1} \tfrac{(\bar X - \mu_0)^2}{\tfrac{S^2}{n}} = n(\bar X - \mu_0)\tfrac{1}{S^2}(\bar X - \mu_0) &> F_{1,n-1}(\alpha) \end{alignat*}\] $
therefore, with assumption \(\pmb X_1 , \cdots, \pmb X_n \overset {iid} {\sim} N_p (\pmb \mu , \Sigma)\),
$
H_0 : = _0
$
$ \[\begin{alignat*}{3} \text{Hotelling's }T^2 \; \; T^2 &= n(\bar {\pmb X} - \pmb \mu_0)' S^{-1} (\bar {\pmb X} - \pmb \mu_0) \\ &\overset{H_0}{\sim} \tfrac{(n-1)p}{(n-p)} F_{p,n-p} \\ \iff \; \; \tfrac {(n-p)} {(n-1)p} T^2 &\overset{H_0}{\sim} F_{p,n-p} \end{alignat*}\] $
reject \(H_0\), if \(T^2 > \tfrac{(n-1)p}{(n-p)} F_{p,n-p} (\alpha)\).
assumption check: \(\pmb X_1 , \cdots, \pmb X_n \overset{iid}{\sim} N_p (\pmb \mu , \Sigma)\).
5.3.1.0.1 Remark
stat \(T^2\)는 측정 단위에 invariant. proof)
let $Y_{p } = C_{p p} X_{p } + d_{p } $. then
$ \[\begin{align*} \bar {\pmb Y} &= C \bar {\pmb X} + \pmb d \\ \\ S_{\pmb y} &= CSC'\\ \\ \mu_y &= E(\pmb Y) \\ &=C \ast E(\pmb X) + \pmb d \\ &= C \pmb \mu_0 + \pmb d \end{align*}\] $
therefore,
$ \[\begin{align*} T^2 &= n(\bar {\pmb Y} - \pmb \mu_y)' S_y^{-1} (\bar {\pmb Y} - \pmb \mu_y) \\ &= n \left[ C(\bar {\pmb X} - \pmb \mu_0) \right]' (CSC')^{-1} \left[ C(\bar {\pmb X} - \pmb \mu_0) \right] \\ &= n (\bar {\pmb X} - \pmb \mu_0)' C' (C')^{-1} S^{-1}(C)^{-1} C(\bar {\pmb X} - \pmb \mu_0) \\ &= n (\bar {\pmb X} - \pmb \mu_0)' S^{-1}(\bar {\pmb X} - \pmb \mu_0) \end{align*}\] $
여기서 \(C^{-1}\)이 존재한다는게 뭔수로 보장되는거지?
5.3.2 1. Confidence Region
5.3.2.0.1 Confidence Region
region \(R(\pmb X)\), is $100(1-) % $ CR of
$ \[\begin{alignat*}{3} &P \left\{ R(\pmb X) \text{ will cover the true } \pmb \theta \right\} &&= 1-\alpha \\ &P \left\{ n (\hat {\pmb X} - \pmb \mu)' S^{-1}(\hat {\pmb X} - \pmb \mu) \le \tfrac{(n-1)p}{(n-p)} F_{p,n-p} (\alpha) \right\} &&= \end{alignat*}\] $
the inequality \(n (\bar {\pmb X} - \pmb \mu)' S^{-1}(\bar {\pmb X} - \pmb \mu) \le \tfrac{(n-1)p}{(n-p)} F_{p,n-p} (\alpha)\) will define a region \(R(\pmb X)\).
The region is an ellipsoid centered at \(\bar {\pmb X}\).
Testing \(H_0 : \mu = \mu_0\) at \(\alpha =.05\) is equivalent to see whether \(\mu_0\) falls within the CR.
- with ev \(\lambda_1 , \cdots, \lambda_p\), evec \(\pmb e_1 , \cdots, \pmb e_p\) of \(S\),
- CR Axis: \(\pm \sqrt{\lambda}\sqrt{\tfrac{(n-1)p}{(n-p)} F_{p,n-p} (\alpha)} \ast \pmb e_i'\)
- CR half-length: $ $
5.3.3 2. Simultaneous CI
let \(\pmb X \sim N_p (\pmb \mu, \Sigma)\), then linear combination \(\pmb a' \pmb X \sim N_p (\pmb a' \pmb \mu, \pmb a' \Sigma \pmb a)\)
$ \[\begin{align*} t=\dfrac{\bar X - \mu} {S / \sqrt{n}} &\sim t_{n-1} \tag{recall: univariate}\\ t= \dfrac {\pmb a ' \bar X - \pmb a ' \pmb \mu} {\sqrt{\pmb a ' S \pmb a / n } } = \dfrac {\sqrt{n}(\pmb a ' \bar X - \pmb a ' \pmb \mu)} {\sqrt{\pmb a ' S \pmb a} } &\sim t_{n-1} \tag{MV} \end{align*}\] $
therefore, \(100(1-\alpha)\%\) CI for \(\pmb a ' \mu\) (at here, \(\pmb a\)is fixed) is \(\pmb a ' \bar {\pmb X} \pm t_{n-1} \dfrac {\alpha} {2} \dfrac{\sqrt{\pmb a ' S \pmb a} } {\sqrt{n}}\). This is not a simultaneous CI. let each \((a_1 , a_2), (b_1, b_2)\) be CI for \(\mu_1 , \mu_2\). then simultaneous CI \((a_1 , a_2), (b_1, b_2)\) has confidence \(95\% \ast 95\% = 90.25\%\). need a wider interval.
let rs \(\pmb X_1 , \cdots, \pmb X_n \overset {iid} {\sim} N_p (\pmb \mu , \Sigma)\).
then, simultaneously for all \(\pmb a\), the interval \(\pmb a ' \bar {\pmb X} \pm \sqrt{\dfrac{n-1}{n} \dfrac{p}{n-p} F_{p,n-p} (\alpha) \pmb a ' S \pmb a}\) will contain \(\pmb a ' \pmb \mu\) with probability \(1-\alpha\).
$ \[\begin{alignat*}{3} \because 1-\alpha &= P \left[ n (\bar {\pmb X } - \pmb \mu)' S^{-1} (\bar {\pmb X } - \pmb \mu) \le (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \right] \\ &= P \left[ (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu)' (\pmb a' S \pmb a)^{-1} (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu) \le \dfrac{1}{n} (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \right] \\ &= P \left[ (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu)' (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu) \le \dfrac{1}{n} (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \; \ast \; (\pmb a' S \pmb a) \right] \tag{∵ Scalar} \\ &= P \left[ (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu)^2 \le \dfrac{1}{n} (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \; \ast \; (\pmb a' S \pmb a) \right] \\ &= P \left[ - \sqrt{\dfrac{1}{n} (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \; \ast \; (\pmb a' S \pmb a)} \le (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu) \le \sqrt{\dfrac{1}{n} (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \; \ast \; (\pmb a' S \pmb a)} \right] \end{alignat*}\] $
5.3.3.0.1 Simultaneous CI for \(\mu_i - \mu_k\)
let \(\pmb a ' = [0,\cdots, 0, a_i, 0, \cdots, 0, a_k, 0, \cdots, 0]\). then as below, where \(S =\begin{bmatrix} S_{11} & \cdots &S_{1p} \\ & \ddots & \\ S_{p1} & \cdots & S_{pp} \end{bmatrix}\).
$ \[\begin{align*} \pmb a ' \pmb \mu &= \mu_i - \mu_k \\ \pmb a ' S \pmb a =S_{ii} -2 S_{ik} + S_kk \end{align*}\] $
therefore, the simultaneous CI for \(\mu_i - \mu_k\), is \((\bar x_i - \bar x_k ) \pm \sqrt{\dfrac{n-1}{n} \dfrac{p}{n-p} F_{p, n-p}(\alpha)S_{ii} -2 S_{ik} + S_kk}\).
at here, if we let \(\pmb a ' = [1, 0, \cdots, 0]\).
then
$ \[\begin{align*} \pmb a ' \pmb \mu &= \mu_1\\ \pmb a ' S \pmb a =S_{11} \end{align*}\] $
therefore, the simultaneous CI for $_1 $, is \(\bar x_1 \pm \sqrt{\dfrac{n-1}{n} \dfrac{p}{n-p} F_{p, n-p}(\alpha)S_{11}}\).
5.3.4 3. Note: Bonferroni Multiple Comparison
Bonferroni’s CI, \(\bar x_1 \pm \left\{ t_{n-1} \left( \dfrac{\alpha}{2p} \right) \right\} \sqrt{\dfrac{S_11}{n}}\), is more precise (narrower) than simultaneous CI.
5.3.5 4. Large Sample Inferences about a Mean Vector
Recall mv CLT:
let \(\pmb X_1 , \cdots, \pmb X_n {\sim} ?(\pmb \mu, \Sigma)\) and for \(n-p\) large. then
$ \[\begin{align*} \sqrt{n} (\bar {\pmb X} - \pmb \mu) &\overset {d}{\Longrightarrow} N_p (\pmb 0, \Sigma) \\ n (\bar {\pmb X} - \pmb \mu)' S^{-1}(\bar {\pmb X} - \pmb \mu) &\overset {d}{\Longrightarrow} \chi^2_p \end{align*}\] $
when the sample size is large, the MVN assumption is less critical. therefore,
let \(\pmb X_1 , \cdots, \pmb X_n {\sim} ?(\pmb \mu, \Sigma)\).
$ H_0: = _0 $
when \(n-p\) is large, the \(H_0\) is rejected if \(n (\bar {\pmb X} - \pmb \mu)' S^{-1}(\bar {\pmb X} - \pmb \mu) > \chi^2_p (\alpha)\).
Note: \((n-1) \dfrac{p}{n-p} F_{p,n-p} )\alpha \simeq \chi_p^2(\alpha)\), for large \(n-p\).
- CI:
$ P = 1- $
the inequality $ n ({X } - )’ S^{-1} ({X } - ) _p^2 () $ will define a region, which means, \(100(1-\alpha) \%\) region.
- Simultaneous CI:
let \(\pmb X_1 , \cdots, \pmb X_n {\sim} ?(\pmb \mu, \Sigma)\) and for \(n-p\) large. then
\(\forall \pmb a\), \(100(1-\alpha) \%\) simultaneous CI for \(\pmb a ' \pmb \mu\) \(= \pmb a ' \bar {\pmb X} \pm \sqrt{ \chi_p^2 (\alpha)} \sqrt{ \dfrac{\pmb a ' S \pmb a} {n}}\).
- Simultaneous CI for \(\mu_i\)
$ x_i $
- Bonferroni’s CI for \(\mu_i\)
$ x_i z_{} $ - Bonferroni’s CI is more precise. as also.
5.3.6 1. Profile Analysis (wk4, 5)
if \(\pmb X \sim N_p (\pmb \mu, \Sigma)\), and the variables in \(\pmb X\) are measured in the same unit, we may with to compare the means \(\mu_1 , \cdots, \mu_p\) in \(\pmb \mu\).
ex) repeated measure: a measurement is taken at the same experimental unit \(p\) successive times.
A profile is a plot, connecting \((i, \mu_i), i= 1, \cdots, p\)
Question: is the profile flat?
$ \[\begin{align*} &H_0: \mu_1 = \cdots = \mu_p \\ \iff &H_0: C_1 \pmb \mu = \pmb 0 , \left[ C_1\right]_{(p-1) \times p} \\ \iff &H_0: C_2 \pmb \mu = \pmb 0 , \left[ C_2\right]_{(p-1) \times p} \end{align*}\] $
if \(\pmb X \sim N_p (\pmb \mu, \Sigma)\), then \(C \pmb X \sim N_{p-1} (C \pmb \mu, C \Sigma C')\), thus when \(H_0 : C \pmb \mu = 0\) is true, then \(C \bar X \sim N_{p-1} (C \pmb \mu, C \Sigma C')\).
test stat \(T^2 = n (C \bar {\pmb X})' (C S C')^{-1} (C \bar {\pmb X}) \overset{H_0}{\sim} (n-1) \dfrac{p-1}{n-p+1} F_{p-1,n-p+1}\)
reject \(H_0\), if \(T^2 > (n-1) \dfrac{p-1}{n-p+1} F_{p-1,n-p+1} (\alpha)\).
**Note: \(C_{(p-1) \times p}\) is not square, so there’s no inverse. thus \(C\) in test stat doesn’t be canceled.
$ H_0 : C = 0 $
where \(C_{q \times p} (q \le p)\), and \(rank(C)=q\). then
test stat \(T^2 = n (C \bar {\pmb X})' (C S C')^{-1} (C \bar {\pmb X}) \overset{H_0}{\sim} (n-1) \dfrac{q}{n-q} F_{q,n-q}\)
which means \(p-1\) become \(q\).
5.3.7 2. Test for Linear Trend
suppose \(p\) variables are measured across equally spaced time periods. Also suppose \(H_0 : \mu_1 = \cdots = \mu_p\) is rejected.
Question: Do the means fall onto a straight line?
$ \[\begin{align*} &H_0: \mu_2-\mu_1 = \cdots = \mu_p-\mu_{p-1} \\ \iff &H_0: \mu_3 -2 \mu_2+\mu_1 = 0, \; \; \cdots, \; \; \mu_p - 2 \mu_{p-1} + \mu_{p-2} = 0 \\ \iff C_{(p-2) \times p}, &H_0: C \pmb \mu = \pmb 0 \end{align*}\] $
at here, we acquire test stat \(T^2 \overset {H_0} {\sim} (n-1) \dfrac{p-2}{n-p+2} F_{p-2,n-p+2}\).
5.3.8 3. Inferences about a Covariance Matrix
let rs \(\pmb X_1 , \cdots, \pmb X_n \overset {iid} {\sim} N_p (\pmb \mu , \Sigma)\).
$ H_0 : = _0 $
let \(W = (n-1)S = \sum_{i=1}^n (\pmb X_i - \bar {\pmb X})(\pmb X_i - \bar {\pmb X})'\). then
$
^
= ( )^{} _0^{-1} W ^{} , ; ; ; ; ; ; ; v=n-1
$
then calculate \(L=-2 ln \Lambda^\ast \; \; \; \; \; \; \; \overset {H_0}{\sim}\) function of \(\chi^2\)-distribution.
- Test for Sphericity (Test for no Correlation)
$ H_0 : = ^2 I $
$ = $ function of \(\chi^2\)-distribution.
- Test for Compound Symmetry
if \(\Sigma = \begin{bmatrix} \sigma^2 & \rho & \cdots & \rho \\\rho & \sigma^2 & & \vdots \\ \vdots & \rho & \ddots & \rho \\ \rho & \cdots & \rho & \sigma^2 \\ \end{bmatrix}\), then \(\Sigma\) has compound symmetry.
$ H_0: $
Compute \(\Lambda = \dfrac{\vert S \vert} {(S^2)^p (1-r)^{p-1} (1+ (p-1)r)}\), where - $S^2 = {i=1}^p S{ii} $. - $r = {i<j}^p S{ij} $.
reject \(H_0\) if \(Q> \chi_f^2 (\alpha), \; \; \; \; \; f= \tfrac{p(p+1)-4}{2}\) - \(Q = -\dfrac{(N-1)-p(p+1)^2(2p-3)}{6(p-1)(p^2+p-4)} \ast \ln\Lambda\).