5.3 Inference about Mean Vector (wk3)

5.3.1 Overview

Recall: univariate case \(x_1 , \cdots, x_n \overset {iid} {\sim} N(\mu, \sigma^2)\)

\(H_0 : \mu = \mu_0\)

$ \[\begin{alignat*}{2} \text{test stat } &t &&=\tfrac{\bar X - \mu_0}{\tfrac{S}{\sqrt{n}}} &\overset{H_0}{\sim} t_{n-1} \\ &t^2 &&=\tfrac{(\bar X - \mu_0)^2}{\tfrac{S^2}{n}} &\overset{H_0}{\sim} F_{1, \; n-1} \end{alignat*}\] $

reject \(H_0\) if as below, which means upper \((100-\alpha)\)th percentile.


$ \[\begin{alignat*}{1} \tfrac{(\bar X - \mu_0)^2}{\tfrac{S^2}{n}} = n(\bar X - \mu_0)\tfrac{1}{S^2}(\bar X - \mu_0) &> F_{1,n-1}(\alpha) \end{alignat*}\] $

therefore, with assumption \(\pmb X_1 , \cdots, \pmb X_n \overset {iid} {\sim} N_p (\pmb \mu , \Sigma)\),

$

H_0 : = _0

$

$ \[\begin{alignat*}{3} \text{Hotelling's }T^2 \; \; T^2 &= n(\bar {\pmb X} - \pmb \mu_0)' S^{-1} (\bar {\pmb X} - \pmb \mu_0) \\ &\overset{H_0}{\sim} \tfrac{(n-1)p}{(n-p)} F_{p,n-p} \\ \iff \; \; \tfrac {(n-p)} {(n-1)p} T^2 &\overset{H_0}{\sim} F_{p,n-p} \end{alignat*}\] $

reject \(H_0\), if \(T^2 > \tfrac{(n-1)p}{(n-p)} F_{p,n-p} (\alpha)\).



assumption check: \(\pmb X_1 , \cdots, \pmb X_n \overset{iid}{\sim} N_p (\pmb \mu , \Sigma)\).



5.3.1.0.1 Remark

stat \(T^2\)는 측정 단위에 invariant. proof)

let $Y_{p } = C_{p p} X_{p } + d_{p } $. then

$ \[\begin{align*} \bar {\pmb Y} &= C \bar {\pmb X} + \pmb d \\ \\ S_{\pmb y} &= CSC'\\ \\ \mu_y &= E(\pmb Y) \\ &=C \ast E(\pmb X) + \pmb d \\ &= C \pmb \mu_0 + \pmb d \end{align*}\] $

therefore,

$ \[\begin{align*} T^2 &= n(\bar {\pmb Y} - \pmb \mu_y)' S_y^{-1} (\bar {\pmb Y} - \pmb \mu_y) \\ &= n \left[ C(\bar {\pmb X} - \pmb \mu_0) \right]' (CSC')^{-1} \left[ C(\bar {\pmb X} - \pmb \mu_0) \right] \\ &= n (\bar {\pmb X} - \pmb \mu_0)' C' (C')^{-1} S^{-1}(C)^{-1} C(\bar {\pmb X} - \pmb \mu_0) \\ &= n (\bar {\pmb X} - \pmb \mu_0)' S^{-1}(\bar {\pmb X} - \pmb \mu_0) \end{align*}\] $

여기서 \(C^{-1}\)이 존재한다는게 뭔수로 보장되는거지?




5.3.2 1. Confidence Region

5.3.2.0.1 Confidence Region

region \(R(\pmb X)\), is $100(1-) % $ CR of

$ \[\begin{alignat*}{3} &P \left\{ R(\pmb X) \text{ will cover the true } \pmb \theta \right\} &&= 1-\alpha \\ &P \left\{ n (\hat {\pmb X} - \pmb \mu)' S^{-1}(\hat {\pmb X} - \pmb \mu) \le \tfrac{(n-1)p}{(n-p)} F_{p,n-p} (\alpha) \right\} &&= \end{alignat*}\] $

the inequality \(n (\bar {\pmb X} - \pmb \mu)' S^{-1}(\bar {\pmb X} - \pmb \mu) \le \tfrac{(n-1)p}{(n-p)} F_{p,n-p} (\alpha)\) will define a region \(R(\pmb X)\).

The region is an ellipsoid centered at \(\bar {\pmb X}\).

Testing \(H_0 : \mu = \mu_0\) at \(\alpha =.05\) is equivalent to see whether \(\mu_0\) falls within the CR.

  • with ev \(\lambda_1 , \cdots, \lambda_p\), evec \(\pmb e_1 , \cdots, \pmb e_p\) of \(S\),
    • CR Axis: \(\pm \sqrt{\lambda}\sqrt{\tfrac{(n-1)p}{(n-p)} F_{p,n-p} (\alpha)} \ast \pmb e_i'\)
    • CR half-length: $ $

5.3.3 2. Simultaneous CI

let \(\pmb X \sim N_p (\pmb \mu, \Sigma)\), then linear combination \(\pmb a' \pmb X \sim N_p (\pmb a' \pmb \mu, \pmb a' \Sigma \pmb a)\)

$ \[\begin{align*} t=\dfrac{\bar X - \mu} {S / \sqrt{n}} &\sim t_{n-1} \tag{recall: univariate}\\ t= \dfrac {\pmb a ' \bar X - \pmb a ' \pmb \mu} {\sqrt{\pmb a ' S \pmb a / n } } = \dfrac {\sqrt{n}(\pmb a ' \bar X - \pmb a ' \pmb \mu)} {\sqrt{\pmb a ' S \pmb a} } &\sim t_{n-1} \tag{MV} \end{align*}\] $

therefore, \(100(1-\alpha)\%\) CI for \(\pmb a ' \mu\) (at here, \(\pmb a\)is fixed) is \(\pmb a ' \bar {\pmb X} \pm t_{n-1} \dfrac {\alpha} {2} \dfrac{\sqrt{\pmb a ' S \pmb a} } {\sqrt{n}}\). This is not a simultaneous CI. let each \((a_1 , a_2), (b_1, b_2)\) be CI for \(\mu_1 , \mu_2\). then simultaneous CI \((a_1 , a_2), (b_1, b_2)\) has confidence \(95\% \ast 95\% = 90.25\%\). need a wider interval.

let rs \(\pmb X_1 , \cdots, \pmb X_n \overset {iid} {\sim} N_p (\pmb \mu , \Sigma)\).

then, simultaneously for all \(\pmb a\), the interval \(\pmb a ' \bar {\pmb X} \pm \sqrt{\dfrac{n-1}{n} \dfrac{p}{n-p} F_{p,n-p} (\alpha) \pmb a ' S \pmb a}\) will contain \(\pmb a ' \pmb \mu\) with probability \(1-\alpha\).

$ \[\begin{alignat*}{3} \because 1-\alpha &= P \left[ n (\bar {\pmb X } - \pmb \mu)' S^{-1} (\bar {\pmb X } - \pmb \mu) \le (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \right] \\ &= P \left[ (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu)' (\pmb a' S \pmb a)^{-1} (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu) \le \dfrac{1}{n} (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \right] \\ &= P \left[ (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu)' (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu) \le \dfrac{1}{n} (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \; \ast \; (\pmb a' S \pmb a) \right] \tag{∵ Scalar} \\ &= P \left[ (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu)^2 \le \dfrac{1}{n} (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \; \ast \; (\pmb a' S \pmb a) \right] \\ &= P \left[ - \sqrt{\dfrac{1}{n} (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \; \ast \; (\pmb a' S \pmb a)} \le (\pmb a' \bar {\pmb X } - \pmb a' \pmb \mu) \le \sqrt{\dfrac{1}{n} (n-1) \dfrac {p}{n-p} F_{p,n-p} (\alpha) \; \ast \; (\pmb a' S \pmb a)} \right] \end{alignat*}\] $

5.3.3.0.1 Simultaneous CI for \(\mu_i - \mu_k\)

let \(\pmb a ' = [0,\cdots, 0, a_i, 0, \cdots, 0, a_k, 0, \cdots, 0]\). then as below, where \(S =\begin{bmatrix} S_{11} & \cdots &S_{1p} \\ & \ddots & \\ S_{p1} & \cdots & S_{pp} \end{bmatrix}\).

$ \[\begin{align*} \pmb a ' \pmb \mu &= \mu_i - \mu_k \\ \pmb a ' S \pmb a =S_{ii} -2 S_{ik} + S_kk \end{align*}\] $

therefore, the simultaneous CI for \(\mu_i - \mu_k\), is \((\bar x_i - \bar x_k ) \pm \sqrt{\dfrac{n-1}{n} \dfrac{p}{n-p} F_{p, n-p}(\alpha)S_{ii} -2 S_{ik} + S_kk}\).


at here, if we let \(\pmb a ' = [1, 0, \cdots, 0]\).

then

$ \[\begin{align*} \pmb a ' \pmb \mu &= \mu_1\\ \pmb a ' S \pmb a =S_{11} \end{align*}\] $

therefore, the simultaneous CI for $_1 $, is \(\bar x_1 \pm \sqrt{\dfrac{n-1}{n} \dfrac{p}{n-p} F_{p, n-p}(\alpha)S_{11}}\).


5.3.4 3. Note: Bonferroni Multiple Comparison

Bonferroni’s CI, \(\bar x_1 \pm \left\{ t_{n-1} \left( \dfrac{\alpha}{2p} \right) \right\} \sqrt{\dfrac{S_11}{n}}\), is more precise (narrower) than simultaneous CI.


5.3.5 4. Large Sample Inferences about a Mean Vector

Recall mv CLT:

let \(\pmb X_1 , \cdots, \pmb X_n {\sim} ?(\pmb \mu, \Sigma)\) and for \(n-p\) large. then

$ \[\begin{align*} \sqrt{n} (\bar {\pmb X} - \pmb \mu) &\overset {d}{\Longrightarrow} N_p (\pmb 0, \Sigma) \\ n (\bar {\pmb X} - \pmb \mu)' S^{-1}(\bar {\pmb X} - \pmb \mu) &\overset {d}{\Longrightarrow} \chi^2_p \end{align*}\] $


when the sample size is large, the MVN assumption is less critical. therefore,

let \(\pmb X_1 , \cdots, \pmb X_n {\sim} ?(\pmb \mu, \Sigma)\).

$ H_0: = _0 $

when \(n-p\) is large, the \(H_0\) is rejected if \(n (\bar {\pmb X} - \pmb \mu)' S^{-1}(\bar {\pmb X} - \pmb \mu) > \chi^2_p (\alpha)\).

Note: \((n-1) \dfrac{p}{n-p} F_{p,n-p} )\alpha \simeq \chi_p^2(\alpha)\), for large \(n-p\).


  • CI:

$ P = 1- $

the inequality $ n ({X } - )’ S^{-1} ({X } - ) _p^2 () $ will define a region, which means, \(100(1-\alpha) \%\) region.

  • Simultaneous CI:

let \(\pmb X_1 , \cdots, \pmb X_n {\sim} ?(\pmb \mu, \Sigma)\) and for \(n-p\) large. then

\(\forall \pmb a\), \(100(1-\alpha) \%\) simultaneous CI for \(\pmb a ' \pmb \mu\) \(= \pmb a ' \bar {\pmb X} \pm \sqrt{ \chi_p^2 (\alpha)} \sqrt{ \dfrac{\pmb a ' S \pmb a} {n}}\).

  • Simultaneous CI for \(\mu_i\)

$ x_i $

  • Bonferroni’s CI for \(\mu_i\)

$ x_i z_{} $ - Bonferroni’s CI is more precise. as also.

5.3.6 1. Profile Analysis (wk4, 5)

if \(\pmb X \sim N_p (\pmb \mu, \Sigma)\), and the variables in \(\pmb X\) are measured in the same unit, we may with to compare the means \(\mu_1 , \cdots, \mu_p\) in \(\pmb \mu\).

ex) repeated measure: a measurement is taken at the same experimental unit \(p\) successive times.

A profile is a plot, connecting \((i, \mu_i), i= 1, \cdots, p\)


Question: is the profile flat?

$ \[\begin{align*} &H_0: \mu_1 = \cdots = \mu_p \\ \iff &H_0: C_1 \pmb \mu = \pmb 0 , \left[ C_1\right]_{(p-1) \times p} \\ \iff &H_0: C_2 \pmb \mu = \pmb 0 , \left[ C_2\right]_{(p-1) \times p} \end{align*}\] $


if \(\pmb X \sim N_p (\pmb \mu, \Sigma)\), then \(C \pmb X \sim N_{p-1} (C \pmb \mu, C \Sigma C')\), thus when \(H_0 : C \pmb \mu = 0\) is true, then \(C \bar X \sim N_{p-1} (C \pmb \mu, C \Sigma C')\).

test stat \(T^2 = n (C \bar {\pmb X})' (C S C')^{-1} (C \bar {\pmb X}) \overset{H_0}{\sim} (n-1) \dfrac{p-1}{n-p+1} F_{p-1,n-p+1}\)

reject \(H_0\), if \(T^2 > (n-1) \dfrac{p-1}{n-p+1} F_{p-1,n-p+1} (\alpha)\).

**Note: \(C_{(p-1) \times p}\) is not square, so there’s no inverse. thus \(C\) in test stat doesn’t be canceled.

$ H_0 : C = 0 $

where \(C_{q \times p} (q \le p)\), and \(rank(C)=q\). then

test stat \(T^2 = n (C \bar {\pmb X})' (C S C')^{-1} (C \bar {\pmb X}) \overset{H_0}{\sim} (n-1) \dfrac{q}{n-q} F_{q,n-q}\)

which means \(p-1\) become \(q\).


5.3.7 2. Test for Linear Trend

suppose \(p\) variables are measured across equally spaced time periods. Also suppose \(H_0 : \mu_1 = \cdots = \mu_p\) is rejected.

Question: Do the means fall onto a straight line?

$ \[\begin{align*} &H_0: \mu_2-\mu_1 = \cdots = \mu_p-\mu_{p-1} \\ \iff &H_0: \mu_3 -2 \mu_2+\mu_1 = 0, \; \; \cdots, \; \; \mu_p - 2 \mu_{p-1} + \mu_{p-2} = 0 \\ \iff C_{(p-2) \times p}, &H_0: C \pmb \mu = \pmb 0 \end{align*}\] $

at here, we acquire test stat \(T^2 \overset {H_0} {\sim} (n-1) \dfrac{p-2}{n-p+2} F_{p-2,n-p+2}\).


5.3.8 3. Inferences about a Covariance Matrix

let rs \(\pmb X_1 , \cdots, \pmb X_n \overset {iid} {\sim} N_p (\pmb \mu , \Sigma)\).

$ H_0 : = _0 $

let \(W = (n-1)S = \sum_{i=1}^n (\pmb X_i - \bar {\pmb X})(\pmb X_i - \bar {\pmb X})'\). then

$

^

= ( )^{} _0^{-1} W ^{} , ; ; ; ; ; ; ; v=n-1

$

then calculate \(L=-2 ln \Lambda^\ast \; \; \; \; \; \; \; \overset {H_0}{\sim}\) function of \(\chi^2\)-distribution.


  • Test for Sphericity (Test for no Correlation)

$ H_0 : = ^2 I $

$ = $ function of \(\chi^2\)-distribution.


  • Test for Compound Symmetry

if \(\Sigma = \begin{bmatrix} \sigma^2 & \rho & \cdots & \rho \\\rho & \sigma^2 & & \vdots \\ \vdots & \rho & \ddots & \rho \\ \rho & \cdots & \rho & \sigma^2 \\ \end{bmatrix}\), then \(\Sigma\) has compound symmetry.

$ H_0: $

Compute \(\Lambda = \dfrac{\vert S \vert} {(S^2)^p (1-r)^{p-1} (1+ (p-1)r)}\), where - $S^2 = {i=1}^p S{ii} $. - $r = {i<j}^p S{ij} $.


reject \(H_0\) if \(Q> \chi_f^2 (\alpha), \; \; \; \; \; f= \tfrac{p(p+1)-4}{2}\) - \(Q = -\dfrac{(N-1)-p(p+1)^2(2p-3)}{6(p-1)(p^2+p-4)} \ast \ln\Lambda\).