5.1 Overview of mva (not ended)
Find relationships b/w \(\pmb x_p, \pmb y_q\), e.g., * Response variables (variable directed) * PCA * Factor Analysis * mv Regression * Cannonical Correlation Analysis * Experiment units (individual directed) * Discriminant Analysis * Cluster Analysis * MANOVA
Multivariate techniques tend to be exploratory. * i.e. not hypothesis testing type
Experimental units must be independent. Time series data are not appropriate for this course.
5.1.1 Notation
Variable \(y_1 , \cdots, y_p\)
One observation \(\pmb y ' = (y_1 , \cdots, y_p )\)
Data Matrix \(Y_{n \times p} = \begin{bmatrix} \pmb y ' = (y_1 , \cdots, y_p ) \\ \vdots \\ \pmb y ' = (y_1 , \cdots, y_p ) \end{bmatrix}\)
Random Samples: Suppose we intend to collect n sets of measurements on p variables, but not been observed yet. If $x_1 ‘, , x_n’ $ are independent observation from pdf \(f(\pmb x) = f(x_1, \cdots, x_p)\), then $x_1 ‘, , x_n’ $ are said to be rs from \(f(\pmb x)\).
rvec \(\pmb X = X_{p \times n} = \begin{bmatrix} \pmb x_1 \\ \vdots \\ \pmb x_p \end{bmatrix}\)
mean vector \(\pmb \mu = E (\pmb x) = \begin{bmatrix} \mu1 \\ \vdots \\ \mup \end{bmatrix}\)
Covariance Matrix \(\Sigma\)
Correlation Matrix \(\rho\), \(\rho_{ij} = \tfrac{\sigma_{ij}}{\sigma_{ii}\sigma_{jj}}\) * Correlation measures linear association. * Correlation is 0 if symmetric non-linear association exists.
5.1.2 Summary Statistics
- Sample Mean Vector \(\bar X=\) estimate of \(\pmb \mu\)
- Sample Covariance Matrix
- Sample Correlation Matrix
5.1.3 Statistical Inference on Correlation
$ H_0: = 0 $
test stat \(T=\dfrac {r \sqrt{n-2}}{\sqrt{1-r^2}} \sim t_{n-2}\), where \(r=Corr(x,y)\) and \(x \sim N_2, y \sim N_2\) Notes: 1. Correlation measures a linear relationships 2. it is still difficult to get a CI for \(\rho\).
5.1.3.0.1 CI for \(\rho\)
- Fisher’s Method:
\(100(1-\alpha)%\) CI for \(\rho\) $=((r) )
- Ruben’s Method
let \(u=z_{\alpha/2}, r^\ast = \dfrac {r}{\sqrt{1-r^2}\)
$ \[\begin{align*} a &= 2n-3-u^2 b &= r^\ast \ast \sqrt{(2n-3)(2n-5)} c &= (2n-5-u^2} \ast (r^\ast)^2 - 2u^2 \end{align*}\] $
set \(ay^2 - 2by +c =0\), then root of this equation will be \(y_1, y_2 = \dfrac{b}{a} \pm \dfrac {\sqrt{b^2-ac}}{a}\).
이때 \(100(1-\alpha)%\) CI for \(\rho\) \(=\left[ \dfrac{y_1}{\sqrt{1+y_1^2}, \dfrac{y_2}{\sqrt{1+y_2^2}, \right]\) * 이때, input은 \(n, \alpha, r\), output은 \(\rho\)의 CI.