The Multivariate Normal Distribution
In univariate statistical quality control, we generally use the normal distribution to describe the behavior of a continuous quality characteristic. The univariate normal probability density function is\[f(x)=\frac{1}{\sqrt{2\pi {{\sigma }^{2}}}}{{e}^{-\frac{1}{2}{{(\frac{x-\mu }{\sigma })}^{2}}}}\]
-∞ < x <
The mean of the normal distribution is m and the variance is s2 . Note that (apart from the minus sign) the term in the exponent of the normal distribution can be written as follows:
(x-μ)(σ2)-1(x-μ)
This quantity measures the squared standardized distance from x to the mean m, where by the term “standardized” we mean that the distance is expressed in standard deviation units. This same approach can be used in the multivariate normal distribution case. Suppose that we have p variables, given by x1, x2, . . . , xP Arrange these variables in a p-component vector x′ = [ x1, x2, . . . , xP]. Let m′ = [ m1, m2, . . . , mP] be the vector of the means of the x’s, and let the variances and covariances of the random variables in x be contained in a p × p covariance matrix Σ. The main diagonal elements of Σ are the variances of the x’s and the off-diagonal elements are the covariances. Now the squared standardized (generalized) distance from x to m is
(x-μ)' ∑-1 (x-μ)
The multivariate normal density function is obtained simply by replacing the standardized distance in equation (11.4) by the multivariate generalized distance in equation (11.5) and changing the constant term to a more general form that makes the area under the probability density function unity regardless of the value of p. Therefore, the multivariate normal probability density function is
\[f(x)=\frac{1}{{{(2\pi )}^{\frac{p}{2}}}{{\left| \sum \right|}^{\frac{1}{2}}}}{{e}^{-\frac{1}{2}(x-\mu ){{\sum }^{-1}}(x-\mu )}}\]
where −∞ < xj < ∞, j = 1, 2, . . . , p.
A multivariate normal distribution for p = 2 variables (called a bivariate normal) is shown in Fig. 11.3. Note that the density function is a surface. The correlation coefficient between the two variables in this example is 0.8, and this causes the probability to concentrate closely along a line.
No comments:
Post a Comment