The Binomial Distribution
In later chapters we shall consider, in considerable detail, a number of importantdiscrete random variables. For the moment we shall simply study one of these
and then use it to illustrate a number of important concepts.
EXAMPLE 7.
Suppose that items coming off a production line are classified
as defective (D) or nondefective (N). Suppose that three items are chosen at
random from a day's production and are classified according to this scheme.
The sample space for this experiment, say S, may be described as follows:
S = {DDD, DDN, DND, NDD, NND, NDN, DNN, NNN}.
(Another way of describing S is as S = S1 x S2 x S3, the Cartesian product of
S1. S2, and S3, where each Si = {D, N}.)
Let us suppose that with probability 02. an item is defective and hence with
probability 0.8 an item is nondefective. Let us assume that these probabilities are
the same for each item at least throughout the duration of our study. Finally let
us suppose that the classification of any particular item is independent of the
classification of any other item. Using these assumptions, it follows that the
probabilities associated with the various outcomes of the sample space S as
described above are
(0.2)3, (0.8)(0.2)2 , (0.8)(0.2)2 , (0.8)(0.2)2, (0.2)(0.8)2 , (0.2)(0.8)2 , (0.2)(0.8)2, (0.8)3•
Our interest usually is not focused on the individual outcomes of S. Rather, we
simply wish to know how many defectives were found (irrespective of the order
in which they occurred). That is, we wish to consider the random variable X
which assigns to each outcome s ϵ S the number of defectives found in s. Hence
the set of possible values of X is {0, 1, 2, 3}.
We can obtain the probability distribution for X,p(xi) = P(X = xi) as follows:
X = O if and only if NNN occurs;
X = l if and only if DNN, NDN, or NND occurs;
X =2 if and only if DDN, DND, or NDD occurs;
X =3 if and only if DDD occurs.
(Note that {NNN} is equivalent to {X = 0}, etc.) Hence
p(0) = P(X = 0) = (0.8)3 , p(l) = P(X = l) = 3(0.2)(0.8)2,
p(2) = P(X = 2) = 3(0.8)(0.2)2, p(3) = P(X = 3) = (0.2)3•
Observe that the sum of these probabilities equals l, for the sum may be written
as (0.8 + 0.2)3•
Note: The above discussion illustrates how the probabilities in the range space Rx (in
this case {0, 1, 2, 3}) are induced by the probabilities defined over the sample space S.
For the assumption that the eight outcomes of
S = {DDD, DDN, DND, NDD, NND, NDN, DNN, NNN}
have the probabilities given in Example 7, determined the value of p(x) for all xϵRx.
Let us now generalize the notions introduced in the above example
Definition.
Consider an experiment & and let A be some event associated with
ϵ . Suppose that P(A) = p and hence P(A̅)= l - p. Consider n independent
repetitions of &. Hence the sample space consists of all possible sequences
{a1. a2, • . • , an}, where each ai is either A or A̅, depending on whether A
or A̅ occurred on the ithrepetition of ϵ. (There are 2n such sequences.)
Furthermore, assume that P(A) = p remains the same for all repetitions.
Let the random variable X be defined as follows: X = number of times the
event A occurred. We call X a binomial random variable with parameters n
and p. Its possible values are obviously 0, l , 2, ... , n. (Equivalently we say
that X has a binomial distribution.) The individual repetitions of.ϵ will be
called Bernoulli trials.
Theorem 1. Let X be a binomial variable based on n repetitions. Then
\[P(X=k)=(_{k}^{n}){{p}^{k}}{{(1-p)}^{n-k}},k=0,1,.....,n\]
Proof• Consider a particular element of the sample space of ϵ satisfying the
k repetitions of ϵ resulted in the occurrence of A, while the last n - k repetitions
resulted in the occurrence of A̅, that is
\[\frac{AAA....A}{k}\frac{\bar{A}\bar{A}\bar{A}......\bar{A}}{n-k}\]
Since all repetitions are independent, the probability of this particular sequence would be
\[{{p}^{k}}{{(1-p)}^{n-k}}\]
But exactly the same probability would be associated with any other outcome for which X = k. The total number of such outcomes equals (_{k}^{n}) , for we must choose exactly k positions (out of n) for the A's. But this yields the above result, since these (_{k}^{n}) outcomes are all mutually exclusive.
Notes: (a) To verify our calculation we note that, using the binomial theorem, we have
\[\sum\nolimits_{k=0}^{n}{P(X=K)=\sum\nolimits_{k=0}^{n}{(_{k}^{n}){{p}^{k}}{{(1-p)}^{n-k}}}}={{[p+(1-p)]}^{n}}={{1}^{n}}=1\]
as it should be. Since the probabilities
(_{k}^{n}){{p}^{k}}{{(1-p)}^{n-k}}
are obtained by expanding the binomial expression [p + (1 - P)]n, we call this the binomial distribution.
(b) Whenever we perform independent repetitions of an experiment and are interested only in a dichotomy-defective or non defective , hardness above or below a certainstandard, noise level in a communication system above or below a preassigned thresholdwe
are potentially dealing with a sample space on which we may define a binomial random
variable. So long as the conditions of experimentation stay sufficiently uniform so that
the probability of some attribute, say A, stays constant, we may use the above model.
(c) If n is small, the individual terms of the binomial distribution are relatively easy to
compute. However, if n is reasonably large, these computations become rather cumbersome.
Fortunately, the binomial probabilities have been tabulated. There are many
such tabulations. (See Appendix.)
EXAMPLE 8.
Suppose that a radio tube inserted into a certain type of set has
a probability of 0.2 of functioning more than 500 hours. If we test 20 tubes, what
is the probability that exactly k of these function more than 500 hours, k =
0, I, 2, ... , 20?
If X is the number of tubes functioning more than 500 hours, we shall assume
that X has a binomial distribution. Thus
\[P(X=k)=(_{k}^{20}){{0.2}^{k}}{{(0.8)}^{20-k}}]
The following values may be read from Table 4. l .
P(X = 0 ) = 0.012 | P(X = 4) = 0.218 | P(X = 8) = 0.022 |
P(X = 1) = 0.058 | P(X = 5) = 0.175 | P(X = 9) = 0.007 |
P(X = 2) = 0.137 | P(X = 6) = 0.109 | P(X = 10) = 0.002 |
P(X = 3) = 0.205 | P(X = 7) = 0.055 | P(X = k) = 0+ for k ≥ 11 |
In operating a certain machine, there is a certain probability that
the machine operator makes an error. It may be realistically assumed that the
operator learns in the sense that the probability of his making an error decreases
as he uses the machine repeatedly. Suppose that the operator makes n attempts
and that the n trials are statistically independent. Suppose specifically that
P(an error is made on the ith repetition) = l/(i + 1), i = 1, 2, ... , n. Assume
that 4 attempts are contemplated (that is, n = 4) and we define the random
variable X as the number of machine operations made without error. Note that
X is not binomially distributed because the probability of "success" is not constant.
To compute the probability that X = 3, for instance, we proceed as follows:
X = 3 if and only if there is exactly one unsuccessful attempt. This can happen
on the first, second, third, or fourth trial. Hence
P(x = 3) = 1/2 .2/3 .3/ 4 4/5 + 1/2 l/3.3/4.4/5 + 1/2.2/3.1/4 .4/5 + 1/2.2/3/4.1/5 =5/12•
EXAMPLE 10.
Consider a situation similar to the one described in Example 9 This time we shall assume that there is a constant probability p1 of making no error on the machine during each of the first n1 attempts and a constant probability p2 ≤ p1 of making no error on each of the next n2 repetitions. Let X be the number
of successful operations of the machine during the n = n1 + n2 independent attempts. Let us find a general expression for P(X = k). For the same reason as given in the preceding example, X is not binomially distributed. To obtain P(X = k) we proceed as follows.
Let Y1 be the number of correct operations during the first n1 attempts and let Y2 be the number of correct operations during the second n2 attempts. Hence Y1 and Y2 are independent random variables and X = Y1 + Y2. Thus X = k
if and only if Y1 = r and Y2 = k - r, for any integer r satisfying 0 ≤ r ≤ n1 and 0 ≤ k - r ≤ n2•
The above restrictions on rare equivalent to 0 ≤ r ≤ n1 and k - n2 ≤ r ≤ k.
Combining these we may write
\[\max (0,k-{{n}_{2}})\min (k,{{n}_{1}})\]
Hence we have
\[P(X=k)=\sum\limits_{r=\max (0,k-{{n}_{2}})}^{\min (k,{{n}_{1}})}{(_{r}^{{{n}_{1}}})}{{p}^{r}}_{1}{{\left( 1-{{p}_{1}} \right)}^{{{n}_{1}}-r}}(_{k-r}^{{{n}^{2}}})p_{2}^{k-r}{{(1-{{p}_{2}})}^{n2-(k-r)}}\]
With our usual convention that (a to b) = 0 whenever b > a or b < 0, we may write
the above probability as
\[P(X=k)=\sum\limits_{r=0}^{{{n}_{1}}}{(_{r}^{{{n}_{1}}}){{p}^{r}}_{1}{{\left( 1-{{p}_{1}} \right)}^{{{n}_{1}}-r}}(_{k-r}^{{{n}_{2}}})p_{2}^{k-r}{{(1-{{p}_{2}})}^{n2-k+r}}}\]
For instance, if p1=0.2, p2 =0.1 n1=n2=10, and k = 2, the above
probability becomes
\[P(X=k)=\sum\limits_{r=0}^{2}{(_{r}^{10}){{(0.2)}^{r}}{{\left( 0.8 \right)}^{10-r}}(_{2-r}^{10}){{0.1}^{2-r}}{{(0.9)}^{8+r}}}=0.27\]
after a straightforward calculation.
Note: Suppose that p1 = p2. In this case, Eq. (6) should reduce to
\[(_{k}^{n})p_{1}^{k}{{(1-{{p}_{1}})}^{n-k}}\]
since now the random variable X does have a binomial distribution. To see that this is
so, note that we may write (since n1 + n2 = n)
\[p(X=k)=p_{1}^{k}{{(1-{{p}_{1}})}^{n-k}}\sum\limits_{r=0}^{n1}{(_{r}^{n1})}(_{k-r}^{{{n}_{2}}})\]
To show that the above sum equals (n to k) simply compare coefficients for the powers of xk
on both sides of the identity (1 + x)n1(1 + x)n2= (1 + x)n1 +n2 .
No comments:
Post a Comment