Completely Randomized Design (CRD) - NayiPathshala

Breaking

Total Pageviews

Loading...

Search Here

1/27/2018

Completely Randomized Design (CRD)

COMPLETELY RANDOMIZED DESIGN WITH AND WITHOUT SUB SAMPLES



Defination Responses among experimental units vary due to many different causes, known and unknown. The process of the separation and comparison of sources of variation is called the Analysis of Variance (AOV). The process is more general than the t-test as any number of treatment means can be
simultaneously compared. The sugar beet experiment discussed in Chapter 5 and 6 involved six rates of nitrogen fertilizer. Table 7-1 gives root yield data for the five replications of all six treatments Table 7-1. Root yields (tons/acre) of plots fertilized with six levels of nitrogen
Treatment (lb. /acre)
Replications
Total (Yi)
Mean (Ȳi )
A(0)
31.3
33.4
29.2
32.2
33.9
160.0
32.00
B(50)
38.8
37.5
37.4
35.8
38.4
187.9
37.58
C(100)
40.9
39.2
39.5
38.6
39.8
198.0
39.60
D(150)
40.9
41.7
39.4
40.1
40.0
202.1
40.42
E(200)
39.7
40.6
39.2
38.7
41.9
200.1
40.02
F(250)
40.6
41.0
41.5
41.1
39.8
204.0
40.80
Overall





1152.1
38.40
In this case, the experimenter may want to compare the six treatment means simultaneously to decide if there is any difference among treatments. The AOV can be used for this purpose. It involves:

  1. The partitioning of the total sum of squares of the experiment into each specified source of variation. 
  2.  The estimation of the variance per experimental unit from these sources of variation. 
  3. The comparison of these variances by F-tests, which will lead to conclusions concerning the equality of the means.
For the experiment in Table 7-1, the total sum of squares for root yield can be separated into a sum of squares representing variability among treatment means (a between treatment sum of squares) and a sum of squares resulting from random variation among plots within treatments (within treatment sum of squares). Each sum of squares divided by its appropriate df results in a mean square. The within treatment mean square measures the random variability among experimental units, an estimate of the population variance, σ2. If there are no treatment effects, the between treatment mean square is also an estimate of σ2. The ratio of between treatment mean square divided by within treatment mean square provides an F-test of the equality of treatment means. Experiments must be designed to provide valid estimates of the population variance from various classifications of the experimental units. A principal feature of experimental design is the way in which experimental units are grouped, for example into treatments, blocks, locations, litters, years, etc., so that mean squares can be obtained for each source of variation. The exact form of the AOV therefore depends on the design used for the experiment. In chapters that follow, the AOV will be developed in the context of several designs.
Certain assumptions must be satisfied for an appropriate use of the AOV. These are: 1) Measurements made on experimental units within a classification are normally distributed. For data in Table 7-1, this means that root yields within the treatments are normally distributed. 2) An observation made on one experimental unit is independent from any other experimental unit. That is the root yield from one plot is not influenced by any other plot. 3) The variances of different samples are homogeneous, i.e., each treatment variance estimates the same population variance. 4) Treatment and environmental effects are additive
 
Completely Randomized Design Without Sub samplesAs the name implies, the completely randomized design (CRD) refers to the random assignment of experimental units to a set of treatments. It is essential to have more than one experimental unit per treatment to estimate the magnitude of experimental error and to make probability statements concerning treatment effects.
RandomizationTo illustrate the procedure for the random assignment of experimental units to treatments, we will show how the treatments of Table 7-1 might have been assigned to the 30 experimental units (plots of land) of that experiment. 
  1.  Arbitrarily number the experimental units (top left number in each plot of Figure 7-1). 
  2.  Refer to a table of random numbers (Appendix Table A-1). Note that some of our experimental units are two-digit numbers. Therefore we must use two lines or columns of the random number table. Start at some arbitrary point -- say we will read down columns 7 and 8 of Appendix Table A-1 and record the two digit numbers as we go, skipping those previously recorded, until we have a random number for each experimental unit (the number in the top middle of each plot of Figure 7-1). 
  3.  Rank the random numbers (top right number in each lot of Figure 7-1).
  4.  Assign each treatment in order (A through F) to plots according to the necessary ranks, to give as many replications as needed for each treatment. In this case, we want five replications per treatment.
1-58-18
D(0.9)
7-96-29
F(41.0)
13-64-21
E(39.2)
19-20-07
B(37.5)
25-25-08
B(38.4)
2-97-30
F(40.6)
8-51-15
C(39.5)
14-52-16
D(41.7)
20-73-23
E(38.7)
26-60-19
D(40.1)
3-42-11
C(40.9)
9-74-24
 E(39.7)
15-62-20
D(39.4)
21-44-12
C(39.8)
27-95-28
F(39.8)
4-07-02
A(31.3)
10-79-25
E(40.6)
16-28-09
B(38.8)
22-01-01
A(32.2)
28-15-04
A(33.9)
5-49-14
D(39.2)
11-13-03
A(29.2)
17-92-27
 F(41.1)
23-31-10
B(37.4)
29-53-17
D(40.0)
6-14-05
A(33.4)
12-85-26
F(41.5)
18-45-13
C(38.6)
24-17-06
B(35.8)
30-65-22
E(41.9)

Thirty sugar beet plots numbered in sequence; randomly assigned two digit numbers from Appendix Table A-1 (top middle); a ranking of the random number (top right); the assignment treatment (A through F); and resulting root yields (parentheses). See Table 7- 1 for the root yields organized by treatments. 
Analysis of Variance The null hypothesis to be tested is:
H0: μ1 = μ2 = ... = μk for k treatments
The procedure for testing this hypothesis results in the construction and completion of an AOV table (Table 7-2). Note that there are only two sources of variation in the CRD, between and within treatments and that the total df in the experiment are partitioned into these two sources.


                                   Table 7-2. Analysis of variance of a CRD.
Source
df
Sum of squares (SS)
Mean squares (MS)
Observed F
Total
kr - 1
TSS


Between treatments
k - 1
SST
MST
MST/MSE
Within treatments (experimental error)
k(r - 1)
SSE
MSE


where r is the replication number per treatment.

Table 7-3 is the completed AOV for the experiment of Figure 7-1.
                    Table 7-3. Analysis of variance for the experiment of Figure 7-1


Source
df
Sum of squares (SS)
Mean squares (MS)
Observed F
Total
29
311.13


Nitrogen treatments
5
277.69
55.54
39.95
Experimental  error
24
33.44
1.39



Since the observed F is greater than the 5% tabular F value with 5 and 24 degrees of freedom (2.60), the null hypothesis is rejected. The procedure involved in constructing such an AOV table is illustrated by the following steps.
Step 1:
Outline the AOV table and list the sources of variation and degrees of freedom. There are two sources of variation, between and within treatments. Degrees of freedom are one less than the number of observations in each source of variation. There are 6 treatments, therefore there are 5 degrees of freedom for the between treatment sum of squares (SST). There are 5 replications per treatment, therefore there are 4 degrees of freedom for each treatment times 6 treatments, which gives 24 degrees of freedom for the within treatment sum of squares (SSE). The degrees of freedom associated with the total variation in the experiment is one less than the total number of experimental units: 30 - 1 = 29. Note that the degrees of freedom associated with the sources of variation are additive, 5 + 25 = 29.

Step 2:
Calculate the correction term (C). 
C = Y2../kr = (1152.1)2/6(5) = 44244.48 
This is actually the sum of squares due to the mean.
Step 3: 
Calculate the total sum of squares(TSS).
 =TSS =  Î£ Î£(Yij -Ȳ..... )2
=  ΣΣ2ij -C
= 31.32 + 38.82 + ... 
The correction term is used so that the sum of squares is calculated about the general mean Y.. not about 0.

Step 4:
 

Calculate the sum of squares and mean square for treatments.

SST= rΣ(Ȳi -Ȳ..... )2 
=ΣY2i/r-C 
= (160.02 + 187.92 + ... + 204.02 )/5 - C 
= 44522.17 - 44244.48 = 277.69
A mean square is calculated by dividing the sum of squares by its degrees of freedom.
 MST = SST/ (k-1)
 = 277.69/(6-1) = 55.54

Step 5 

           Calculate the sum of squares and mean square for error.

           SSE = TSS - SST
           = 311.13 - 277.69 = 33.44
           MSE = SSE/k(r-1)
           = (33.44)/24 = 1.39

 The calculation of the sum of squares for error is based on the fact that the total degrees of freedom and total sum of squares can be partitioned into components, treatment and error. Thus the simples method of obtaining the degrees of freedom for error and SSE is by subtraction.
 The error sum of squares is actually the pooled within treatment sum of squares and can be directly calculated by:
            SSE = ΣΣ(Yij -Ȳi )2
                         =Σ(Y1j -Ȳ1 )+ Σ(Y2j -Ȳ2 )+............+ Σ(Ykj -Ȳk )2
                         =Σ(Y1j - Y1/r )+ Σ(Y2j - Y2/r )+............+ Î£(Ykj -Yk/r)2
                         = (31.3+ ...... + 31.9- 160.02/5) + (..........) + (........) + (40.62 + ... + 39.82 - 204.02 /5)
                     = 13.94 + ... + 1.66 = 33.44

The mean square for error results from the pooling of within treatment variances.

                MSE = SSE/k(r-1)
                         ={[ Σ(Y2ij.- Y21/r)/(r - 1)] +... +[Σ(Y2kj.- Y2k/r)/(r - 1)]} /k 
                         ={S2+ .... + S2k}/k
                         = {13.94/4 + ... + 1.66/4}/6=1.39

The pooled mean square for error, MSE, is an estimate of the variability among experimental units not due to treatment effects, i.e., the mean square error estimates σ2, the variance common to each of the populations from which the treatment samples were drawn. Thus pooling is only justified when each within treatment estimated variance,
S2i , is a valid estimate of σ2. This explains the requirement of the assumption that within treatment variances must be homogeneous in an AOV.

Step 6 

Compute F.
              F = MST/MSE
                 = 55.54/1.39 = 39.95

 In Appendix Table A-7, we see that for 5 and 24 degrees of freedom, an F value, 3.90, is the critical value at the 1% level. Since the observed F (39.95) greatly exceeds the 1% critical value, we have high confidence in rejecting the null hypothesis and conclude that there are significant differences among treatment means.
When Treatments Have Unequal Replications When sample sizes, replications per treatment, are not equal, then:
SST = Σri (Ȳ-Ȳ......)2
SST = Σri (Ȳ2i/ri) - C
i.e., each treatment total is squared and divided by its own sample size before summation.
The MSE is simply an average of the within treatment variances. When treatments are not equally replicated, pooling involves the calculation of a weighted average of the within treatment variances, the weighing factor being the degrees of freedom for each treatment, i.e.:
                    (r1 - 1)S21 + (r2 - 1)S22 + ............ + (rk - 1)S2k         
MSE =     ________________________________________
                         (r1 - 1) + (r2 - 1) + ...........(rk - 1)
The numerator is the sum of squares for error which can be calculated directly,

SSE = (ΣY21j - Y21 / r1) + ... + (ΣY2kj - Y2k / rk)
with (Σr1 - k) degrees of freedom
The Nature of Error To further understand the nature of "error" in this design, each observation contains an error component which can be expressed in the following form:
Yij = μ + (μi . - μ) + εij
Where is the overall mean, μi . - μ represents the ith treatment effect and εij is the random error component, without which all the observations of a given treatment would be the same. εij measures the deviation of an observation from the effects of known sources.
εij = Yij - μ - (μi . - μ) = Yij - μi .
The error component for the first replicate of treatment-C of Table 7-1 is estimated as:
ε̇̂31 = 40.9 - 39.6 = 1.3
the sum of the squares of error, i.e.
 Completely Randomized Design With Sub samples As already discussed, the experimental unit is the unit of research material to which a treatment is applied. In many experiments, it is common to have the experimental unit consist of two or more observational units. For example, consider an experiment where 3 feeding rations are to be compared. Each ration is randomly assigned to each of 5 pens and each pen contains 4 animals. In this case the pen is the experimental unit and the observations made on individual animals within a pen are sub samples. There are two sources of random variations associated with any observation made on each animal. One is the random variation from pen to pen within treatments, and the other is random variation among animals within pens. If the experimenter collects data on a pen basis, for example, weighs all animals in a pen together and expresses the result as total body weight or average body weight per animal, the appropriate AOV falls in the category of a one-way AOV without sub samples. The conclusions regarding treatment effects will be the same if individual animal data are analyzed. The additional information on animal variation can be useful in planning experiments with respect to more efficiency allocation of animals and pens to treatments. In some experiments, animals may be classified as experimental units. An example would be when the animals receive treatments individually, e.g., comparing the effects of hormones that can be injected into animals individually.

To illustrate the AOV with sub samples, we will use the same sugar beet experiment described at the beginning of the chapter. Now we will analyze the results of sucrose concentration obtained from two random sub samples per plot. Each sub sample consists of 10 beet roots. The observational unit is the sucrose concentration for each 10 beet root sub sample. The experimental unit is the plot which is represented by the total or the average of the two sub samples. The data are given in Table 7-4.

Table 7-4. % sucrose of two ten beet sub samples per plot.
Treatment (lb N/acre)
Subsample
Replications
Total (Y1..)
Mean (i. . )
A(0)
1
2
Y1j 
16.5
16.4
32.9
16.4
15.8
32.2
15.7
15.3
31.0
16.6
16.1
32.7
16.0
16.8
32.8


161.6
16.16


B(50)
1
2
Y2j 
16.0
16.6
32.6
14.4
13.9
28.3
15.5
16.6
32.1
15.6
16.2
31.8
16.4
16.2
32.6


157.4
15.74


C(100)
1
2
Y3j 
15.1
15.6
30.7
15.0
14.3
29.3
15.9
16.2
32.1
16.1
15.2
31.3
15.0
14.5
29.5


152.9
15.29


D(150)
1
2
Y4j 
15.6
15.5
31.1
14.7
15.2
29.9
15.6
15.5
31.1
15.4
14.6
30.0
15.6
15.2
30.8
152.9
15.29




E(200)
1
2
Y5j 
13.5
14.3
27.8
14.2
13.3
27.5
14.5
15.1
29.6
15.4
15.1
30.5
15.6
15.2
30.8


152.9
15.29


F(250)
1
2
Y6j 
14.2
13.0
27.2
12.5
12.6
25.1
15.1
14.3
29.4
14.0
14.8
28.8
14.3
14.6
28.9


139.4
13.94


Overall






907.8
15.13


The AOV is given in Table 7-5, and the steps for completing the table are given below.


Table 7-5. Analysis of variance.
Source
df
Sum of squares (SS)
Mean squares (MS)
Observed F
Total(Samples)
59
62.65
Plots(Exp. Units)
29
55.71


Nitrogen treatments
5
34..94
6.99
8.13
Experimental  error
24
20.77
0.86
3.74
Sampling error
30
6.94
0.23


Step 1: 


  • List the sources of variation and degrees of freedom. There are three sources of variation: nitrogen treatments, plots within nitrogen treatments, and samples within plots. total degrees of freedom are 59, one less than the total observations made. The combined variation of nitrogen and experimental error (plots within nitrogen) is the total variation among plots with degrees of freedom equal to 29, one less than the total number of plots. There are six nitrogen levels, therefore, there are 5 df for treatments. There are 5 replications per treatment, therefore there are 4 df per treatment and with 6 treatments, there are 24 df for experimental error. Note 5 + 24 = 29 is the degrees of freedom due to total plots. Two observations per plot gives 1 df per plot times 30 plots equals 30 df for sampling error. 
Step 2: 

  • Correction factor, C = Y2.../krn, where n is the number of samples per plot, r is the replications per treatment and k is the number of treatments.
 C = 907.82/6(5) (2) = 13735.01 
Step 3: 

  • Calculate the total sum of squares (TSS). 
TSS = nΣΣ (Yijh )2 
 =ΣΣΣ2ijh -  C
= 16.52 + 16.4+ ... 14.62 - 13735.01 
= 62.65
Step 4: 

  • Calculate the sum of squares for experimental units (SSU). 
SSE =nΣΣ(ij Ȳ.....)2  
= ΣΣY2ij /n-C
= (32.92 + ... + 28.92)/2 - 13735.01 
= 55.71 
This source of variation is computed to simplify the calculation of experimental and sampling error, i.e.,
SSE = SSE - SST 
SSS = TSS - SSU 
where SSS represents the sum of squares for sampling error.
Step 5: 

  • Sum of squares and the mean square for treatment. 
SST =  rn Î£Ȳ......)2
= ΣΣ Y2i.../rn - C
= (161.62 + ... + 139.42 )/5(2) - 13735.01 
= 34.94 
MST = SST/(k - 1) = 34.94/5 = 6.99 
Step 6: 

  • Sum of squares and mean square for experimental error. 
 SSE = SSU - SST 
= 55.71 - 34.94 
= 20.77 
MSE = SSE/k(r-1) 
= 20.77/24 = 0.86 
 Step 7: 

  • Sum of squares and mean square for sampling error. 
 SSE = TSS - SSU
 = 62.65 - 55.71 = 6.94
 MSE = SSS/kr(n-1)
 = 6.94/6(5) (1) = 0.23 

Step 8: 
  • Calculate F values. 
 For testing the hypothesis of equal treatment means, 
F = MST/MSE = 6.99/0.86 = 8.13


 The critical F-value for this test is based on 5 and 24 degrees of freedom. At the 1% level of significance the critical F value is 3.90, thus we will reject the null hypothesis and conclude that there are highly significant differences among treatment means. 

 Since we now have information about the sampling variability, we can test a hypothesis related to the experimental error which includes random variation among plots as well as between sucrose samples within plots. 

 MSE estimates σ2s +nσ2e where σ2s is the sampling variance (estimated by MSS) and σ2e is  the pure random plot to plot variation and does not include sampling variation. 
 the F test, F = MSE/MSS can be performed to test 
H0 :σ2e = 0
 In our case, 
 F = 0.86/0.23 = 3.74 
 with 24 and 30 df, which is significant beyond the 1% level. This indicates the existence of plot to plot variation which is estimated as S2e  ( MSE - MSS ) / n  For our experiment, S2= (0.86 - 0.23)/2 - 0.31. If the F test was not significant, or σ2= was not rejected, it would imply that the plot was not an important factor contributing to the variability in sucrose concentrations of sugar beet samples. Thus samples could be taken from fewer plots to test the nitrogen differences. The estimates of sampling and plot variances, S2and S2e can be used to determine the effects of increasing the number of samples per plot (n) and/or the number of replications per treatment (r) on the precision of an experiment. For example, the variance of the difference between two means can be expressed as: 

S(2/d) = 2(S2s/nr + S2e/r)

Having estimates of  S2and S2e allows the experimenter to see how changes in n and r will effect the magnitude of S or the confidence interval for a mean difference (  L /U =  d̄ ± tα/2 . For instance is relatively large, then an increase in sampling number will effectively decrease S . But if S2e is relatively large, an increase in experimental units will be most effective in reducing S  

Another example of an experiment with sub samples is an evaluation of pig sires where each sire is mated to several dams. In this case sires are considered as treatments and dams are experimental units and serve as replications for sire evaluation. The body weights of newborn animals from each mating are sub samples. A significant F test MSE/MSS, implies that maternal effect exists which causes variation among body weights of newborn animals. The environment influences are indicated by the magnitude of MSS. On the other hand, if this F test is not significant, it implies the dam is not an important factor in terms of contributing to the variability in body weight of newborn animals. Perhaps baby animals can be sampled without identifying the dams they came from.

No comments:

Post a Comment