Frequency Distributions - NayiPathshala

Breaking

Total Pageviews

Loading...

Search Here

12/29/2017

Frequency Distributions

MEASURES OF CENTRAL TENDENCY

2.1. Frequency Distributions.

 When observations, discrete or continous , are available on a single characteristic of a large number of individuals, often it becomes necessary to condense the data as far as possible without losing any
information of interest Lets consider the marks in Statistics obtained by 250
candidates selected at random from among those appearing in a certain examination
·.
TABLE 1: MARKS IN STATISTICS OF 250 CANDIDATES , ' ,

32474151413039184853
54323146153732564248
38265040384235226251
44214531374144183747
68413052526042383834
41534821284942364129
30333135293738403249
43322438382241501746
46502615234225523846
41384037404845302831
40334236514256443538
31514541505350324548
40434034344438584928
40451924344737333736
36326130444350313845
46403234445435393148
48504355433941485334
32314234343233244339
40502747344434334742
17425735381733463623
48503158334426293137
47555737415442454743
37524746445044384219
52452341473342244839
4844603838
This representation of, the data dQes not furnish any useful information and is
rather confusing to mind. A better way may be to express the figures in an
ascending or descending order of magnitude, commonly termed as array. But this
does not reduce the bulk of the data. A much better representation is given on the
next page.
A bar ( I ) called tally mark is put against the'number when it occurs. Having
occurred four times. the fifth occurrence is represented by puttfug a cross tally (j)
on the first four. tallies. This technique faciliiates the counting of the tally marks
at the end.
The representation of the data as above is known as frequency distribution.
Marks are called the variable (x) and the 'number of students' against the marks
is known as the frequency (f) of the variable. The word 'frequency' is derived
from 'how frequently' a variable occurs. For example, in the above case the
frequency of 31 is 10 as there are ten students getting 31 marks. This representation,
though beuer than an array' ,does not condense the data much and it is
quite cumbersome to go through this huge mass of iIata.



If the identity of the individuals about whom a particular information is taken
is not relevant, nor the order in which the observations arise, then the first real step
of condensation is to divide the observed range of variable into a suitable number
of class-intervals and to recall the number of observations in each class. For
example, in the above case, the data may be expressed as shown in Table 3.
Such a table showing the distribution

Frequency Distributions Arid Measures Of'Central Tendency 1·3
all the values from:20 10 24, both inclusive 'and tlie classification is termed as
inclusive type classification.
In spite of great importance of classification in statistical analysis, no hard and
fast rules can be laid down for it The following points may be kep~ in mind for
classification' :
(i) Th~ classes should be clearl5' defmedand should not lead 10 aliy ambiguity.
(ii) The classes should be exhaustive, i.e., each of the given values should be
included in one of the classes.
(iii) The classes should
(iv) The classes should be of equal width. The principle, however, cannot be
rigidly followed. If the classes are of var:yin~ width, the different class frequencies
will not be comparable. Comparable figures can be obtained by dividing the value
of the frequencieS by the 'corresponding widths of the class intervals. The ratios
thus obtained are called 'frequency densities' .
(v) Indeterminate classes, e.g •• the open-end classes. less than 'a' or greater
than 'b' should be avoided as far as possible since they create difficulty in analysis
and interpretation.
(vi) The number of classes should neither be too large nor too small. It should
preferably lie between 5 and 15. However. the number of classes may be more"
than 15 depending upon the IOtaI frequency and the details required. but it is
d~irable that it is not less than 5 since in.that case the classification may not reveal
the essential characteristics of the population. The following fQrmula due to
SlrUges may be ~ to determine an approximate ~umber k of classes :
k = 1 + 3·322log10 N.
where N is the total frequency.
The Magni~de or u.e (::Iass IDle"al
Having'faxed the number of classe$.'divide the range (the difference. bet}Yeen
the greateSt and the smallest observation) by it and the nearest integer to this. value
giv<;.s the magnitude of the c~ interval. Broad class intervals ( i;e .• ICS$ n"mber
of classes) will yield -only rough estimates while for high degree of accuracy small
class intervals ( i.e .• large number of classes) are desirable.
CIauLimits
1;be class limits should be cOOsen in such a way that the mid-vaI~'of~ class
intezval and.actual average of the observations in that claSs interval are as near'to
each other as possible. If this is not the case then the classification gives a distorted
picCUre of the characteristics of the dala. Jf possible. class limitS stiould tie locaied
at the points which are multiple of 0, 2. s. 10 •••• etC •• sO that the midpoints of the
classes are the Common figures, viz .• O. 2. 5. 10 .•.• ele .• the figures capable of easy
and simple analysis.




2·1·1. Continuous Frequency Distribution.


 If we deal with a continuous
variable, it is not possible to arrange the data in the class intervals of above type.
Let us consider the distribution of age in years. If class intervals are 15-19,
20-24 then the persons with ages between 19 and 20 years are not taken into
consideration. In such a case we form the class intervals as shown below.
Age in years
Below 5
5 or more but less than 10
10 or more but less than 15
15 or more but less than 20
20 or more but less than 25
and soon.
Here all the persons with ~y fraction of age are included in one group or the
other. For practical purpose we re-writethe above clasSes as
0-5
5-10
10-15
15-20
20-25
This form of frequency distribution is known as continuousj:-equency distribution.
It should be clearly understood that in. the above classes, the upper limits of
each class are excluded from the respective classes. Such classes in which the upper
limits are excluded from the respective classes and are included in Ihe immediate
next class are known as 'exclusive classes' and Ihe classification is termed as
'exclusive type classification.




2·2. Graphic Representation of a Frequency distribution.

 It is often useful to represent a frequency distribution by means of a diagram which makes the
unwieldy data intelligible and conveys to the eye the general run of the observations. diagrammatic representation also facilitates the comparison of two or more
frequency distributions. We consider below some important types of graphic

representation.

No comments:

Post a Comment