INTRODUCTION
The description of statistical data may be quite elaborate or quite brief depending on
two factors: the nature of data and the purpose for which the same data have been
collected. While describing data statistically or verbally, one must ensure that the
description is neither too brief nor too lengthy. The measures of central tendency
enable us to compare two or more distributions pertaining to the same time period or
within the same distribution over time. For example, the average consumption of tea
in two different territories for the same period or in a territory for two years, say, 2003
and 2004, can be attempted by means of an average.
ARITHMETIC MEAN
Adding all the observations and dividing the sum by the number of observations results the arithmetic mean. Suppose we have the following observations:
10, 15,30, 7, 42, 79 and 83
These are seven observations. Symbolically, the arithmetic mean, also called simply mean is
x = ∑x/n, where x is simple mean.
= 10 + 15 + 30 + 7 + 42 + 79 + 83 7
= 2667 = 38
It may be noted that the Greek letter μ is used to denote the mean of the population and n to denote the total number of observations in a population. Thus the population mean μ = ∑x/n. The formula given above is the basic formula that forms the definition of arithmetic mean and is used in case of ungrouped data where weights are not involved.
UNGROUPED DATA-WEIGHTED AVERAGE
In case of ungrouped data where weights are involved, our approach for calculating arithmetic mean will be different from the one used earlier.
Example 2.1: Suppose a student has secured the following marks in three tests:
Mid-term test 30
Laboratory 25
Final 20
The simple arithmetic mean will be 30 + 25 + 20 = 25
3
However, this will be wrong if the three tests carry different weights on the basis of their relative importance. Assuming that the weights assigned to the three tests are:
Mid-term test 2 points
Laboratory 3 points
Final 5 points
Solution: On the basis of this information, we can now calculate a weighted mean as shown below:
Table 2.1: Calculation of a Weighted Mean
Type of Test Relative Weight (w) Marks (x) (wx)
Mid-term 2 30 60
Laboratory 3 25 75
Final 5 20 100
Total ∑ w = 10 235
∑ wx w1 x1 + w2 x2 + w3 x3
x = =
∑ w w + w 2 + w
1 3
= 60 + 75 + 100 = 23.5 marks 2 + 3 + 5
It will be seen that weighted mean gives a more realistic picture than the simple or unweighted mean.
Example 2.2: An investor is fond of investing in equity shares. During a period of falling prices in the stock exchange, a stock is sold at Rs 120 per share on one day, Rs 105 on the next and Rs 90 on the third day. The investor has purchased 50 shares on the first day, 80 shares on the second day and 100 shares on the third' day. What average price per share did the investor pay?
Solution:
Table 2.2: Calculation of Weighted Average Price
Day Price per Share (Rs) (x) No of Shares Purchased (w) Amount Paid (wx)
1 120 50 6000
2 105 80 8400
3 90 100 9000
Total - 230 23,400
Weighted average = w1 x1 + w2 x2 + w3 x3 = ∑ wx
w + w 2 + w ∑ w
1 3
= 6000 + 8400 + 9000 = 101.7 marks
50 + 80 + 100
Therefore, the investor paid an average price of Rs 101.7 per share.
It will be seen that if merely prices of the shares for the three days (regardless of the number of shares purchased) were taken into consideration, then the average price would be
Rs.120 + 105 + 90 = 105
3
This is an unweighted or simple average and as it ignores the-quantum of shares purchased, it fails to give a correct picture. A simple average, it may be noted, is also a weighted average where weight in each case is the same, that is, only 1. When we use the term average alone, we always mean that it is an unweighted or simple average.
2.2.2 GROUPED DATA-ARITHMETIC MEAN
For grouped data, arithmetic mean may be calculated by applying any of the following methods:
(i) Direct method, (ii) Short-cut method , (iii) Step-deviation method
In the case of direct method, the formula x = ∑fm/n is used. Here m is mid-point of various classes, f is the frequency of each class and n is the total number of frequencies. The calculation of arithmetic mean by the direct method is shown below.
Example 2.3: The following table gives the marks of 58 students in Statistics. Calculate the average marks of this group.
Marks No. of Students
0-10 4
10-20 8
20-30 11
30-40 15
40-50 12
50-60 6
60-70 2
Total 58
Solution:
Table 2.3: Calculation of Arithmetic Mean by Direct Method
Marks Mid-point m No. of Students fm
f
0-10 5 4 20
10-20 15 8 120
20-30 25 11 275
30-40 35 15 525
40-50 45 12 540
50-60 55 6 330
60-70 65 2 130
Where, ∑fm = 1940
= ∑ fm = 1940 = 33.45 marks or 33 marks approximately.
x
n 58
It may be noted that the mid-point of each class is taken as a good approximation of the true mean of the class. This is based on the assumption that the values are distributed fairly evenly throughout the interval. When large numbers of frequency occur, this assumption is usually accepted.
In the case of short-cut method, the concept of arbitrary mean is followed. The formula for calculation of the arithmetic mean by the short-cut method is given below:
x = A + ∑ fd n
Where A = arbitrary or assumed mean f = frequency
d = deviation from the arbitrary or assumed mean
When the values are extremely large and/or in fractions, the use of the direct method would be very cumbersome. In such cases, the short-cut method is preferable. This is because the calculation work in the short-cut method is considerably reduced particularly for calculation of the product of values and their respective frequencies. However, when calculations are not made manually but by a machine calculator, it may not be necessary to resort to the short-cut method, as the use of the direct method may not pose any problem.
As can be seen from the formula used in the short-cut method, an arbitrary or assumed mean is used. The second term in the formula (∑fd ÷ n) is the correction factor for the difference between the actual mean and the assumed mean. If the assumed mean turns out to be equal to the actual mean, (∑fd ÷ n) will be zero. The use of the short-cut method is based on the principle that the total of deviations taken from an actual mean is equal to zero. As such, the deviations taken from any other figure will depend on how the assumed mean is related to the actual mean. While one may choose any value as assumed mean, it would be proper to avoid extreme values, that is, too small or too high to simplify calculations. A value apparently close to the arithmetic mean should be chosen.
For the figures given earlier pertaining to marks obtained by 58 students, we calculate the average marks by using the short-cut method.
Example 2.4:
Table 2.4: Calculation of Arithmetic Mean by Short-cut Method
Marks Mid-point f d fd
m
0-10 5 4 -30 -120
10-20 15 8 -20 -160
20-30 25 11 -10 -110
30-40 35 15 0 0
40-50 45 12 10 120
50-60 55 6 20 120
60-70 65 2 30 60
∑fd = -90
It may be noted that we have taken arbitrary mean as 35 and deviations from midpoints. In other words, the arbitrary mean has been subtracted from each value of mid-point and the resultant figure is shown in column d.
x = A + ∑ fd n
= + − 90 35 58
= 35 - 1.55 = 33.45 or 33 marks approximately.
Now we take up the calculation of arithmetic mean for the same set of data using the step-deviation method. This is shown in Table 2.5.
Table 2.5: Calculation of Arithmetic Mean by Step-deviation Method
Marks Mid-point f d d’= d/10 Fd’
0-10 5 4 -30 -3 -12
10-20 15 8 -20 -2 -16
20-30 25 11 -10 -1 -11
30-40 35 15 0 0 0
40-50 45 12 10 1 12
50-60 55 6 20 2 12
60-70 65 2 30 3 6
∑fd’ =-
Mean =
x = A + ∑ fd ' × C n
= + − 9 ×10
35 = 33.45 or 33 marks approximately.
58
It will be seen that the answer in each of the three cases is the same. The step-deviation method is the most convenient on account of simplified calculations. It may also be noted that if we select a different arbitrary mean and recalculate deviations from that figure, we would get the same answer.
Now that we have learnt how the arithmetic mean can be calculated by using different methods, we are in a position to handle any problem where calculation of the arithmetic mean is involved.
Example 2.6: The mean of the following frequency distribution was found to be 1.46.
No. of Accidents No. of Days (frequency)
0 46
1 ?
2 ?
3 25
4 10
5 5
Total 200 days
Calculate the missing frequencies.
Solution:
Here we are given the total number of frequencies and the arithmetic mean. We have to determine the two frequencies that are missing. Let us assume that the frequency against 1 accident is x and against 2 accidents is y. If we can establish two simultaneous equations, then we can easily find the values of X and Y.
(0.46) + (1 . x) + (2 . y) + (3 . 25) + (4 . l0) + (5 . 5)
200
1.46 =
x + 2y + 140
200
x + 2y + 140 = (200) (1.46)
x + 2y = 152
x + y=200- {46+25 + 1O+5} x + y = 200 - 86
x + y = 114
Now subtracting equation (ii) from equation (i), we get
x + 2y = 152
x + y = 114
- - -
y = 38
Substituting the value of y = 38 in equation (ii) above, x + 38 = 114 Therefore, x = 114 - 38 = 76
Hence, the missing frequencies are: Against accident 1 : 76
Against accident 2 : 38
2.2.3 CHARACTERISTICS OF THE ARITHMETIC MEAN
Some of the important characteristics of the arithmetic mean are:
The sum of the deviations of the individual items from the arithmetic mean is
always zero. This means I: (x - x ) = 0, where x is the value of an item and x is the arithmetic mean. Since the sum of the deviations in the positive direction is equal to the sum of the deviations in the negative direction, the arithmetic mean is regarded as a measure of central tendency.
The sum of the squared deviations of the individual items from the arithmetic mean is always minimum. In other words, the sum of the squared deviations taken from any value other than the arithmetic mean will be higher.
As the arithmetic mean is based on all the items in a series, a change in the value of any item will lead to a change in the value of the arithmetic mean.
In the case of highly skewed distribution, the arithmetic mean may get distorted on account of a few items with extreme values. In such a case, it may cease to be the representative characteristic of the distribution.
The description of statistical data may be quite elaborate or quite brief depending on
two factors: the nature of data and the purpose for which the same data have been
collected. While describing data statistically or verbally, one must ensure that the
description is neither too brief nor too lengthy. The measures of central tendency
enable us to compare two or more distributions pertaining to the same time period or
within the same distribution over time. For example, the average consumption of tea
in two different territories for the same period or in a territory for two years, say, 2003
and 2004, can be attempted by means of an average.
ARITHMETIC MEAN
Adding all the observations and dividing the sum by the number of observations results the arithmetic mean. Suppose we have the following observations:
10, 15,30, 7, 42, 79 and 83
These are seven observations. Symbolically, the arithmetic mean, also called simply mean is
x = ∑x/n, where x is simple mean.
= 10 + 15 + 30 + 7 + 42 + 79 + 83 7
= 2667 = 38
It may be noted that the Greek letter μ is used to denote the mean of the population and n to denote the total number of observations in a population. Thus the population mean μ = ∑x/n. The formula given above is the basic formula that forms the definition of arithmetic mean and is used in case of ungrouped data where weights are not involved.
UNGROUPED DATA-WEIGHTED AVERAGE
In case of ungrouped data where weights are involved, our approach for calculating arithmetic mean will be different from the one used earlier.
Example 2.1: Suppose a student has secured the following marks in three tests:
Mid-term test 30
Laboratory 25
Final 20
The simple arithmetic mean will be 30 + 25 + 20 = 25
3
However, this will be wrong if the three tests carry different weights on the basis of their relative importance. Assuming that the weights assigned to the three tests are:
Mid-term test 2 points
Laboratory 3 points
Final 5 points
Solution: On the basis of this information, we can now calculate a weighted mean as shown below:
Table 2.1: Calculation of a Weighted Mean
Type of Test Relative Weight (w) Marks (x) (wx)
Mid-term 2 30 60
Laboratory 3 25 75
Final 5 20 100
Total ∑ w = 10 235
∑ wx w1 x1 + w2 x2 + w3 x3
x = =
∑ w w + w 2 + w
1 3
= 60 + 75 + 100 = 23.5 marks 2 + 3 + 5
It will be seen that weighted mean gives a more realistic picture than the simple or unweighted mean.
Example 2.2: An investor is fond of investing in equity shares. During a period of falling prices in the stock exchange, a stock is sold at Rs 120 per share on one day, Rs 105 on the next and Rs 90 on the third day. The investor has purchased 50 shares on the first day, 80 shares on the second day and 100 shares on the third' day. What average price per share did the investor pay?
Solution:
Table 2.2: Calculation of Weighted Average Price
Day Price per Share (Rs) (x) No of Shares Purchased (w) Amount Paid (wx)
1 120 50 6000
2 105 80 8400
3 90 100 9000
Total - 230 23,400
Weighted average = w1 x1 + w2 x2 + w3 x3 = ∑ wx
w + w 2 + w ∑ w
1 3
= 6000 + 8400 + 9000 = 101.7 marks
50 + 80 + 100
Therefore, the investor paid an average price of Rs 101.7 per share.
It will be seen that if merely prices of the shares for the three days (regardless of the number of shares purchased) were taken into consideration, then the average price would be
Rs.120 + 105 + 90 = 105
3
This is an unweighted or simple average and as it ignores the-quantum of shares purchased, it fails to give a correct picture. A simple average, it may be noted, is also a weighted average where weight in each case is the same, that is, only 1. When we use the term average alone, we always mean that it is an unweighted or simple average.
2.2.2 GROUPED DATA-ARITHMETIC MEAN
For grouped data, arithmetic mean may be calculated by applying any of the following methods:
(i) Direct method, (ii) Short-cut method , (iii) Step-deviation method
In the case of direct method, the formula x = ∑fm/n is used. Here m is mid-point of various classes, f is the frequency of each class and n is the total number of frequencies. The calculation of arithmetic mean by the direct method is shown below.
Example 2.3: The following table gives the marks of 58 students in Statistics. Calculate the average marks of this group.
Marks No. of Students
0-10 4
10-20 8
20-30 11
30-40 15
40-50 12
50-60 6
60-70 2
Total 58
Solution:
Table 2.3: Calculation of Arithmetic Mean by Direct Method
Marks Mid-point m No. of Students fm
f
0-10 5 4 20
10-20 15 8 120
20-30 25 11 275
30-40 35 15 525
40-50 45 12 540
50-60 55 6 330
60-70 65 2 130
Where, ∑fm = 1940
= ∑ fm = 1940 = 33.45 marks or 33 marks approximately.
x
n 58
It may be noted that the mid-point of each class is taken as a good approximation of the true mean of the class. This is based on the assumption that the values are distributed fairly evenly throughout the interval. When large numbers of frequency occur, this assumption is usually accepted.
In the case of short-cut method, the concept of arbitrary mean is followed. The formula for calculation of the arithmetic mean by the short-cut method is given below:
x = A + ∑ fd n
Where A = arbitrary or assumed mean f = frequency
d = deviation from the arbitrary or assumed mean
When the values are extremely large and/or in fractions, the use of the direct method would be very cumbersome. In such cases, the short-cut method is preferable. This is because the calculation work in the short-cut method is considerably reduced particularly for calculation of the product of values and their respective frequencies. However, when calculations are not made manually but by a machine calculator, it may not be necessary to resort to the short-cut method, as the use of the direct method may not pose any problem.
As can be seen from the formula used in the short-cut method, an arbitrary or assumed mean is used. The second term in the formula (∑fd ÷ n) is the correction factor for the difference between the actual mean and the assumed mean. If the assumed mean turns out to be equal to the actual mean, (∑fd ÷ n) will be zero. The use of the short-cut method is based on the principle that the total of deviations taken from an actual mean is equal to zero. As such, the deviations taken from any other figure will depend on how the assumed mean is related to the actual mean. While one may choose any value as assumed mean, it would be proper to avoid extreme values, that is, too small or too high to simplify calculations. A value apparently close to the arithmetic mean should be chosen.
For the figures given earlier pertaining to marks obtained by 58 students, we calculate the average marks by using the short-cut method.
Example 2.4:
Table 2.4: Calculation of Arithmetic Mean by Short-cut Method
Marks Mid-point f d fd
m
0-10 5 4 -30 -120
10-20 15 8 -20 -160
20-30 25 11 -10 -110
30-40 35 15 0 0
40-50 45 12 10 120
50-60 55 6 20 120
60-70 65 2 30 60
∑fd = -90
It may be noted that we have taken arbitrary mean as 35 and deviations from midpoints. In other words, the arbitrary mean has been subtracted from each value of mid-point and the resultant figure is shown in column d.
x = A + ∑ fd n
= + − 90 35 58
= 35 - 1.55 = 33.45 or 33 marks approximately.
Now we take up the calculation of arithmetic mean for the same set of data using the step-deviation method. This is shown in Table 2.5.
Table 2.5: Calculation of Arithmetic Mean by Step-deviation Method
Marks Mid-point f d d’= d/10 Fd’
0-10 5 4 -30 -3 -12
10-20 15 8 -20 -2 -16
20-30 25 11 -10 -1 -11
30-40 35 15 0 0 0
40-50 45 12 10 1 12
50-60 55 6 20 2 12
60-70 65 2 30 3 6
∑fd’ =-
Mean =
x = A + ∑ fd ' × C n
= + − 9 ×10
35 = 33.45 or 33 marks approximately.
58
It will be seen that the answer in each of the three cases is the same. The step-deviation method is the most convenient on account of simplified calculations. It may also be noted that if we select a different arbitrary mean and recalculate deviations from that figure, we would get the same answer.
Now that we have learnt how the arithmetic mean can be calculated by using different methods, we are in a position to handle any problem where calculation of the arithmetic mean is involved.
Example 2.6: The mean of the following frequency distribution was found to be 1.46.
No. of Accidents No. of Days (frequency)
0 46
1 ?
2 ?
3 25
4 10
5 5
Total 200 days
Calculate the missing frequencies.
Solution:
Here we are given the total number of frequencies and the arithmetic mean. We have to determine the two frequencies that are missing. Let us assume that the frequency against 1 accident is x and against 2 accidents is y. If we can establish two simultaneous equations, then we can easily find the values of X and Y.
(0.46) + (1 . x) + (2 . y) + (3 . 25) + (4 . l0) + (5 . 5)
200
1.46 =
x + 2y + 140
200
x + 2y + 140 = (200) (1.46)
x + 2y = 152
x + y=200- {46+25 + 1O+5} x + y = 200 - 86
x + y = 114
Now subtracting equation (ii) from equation (i), we get
x + 2y = 152
x + y = 114
- - -
y = 38
Substituting the value of y = 38 in equation (ii) above, x + 38 = 114 Therefore, x = 114 - 38 = 76
Hence, the missing frequencies are: Against accident 1 : 76
Against accident 2 : 38
2.2.3 CHARACTERISTICS OF THE ARITHMETIC MEAN
Some of the important characteristics of the arithmetic mean are:
The sum of the deviations of the individual items from the arithmetic mean is
always zero. This means I: (x - x ) = 0, where x is the value of an item and x is the arithmetic mean. Since the sum of the deviations in the positive direction is equal to the sum of the deviations in the negative direction, the arithmetic mean is regarded as a measure of central tendency.
The sum of the squared deviations of the individual items from the arithmetic mean is always minimum. In other words, the sum of the squared deviations taken from any value other than the arithmetic mean will be higher.
As the arithmetic mean is based on all the items in a series, a change in the value of any item will lead to a change in the value of the arithmetic mean.
In the case of highly skewed distribution, the arithmetic mean may get distorted on account of a few items with extreme values. In such a case, it may cease to be the representative characteristic of the distribution.