# Chapter 14

**DESCRIPTIVE STATISTICS **

**Types of Descriptive Statistics **

The major types of descriptive statistics are measures of central tendency, measures of variability, measures of relative position, and measures of relationship. Measures of central tendency are used to determine the typical or average score of a group of scores. Measures of variability indicate how spread out a group of scores are. Measures of relative position describe a participant’s performance compared to the performance of all other participants. Measures of relationship indicate the degree to which two sets of scores are related. Before actually calculating any of these measures, it is often useful to present the data in graphic form.

**Graphing Data **

The most common method of graphing data is to construct a frequency polygon. The first step in constructing a frequency polygon is to list all scores and to tabulate how many participants received each score. If 85 tenth-grade students were administered an achievement test, the results might be as shown in Table 14.1.

Once the scores are tallied, the steps are as follows:

- Place all the scores on a horizontal axis, at equal intervals, from lowest score to highest.
- Place the frequencies of scores at equal intervals on the vertical axis, starting with zero.
- For each score, find the point where the score intersects with its frequency of occurrence and make a dot.
- Connect all the dots with straight lines.

Table 14.1

Frequency Distribution Based on 85 Hypothetical Achievement Test Scores

Score Frequency of Score

----------------------------------------------------------------------------------------------------------------------------

78 1

79 4

80 5

81 7

82 7

83 9

84 9

85 12

86 10

87 7

88 6

89 3

90 4

91 1

Total: 85 students

------------------------------------------------------------------------------------------------------------------------------------

**Measures of Central Tendency **

Measures of central tendency provide a convenient way of describing a set of data with a single number. The number resulting from computation of a measure of central tendency represents the average or typical score attained by a group of participants. The three most frequency encountered indices of central tendency are the mode, the median, and the mean. Each of these indices is used with a different scale of measurement: the mode is appropriate for describing nominal data, the median for describing ordinal data, and the mean for describing interval or ratio data. Most measurement in educational research uses an interval scale, so the mean is the most frequently used measure of central tendency.

**The Mode **

The mode is the score that is attained by more participants than any other score. The data presented in 9.1, for example, shows that the group mode is 85, since more participants (12) achieved that score than any other. The mode is not established through calculation; it is determined by looking at a set of scores or at a graph of scores and seeing which score occurs most frequently.

**The Median **

The median is that point, after scores are organized from low to high or high to low, above and below which are 50% of the scores. In other words, the median is the midpoint (like the median strip on a high way). If there are an odd number of cores, the median is the middle score (assuming the scores are arranged in order). For example, for the scores 75. 80, 82, 83, 87, the median is 82, because it is the middle score. If there is an even number of scores, the median is the point halfway between the middle scores. For example, for the scores 21, 23, 24, 25, 26, 30, the median is 24.5; for the scores 50, 52, 55, 57, 59, 61, the median is 56. Thus, the median is not necessarily the same as one of the scores. There is no calculation of the median except finding the midpoint when there are an even of scores.

**The Mean **

The mean is the arithmetic average of the scores and is the most frequently used measure of central tendency. It is calculated by adding up all of the scores and dividing that total by the number of scores. In general, the mean is the preferred measure of central tendency. It is appropriate when the data represent either interval or ratio scores and is more precise than the median and the mode, because if equal-sized samples are randomly selected from the same population, the means of those samples will be more similar to each other than either the medians or the modes.

When there are one or more extreme scores, the mean will not be the most accurate representation of the performance of the total group but it will be the best index of typical performance. As an example, suppose you had the following IQ scores: 96, 96, 97, 99, 100, 101, 102, 104, 195. For these scores, the three measures of central tendency are:

Mode = 96 (most frequent score)

Median = 100 (middle score)

Mean = 110 (arithmetic average)

**Measures of Variability **

Although measures of central tendency are useful statistics for describing a set of data, they are not sufficient. Two sets of data that are very different can have identical means or medians. As an example, consider the following sets of data:

Set A: 79 79 79 80 81 81 81

Set B: 50 60 70 80 90 10 110

The mean of both sets of scores is 80 and the median of of both is 80, but set A is very different from set B. In set A the scores are all very close together and are clustered around the mean. In set B the scores are much more spread out; in other words, there is much more variation or variability in set B. Thus, there is a need for a measure that indicates how spread out the scores are, that is, how much variability there is. A number of descriptive statistics serve this purpose, and they are referred to as measures of variability. The three most frequently encountered are the *range* , the *quartile deviation* , and the *standard deviation* .

**The Range **

The range is simply the difference between the highest and the lowest score once the scores are arranged in order and is determined by subtraction. As an example, the range for the scores 79, 79, 79, 80, 81, 81, 81, is 2, while the range for the scores 50, 60, 70, 80, 90, 100, 110 is 60. Thus, if the range is small the scores are close together, whereas if the range is large the scores are more spread out. Like the mode, the range is not a very stable measure of variability, and its chief advantage is that it gives a quick, rough estimate of variability.

**The Quartile Deviation **

In “research talk” the quartile deviation is half of the difference between the upper quartile and the lower quartile in a distribution. In English, the upper quartile is the 75th percentile, the point below which are 75% of the scores. Correspondingly, the lower quartile is the 25th percentile, that point below which are 25% of the scores. By subtracting the lower quartile from the upper quartile and then dividing the result by two, we get a measure of variability. If the quartile deviation is small the scores are close together, whereas if the quartile deviation is large the scores are more spread out. The quartile deviation is a more stable measure of variability than the range and is appropriate whenever the median is appropriate. Calculation of the quartile deviation involves a process very similar to that used to calculate the median, which just happens to be the second quartile or the 50th percentile.

**Variance **

Variance indicates the amount of spread among test scores. If the variance is small, the scores are close together; if the variance is large, the scores are more spread out. The square root of the variance is called the *standard deviation* . Like variance, a small standard deviation indicates that scores are close together and a large standard deviation indicates that the scores are more spread out.

Calculate of the variance is quite simple. For example, five students took a test and received scores of 35, 25 30, 40, and 30. the mean of these scores is—what? Right, 32. the difference of each student’s score from the mean is

35 – 32 = 3

25 – 32 = -7

30 – 32 = -2

40 – 32 = 8

30 – 32 = -2

(Notice that the sum of the differences is:0. That’s why we have to square the differences in the next step.)

Squaring each difference gives 9 + 49 + 4 + 64 + 4 = 130. Dividing the squared differences by the number of scores gives us 130/5 = 26. This is called the variance of the scores. Variance is seldom used itself, but is used to obtain the standard deviation. The standard deviation is the square root of 26 is 5.1, and this is the standard deviation of the five scores.

**The Standard Deviation **

The standard deviation is used when data are interval or ratio, and is by far the most frequently used index of variability. Like the mean, its central tendency counterpart, the standard deviation is the most stable measure of variability and includes every score in its calculation. In fact, the first step in calculating the standard deviation is to find out how far away each score is from the mean by subtracting the mean from each score.

As an example, suppose that the mean of a set of scores had a mean (X) is calculated to be 80 and the standard deviation (SD) to be 1. in this case the mean minus 3 standard deviation, X + 3 SD, is equal to 80 + 3(1) = 80 + 3 = 83. the mean minus 3 standard deviation, X – 3 SD, is equal to 80 – 3(1) = 80 – 3 = 77. Thus, almost all the scores fall between 83 and 77. This makes sense because, as we mentioned before, a small standard deviation (in this case SD = 1) indicates that the scores are close together, not very spread out.

Thu, 12 May 2011 @12:36