## 5. DISTRIBUTION OF MEASUREMENTS
Up to this point, the discussion has treated the "scatter" of measurements in an intuitive way, without inquiring into the nature of the scatter. This chapter will explore some of the methods for accurately describing the nature of measurement distributions. In practice, one must deal with a finite set of values, so the nature of their distribution is never known precisely. As always, one proceeds on the basis of reasonable assumptions. Consider a large number of repeated measured values of a physical quantity. Suppose the
number of values is
One can often guess the shape of the curve, even with a finite set of values, especially
such features as symmetry and spread. Just as we represent a set of values by
Some of the "measures of central tendency" commonly used are listed here for reference:
The difference between a measurement and the mean of its distribution is called the DEVIATION (or VARIATION) of that measurement. Measures of dispersion are defined in terms of the deviations. Some commonly used measures of dispersion are listed for reference:
The distributions encountered in physics often have a mathematical shape given by
where x is a measurement, <x> is the mean, and f(x) is the ordinate of the distribution
curve for that value of x. σ is the standard deviation.
Distributions which conform to this equation are called The Gaussian distribution is so common that much of the terminology of statistics and error analysis has been built upon it. Furthermore, when one must deal with an unknown distri- bution, it is usually assumed to be Gaussian until contrary evidence is found. The total width, or spread, of the Gaussian curve is infinite, as the equation shows. But
Gaussians do differ in how much f(x) decreases for a given value of (x - <x>). Physicists
sometimes define the "width" of such peaked curves by the "width at half height." This is
measured by finding two points x Statisticians have devised better measures of "width" of Gaussian curves by specifying a range of values of x which include a specified fraction of the measurements. Some are listed here:
The dispersion measures listed in the last section described the dispersion of the data sample. Had we taken more data, we would expect slightly different answers; both the mean and the dispersion depends on the size of the sample. Ideally we want
When this factor is applied to the root mean square deviation, the result is simply to replace n by (n-1). This new expression is called the
Note that Eqs. 5.3 and 5.6 become more nearly identical as n gets large. The distinction between the two is mainly important for small samples. Mathematical statistics texts may be consulted for an explanation of equation 5.5. Also see the books by Topping, Parratt, Beers, Barford, and Pugh-Winslow. The replacement of n by (n-1) us called Bessel's correction. A plausibility argument reveals the need for the correction, so we state it briefly here: First, the case of n=1 can be eliminated form consideration; we can only average
When samples are small, the spread of values will likely be less than that of a larger sample. The (n-1) "corrects" for this small-sample effect, giving a more realistic estimate of the spread of the parent distribution. Quite a number of books presenting error analysis for the undergraduate laboratory ignore Bessel's correction entirely. There is some practical justification for this. The difference between n and (n-1) is only 2% when n = 50. As n gets larger, the difference becomes less. So, when "enough" measurements are made, the difference matters little. When very few measurements are made, the error estimates themselves will be of low
precision. It can be shown, using careful and correct mathematical techniques, that the
uncertainty
100/√[2(n-1)] So we'd have to average 50 independent values to obtain a 10% error in the determination of the error. We would need 5000 measurements to get an error estimate good to 1%. If only 10 measurements were made, the uncertainty in the standard deviation is 33%. This is why we have continually stressed that error estimates of 1 or 2 significant figures are sufficient when data samples are small. This is one reason why the use of the standard deviation in elementary laboratory is seldom justified. How often does one take more than a few measurements of each quantity? Does one even take enough measurements to determine the nature of the error distribution? Is it Gaussian, or something else? One usually doesn't know. If it isn't close to Gaussian, the whole apparatus of the usual statistical error rules for standard deviation must be modified. But the rules for maximum error, limits of error, and avarage error are sufficiently conservative and robust that they can still be relibably used even for small samples. However, when three or more different quantities contribute to a result, a more realistic measure of error is obtained by using the `adding in quadrature' method described at the beginning of this section.
All of the measures of dispersion or "width" introduced above express how far individual measurements deviate from the "true" mean. But we are usually more interested in the accuracy of the mean itself. In chapter 3 we considered this problem, concluding that the error in an average was the error in each measurement divided by the square root of the number of measurements. This result expresses our confidence in any one isolated measurement. This is one of three commonly used measures of confidence in the mean; we list them here for completeness.
_{m} or σ_{<Q>}) The
standard deviation divided by the square root of the number of measurements. To illustrate the meaning of these, consider a set of, say, 100 measurements, distributed
like Fig. 5.2. These should be sufficient to make a rough sketch of the shape of the curve,
determine the mean, and calculate a standard deviation. Now suppose we took 10,000
measurements. Would the shape of the curve change much? Probably not. We would be able
to sketch the curve with more precision, but its width and the value of the mean would change
very little. Yet, with more measurements we are "more certain" of our calculated mean. The
error measurements Also, with more data, the calculation of the measures of dispersion improves. Imagine
the set of 10,000 measurements made up of 1000 sets of 10 measurements. From each set of
10 we calculate a mean. If we now look at these 1000 calculated means, they too form a
distribution. If the data were Gaussian, this distribution of means will also be Gaussian. But this
distribution of means will have a smaller width than the width of the data distribution itself.
The standard deviation To carry this example further, if we calculate the standard deviation of the measurements
in each sample of 10, we will get 1000 different values of standard deviation. These too form
a distribution. It is shown, in more advanced treatments that the standard deviation
where σ is the standard deviation In scientific papers it is important to specify which measure of error is being used, and how many measurements were taken. Only then can readers properly interpret the quality of the results.
The root-mean-square deviation and standard deviation definitions (Eqs. 5.2 and 5.6) are
given in intuitively meaningful forms. But these equations are In the following derivation all summations are from i=1 to i=n. The standard deviation is defined by
Expand the summand: So:
Many electronic calculators have a built-in routine which allows you to enter the x A similar procedure can be used for the rms deviation.
(8.1)* What percent of the measurements fall within the width at half height of a Gaussian curve? (8.2) A set of measurements of a quantity is 878 849 804 755 816 833 781 735 964 795 817 807 862 801 778 810 778 799 819 797 Find the means, average deviations, and standard deviations for (1) each of the four groups, and (2) the whole group of twenty. (8.3) Graph the distribution of problem 2. Note that a bar graph showing occurrences of
each value would not be very informative, for few values occur more than once. It is better to
graph the number of occurrences within a few
This chapter has been included for three reasons: (1) to introduce the statistical measures of error needed in the following chapter, (2) to provide a reference list of commonly encountered measures of error, and related terminology, and (3) to explain the important distinction between measures of dispersion of the data, and errors of the mean. It is not expected that the student should memorize this material; it is included here as a reference source, to be used as needed. The definitions given here (and throughout this lab manual) are consistent with current
usage in physics, mathematical statistics and engineering. The student may (and
should) confirm this by consulting the error analysis books given in the bibliography,
other lab manuals in physics, and copies of current physics journals. The journals
are the best source of examples of accepted practice in methods of reporting errors.
The editors of the good journals insist that authors not be sloppy in these matters. But
do not take as your guide the popular, general interest publications, such as Popular
Science, news magazines, or the daily paper. Such publications are shamefully
negligent in these matters, with the result that scientific facts are often presented in
a most misleading manner. These are primarily sources of Chemistry, biology, earth sciences, astronomy, and even social sciences will be found to adhere to careful standards in reporting errors in their journals. Unfortunately, instructors in elementary courses often take a more cavalier attitude, seemingly unaware of current practice and current terminology used in research papers. If the student has any doubts about correct style, he should check up-to-date books and journal articles in his discipline. Standards and styles were different even as recently as a few decades ago. For example, in the 1950's one frequently found mention of the "probable error" as a measure of uncertainty. Today, one seldom sees that term, the standard deviation is preferred instead. We list both in the table on the next page, to aid those who may read the older literature. The relations between probable error and standard deviation are summarized below,
and are Conversion factors, for
© 1999, 2004 by Donald E. Simanek. |