2. MEASURES OF ERROR
The rules for significant digits given in the last chapter are too crude for a serious study of experimental error. A better analysis of error should include:
(1) A more precise way to measure and express the size of uncertainties in measurements.
(2) Rules to predict how the uncertainties in results depend on the uncertainties in the data.
Error analysis is an essential part of the experimental process. It allows us to make meaningful quantitative estimates of the reliability of results. A laboratory investigation done without concern for error analysis can not properly be called a scientific experiment.
In this chapter we take the first step toward development of useful measures of size of uncertainties.
In this text the word "error" is reserved for measurements or estimates of the average size of uncertainties. Older books sometimes used "error" with broader meaning or different meaning. Even today, almost all of the important quantities in error theory and mathematical statistics have several names. Seldom do two authors use completely identical sets of names.
However, recent books are approaching uniformity in the use of the word "error," if they use the word at all. Some even avoid the word "error" entirely, or treat "error" and "uncertainty" as synonyms.
2.2 INDETERMINATE ERRORS
In Sec. 1.4 we discussed indeterminate errors, those which are "random" in size and sign. Consider again the data set introduced in that section:
3.69 3.68 3.67 3.69 3.68 3.69 3.66 3.67
We'd like an estimate of the "true" value of this measurement, the value which is somewhat obscured by randomness in the measurement process. Common sense suggests that the "true" value probably lies somewhere between the extreme values 3.66 and 3.69, though it is possible that if we took more data we might find a value outside this range.
From this limited information we might say that the true value "lies between 3.66 and 3.69." This expresses the range of variation actually observed, and is a rather conservative statement.
Or we might quote the arithmetic mean (average) of the measurements as the best value. Then we could specify a maximum range of variation from that average:
3.68 ± 0.02
This is a standard way to express data and results. The first number is the experimenter's best estimate of the true value. The last number is a measure of the "maximum error."
ERROR: The number following the ± symbol is the experimenter's estimate of how far the quoted value might deviate from the "true" value.
There are many ways to express sizes of errors. This example used a conservative "maximum error" estimate. Other, more realistic, measures take advantage of the mathematical methods of statistics. These will be taken up more fully in chapter 5.
Quotes were placed around "true" in the previous discussion to warn the reader that we have not yet properly defined "true" values, and are using the word in a colloquial or intuitive sense. A really satisfying definition cannot be given at this level, but we can clarify the idea somewhat. One can think of the "true" value in two equivalent ways:
(1) The true value is the value one would measure if all sources of error were completely absent.
(2) The true value is the average of an infinite number of repeated measurements.
Since uncertainties can never be completely eliminated, and we haven't time to take an infinite set of measurements, so we never obtain "true" values. With better equipment and more care we can narrow the range of scatter, and therefore improve our estimates of average values, confident that we are approaching "true" values more and more closely. More importantly, we can make useful estimates of how close the values are likely to be to the true value.
Note that we have introduced a special scientific meaning for the word "error." Colloquially the word means "mistake" or "blunder." But in science we use "error" to name estimates of the size of scatter of data.
Two other words, "deviation" and "discrepancy" are also used in science with very specific meanings:
DEVIATION: Suppose a set of measurements has been averaged. The magnitude of the difference between a particular measurement and the average is called the deviation of that measurement from the average. Deviations may be expressed as percents.
Example 1: The deviation of the first value in the set discussed above is 3.69 - 3.68 = 0.01. The percent deviation is 100 x 0.01/3.68 = 0.2%.
DISCREPANCY: The discrepancy between any two measurements is the magnitude of their difference.
Discrepancies may also be expressed as percents. The term "discrepancy" is usually used when several independent experimental determinations of the same quantity are compared. When a student compares a lab measurement or result with the value given in the textbook, the difference is called the "experimental discrepancy." Never make the mistake of calling this comparison an "error." Textbook or handbook values are not "true" values, they are merely someone else's experimental values. If several experimenters quote values of the same quantity, and have estimated their errors properly, the discrepancy between their values should be no larger than their estimated maximum errors.
Example 2: Suppose that in the preceding example, the "textbook value" of the quantity was 3.675. Our best average experimental value was 3.68. The discrepancy is therefore 0.005. Note that the maximum error is still 0.02. Do not confuse an error with a discrepancy.
2.3 DETERMINATE ERRORS
The determinate (random) errors we have been discussing are unbiased in sign. They are as likely to make a measurement "too high" as they are to make it "too low". Our assertion that the average value is an appropriate approximation to the "true" value was based on this assumption.
But there are also influences which affect the data so as to make values consistently too large or too small. These are called determinate errors.. A few possible causes of determinate errors are listed here to illustrate their nature.
(1) The measuring instrument is miscalibrated.
(2) The observation technique has a consistent bias.
(3) There are unnoticed outside influences on the apparatus, or on the quantity being measured. Such influences may be determinate or indeterminate or both.
One common determinate error in elementary lab work is the miscalibrated scale (an example of cause 1). This can happen even with such a simple instrument as a meter stick. The millimeter divisions may vary in size. The end of the stick may have been sawed off inaccurately (or be worn from use) so that a fraction of the first millimeter is "lost." This error is easily eliminated by avoiding making measurements from the end of the stick; start at 10 cm instead, and subtract 10 cm from the reading.
All determinate errors may be eliminated, when they are recognized! But that's what makes them so troublesome. They may not be suspected until the final results are calculated and found to disagree with theory or with "book values." This is why it is important to do at least a rough calculation of all experimental results before leaving lab. Determinate errors and outright blunders may then be detected before it is too late to repeat the experiment.
More will be said of determinate errors in chapter 8.
2.4 IMPORTANCE OF REPEATED OBSERVATIONS
We strongly recommend that several independent measurements be made of each data quantity, preferably by different observers. This procedure has several advantages:
(1) Bias of a single observer is eliminated.
(2) Blunders may be discovered.
(3) The values will indicate whether the data is scale limited. There is no other way to determine this, and unless it is determined you will have no idea of the size of the uncertainty.
There are cases where a single observer is better than several. If the measurement takes a considerable skill and practice, it may not be practical for all partners to attain this skill in the allotted time. But the measurements should at least be checked by another person, to eliminate blunders.
It is not sufficient to merely repeat the reading process; the entire measuring procedure should be repeated. For example, if the length of the room is measured with a meter stick, each observer should carry out the entire process of successive laying-off of the stick, for important errors are introduced in this procedure.
2.5 MEASURES OF ERROR
The goal of most experiments is to calculate numerical results from measured data. The goal of error analysis is to determine the reliability of these results. Just as we must take data from which to calculate a result, so we must also know the uncertainties in the data to calculate the uncertainty of a result.
The first step in any error analysis is to determine the error in each data quantity. In some cases this information may be known from past experience, or it may be supplied by the manufacturer of the measuring device.
The experimenter confronted with an unfamiliar measuring device must experimentally determine its reliability. This will often occupy a large fraction of the laboratory time. Laboratory manuals usually do not spell out this procedure, but it must be done anyway. Do not assume that data is scale-limited until you have shown it to be (by repeated measurements). Many common measuring procedures are not scale-limited.
One of the best ways to estimate the precision of a measurement is to make a number of independent measurements. As we have illustrated, in the example of section 2.1, these values will show a distribution, or scatter. The distribution may be graphed. If only a few measurements are made, a bar-graph like Fig. 2.1 may be appropriate. If very many measurements are made, the distribution usually approaches a smooth curve. Figs. 2.2, 2.3, and 2.4 illustrate only a few of the many possible distribution curves. In all cases the graphs are a plot of the number of occurrences of each value (on the vertical axis) plotted against the values (on the horizontal axis.)
Error distributions like that of Fig. 2. aren't often encountered in science. Many of the distributions will resemble Fig. 2.4. This curve is the famous "normal" or Gaussian curve. Most of the mathematics of statistics, and of error theory, is based upon this curve.
Analysis of a sample set of repeated measurements allows us to calculate several important things:
(1) An "average" or "best" value representative of the set. If the distribution is normal or near-normal, the arithmetic mean is the best value.
(2) A measure of the dispersion (width or spread) of the distribution. This is also a measure of the "average" error of a typical measurement.
(3) An estimate of the error in the mean itself.
2.6 AVERAGE DEVIATION
A simple and useful measure of scatter is the average deviation:
AVERAGE DEVIATION: Designate the values of the measurements of a quantity Q by Qi, and the mean (average) of these by <Q>. The average deviation of the Qi is found by averaging the magnitudes of the deviations from the mean (i.e., ignoring the signs of the deviations). If the sample is small, the average of the deviations is done by dividing by n-1 rather than n, which is the "Bessel small sample correction."
The average of the deviations (including their signs) is, of course, zero (see exercise 2.3).
Example 3: Using the data set of sec. 2.1, we can calculate the deviation of each quantity from the average. In the table below, we have listed these deviations in the second column, and their magnitudes are in the last column. (The second column is included for completeness; you would not normally include it on your calculation sheet.) The calculation of the mean and the average deviation of the mean are also shown.
DATA DEVIATIONS MAGNITUDES OF SET FROM THE MEAN THE DEVIATIONS 3.69 +0.01 0.01 3.68 0.0 0.0 3.67 -0.01 0.01 3.69 +0.01 0.01 3.68 0.0 0.0 3.69 +0.01 0.01 3.66 -0.02 0.02 3.67 -0.01 0.01 _______ ______ 8 ) 29.43 = 3.67875 7 ) 0.07 = 0.01 , average deviation Mean = 3.68 of the data set. Result: 3.68 ± 0.01
Note that when calculating error estimates we are satisfied with values rounded off to about two significant digits. To calculate an error to as great a precision as the quantity itself is wasted effort.
2.7 ESTIMATING THE ERROR OF THE MEAN
The mean (average) is often taken as the best representative value for a set of measurements. How good is the mean for this purpose? How far is it likely to deviate from the "true" value?
You may have an intuitive feeling that the mean will be "better" if you average a larger number of independently determined values. We will later show (in chapter 6) that this is true. It turns out that for each method of measuring dispersion of the measurements there is a corresponding estimate of error of the mean, related by
measure of data dispersion estimated error of the mean = —————————————————————————— √n
where n is the number of values which were averaged. Therefore we may define:
AVERAGE DEVIATION OF THE MEAN: the average deviation divided by the square root of the number of measurements.
For the data set of Example 3, the average deviation of the measurements was 0.01, but the average deviation of the mean is
0.01 0.01 ———— = ———— = 0.0042 √8 2.38
When the "average deviation" is quoted as a measure of error of a result obtained from averaging, of from graphical curve fitting, it is the "average deviation of the mean" which is meant. This is implied by the way we express results:
3.68 ± 0.004
The 0.004 tells us the uncertainty of the mean (3.68). To illustrate the difference, consider again the process of Example 3. Ten measurements were averaged to obtain a mean. Suppose we did this again, with ten new measurements, and calculated another mean. Now repeat this many times, obtaining a large number of independent calculated values of the mean. Though the individual measurements in each set of ten would have an average deviation of 0.01 from their mean, the means would have an average deviation of 0.004 from their mean.
In other words, the average deviation expresses the uncertainty of the individual measurements, while the average deviation of the mean expresses the uncertainty of the mean itself from the "true" value.
2.8 THE STANDARD DEVIATION
The standard deviation is a widely used measure of error, especially appropriate for Gaussian distributions:
We have included this definition here, for completeness, but will postpone discussion of it until chapter 5.
2.9 PRECISION AND ACCURACY; SOME DEFINITIONS
Words and their meanings can cause difficulty in scientific discussion. This is especially true when words taken from common usage are given specialized scientific meanings. Such words as accurate, correct, precise, and reliable are examples of this problem. Their meanings, as given in standard dictionaries, overlap. Webster's New World Dictionary gives some help in its clarification of the meanings of possible synonyms of the word correct:
"...correct connotes little more than absence of error (a correct answer)...; accurate implies a positive exercise of care to obtain conformity with fact or truth...; exact stresses perfect conformity with fact, truth, or some standard (the exact time, an exact quotation); precise suggests minute accuracy of detail..."
Measuring processes never yield perfect precision or perfect accuracy, so absolutely correct or true results are not attainable. We can only speak of degrees of precision or degrees of accuracy expressed as numerical amounts or percents.
The following statements supply the essential scientific meanings of these terms:
ACCURATE: conforming closely to truth, or to some standard.
PRECISE: sharply or clearly defined.
A precise measurement is one in which repeated trials give very nearly the same value, with small fluctuation. Such a measurement may not be accurate, however, if the measuring instrument is miscalibrated. Accuracy implies freedom from all sources of error, while precision only implies absence of indeterminate error.
These words refer more to how we make the measurements than to the results of the measurements. True values are never known, so the accuracy of results can never really be determined. When a result is said to be accurate, this means that analysis of the experimental procedure has shown that all sources of error known to the experimenter have been kept small. Statements of experimental accuracy should be supported by description of experimental equipment and technique, to inform others of just what precautions were taken. Quite often in the history of science, results thought accurate were later found inaccurate because of unrecognized determinate error.
"Correct" is used in science primarily to indicate absence of mathematical error or blunder.
(2.1) Two wooden meter sticks, picked at random from the supply room, are laid with their scales adjacent and their zero marks coinciding. It is observed that the markings line up well near the zero ends of the sticks, but as we go to larger readings they do not coincide, and when we reach the 100 cm. end of the stick, the markings are a full millimeter off. On the basis of this limited information, how would you express the reliability you would expect from a meter stick? Would it be better to express the error in millimeters, or as a percent? Why?
(2.2) A student measures a quantity four times, getting 4.86, 4.99, 4.80, and 5.02. What is the average value? The text book value for this measurement was 5.01. What is the student's maximum error? What is the percent maximum error? What is the discrepancy? What is the percent discrepancy?
(2.3) When averaging deviations, we always insist that it is the magnitudes of the deviations we must average. If we were to average the deviations from the mean (i.e., retaining the signs of the deviations) we would always get an average of zero. Construct an algebraic proof of this assertion.
(2.4) Six measurements of a quantity were found to be: 14.68, 14.23, 13.91, 14.44, 13.85, 14.16. Calculate the average, the deviation of each measurement, the maximum deviation, and the average deviation. Round off the average and state it, with its error, in standard form.
2.11 SUMMARY OF CHAPTER 2.
The descriptive term "exact science" is a misnomer, for science deals with measurements, and the measurement processes can never be perfect. We can only try to improve the precision of measurements by reducing experimental uncertainties.
Repeated measurements of a quantity show a scatter (dispersion) throughout a range of values. A mathematical average of the values (for example, the arithmetic mean) is often taken as the best representative value of the set. But how good is that mean value? A clue is provided by the "width" of the scatter distribution of the original measurements about the mean. The average deviation of the measurements is a measure of this width. But we are more interested in how much the sample mean deviates from the "true" mean, and that the "goodness" of the mean is better estimated by the average deviation of the mean:
where n is the number of values of x.
We suspect that the mean value is closer to "truth," and more reliable than any single measurement. The mathematical study of statistics provides tools which are very helpful for estimating just how good mean values are.
We used the term "true value" to facilitate discussion. We may think of the true value as representing either (1) the value we'd measure if all sources of uncertainty were eliminated, or (2) the mean value of an infinite set of measure- ments. Statisticians refer to the latter as the universe mean.
The standard form for measured values and results is:
(value) ± (est. error in the value)
for example: 3.68 ± 0.02 seconds
Errors and discrepancies may also be expressed in fractional form, or as percents. Always specify clearly what kind of error estimate you use (maximum, average deviation, standard deviation, etc.).
The proper style for scientific notation is:
(6.35 ± 0.003) × 106
© 1996, 2004 by Donald E. Simanek