Although it may not seem so to the beginner, it is important to examine the ways in which numbers are represented.
Humans normally represent a number in decimal (base 10) form, while modern computers use binary (base 2) and also hexadecimal (base 16) forms. Numerical calculations usually involve numbers that cannot be represented exactly by a finite number of digits. For instance, the arithmetical operation of division often gives a number which does not terminate; the decimal (base 10) representation of 2/3 is one example. Even a number such as 0.1 which terminates in decimal form would not terminate if expressed in binary form. There are also the irrational numbers such as p, which do not terminate. In order to carry out a numerical calculation involving such numbers, one must approximate them by a representation involving a finite number of significant digits (S). For practical reasons (for example, the size of the back of an envelope or the storage available in a machine), this number is usually quite small. Typically, a single precision number on a computer has an accuracy of only about 6 or 7 decimal digits (cf. below).
To five significant digits (5S ), 2/3 is represented by 0.66667, p by 3.1416, and by 1.4142. None of these are exact representations, but all are correct to within half a unit of the fifth significant digit. Numbers should normally be presented in this sense, correct to the number of digits given.
If the numbers to be represented are very large or very small, it is convenient to write them in floating point notation (for example, the speed of light is 2.99792*108 m/s, or the electronic charge is 1.6022*10-19 Coulomb). As indicated, one separates the significant digits (the mantissa) from the power of ten (the exponent); the form, in which the exponent is chosen so that the magnitude of the mantissa is less than 10, but not less than 1, is referred to as scientific notation.
In 1985, the Institute of Electrical and Electronics Engineers published a standard for binary floating point arithmetic. This standard, known as the IEEE Standard 754, has been widely adopted (it is very common on workstations used for scientific computation). The standard specifies formats for single precision and double precision numbers. The single precision format allows 32 binary digits (known as bits) for a floating point number with 23 of these bits allocated to the mantissa. In the double precision format the values are 64 and 52 bits, respectively. On conversion from binary to decimal, it turns out that any IEEE Standard 754 single precision number has an accuracy of about six or seven decimal digits, and a double precision number an accuracy of about 15 or 16 decimal digits.
The simplest way of reducing the number of significant digits in the representation of a number is merely to ignore the unwanted digits. This procedure, known as chopping, was used by many early computers. A more common and better procedure is rounding, which involves adding 5 to the first unwanted digit and then chopping. For example, p chopped to four decimal places (4D ), is 3.1415, but it is 3.1416 when rounded; the representation 3.1416 is correct to five significant digits (5S). The error involved in the reduction of the number of digits is called round-off error. Since p is 3.14159.. , note that chopping has introduced much more round-off error than rounding.
Numerical results are often obtained by truncating an infinite series or iterative process (cf. STEP 5). Whereas round-off error can be reduced by working to more significant digits, truncation errors can be reduced by retaining more terms in the series or more steps in the iteration; this, of course, involves extra work (and perhaps expense!).
In the language of Numerical Analysis, a mistake (or blunder) is not an error! A mistake is due to fallibility (usually human, not machine). Mistakes may be trivial, with little or no effect on the accuracy of the calculation, or they may be so serious as to render the calculated results quite wrong. There are three things which may help to eliminate mistakes:
Common mistakes include: transposition of digits (for example, reading 6238 as 6328); misreading of repeated digits (for example, reading 62238 as 62338); misreading of tables (for example, referring to a wrong line or a wrong column); incorrectly positioning a decimal point; overlooking signs (especially near sign changes).
The following examples illustrate rounding to four decimal places (4D):
The following example illustrates rounding to four significant digits (4S):
5/3 = 1.66666...
determine the magnitude of the round-off error when it is represented by a number obtained from the decimal form by: