996
|
1 * Summary of routines in floatlib.i:
|
|
2
|
|
3 (5000) :3 <- :1 plus :2
|
|
4 (5010) :3 <- :1 minus :2
|
|
5 (5020) :2 <- the integer part of :1
|
|
6 :3 <- the fractional part of :1
|
|
7 (5030) :3 <- :1 times :2
|
|
8 (5040) :3 <- :1 divided by :2
|
|
9 (5050) :3 <- :1 modulo :2
|
|
10
|
|
11 (5060) :2 <- :1 cast from a two's-complement integer into a
|
|
12 floating-point number
|
|
13 (5070) :2 <- :1 cast from a floating-point number into the nearest
|
|
14 two's-complement integer
|
|
15 (5080) :2 <- :1 cast from a floating-point number into a decimal
|
|
16 representation in scientific notation
|
|
17 (5090) :2 <- :1 cast from a decimal representation in scientific
|
|
18 notation into a floating-point number
|
|
19
|
|
20 (5100) :2 <- the square root of :1
|
|
21 (5110) :2 <- the natural logarithm of :1
|
|
22 (5120) :2 <- e to the power of :1 (the exponential function)
|
|
23 (5130) :3 <- :1 to the power of :2
|
|
24
|
|
25 (5200) :2 <- sin :1
|
|
26 (5210) :2 <- cos :1
|
|
27 (5220) :2 <- tan :1
|
|
28
|
|
29 (5400) :1 <- uniform random number between zero and one exclusive
|
|
30 (5410) :2 <- :1 times phi
|
|
31 (5419) :2 <- :1 divided by phi
|
|
32
|
|
33 Note: All of the above routines except (5020), (5060), (5080), (5200),
|
|
34 (5210), and (5400) also modify .5 as follows: .5 will contain #3 if
|
|
35 the result overflowed or if the arguments were out of domain, #2 if
|
|
36 the result underflowed, #1 otherwise. (See below.)
|
|
37
|
|
38 * Description of floatlib.i:
|
|
39
|
|
40 The INTERCAL floating-point library uses the IEEE format for 32-bit
|
|
41 floating-point numbers, which uses bit 31 as a sign bit (1 being
|
|
42 negative), bits 30 through 23 hold the exponent with a bias of 127,
|
|
43 and bits 22 through 0 contain the fractional part of the mantissa with
|
|
44 an implied leading 1. In mathematical notation:
|
|
45
|
|
46 N = (1.0 + fraction) * 2^(exponent - 127) * -1^sign
|
|
47
|
|
48 Thus the range of floating-point magnitudes is, roughly, from
|
|
49 5.877472*10^-39 up to 6.805647*10^38, positive and negative. Zero is
|
|
50 specially defined as all bits 0. (Actually, to be precise, zero is
|
|
51 defined as bits 30 through 0 as being 0. Bit 31 can be 1 to represent
|
|
52 negative zero, which the library generally treats as equivalent to
|
|
53 zero, though don't hold me to that.)
|
|
54
|
|
55 Note that, contrary to the IEEE standard, exponents 0 and 255 are not
|
|
56 given special treatment (besides the representation for zero). Thus
|
|
57 there is no representation for infinity or not-a-numbers, and there is
|
|
58 no gradual underflow capability. Conformance with widely-accepted
|
|
59 standards was not considered to be a priority for an INTERCAL library.
|
|
60 (The fact that the general format conforms to IEEE at all is due to
|
|
61 sheer pragmatism.)
|
|
62
|
|
63 Here, for easy reference, are some common values as they would be
|
|
64 directly represented within an INTERCAL program:
|
|
65
|
|
66 Zero: #0
|
|
67 One: #30720$#28672
|
|
68 Two: #0$#32768
|
|
69 One million: #9280$#40480
|
|
70 One-half: #28672$#28672
|
|
71 Square root of two: #31757$#30509
|
|
72 e: #1760$#33742
|
|
73 Pi: #571$#35133
|
|
74
|
|
75 However, a more human-accessible representation of floating-point
|
|
76 numbers is made possible by the routines (5080) and (5090). For this
|
|
77 representation, scientific notation is used with six digits of
|
|
78 precision after the decimal point. The seven digits of the mantissa
|
|
79 are suffixed with the two digits of the exponent. If the number is
|
|
80 negative, a one is prefixed (in the billions' place), so there can be
|
|
81 ten decimal digits in all. Finally, if the exponent is negative, fifty
|
|
82 is added. As is the usually the case with scientific notation, the
|
|
83 digit to the left of the decimal point must be nonzero except for the
|
|
84 case of zero itself.
|
|
85
|
|
86 The above may sound convoluted, but it is not nearly as confusing as
|
|
87 it perhaps should be. As an example, if you wished to enter the value
|
|
88 of the speed of light measured in centimeters per seconds squared, you
|
|
89 would type TWO NINE NINE SEVEN NINE TWO FIVE ONE OH. The INTERCAL
|
|
90 program would then call (5080) to transform this into the
|
|
91 floating-point number 2.997925e+10. The same value printed out with
|
|
92 _____
|
|
93 the help of (5090) would appear as ccxcixDCCXCMMDX. Similarly, the
|
|
94 value -1.602189e-19 (the charge of an electron measured in Coulombs)
|
|
95 would be represented respectively as ONE ONE SIX ZERO TWO ONE EIGHT
|
|
96 _______
|
|
97 NINE SIX NINE and mclxCCXVIIICMLXIX. Note that the negative sign on
|
|
98 the exponent always translates to an L between the fraction and the
|
|
99 exponent when output.
|
|
100
|
|
101 The 16-bit variable .5 is used by the floating-point library as an
|
|
102 error indicator. Upon return from most of the functions, .5 will
|
|
103 normally be set to #1 if the return value is valid. If .5 is set to
|
|
104 #2, this indicates that the result of the function underflowed (that
|
|
105 is, the magnitude of the result is less than 2^-127). If .5 is set to
|
|
106 #3, this indicates either that the result overflowed (magnitude
|
|
107 greater than 2^128), or that the arguments to the function were
|
|
108 illegal. The latter can occur for the following situations: division
|
|
109 by zero (either via division or modulus), negative root (either via
|
|
110 square root or a fractional power), and non-positive logarithm. Also,
|
|
111 (5070) will set .5 to #3 if the magnitude of the argument is greater
|
|
112 than 2^31, and (5080) will do likewise if given a bad number (e.g., if
|
|
113 it is greater than two billion).
|
|
114
|
|
115 It may be worth nothing that there are some cases in which an under-
|
|
116 or overflow can be recovered from with tolerable grace. The sign and
|
|
117 fraction bits of such a value will usually still be correct, and the
|
|
118 exponent bits will just be the lower eight bits of the true exponent:
|
|
119 that is, the true exponent plus or minus 256 as appropriate. If such a
|
|
120 value is passed to another function as is, and the return value from
|
|
121 that over- or underflows in the opposite direction, it is likely that
|
|
122 the final result can be trusted. Of course, this depends entirely on
|
|
123 the nature of the operations involved, and this paragraph should not
|
|
124 be taken as advice to pursue such approaches in general, just as this
|
|
125 document should not be taken as advice to make use of this library at
|
|
126 all. Note also that in the case of (5110), the exponential function,
|
|
127 and (5130), the power function, it is quite easy to request a
|
|
128 massively under- and/or overflowed result. In these cases the
|
|
129 functions in question exit early, and attempting to salvage something
|
|
130 from such results, with the possible exception of the sign bit, is
|
|
131 pretty much guaranteed to be fruitless.
|
|
132
|
|
133 (5020) is the integral partition function (referred to as modf in the
|
|
134 C library). Both return values are floating-point numbers, and both
|
|
135 have the same sign as the argument, such that :2 + :3 = :1 while
|
|
136 keeping the magnitude of :3 less than 1.
|
|
137
|
|
138 (5060) and (5070) are the "type-casting" functions. They translate
|
|
139 values to and from 32-bit two's-complement integers. (5070) truncates
|
|
140 any fractional amounts from its argument, and will signal an error if
|
|
141 the magnitude of the value requires more than 32 bits.
|
|
142
|
|
143 The trigonometric functions (5200), (5210), and (5220) will return
|
|
144 erroneous results when given very large arguments. At magnitudes
|
|
145 around 8 million or so, the result is occasionally off by one bit.
|
|
146 The errors get progressively worse, so that with magnitudes in the
|
|
147 neighborhood of the quintillions and higher, the number is wholly
|
|
148 inaccurate. However, at magnitudes around 8 million, there is already
|
|
149 a difference of over 20pi between consecutive floating-point numbers,
|
|
150 so it is hard to conceive of a purpose for which improved accuracy
|
|
151 would be useful.
|
|
152
|
|
153 In the descriptions for (5410) and (5419), phi refers to the golden
|
|
154 ratio.
|
|
155
|
|
156 * Internal routines in floatlib.i:
|
|
157
|
|
158 A floating-point library involves a fair amount of mathematical
|
|
159 functionality to begin with, something that INTERCAL is not generally
|
|
160 mistaken for having. The floating-point library therefore has a number
|
|
161 of internal functions to provide it with some 64-bit arithmetic, among
|
|
162 other things. While these routines were designed specifically for
|
|
163 internal use, and in many cases one would be hard-pressed to find
|
|
164 other uses for them in their current form, it was felt that they were
|
|
165 nonetheless worth documenting. Besides providing a guide for anyone so
|
|
166 foolish as to examine the actual code, it provides a sort of "itemized
|
|
167 bill" that helps to justify the inordinate size of the library.
|
|
168
|
|
169 In the following list, the notation :1:2 simply indicates that two
|
|
170 32-bit variables, in this example :1 and :2, are being used to hold a
|
|
171 single 64-bit integer value. The notation :1:2.1, on the other hand,
|
|
172 indicates a kind of double-precision floating-point value, a format
|
|
173 that the library uses frequently when storing intermediate values. The
|
|
174 two 32-bit variables hold the fraction bits, with no implied bits, and
|
|
175 ideally with the highest 1 bit at bit 55. The 16-bit variable holds
|
|
176 the exponent, but with a bias of 639 instead of 127. This is done so
|
|
177 that underflows do not affect the sign, which is also stored in the
|
|
178 16-bit variable in bit 10. These intermediate values are rounded (to
|
|
179 even), truncated, and stored in a single 32-bit variable by the
|
|
180 floating-point normalization function, (5600).
|
|
181
|
|
182 Finally, note that the numbers in the tables accessed by routines
|
|
183 (5690) through (5693) are generally tailored for the function that
|
|
184 applies them, and all of them use slightly different representations
|
|
185 for the values they encode.
|
|
186
|
|
187 Here are brief descriptions of the internal routines:
|
|
188
|
|
189 (5500) :1:2 <- :1:2 plus :3:4
|
|
190 .5 <- #0 if no overflow, #1 otherwise
|
|
191 (5510) :1 <- :1 plus :6, where :6 has at most one 1 bit
|
|
192 :6 <- #0 on overflow, nonzero otherwise
|
|
193 (5519) :1 <- :1 minus :6, where :6 has at most one 1 bit
|
|
194 :6 <- #0 on underflow, nonzero otherwise
|
|
195 (5520) .1 <- .1 minus #1
|
|
196 (5530) :1:2:3:4 <- :1:2 times :3:4
|
|
197 (5540) :1 <- :1:2 divided by :3:4, with :1:2 aligned on bit 63, :3:4
|
|
198 aligned on bit 62, and the result aligned on bit 31
|
|
199 .1 <- the exponent of the quotient plus #126
|
|
200 .5 <- #1 if the result should be rounded up, #0 otherwise
|
|
201 (5550) :1:2.1 <- :1:2.1 modulo :3:4.2
|
|
202 (5560) :1:2 <- two's complement of :1:2
|
|
203 (5570) :1:2 <- :1:2 left-shifted logically so that bit 63 is 1
|
|
204 .1 <- .1 minus the no. of bits shifted out
|
|
205 (5580) :3:4 <- :3:4 shifted right arithmetically .3 bits
|
|
206 (5590) :1:2 <- :1:2 shifted so that bit 55 is the highest 1 bit
|
|
207 .1 <- .1 plus or minus the no. of bits shifted
|
|
208 .5 <- the bits right-shifted out of :1:2, if any
|
|
209 (5600) :1 <- the normalization of :1:2.1
|
|
210 (5610) :1:2.1 <- the natural logarithm of :1:2.1
|
|
211 (5620) :1:2.1 <- e to the power of :1:2.1
|
|
212 (5640) :1:2.1 <- :1:2.1 to the power of the two's-complement integer
|
|
213 in :3
|
|
214 (5650) :1:2.1 <- :1 to the power of :2
|
|
215 .5 <- #1 if the power is not real, #2 otherwise
|
|
216 (5670) :1:2 <- an angle between -pi/2 and pi/2 that has the same
|
|
217 value for sine as :1, and stored in two's-complement,
|
|
218 fixed-point form, with the one's position at bit 62
|
|
219 .1 <- #1024 if the cosine of :1:2 will have the opposite sign
|
|
220 as the cosine of :1, #2048 otherwise
|
|
221 (5680) :1:2 <- the sine of :1:2 in two's-complement, fixed-point
|
|
222 :3:4 <- the cosine of :1:2
|
|
223 (5690) ;1 <- a table of 32 significant bits of the numbers 10^i,
|
|
224 where i ranges from 1 to 39
|
|
225 ,1 <- a table of exponents for the above powers of ten
|
|
226 (5691) ;1 <- a table of bit patterns representing ln(1 + 1/(2^i - 1)),
|
|
227 where i ranges from -1 to -26
|
|
228 (5692) ;1 <- a table of bit patterns representing -ln(1 + 2^i), where
|
|
229 i ranges from -1 to -32
|
|
230 (5693) ;1 <- a table of bit patterns representing arctan(2^i), where
|
|
231 i ranges from 0 to -30
|
|
232
|
|
233 * Bibliography:
|
|
234
|
|
235 Knuth, Donald E. "The Art of Computer Programming: Vol. 2
|
|
236 (Seminumerical Algorithms)", 2nd edition. Addison-Wesley, 1981.
|
|
237 This book provided the essential background as well as algorithms
|
|
238 for the basic arithmetic operations. Mention should also be made
|
|
239 of Vol. 1, which provided an algorithm for the natural logarithm
|
|
240 as a side note.
|
|
241
|
|
242 Koren, Israel. "Computer Arithmetic Algorithms". Prentice-Hall, 1993.
|
|
243 In addition to information on the IEEE standard, this book
|
|
244 provided the algorithms used for the exponential and trigonometric
|
|
245 functions.
|
|
246
|
|
247 Plauger, P. J. "The Standard C Library". Prentice-Hall, 1992.
|
|
248 This book provided assistance in designing the modulus and power
|
|
249 functions, as well as other miscellaneous contributions.
|