comparison interps/c-intercal/pit/lib/floatlib.doc @ 996:859f9b4339e6

<Gregor> tar xf egobot.tar.xz
author HackBot
date Sun, 09 Dec 2012 19:30:08 +0000
parents
children
comparison
equal deleted inserted replaced
995:6883f5911eb7 996:859f9b4339e6
1 * Summary of routines in floatlib.i:
2
3 (5000) :3 <- :1 plus :2
4 (5010) :3 <- :1 minus :2
5 (5020) :2 <- the integer part of :1
6 :3 <- the fractional part of :1
7 (5030) :3 <- :1 times :2
8 (5040) :3 <- :1 divided by :2
9 (5050) :3 <- :1 modulo :2
10
11 (5060) :2 <- :1 cast from a two's-complement integer into a
12 floating-point number
13 (5070) :2 <- :1 cast from a floating-point number into the nearest
14 two's-complement integer
15 (5080) :2 <- :1 cast from a floating-point number into a decimal
16 representation in scientific notation
17 (5090) :2 <- :1 cast from a decimal representation in scientific
18 notation into a floating-point number
19
20 (5100) :2 <- the square root of :1
21 (5110) :2 <- the natural logarithm of :1
22 (5120) :2 <- e to the power of :1 (the exponential function)
23 (5130) :3 <- :1 to the power of :2
24
25 (5200) :2 <- sin :1
26 (5210) :2 <- cos :1
27 (5220) :2 <- tan :1
28
29 (5400) :1 <- uniform random number between zero and one exclusive
30 (5410) :2 <- :1 times phi
31 (5419) :2 <- :1 divided by phi
32
33 Note: All of the above routines except (5020), (5060), (5080), (5200),
34 (5210), and (5400) also modify .5 as follows: .5 will contain #3 if
35 the result overflowed or if the arguments were out of domain, #2 if
36 the result underflowed, #1 otherwise. (See below.)
37
38 * Description of floatlib.i:
39
40 The INTERCAL floating-point library uses the IEEE format for 32-bit
41 floating-point numbers, which uses bit 31 as a sign bit (1 being
42 negative), bits 30 through 23 hold the exponent with a bias of 127,
43 and bits 22 through 0 contain the fractional part of the mantissa with
44 an implied leading 1. In mathematical notation:
45
46 N = (1.0 + fraction) * 2^(exponent - 127) * -1^sign
47
48 Thus the range of floating-point magnitudes is, roughly, from
49 5.877472*10^-39 up to 6.805647*10^38, positive and negative. Zero is
50 specially defined as all bits 0. (Actually, to be precise, zero is
51 defined as bits 30 through 0 as being 0. Bit 31 can be 1 to represent
52 negative zero, which the library generally treats as equivalent to
53 zero, though don't hold me to that.)
54
55 Note that, contrary to the IEEE standard, exponents 0 and 255 are not
56 given special treatment (besides the representation for zero). Thus
57 there is no representation for infinity or not-a-numbers, and there is
58 no gradual underflow capability. Conformance with widely-accepted
59 standards was not considered to be a priority for an INTERCAL library.
60 (The fact that the general format conforms to IEEE at all is due to
61 sheer pragmatism.)
62
63 Here, for easy reference, are some common values as they would be
64 directly represented within an INTERCAL program:
65
66 Zero: #0
67 One: #30720$#28672
68 Two: #0$#32768
69 One million: #9280$#40480
70 One-half: #28672$#28672
71 Square root of two: #31757$#30509
72 e: #1760$#33742
73 Pi: #571$#35133
74
75 However, a more human-accessible representation of floating-point
76 numbers is made possible by the routines (5080) and (5090). For this
77 representation, scientific notation is used with six digits of
78 precision after the decimal point. The seven digits of the mantissa
79 are suffixed with the two digits of the exponent. If the number is
80 negative, a one is prefixed (in the billions' place), so there can be
81 ten decimal digits in all. Finally, if the exponent is negative, fifty
82 is added. As is the usually the case with scientific notation, the
83 digit to the left of the decimal point must be nonzero except for the
84 case of zero itself.
85
86 The above may sound convoluted, but it is not nearly as confusing as
87 it perhaps should be. As an example, if you wished to enter the value
88 of the speed of light measured in centimeters per seconds squared, you
89 would type TWO NINE NINE SEVEN NINE TWO FIVE ONE OH. The INTERCAL
90 program would then call (5080) to transform this into the
91 floating-point number 2.997925e+10. The same value printed out with
92 _____
93 the help of (5090) would appear as ccxcixDCCXCMMDX. Similarly, the
94 value -1.602189e-19 (the charge of an electron measured in Coulombs)
95 would be represented respectively as ONE ONE SIX ZERO TWO ONE EIGHT
96 _______
97 NINE SIX NINE and mclxCCXVIIICMLXIX. Note that the negative sign on
98 the exponent always translates to an L between the fraction and the
99 exponent when output.
100
101 The 16-bit variable .5 is used by the floating-point library as an
102 error indicator. Upon return from most of the functions, .5 will
103 normally be set to #1 if the return value is valid. If .5 is set to
104 #2, this indicates that the result of the function underflowed (that
105 is, the magnitude of the result is less than 2^-127). If .5 is set to
106 #3, this indicates either that the result overflowed (magnitude
107 greater than 2^128), or that the arguments to the function were
108 illegal. The latter can occur for the following situations: division
109 by zero (either via division or modulus), negative root (either via
110 square root or a fractional power), and non-positive logarithm. Also,
111 (5070) will set .5 to #3 if the magnitude of the argument is greater
112 than 2^31, and (5080) will do likewise if given a bad number (e.g., if
113 it is greater than two billion).
114
115 It may be worth nothing that there are some cases in which an under-
116 or overflow can be recovered from with tolerable grace. The sign and
117 fraction bits of such a value will usually still be correct, and the
118 exponent bits will just be the lower eight bits of the true exponent:
119 that is, the true exponent plus or minus 256 as appropriate. If such a
120 value is passed to another function as is, and the return value from
121 that over- or underflows in the opposite direction, it is likely that
122 the final result can be trusted. Of course, this depends entirely on
123 the nature of the operations involved, and this paragraph should not
124 be taken as advice to pursue such approaches in general, just as this
125 document should not be taken as advice to make use of this library at
126 all. Note also that in the case of (5110), the exponential function,
127 and (5130), the power function, it is quite easy to request a
128 massively under- and/or overflowed result. In these cases the
129 functions in question exit early, and attempting to salvage something
130 from such results, with the possible exception of the sign bit, is
131 pretty much guaranteed to be fruitless.
132
133 (5020) is the integral partition function (referred to as modf in the
134 C library). Both return values are floating-point numbers, and both
135 have the same sign as the argument, such that :2 + :3 = :1 while
136 keeping the magnitude of :3 less than 1.
137
138 (5060) and (5070) are the "type-casting" functions. They translate
139 values to and from 32-bit two's-complement integers. (5070) truncates
140 any fractional amounts from its argument, and will signal an error if
141 the magnitude of the value requires more than 32 bits.
142
143 The trigonometric functions (5200), (5210), and (5220) will return
144 erroneous results when given very large arguments. At magnitudes
145 around 8 million or so, the result is occasionally off by one bit.
146 The errors get progressively worse, so that with magnitudes in the
147 neighborhood of the quintillions and higher, the number is wholly
148 inaccurate. However, at magnitudes around 8 million, there is already
149 a difference of over 20pi between consecutive floating-point numbers,
150 so it is hard to conceive of a purpose for which improved accuracy
151 would be useful.
152
153 In the descriptions for (5410) and (5419), phi refers to the golden
154 ratio.
155
156 * Internal routines in floatlib.i:
157
158 A floating-point library involves a fair amount of mathematical
159 functionality to begin with, something that INTERCAL is not generally
160 mistaken for having. The floating-point library therefore has a number
161 of internal functions to provide it with some 64-bit arithmetic, among
162 other things. While these routines were designed specifically for
163 internal use, and in many cases one would be hard-pressed to find
164 other uses for them in their current form, it was felt that they were
165 nonetheless worth documenting. Besides providing a guide for anyone so
166 foolish as to examine the actual code, it provides a sort of "itemized
167 bill" that helps to justify the inordinate size of the library.
168
169 In the following list, the notation :1:2 simply indicates that two
170 32-bit variables, in this example :1 and :2, are being used to hold a
171 single 64-bit integer value. The notation :1:2.1, on the other hand,
172 indicates a kind of double-precision floating-point value, a format
173 that the library uses frequently when storing intermediate values. The
174 two 32-bit variables hold the fraction bits, with no implied bits, and
175 ideally with the highest 1 bit at bit 55. The 16-bit variable holds
176 the exponent, but with a bias of 639 instead of 127. This is done so
177 that underflows do not affect the sign, which is also stored in the
178 16-bit variable in bit 10. These intermediate values are rounded (to
179 even), truncated, and stored in a single 32-bit variable by the
180 floating-point normalization function, (5600).
181
182 Finally, note that the numbers in the tables accessed by routines
183 (5690) through (5693) are generally tailored for the function that
184 applies them, and all of them use slightly different representations
185 for the values they encode.
186
187 Here are brief descriptions of the internal routines:
188
189 (5500) :1:2 <- :1:2 plus :3:4
190 .5 <- #0 if no overflow, #1 otherwise
191 (5510) :1 <- :1 plus :6, where :6 has at most one 1 bit
192 :6 <- #0 on overflow, nonzero otherwise
193 (5519) :1 <- :1 minus :6, where :6 has at most one 1 bit
194 :6 <- #0 on underflow, nonzero otherwise
195 (5520) .1 <- .1 minus #1
196 (5530) :1:2:3:4 <- :1:2 times :3:4
197 (5540) :1 <- :1:2 divided by :3:4, with :1:2 aligned on bit 63, :3:4
198 aligned on bit 62, and the result aligned on bit 31
199 .1 <- the exponent of the quotient plus #126
200 .5 <- #1 if the result should be rounded up, #0 otherwise
201 (5550) :1:2.1 <- :1:2.1 modulo :3:4.2
202 (5560) :1:2 <- two's complement of :1:2
203 (5570) :1:2 <- :1:2 left-shifted logically so that bit 63 is 1
204 .1 <- .1 minus the no. of bits shifted out
205 (5580) :3:4 <- :3:4 shifted right arithmetically .3 bits
206 (5590) :1:2 <- :1:2 shifted so that bit 55 is the highest 1 bit
207 .1 <- .1 plus or minus the no. of bits shifted
208 .5 <- the bits right-shifted out of :1:2, if any
209 (5600) :1 <- the normalization of :1:2.1
210 (5610) :1:2.1 <- the natural logarithm of :1:2.1
211 (5620) :1:2.1 <- e to the power of :1:2.1
212 (5640) :1:2.1 <- :1:2.1 to the power of the two's-complement integer
213 in :3
214 (5650) :1:2.1 <- :1 to the power of :2
215 .5 <- #1 if the power is not real, #2 otherwise
216 (5670) :1:2 <- an angle between -pi/2 and pi/2 that has the same
217 value for sine as :1, and stored in two's-complement,
218 fixed-point form, with the one's position at bit 62
219 .1 <- #1024 if the cosine of :1:2 will have the opposite sign
220 as the cosine of :1, #2048 otherwise
221 (5680) :1:2 <- the sine of :1:2 in two's-complement, fixed-point
222 :3:4 <- the cosine of :1:2
223 (5690) ;1 <- a table of 32 significant bits of the numbers 10^i,
224 where i ranges from 1 to 39
225 ,1 <- a table of exponents for the above powers of ten
226 (5691) ;1 <- a table of bit patterns representing ln(1 + 1/(2^i - 1)),
227 where i ranges from -1 to -26
228 (5692) ;1 <- a table of bit patterns representing -ln(1 + 2^i), where
229 i ranges from -1 to -32
230 (5693) ;1 <- a table of bit patterns representing arctan(2^i), where
231 i ranges from 0 to -30
232
233 * Bibliography:
234
235 Knuth, Donald E. "The Art of Computer Programming: Vol. 2
236 (Seminumerical Algorithms)", 2nd edition. Addison-Wesley, 1981.
237 This book provided the essential background as well as algorithms
238 for the basic arithmetic operations. Mention should also be made
239 of Vol. 1, which provided an algorithm for the natural logarithm
240 as a side note.
241
242 Koren, Israel. "Computer Arithmetic Algorithms". Prentice-Hall, 1993.
243 In addition to information on the IEEE standard, this book
244 provided the algorithms used for the exponential and trigonometric
245 functions.
246
247 Plauger, P. J. "The Standard C Library". Prentice-Hall, 1992.
248 This book provided assistance in designing the modulus and power
249 functions, as well as other miscellaneous contributions.