Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | .. _tut-fp-issues: |
| 2 | |
| 3 | ************************************************** |
| 4 | Floating Point Arithmetic: Issues and Limitations |
| 5 | ************************************************** |
| 6 | |
| 7 | .. sectionauthor:: Tim Peters <tim_one@users.sourceforge.net> |
| 8 | |
| 9 | |
| 10 | Floating-point numbers are represented in computer hardware as base 2 (binary) |
| 11 | fractions. For example, the decimal fraction :: |
| 12 | |
| 13 | 0.125 |
| 14 | |
| 15 | has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction :: |
| 16 | |
| 17 | 0.001 |
| 18 | |
| 19 | has value 0/2 + 0/4 + 1/8. These two fractions have identical values, the only |
| 20 | real difference being that the first is written in base 10 fractional notation, |
| 21 | and the second in base 2. |
| 22 | |
| 23 | Unfortunately, most decimal fractions cannot be represented exactly as binary |
| 24 | fractions. A consequence is that, in general, the decimal floating-point |
| 25 | numbers you enter are only approximated by the binary floating-point numbers |
| 26 | actually stored in the machine. |
| 27 | |
| 28 | The problem is easier to understand at first in base 10. Consider the fraction |
| 29 | 1/3. You can approximate that as a base 10 fraction:: |
| 30 | |
| 31 | 0.3 |
| 32 | |
| 33 | or, better, :: |
| 34 | |
| 35 | 0.33 |
| 36 | |
| 37 | or, better, :: |
| 38 | |
| 39 | 0.333 |
| 40 | |
| 41 | and so on. No matter how many digits you're willing to write down, the result |
| 42 | will never be exactly 1/3, but will be an increasingly better approximation of |
| 43 | 1/3. |
| 44 | |
| 45 | In the same way, no matter how many base 2 digits you're willing to use, the |
| 46 | decimal value 0.1 cannot be represented exactly as a base 2 fraction. In base |
| 47 | 2, 1/10 is the infinitely repeating fraction :: |
| 48 | |
| 49 | 0.0001100110011001100110011001100110011001100110011... |
| 50 | |
| 51 | Stop at any finite number of bits, and you get an approximation. This is why |
| 52 | you see things like:: |
| 53 | |
| 54 | >>> 0.1 |
| 55 | 0.10000000000000001 |
| 56 | |
| 57 | On most machines today, that is what you'll see if you enter 0.1 at a Python |
| 58 | prompt. You may not, though, because the number of bits used by the hardware to |
| 59 | store floating-point values can vary across machines, and Python only prints a |
| 60 | decimal approximation to the true decimal value of the binary approximation |
| 61 | stored by the machine. On most machines, if Python were to print the true |
| 62 | decimal value of the binary approximation stored for 0.1, it would have to |
| 63 | display :: |
| 64 | |
| 65 | >>> 0.1 |
| 66 | 0.1000000000000000055511151231257827021181583404541015625 |
| 67 | |
Mark Dickinson | 934896d | 2009-02-21 20:59:32 +0000 | [diff] [blame] | 68 | instead! The Python prompt uses the built-in :func:`repr` function to obtain a |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 69 | string version of everything it displays. For floats, ``repr(float)`` rounds |
| 70 | the true decimal value to 17 significant digits, giving :: |
| 71 | |
| 72 | 0.10000000000000001 |
| 73 | |
| 74 | ``repr(float)`` produces 17 significant digits because it turns out that's |
| 75 | enough (on most machines) so that ``eval(repr(x)) == x`` exactly for all finite |
| 76 | floats *x*, but rounding to 16 digits is not enough to make that true. |
| 77 | |
| 78 | Note that this is in the very nature of binary floating-point: this is not a bug |
| 79 | in Python, and it is not a bug in your code either. You'll see the same kind of |
| 80 | thing in all languages that support your hardware's floating-point arithmetic |
| 81 | (although some languages may not *display* the difference by default, or in all |
| 82 | output modes). |
| 83 | |
Mark Dickinson | 934896d | 2009-02-21 20:59:32 +0000 | [diff] [blame] | 84 | Python's built-in :func:`str` function produces only 12 significant digits, and |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 85 | you may wish to use that instead. It's unusual for ``eval(str(x))`` to |
| 86 | reproduce *x*, but the output may be more pleasant to look at:: |
| 87 | |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 88 | >>> print(str(0.1)) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 89 | 0.1 |
| 90 | |
| 91 | It's important to realize that this is, in a real sense, an illusion: the value |
| 92 | in the machine is not exactly 1/10, you're simply rounding the *display* of the |
| 93 | true machine value. |
| 94 | |
| 95 | Other surprises follow from this one. For example, after seeing :: |
| 96 | |
| 97 | >>> 0.1 |
| 98 | 0.10000000000000001 |
| 99 | |
| 100 | you may be tempted to use the :func:`round` function to chop it back to the |
| 101 | single digit you expect. But that makes no difference:: |
| 102 | |
| 103 | >>> round(0.1, 1) |
| 104 | 0.10000000000000001 |
| 105 | |
| 106 | The problem is that the binary floating-point value stored for "0.1" was already |
| 107 | the best possible binary approximation to 1/10, so trying to round it again |
| 108 | can't make it better: it was already as good as it gets. |
| 109 | |
| 110 | Another consequence is that since 0.1 is not exactly 1/10, summing ten values of |
| 111 | 0.1 may not yield exactly 1.0, either:: |
| 112 | |
| 113 | >>> sum = 0.0 |
| 114 | >>> for i in range(10): |
| 115 | ... sum += 0.1 |
| 116 | ... |
| 117 | >>> sum |
| 118 | 0.99999999999999989 |
| 119 | |
| 120 | Binary floating-point arithmetic holds many surprises like this. The problem |
| 121 | with "0.1" is explained in precise detail below, in the "Representation Error" |
| 122 | section. See `The Perils of Floating Point <http://www.lahey.com/float.htm>`_ |
| 123 | for a more complete account of other common surprises. |
| 124 | |
| 125 | As that says near the end, "there are no easy answers." Still, don't be unduly |
| 126 | wary of floating-point! The errors in Python float operations are inherited |
| 127 | from the floating-point hardware, and on most machines are on the order of no |
| 128 | more than 1 part in 2\*\*53 per operation. That's more than adequate for most |
| 129 | tasks, but you do need to keep in mind that it's not decimal arithmetic, and |
| 130 | that every float operation can suffer a new rounding error. |
| 131 | |
| 132 | While pathological cases do exist, for most casual use of floating-point |
| 133 | arithmetic you'll see the result you expect in the end if you simply round the |
| 134 | display of your final results to the number of decimal digits you expect. |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 135 | :func:`str` usually suffices, and for finer control see the :meth:`str.format` |
| 136 | method's format specifiers in :ref:`formatstrings`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 137 | |
Raymond Hettinger | eba99df | 2008-10-05 17:57:52 +0000 | [diff] [blame] | 138 | For use cases which require exact decimal representation, try using the |
| 139 | :mod:`decimal` module which implements decimal arithmetic suitable for |
| 140 | accounting applications and high-precision applications. |
| 141 | |
| 142 | Another form of exact arithmetic is supported by the :mod:`fractions` module |
| 143 | which implements arithmetic based on rational numbers (so the numbers like |
| 144 | 1/3 can be represented exactly). |
| 145 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 146 | If you are a heavy user of floating point operations you should take a look |
| 147 | at the Numerical Python package and many other packages for mathematical and |
| 148 | statistical operations supplied by the SciPy project. See <http://scipy.org>. |
Raymond Hettinger | 9fce0ba | 2008-10-05 16:46:29 +0000 | [diff] [blame] | 149 | |
| 150 | Python provides tools that may help on those rare occasions when you really |
| 151 | *do* want to know the exact value of a float. The |
| 152 | :meth:`float.as_integer_ratio` method expresses the value of a float as a |
| 153 | fraction:: |
| 154 | |
| 155 | >>> x = 3.14159 |
| 156 | >>> x.as_integer_ratio() |
| 157 | (3537115888337719L, 1125899906842624L) |
| 158 | |
| 159 | Since the ratio is exact, it can be used to losslessly recreate the |
| 160 | original value:: |
| 161 | |
| 162 | >>> x == 3537115888337719 / 1125899906842624 |
| 163 | True |
| 164 | |
| 165 | The :meth:`float.hex` method expresses a float in hexadecimal (base |
| 166 | 16), again giving the exact value stored by your computer:: |
| 167 | |
| 168 | >>> x.hex() |
| 169 | '0x1.921f9f01b866ep+1' |
| 170 | |
| 171 | This precise hexadecimal representation can be used to reconstruct |
| 172 | the float value exactly:: |
| 173 | |
| 174 | >>> x == float.fromhex('0x1.921f9f01b866ep+1') |
| 175 | True |
| 176 | |
| 177 | Since the representation is exact, it is useful for reliably porting values |
| 178 | across different versions of Python (platform independence) and exchanging |
| 179 | data with other languages that support the same format (such as Java and C99). |
| 180 | |
| 181 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 182 | .. _tut-fp-error: |
| 183 | |
| 184 | Representation Error |
| 185 | ==================== |
| 186 | |
| 187 | This section explains the "0.1" example in detail, and shows how you can perform |
| 188 | an exact analysis of cases like this yourself. Basic familiarity with binary |
| 189 | floating-point representation is assumed. |
| 190 | |
| 191 | :dfn:`Representation error` refers to the fact that some (most, actually) |
| 192 | decimal fractions cannot be represented exactly as binary (base 2) fractions. |
| 193 | This is the chief reason why Python (or Perl, C, C++, Java, Fortran, and many |
| 194 | others) often won't display the exact decimal number you expect:: |
| 195 | |
| 196 | >>> 0.1 |
| 197 | 0.10000000000000001 |
| 198 | |
| 199 | Why is that? 1/10 is not exactly representable as a binary fraction. Almost all |
| 200 | machines today (November 2000) use IEEE-754 floating point arithmetic, and |
| 201 | almost all platforms map Python floats to IEEE-754 "double precision". 754 |
| 202 | doubles contain 53 bits of precision, so on input the computer strives to |
Benjamin Peterson | 5c6d787 | 2009-02-06 02:40:07 +0000 | [diff] [blame] | 203 | convert 0.1 to the closest fraction it can of the form *J*/2**\ *N* where *J* is |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 204 | an integer containing exactly 53 bits. Rewriting :: |
| 205 | |
| 206 | 1 / 10 ~= J / (2**N) |
| 207 | |
| 208 | as :: |
| 209 | |
| 210 | J ~= 2**N / 10 |
| 211 | |
| 212 | and recalling that *J* has exactly 53 bits (is ``>= 2**52`` but ``< 2**53``), |
| 213 | the best value for *N* is 56:: |
| 214 | |
| 215 | >>> 2**52 |
Georg Brandl | bae1b94 | 2008-08-10 12:16:45 +0000 | [diff] [blame] | 216 | 4503599627370496 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 217 | >>> 2**53 |
Georg Brandl | bae1b94 | 2008-08-10 12:16:45 +0000 | [diff] [blame] | 218 | 9007199254740992 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 219 | >>> 2**56/10 |
Georg Brandl | bae1b94 | 2008-08-10 12:16:45 +0000 | [diff] [blame] | 220 | 7205759403792794.0 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 221 | |
| 222 | That is, 56 is the only value for *N* that leaves *J* with exactly 53 bits. The |
| 223 | best possible value for *J* is then that quotient rounded:: |
| 224 | |
| 225 | >>> q, r = divmod(2**56, 10) |
| 226 | >>> r |
Georg Brandl | bae1b94 | 2008-08-10 12:16:45 +0000 | [diff] [blame] | 227 | 6 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 228 | |
| 229 | Since the remainder is more than half of 10, the best approximation is obtained |
| 230 | by rounding up:: |
| 231 | |
| 232 | >>> q+1 |
Georg Brandl | bae1b94 | 2008-08-10 12:16:45 +0000 | [diff] [blame] | 233 | 7205759403792794 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 234 | |
| 235 | Therefore the best possible approximation to 1/10 in 754 double precision is |
| 236 | that over 2\*\*56, or :: |
| 237 | |
| 238 | 7205759403792794 / 72057594037927936 |
| 239 | |
| 240 | Note that since we rounded up, this is actually a little bit larger than 1/10; |
| 241 | if we had not rounded up, the quotient would have been a little bit smaller than |
| 242 | 1/10. But in no case can it be *exactly* 1/10! |
| 243 | |
| 244 | So the computer never "sees" 1/10: what it sees is the exact fraction given |
| 245 | above, the best 754 double approximation it can get:: |
| 246 | |
| 247 | >>> .1 * 2**56 |
| 248 | 7205759403792794.0 |
| 249 | |
| 250 | If we multiply that fraction by 10\*\*30, we can see the (truncated) value of |
| 251 | its 30 most significant decimal digits:: |
| 252 | |
| 253 | >>> 7205759403792794 * 10**30 / 2**56 |
Georg Brandl | bae1b94 | 2008-08-10 12:16:45 +0000 | [diff] [blame] | 254 | 100000000000000005551115123125 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 255 | |
| 256 | meaning that the exact number stored in the computer is approximately equal to |
| 257 | the decimal value 0.100000000000000005551115123125. Rounding that to 17 |
| 258 | significant digits gives the 0.10000000000000001 that Python displays (well, |
| 259 | will display on any 754-conforming platform that does best-possible input and |
| 260 | output conversions in its C library --- yours may not!). |
| 261 | |
| 262 | |