Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | .. _tut-fp-issues: |
| 2 | |
| 3 | ************************************************** |
| 4 | Floating Point Arithmetic: Issues and Limitations |
| 5 | ************************************************** |
| 6 | |
| 7 | .. sectionauthor:: Tim Peters <tim_one@users.sourceforge.net> |
| 8 | |
| 9 | |
| 10 | Floating-point numbers are represented in computer hardware as base 2 (binary) |
| 11 | fractions. For example, the decimal fraction :: |
| 12 | |
| 13 | 0.125 |
| 14 | |
| 15 | has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction :: |
| 16 | |
| 17 | 0.001 |
| 18 | |
| 19 | has value 0/2 + 0/4 + 1/8. These two fractions have identical values, the only |
| 20 | real difference being that the first is written in base 10 fractional notation, |
| 21 | and the second in base 2. |
| 22 | |
| 23 | Unfortunately, most decimal fractions cannot be represented exactly as binary |
| 24 | fractions. A consequence is that, in general, the decimal floating-point |
| 25 | numbers you enter are only approximated by the binary floating-point numbers |
| 26 | actually stored in the machine. |
| 27 | |
| 28 | The problem is easier to understand at first in base 10. Consider the fraction |
| 29 | 1/3. You can approximate that as a base 10 fraction:: |
| 30 | |
| 31 | 0.3 |
| 32 | |
| 33 | or, better, :: |
| 34 | |
| 35 | 0.33 |
| 36 | |
| 37 | or, better, :: |
| 38 | |
| 39 | 0.333 |
| 40 | |
| 41 | and so on. No matter how many digits you're willing to write down, the result |
| 42 | will never be exactly 1/3, but will be an increasingly better approximation of |
| 43 | 1/3. |
| 44 | |
| 45 | In the same way, no matter how many base 2 digits you're willing to use, the |
| 46 | decimal value 0.1 cannot be represented exactly as a base 2 fraction. In base |
| 47 | 2, 1/10 is the infinitely repeating fraction :: |
| 48 | |
| 49 | 0.0001100110011001100110011001100110011001100110011... |
| 50 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 51 | Stop at any finite number of bits, and you get an approximation. On most |
| 52 | machines today, floats are approximated using a binary fraction with |
Raymond Hettinger | 1d18068 | 2009-06-28 23:21:38 +0000 | [diff] [blame] | 53 | the numerator using the first 53 bits starting with the most significant bit and |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 54 | with the denominator as a power of two. In the case of 1/10, the binary fraction |
| 55 | is ``3602879701896397 / 2 ** 55`` which is close to but not exactly |
| 56 | equal to the true value of 1/10. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 57 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 58 | Many users are not aware of the approximation because of the way values are |
| 59 | displayed. Python only prints a decimal approximation to the true decimal |
| 60 | value of the binary approximation stored by the machine. On most machines, if |
| 61 | Python were to print the true decimal value of the binary approximation stored |
| 62 | for 0.1, it would have to display :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 63 | |
| 64 | >>> 0.1 |
| 65 | 0.1000000000000000055511151231257827021181583404541015625 |
| 66 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 67 | That is more digits than most people find useful, so Python keeps the number |
| 68 | of digits manageable by displaying a rounded value instead :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 69 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 70 | >>> 1 / 10 |
| 71 | 0.1 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 72 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 73 | Just remember, even though the printed result looks like the exact value |
| 74 | of 1/10, the actual stored value is the nearest representable binary fraction. |
| 75 | |
| 76 | Interestingly, there are many different decimal numbers that share the same |
| 77 | nearest approximate binary fraction. For example, the numbers ``0.1`` and |
| 78 | ``0.10000000000000001`` and |
| 79 | ``0.1000000000000000055511151231257827021181583404541015625`` are all |
| 80 | approximated by ``3602879701896397 / 2 ** 55``. Since all of these decimal |
Georg Brandl | b58f46f | 2009-04-24 19:06:29 +0000 | [diff] [blame] | 81 | values share the same approximation, any one of them could be displayed |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 82 | while still preserving the invariant ``eval(repr(x)) == x``. |
| 83 | |
Mark Dickinson | 6a74d34 | 2010-07-29 13:56:56 +0000 | [diff] [blame] | 84 | Historically, the Python prompt and built-in :func:`repr` function would choose |
Raymond Hettinger | eafaf4c | 2009-06-28 22:30:13 +0000 | [diff] [blame] | 85 | the one with 17 significant digits, ``0.10000000000000001``. Starting with |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 86 | Python 3.1, Python (on most systems) is now able to choose the shortest of |
| 87 | these and simply display ``0.1``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 88 | |
| 89 | Note that this is in the very nature of binary floating-point: this is not a bug |
| 90 | in Python, and it is not a bug in your code either. You'll see the same kind of |
| 91 | thing in all languages that support your hardware's floating-point arithmetic |
| 92 | (although some languages may not *display* the difference by default, or in all |
| 93 | output modes). |
| 94 | |
Ezio Melotti | e130a52 | 2011-10-19 10:58:56 +0300 | [diff] [blame] | 95 | For more pleasant output, you may wish to use string formatting to produce a limited number of significant digits:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 96 | |
Mark Dickinson | 388122d | 2010-08-04 20:56:28 +0000 | [diff] [blame] | 97 | >>> format(math.pi, '.12g') # give 12 significant digits |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 98 | '3.14159265359' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 99 | |
Mark Dickinson | 388122d | 2010-08-04 20:56:28 +0000 | [diff] [blame] | 100 | >>> format(math.pi, '.2f') # give 2 digits after the point |
| 101 | '3.14' |
| 102 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 103 | >>> repr(math.pi) |
| 104 | '3.141592653589793' |
| 105 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 106 | |
| 107 | It's important to realize that this is, in a real sense, an illusion: you're |
| 108 | simply rounding the *display* of the true machine value. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 109 | |
Raymond Hettinger | f0320c7 | 2009-04-26 20:10:50 +0000 | [diff] [blame] | 110 | One illusion may beget another. For example, since 0.1 is not exactly 1/10, |
Raymond Hettinger | 4af3629 | 2009-04-26 21:37:46 +0000 | [diff] [blame] | 111 | summing three values of 0.1 may not yield exactly 0.3, either:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 112 | |
Raymond Hettinger | 4af3629 | 2009-04-26 21:37:46 +0000 | [diff] [blame] | 113 | >>> .1 + .1 + .1 == .3 |
| 114 | False |
| 115 | |
| 116 | Also, since the 0.1 cannot get any closer to the exact value of 1/10 and |
| 117 | 0.3 cannot get any closer to the exact value of 3/10, then pre-rounding with |
| 118 | :func:`round` function cannot help:: |
| 119 | |
| 120 | >>> round(.1, 1) + round(.1, 1) + round(.1, 1) == round(.3, 1) |
| 121 | False |
| 122 | |
| 123 | Though the numbers cannot be made closer to their intended exact values, |
| 124 | the :func:`round` function can be useful for post-rounding so that results |
Raymond Hettinger | eafaf4c | 2009-06-28 22:30:13 +0000 | [diff] [blame] | 125 | with inexact values become comparable to one another:: |
Raymond Hettinger | 4af3629 | 2009-04-26 21:37:46 +0000 | [diff] [blame] | 126 | |
Raymond Hettinger | eafaf4c | 2009-06-28 22:30:13 +0000 | [diff] [blame] | 127 | >>> round(.1 + .1 + .1, 10) == round(.3, 10) |
Raymond Hettinger | 4af3629 | 2009-04-26 21:37:46 +0000 | [diff] [blame] | 128 | True |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 129 | |
| 130 | Binary floating-point arithmetic holds many surprises like this. The problem |
| 131 | with "0.1" is explained in precise detail below, in the "Representation Error" |
| 132 | section. See `The Perils of Floating Point <http://www.lahey.com/float.htm>`_ |
| 133 | for a more complete account of other common surprises. |
| 134 | |
| 135 | As that says near the end, "there are no easy answers." Still, don't be unduly |
| 136 | wary of floating-point! The errors in Python float operations are inherited |
| 137 | from the floating-point hardware, and on most machines are on the order of no |
| 138 | more than 1 part in 2\*\*53 per operation. That's more than adequate for most |
Raymond Hettinger | eafaf4c | 2009-06-28 22:30:13 +0000 | [diff] [blame] | 139 | tasks, but you do need to keep in mind that it's not decimal arithmetic and |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 140 | that every float operation can suffer a new rounding error. |
| 141 | |
| 142 | While pathological cases do exist, for most casual use of floating-point |
| 143 | arithmetic you'll see the result you expect in the end if you simply round the |
| 144 | display of your final results to the number of decimal digits you expect. |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 145 | :func:`str` usually suffices, and for finer control see the :meth:`str.format` |
| 146 | method's format specifiers in :ref:`formatstrings`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 147 | |
Raymond Hettinger | eba99df | 2008-10-05 17:57:52 +0000 | [diff] [blame] | 148 | For use cases which require exact decimal representation, try using the |
| 149 | :mod:`decimal` module which implements decimal arithmetic suitable for |
| 150 | accounting applications and high-precision applications. |
| 151 | |
| 152 | Another form of exact arithmetic is supported by the :mod:`fractions` module |
| 153 | which implements arithmetic based on rational numbers (so the numbers like |
| 154 | 1/3 can be represented exactly). |
| 155 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 156 | If you are a heavy user of floating point operations you should take a look |
| 157 | at the Numerical Python package and many other packages for mathematical and |
| 158 | statistical operations supplied by the SciPy project. See <http://scipy.org>. |
Raymond Hettinger | 9fce0ba | 2008-10-05 16:46:29 +0000 | [diff] [blame] | 159 | |
| 160 | Python provides tools that may help on those rare occasions when you really |
| 161 | *do* want to know the exact value of a float. The |
| 162 | :meth:`float.as_integer_ratio` method expresses the value of a float as a |
| 163 | fraction:: |
| 164 | |
| 165 | >>> x = 3.14159 |
| 166 | >>> x.as_integer_ratio() |
Raymond Hettinger | eafaf4c | 2009-06-28 22:30:13 +0000 | [diff] [blame] | 167 | (3537115888337719, 1125899906842624) |
Raymond Hettinger | 9fce0ba | 2008-10-05 16:46:29 +0000 | [diff] [blame] | 168 | |
| 169 | Since the ratio is exact, it can be used to losslessly recreate the |
| 170 | original value:: |
| 171 | |
| 172 | >>> x == 3537115888337719 / 1125899906842624 |
| 173 | True |
| 174 | |
| 175 | The :meth:`float.hex` method expresses a float in hexadecimal (base |
| 176 | 16), again giving the exact value stored by your computer:: |
| 177 | |
| 178 | >>> x.hex() |
| 179 | '0x1.921f9f01b866ep+1' |
| 180 | |
| 181 | This precise hexadecimal representation can be used to reconstruct |
| 182 | the float value exactly:: |
| 183 | |
| 184 | >>> x == float.fromhex('0x1.921f9f01b866ep+1') |
| 185 | True |
| 186 | |
| 187 | Since the representation is exact, it is useful for reliably porting values |
| 188 | across different versions of Python (platform independence) and exchanging |
| 189 | data with other languages that support the same format (such as Java and C99). |
| 190 | |
Raymond Hettinger | 5afb5c6 | 2009-04-26 22:01:46 +0000 | [diff] [blame] | 191 | Another helpful tool is the :func:`math.fsum` function which helps mitigate |
| 192 | loss-of-precision during summation. It tracks "lost digits" as values are |
| 193 | added onto a running total. That can make a difference in overall accuracy |
| 194 | so that the errors do not accumulate to the point where they affect the |
| 195 | final total: |
| 196 | |
| 197 | >>> sum([0.1] * 10) == 1.0 |
| 198 | False |
| 199 | >>> math.fsum([0.1] * 10) == 1.0 |
| 200 | True |
Raymond Hettinger | 9fce0ba | 2008-10-05 16:46:29 +0000 | [diff] [blame] | 201 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 202 | .. _tut-fp-error: |
| 203 | |
| 204 | Representation Error |
| 205 | ==================== |
| 206 | |
| 207 | This section explains the "0.1" example in detail, and shows how you can perform |
| 208 | an exact analysis of cases like this yourself. Basic familiarity with binary |
| 209 | floating-point representation is assumed. |
| 210 | |
| 211 | :dfn:`Representation error` refers to the fact that some (most, actually) |
| 212 | decimal fractions cannot be represented exactly as binary (base 2) fractions. |
| 213 | This is the chief reason why Python (or Perl, C, C++, Java, Fortran, and many |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 214 | others) often won't display the exact decimal number you expect. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 215 | |
| 216 | Why is that? 1/10 is not exactly representable as a binary fraction. Almost all |
| 217 | machines today (November 2000) use IEEE-754 floating point arithmetic, and |
| 218 | almost all platforms map Python floats to IEEE-754 "double precision". 754 |
| 219 | doubles contain 53 bits of precision, so on input the computer strives to |
Benjamin Peterson | 5c6d787 | 2009-02-06 02:40:07 +0000 | [diff] [blame] | 220 | convert 0.1 to the closest fraction it can of the form *J*/2**\ *N* where *J* is |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 221 | an integer containing exactly 53 bits. Rewriting :: |
| 222 | |
| 223 | 1 / 10 ~= J / (2**N) |
| 224 | |
| 225 | as :: |
| 226 | |
| 227 | J ~= 2**N / 10 |
| 228 | |
| 229 | and recalling that *J* has exactly 53 bits (is ``>= 2**52`` but ``< 2**53``), |
| 230 | the best value for *N* is 56:: |
| 231 | |
Raymond Hettinger | 1d18068 | 2009-06-28 23:21:38 +0000 | [diff] [blame] | 232 | >>> 2**52 <= 2**56 // 10 < 2**53 |
| 233 | True |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 234 | |
| 235 | That is, 56 is the only value for *N* that leaves *J* with exactly 53 bits. The |
| 236 | best possible value for *J* is then that quotient rounded:: |
| 237 | |
| 238 | >>> q, r = divmod(2**56, 10) |
| 239 | >>> r |
Georg Brandl | bae1b94 | 2008-08-10 12:16:45 +0000 | [diff] [blame] | 240 | 6 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 241 | |
| 242 | Since the remainder is more than half of 10, the best approximation is obtained |
| 243 | by rounding up:: |
| 244 | |
| 245 | >>> q+1 |
Georg Brandl | bae1b94 | 2008-08-10 12:16:45 +0000 | [diff] [blame] | 246 | 7205759403792794 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 247 | |
Raymond Hettinger | 1d18068 | 2009-06-28 23:21:38 +0000 | [diff] [blame] | 248 | Therefore the best possible approximation to 1/10 in 754 double precision is:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 249 | |
Raymond Hettinger | 1d18068 | 2009-06-28 23:21:38 +0000 | [diff] [blame] | 250 | 7205759403792794 / 2 ** 56 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 251 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 252 | Dividing both the numerator and denominator by two reduces the fraction to:: |
| 253 | |
Raymond Hettinger | 1d18068 | 2009-06-28 23:21:38 +0000 | [diff] [blame] | 254 | 3602879701896397 / 2 ** 55 |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 255 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 256 | Note that since we rounded up, this is actually a little bit larger than 1/10; |
| 257 | if we had not rounded up, the quotient would have been a little bit smaller than |
| 258 | 1/10. But in no case can it be *exactly* 1/10! |
| 259 | |
| 260 | So the computer never "sees" 1/10: what it sees is the exact fraction given |
| 261 | above, the best 754 double approximation it can get:: |
| 262 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 263 | >>> 0.1 * 2 ** 55 |
| 264 | 3602879701896397.0 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 265 | |
Raymond Hettinger | 1d18068 | 2009-06-28 23:21:38 +0000 | [diff] [blame] | 266 | If we multiply that fraction by 10\*\*55, we can see the value out to |
| 267 | 55 decimal digits:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 268 | |
Raymond Hettinger | 1d18068 | 2009-06-28 23:21:38 +0000 | [diff] [blame] | 269 | >>> 3602879701896397 * 10 ** 55 // 2 ** 55 |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 270 | 1000000000000000055511151231257827021181583404541015625 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 271 | |
Raymond Hettinger | 1d18068 | 2009-06-28 23:21:38 +0000 | [diff] [blame] | 272 | meaning that the exact number stored in the computer is equal to |
| 273 | the decimal value 0.1000000000000000055511151231257827021181583404541015625. |
| 274 | Instead of displaying the full decimal value, many languages (including |
| 275 | older versions of Python), round the result to 17 significant digits:: |
| 276 | |
| 277 | >>> format(0.1, '.17f') |
| 278 | '0.10000000000000001' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 279 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 280 | The :mod:`fractions` and :mod:`decimal` modules make these calculations |
| 281 | easy:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 282 | |
Raymond Hettinger | 8bd1d4f | 2009-04-24 03:09:06 +0000 | [diff] [blame] | 283 | >>> from decimal import Decimal |
| 284 | >>> from fractions import Fraction |
Raymond Hettinger | 1d18068 | 2009-06-28 23:21:38 +0000 | [diff] [blame] | 285 | |
| 286 | >>> Fraction.from_float(0.1) |
| 287 | Fraction(3602879701896397, 36028797018963968) |
| 288 | |
| 289 | >>> (0.1).as_integer_ratio() |
| 290 | (3602879701896397, 36028797018963968) |
| 291 | |
| 292 | >>> Decimal.from_float(0.1) |
| 293 | Decimal('0.1000000000000000055511151231257827021181583404541015625') |
| 294 | |
| 295 | >>> format(Decimal.from_float(0.1), '.17') |
| 296 | '0.10000000000000001' |