Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1 | ====================== |
| 2 | Design and History FAQ |
| 3 | ====================== |
| 4 | |
Andrés Delfino | 38cf49b | 2018-06-23 15:27:16 -0300 | [diff] [blame] | 5 | .. only:: html |
| 6 | |
| 7 | .. contents:: |
| 8 | |
| 9 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 10 | Why does Python use indentation for grouping of statements? |
| 11 | ----------------------------------------------------------- |
| 12 | |
| 13 | Guido van Rossum believes that using indentation for grouping is extremely |
| 14 | elegant and contributes a lot to the clarity of the average Python program. |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 15 | Most people learn to love this feature after a while. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 16 | |
| 17 | Since there are no begin/end brackets there cannot be a disagreement between |
| 18 | grouping perceived by the parser and the human reader. Occasionally C |
| 19 | programmers will encounter a fragment of code like this:: |
| 20 | |
| 21 | if (x <= y) |
| 22 | x++; |
| 23 | y--; |
| 24 | z++; |
| 25 | |
| 26 | Only the ``x++`` statement is executed if the condition is true, but the |
Aeros | d006800 | 2019-06-21 00:43:07 -0400 | [diff] [blame] | 27 | indentation leads many to believe otherwise. Even experienced C programmers will |
| 28 | sometimes stare at it a long time wondering as to why ``y`` is being decremented even |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 29 | for ``x > y``. |
| 30 | |
| 31 | Because there are no begin/end brackets, Python is much less prone to |
| 32 | coding-style conflicts. In C there are many different ways to place the braces. |
Aeros | d006800 | 2019-06-21 00:43:07 -0400 | [diff] [blame] | 33 | After becoming used to reading and writing code using a particular style, |
| 34 | it is normal to feel somewhat uneasy when reading (or being required to write) |
| 35 | in a different one. |
| 36 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 37 | |
Georg Brandl | 6faee4e | 2010-09-21 14:48:28 +0000 | [diff] [blame] | 38 | Many coding styles place begin/end brackets on a line by themselves. This makes |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 39 | programs considerably longer and wastes valuable screen space, making it harder |
| 40 | to get a good overview of a program. Ideally, a function should fit on one |
Serhiy Storchaka | c7b1a0b | 2016-11-26 13:43:28 +0200 | [diff] [blame] | 41 | screen (say, 20--30 lines). 20 lines of Python can do a lot more work than 20 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 42 | lines of C. This is not solely due to the lack of begin/end brackets -- the |
| 43 | lack of declarations and the high-level data types are also responsible -- but |
| 44 | the indentation-based syntax certainly helps. |
| 45 | |
| 46 | |
| 47 | Why am I getting strange results with simple arithmetic operations? |
| 48 | ------------------------------------------------------------------- |
| 49 | |
| 50 | See the next question. |
| 51 | |
| 52 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 53 | Why are floating-point calculations so inaccurate? |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 54 | -------------------------------------------------- |
| 55 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 56 | Users are often surprised by results like this:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 57 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 58 | >>> 1.2 - 1.0 |
Georg Brandl | 9205e9e | 2014-10-06 17:51:09 +0200 | [diff] [blame] | 59 | 0.19999999999999996 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 60 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 61 | and think it is a bug in Python. It's not. This has little to do with Python, |
| 62 | and much more to do with how the underlying platform handles floating-point |
| 63 | numbers. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 64 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 65 | The :class:`float` type in CPython uses a C ``double`` for storage. A |
| 66 | :class:`float` object's value is stored in binary floating-point with a fixed |
| 67 | precision (typically 53 bits) and Python uses C operations, which in turn rely |
| 68 | on the hardware implementation in the processor, to perform floating-point |
| 69 | operations. This means that as far as floating-point operations are concerned, |
| 70 | Python behaves like many popular languages including C and Java. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 71 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 72 | Many numbers that can be written easily in decimal notation cannot be expressed |
| 73 | exactly in binary floating-point. For example, after:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 74 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 75 | >>> x = 1.2 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 76 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 77 | the value stored for ``x`` is a (very good) approximation to the decimal value |
| 78 | ``1.2``, but is not exactly equal to it. On a typical machine, the actual |
| 79 | stored value is:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 80 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 81 | 1.0011001100110011001100110011001100110011001100110011 (binary) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 82 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 83 | which is exactly:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 84 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 85 | 1.1999999999999999555910790149937383830547332763671875 (decimal) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 86 | |
Serhiy Storchaka | c7b1a0b | 2016-11-26 13:43:28 +0200 | [diff] [blame] | 87 | The typical precision of 53 bits provides Python floats with 15--16 |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 88 | decimal digits of accuracy. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 89 | |
Mark Dickinson | ba3b0d8 | 2012-05-13 21:00:35 +0100 | [diff] [blame] | 90 | For a fuller explanation, please see the :ref:`floating point arithmetic |
| 91 | <tut-fp-issues>` chapter in the Python tutorial. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 92 | |
| 93 | |
| 94 | Why are Python strings immutable? |
| 95 | --------------------------------- |
| 96 | |
| 97 | There are several advantages. |
| 98 | |
| 99 | One is performance: knowing that a string is immutable means we can allocate |
| 100 | space for it at creation time, and the storage requirements are fixed and |
| 101 | unchanging. This is also one of the reasons for the distinction between tuples |
| 102 | and lists. |
| 103 | |
| 104 | Another advantage is that strings in Python are considered as "elemental" as |
| 105 | numbers. No amount of activity will change the value 8 to anything else, and in |
| 106 | Python, no amount of activity will change the string "eight" to anything else. |
| 107 | |
| 108 | |
| 109 | .. _why-self: |
| 110 | |
| 111 | Why must 'self' be used explicitly in method definitions and calls? |
| 112 | ------------------------------------------------------------------- |
| 113 | |
| 114 | The idea was borrowed from Modula-3. It turns out to be very useful, for a |
| 115 | variety of reasons. |
| 116 | |
| 117 | First, it's more obvious that you are using a method or instance attribute |
| 118 | instead of a local variable. Reading ``self.x`` or ``self.meth()`` makes it |
| 119 | absolutely clear that an instance variable or method is used even if you don't |
| 120 | know the class definition by heart. In C++, you can sort of tell by the lack of |
| 121 | a local variable declaration (assuming globals are rare or easily recognizable) |
| 122 | -- but in Python, there are no local variable declarations, so you'd have to |
| 123 | look up the class definition to be sure. Some C++ and Java coding standards |
| 124 | call for instance attributes to have an ``m_`` prefix, so this explicitness is |
| 125 | still useful in those languages, too. |
| 126 | |
| 127 | Second, it means that no special syntax is necessary if you want to explicitly |
| 128 | reference or call the method from a particular class. In C++, if you want to |
| 129 | use a method from a base class which is overridden in a derived class, you have |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 130 | to use the ``::`` operator -- in Python you can write |
| 131 | ``baseclass.methodname(self, <argument list>)``. This is particularly useful |
| 132 | for :meth:`__init__` methods, and in general in cases where a derived class |
| 133 | method wants to extend the base class method of the same name and thus has to |
| 134 | call the base class method somehow. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 135 | |
| 136 | Finally, for instance variables it solves a syntactic problem with assignment: |
| 137 | since local variables in Python are (by definition!) those variables to which a |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 138 | value is assigned in a function body (and that aren't explicitly declared |
| 139 | global), there has to be some way to tell the interpreter that an assignment was |
| 140 | meant to assign to an instance variable instead of to a local variable, and it |
| 141 | should preferably be syntactic (for efficiency reasons). C++ does this through |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 142 | declarations, but Python doesn't have declarations and it would be a pity having |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 143 | to introduce them just for this purpose. Using the explicit ``self.var`` solves |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 144 | this nicely. Similarly, for using instance variables, having to write |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 145 | ``self.var`` means that references to unqualified names inside a method don't |
| 146 | have to search the instance's directories. To put it another way, local |
| 147 | variables and instance variables live in two different namespaces, and you need |
| 148 | to tell Python which namespace to use. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 149 | |
| 150 | |
| 151 | Why can't I use an assignment in an expression? |
| 152 | ----------------------------------------------- |
| 153 | |
| 154 | Many people used to C or Perl complain that they want to use this C idiom: |
| 155 | |
| 156 | .. code-block:: c |
| 157 | |
| 158 | while (line = readline(f)) { |
| 159 | // do something with line |
| 160 | } |
| 161 | |
| 162 | where in Python you're forced to write this:: |
| 163 | |
| 164 | while True: |
| 165 | line = f.readline() |
| 166 | if not line: |
| 167 | break |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 168 | ... # do something with line |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 169 | |
| 170 | The reason for not allowing assignment in Python expressions is a common, |
| 171 | hard-to-find bug in those other languages, caused by this construct: |
| 172 | |
| 173 | .. code-block:: c |
| 174 | |
| 175 | if (x = 0) { |
| 176 | // error handling |
| 177 | } |
| 178 | else { |
| 179 | // code that only works for nonzero x |
| 180 | } |
| 181 | |
| 182 | The error is a simple typo: ``x = 0``, which assigns 0 to the variable ``x``, |
| 183 | was written while the comparison ``x == 0`` is certainly what was intended. |
| 184 | |
| 185 | Many alternatives have been proposed. Most are hacks that save some typing but |
| 186 | use arbitrary or cryptic syntax or keywords, and fail the simple criterion for |
| 187 | language change proposals: it should intuitively suggest the proper meaning to a |
| 188 | human reader who has not yet been introduced to the construct. |
| 189 | |
| 190 | An interesting phenomenon is that most experienced Python programmers recognize |
| 191 | the ``while True`` idiom and don't seem to be missing the assignment in |
| 192 | expression construct much; it's only newcomers who express a strong desire to |
| 193 | add this to the language. |
| 194 | |
| 195 | There's an alternative way of spelling this that seems attractive but is |
| 196 | generally less robust than the "while True" solution:: |
| 197 | |
| 198 | line = f.readline() |
| 199 | while line: |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 200 | ... # do something with line... |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 201 | line = f.readline() |
| 202 | |
| 203 | The problem with this is that if you change your mind about exactly how you get |
| 204 | the next line (e.g. you want to change it into ``sys.stdin.readline()``) you |
| 205 | have to remember to change two places in your program -- the second occurrence |
| 206 | is hidden at the bottom of the loop. |
| 207 | |
| 208 | The best approach is to use iterators, making it possible to loop through |
Antoine Pitrou | 11cb961 | 2010-09-15 11:11:28 +0000 | [diff] [blame] | 209 | objects using the ``for`` statement. For example, :term:`file objects |
| 210 | <file object>` support the iterator protocol, so you can write simply:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 211 | |
| 212 | for line in f: |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 213 | ... # do something with line... |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 214 | |
| 215 | |
| 216 | |
| 217 | Why does Python use methods for some functionality (e.g. list.index()) but functions for other (e.g. len(list))? |
| 218 | ---------------------------------------------------------------------------------------------------------------- |
| 219 | |
INADA Naoki | c48e26d | 2018-07-31 14:49:22 +0900 | [diff] [blame] | 220 | As Guido said: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 221 | |
INADA Naoki | c48e26d | 2018-07-31 14:49:22 +0900 | [diff] [blame] | 222 | (a) For some operations, prefix notation just reads better than |
| 223 | postfix -- prefix (and infix!) operations have a long tradition in |
| 224 | mathematics which likes notations where the visuals help the |
| 225 | mathematician thinking about a problem. Compare the easy with which we |
| 226 | rewrite a formula like x*(a+b) into x*a + x*b to the clumsiness of |
| 227 | doing the same thing using a raw OO notation. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 228 | |
INADA Naoki | c48e26d | 2018-07-31 14:49:22 +0900 | [diff] [blame] | 229 | (b) When I read code that says len(x) I *know* that it is asking for |
| 230 | the length of something. This tells me two things: the result is an |
| 231 | integer, and the argument is some kind of container. To the contrary, |
| 232 | when I read x.len(), I have to already know that x is some kind of |
| 233 | container implementing an interface or inheriting from a class that |
| 234 | has a standard len(). Witness the confusion we occasionally have when |
| 235 | a class that is not implementing a mapping has a get() or keys() |
| 236 | method, or something that isn't a file has a write() method. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 237 | |
INADA Naoki | c48e26d | 2018-07-31 14:49:22 +0900 | [diff] [blame] | 238 | -- https://mail.python.org/pipermail/python-3000/2006-November/004643.html |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 239 | |
| 240 | |
| 241 | Why is join() a string method instead of a list or tuple method? |
| 242 | ---------------------------------------------------------------- |
| 243 | |
| 244 | Strings became much more like other standard types starting in Python 1.6, when |
| 245 | methods were added which give the same functionality that has always been |
| 246 | available using the functions of the string module. Most of these new methods |
| 247 | have been widely accepted, but the one which appears to make some programmers |
| 248 | feel uncomfortable is:: |
| 249 | |
| 250 | ", ".join(['1', '2', '4', '8', '16']) |
| 251 | |
| 252 | which gives the result:: |
| 253 | |
| 254 | "1, 2, 4, 8, 16" |
| 255 | |
| 256 | There are two common arguments against this usage. |
| 257 | |
| 258 | The first runs along the lines of: "It looks really ugly using a method of a |
| 259 | string literal (string constant)", to which the answer is that it might, but a |
| 260 | string literal is just a fixed value. If the methods are to be allowed on names |
| 261 | bound to strings there is no logical reason to make them unavailable on |
| 262 | literals. |
| 263 | |
| 264 | The second objection is typically cast as: "I am really telling a sequence to |
| 265 | join its members together with a string constant". Sadly, you aren't. For some |
| 266 | reason there seems to be much less difficulty with having :meth:`~str.split` as |
| 267 | a string method, since in that case it is easy to see that :: |
| 268 | |
| 269 | "1, 2, 4, 8, 16".split(", ") |
| 270 | |
| 271 | is an instruction to a string literal to return the substrings delimited by the |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 272 | given separator (or, by default, arbitrary runs of white space). |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 273 | |
| 274 | :meth:`~str.join` is a string method because in using it you are telling the |
| 275 | separator string to iterate over a sequence of strings and insert itself between |
| 276 | adjacent elements. This method can be used with any argument which obeys the |
| 277 | rules for sequence objects, including any new classes you might define yourself. |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 278 | Similar methods exist for bytes and bytearray objects. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 279 | |
| 280 | |
| 281 | How fast are exceptions? |
| 282 | ------------------------ |
| 283 | |
Georg Brandl | 12c3cd7 | 2012-03-17 16:58:05 +0100 | [diff] [blame] | 284 | A try/except block is extremely efficient if no exceptions are raised. Actually |
| 285 | catching an exception is expensive. In versions of Python prior to 2.0 it was |
| 286 | common to use this idiom:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 287 | |
| 288 | try: |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 289 | value = mydict[key] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 290 | except KeyError: |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 291 | mydict[key] = getvalue(key) |
| 292 | value = mydict[key] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 293 | |
| 294 | This only made sense when you expected the dict to have the key almost all the |
| 295 | time. If that wasn't the case, you coded it like this:: |
| 296 | |
Georg Brandl | 12c3cd7 | 2012-03-17 16:58:05 +0100 | [diff] [blame] | 297 | if key in mydict: |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 298 | value = mydict[key] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 299 | else: |
Georg Brandl | 12c3cd7 | 2012-03-17 16:58:05 +0100 | [diff] [blame] | 300 | value = mydict[key] = getvalue(key) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 301 | |
Georg Brandl | bfe95ac | 2009-12-19 17:46:40 +0000 | [diff] [blame] | 302 | For this specific case, you could also use ``value = dict.setdefault(key, |
| 303 | getvalue(key))``, but only if the ``getvalue()`` call is cheap enough because it |
| 304 | is evaluated in all cases. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 305 | |
| 306 | |
| 307 | Why isn't there a switch or case statement in Python? |
| 308 | ----------------------------------------------------- |
| 309 | |
| 310 | You can do this easily enough with a sequence of ``if... elif... elif... else``. |
| 311 | There have been some proposals for switch statement syntax, but there is no |
| 312 | consensus (yet) on whether and how to do range tests. See :pep:`275` for |
| 313 | complete details and the current status. |
| 314 | |
| 315 | For cases where you need to choose from a very large number of possibilities, |
| 316 | you can create a dictionary mapping case values to functions to call. For |
| 317 | example:: |
| 318 | |
| 319 | def function_1(...): |
| 320 | ... |
| 321 | |
| 322 | functions = {'a': function_1, |
| 323 | 'b': function_2, |
| 324 | 'c': self.method_1, ...} |
| 325 | |
| 326 | func = functions[value] |
| 327 | func() |
| 328 | |
| 329 | For calling methods on objects, you can simplify yet further by using the |
| 330 | :func:`getattr` built-in to retrieve methods with a particular name:: |
| 331 | |
| 332 | def visit_a(self, ...): |
| 333 | ... |
| 334 | ... |
| 335 | |
| 336 | def dispatch(self, value): |
| 337 | method_name = 'visit_' + str(value) |
| 338 | method = getattr(self, method_name) |
| 339 | method() |
| 340 | |
| 341 | It's suggested that you use a prefix for the method names, such as ``visit_`` in |
| 342 | this example. Without such a prefix, if values are coming from an untrusted |
| 343 | source, an attacker would be able to call any method on your object. |
| 344 | |
| 345 | |
| 346 | Can't you emulate threads in the interpreter instead of relying on an OS-specific thread implementation? |
| 347 | -------------------------------------------------------------------------------------------------------- |
| 348 | |
| 349 | Answer 1: Unfortunately, the interpreter pushes at least one C stack frame for |
| 350 | each Python stack frame. Also, extensions can call back into Python at almost |
| 351 | random moments. Therefore, a complete threads implementation requires thread |
| 352 | support for C. |
| 353 | |
Julien Palard | a6e1e41 | 2018-07-05 06:31:38 +0200 | [diff] [blame] | 354 | Answer 2: Fortunately, there is `Stackless Python <https://github.com/stackless-dev/stackless/wiki>`_, |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 355 | which has a completely redesigned interpreter loop that avoids the C stack. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 356 | |
| 357 | |
Georg Brandl | 242e6a0 | 2013-10-06 10:28:39 +0200 | [diff] [blame] | 358 | Why can't lambda expressions contain statements? |
| 359 | ------------------------------------------------ |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 360 | |
Georg Brandl | 242e6a0 | 2013-10-06 10:28:39 +0200 | [diff] [blame] | 361 | Python lambda expressions cannot contain statements because Python's syntactic |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 362 | framework can't handle statements nested inside expressions. However, in |
| 363 | Python, this is not a serious problem. Unlike lambda forms in other languages, |
| 364 | where they add functionality, Python lambdas are only a shorthand notation if |
| 365 | you're too lazy to define a function. |
| 366 | |
| 367 | Functions are already first class objects in Python, and can be declared in a |
Georg Brandl | 242e6a0 | 2013-10-06 10:28:39 +0200 | [diff] [blame] | 368 | local scope. Therefore the only advantage of using a lambda instead of a |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 369 | locally-defined function is that you don't need to invent a name for the |
| 370 | function -- but that's just a local variable to which the function object (which |
Georg Brandl | 242e6a0 | 2013-10-06 10:28:39 +0200 | [diff] [blame] | 371 | is exactly the same type of object that a lambda expression yields) is assigned! |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 372 | |
| 373 | |
| 374 | Can Python be compiled to machine code, C or some other language? |
| 375 | ----------------------------------------------------------------- |
| 376 | |
Brett Cannon | 78ffd6c | 2016-11-18 10:41:28 -0800 | [diff] [blame] | 377 | `Cython <http://cython.org/>`_ compiles a modified version of Python with |
| 378 | optional annotations into C extensions. `Nuitka <http://www.nuitka.net/>`_ is |
| 379 | an up-and-coming compiler of Python into C++ code, aiming to support the full |
| 380 | Python language. For compiling to Java you can consider |
| 381 | `VOC <https://voc.readthedocs.io>`_. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 382 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 383 | |
| 384 | How does Python manage memory? |
| 385 | ------------------------------ |
| 386 | |
| 387 | The details of Python memory management depend on the implementation. The |
Antoine Pitrou | c561a9a | 2011-12-03 23:06:50 +0100 | [diff] [blame] | 388 | standard implementation of Python, :term:`CPython`, uses reference counting to |
| 389 | detect inaccessible objects, and another mechanism to collect reference cycles, |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 390 | periodically executing a cycle detection algorithm which looks for inaccessible |
| 391 | cycles and deletes the objects involved. The :mod:`gc` module provides functions |
| 392 | to perform a garbage collection, obtain debugging statistics, and tune the |
| 393 | collector's parameters. |
| 394 | |
Antoine Pitrou | c561a9a | 2011-12-03 23:06:50 +0100 | [diff] [blame] | 395 | Other implementations (such as `Jython <http://www.jython.org>`_ or |
| 396 | `PyPy <http://www.pypy.org>`_), however, can rely on a different mechanism |
| 397 | such as a full-blown garbage collector. This difference can cause some |
| 398 | subtle porting problems if your Python code depends on the behavior of the |
| 399 | reference counting implementation. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 400 | |
Antoine Pitrou | c561a9a | 2011-12-03 23:06:50 +0100 | [diff] [blame] | 401 | In some Python implementations, the following code (which is fine in CPython) |
| 402 | will probably run out of file descriptors:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 403 | |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 404 | for file in very_long_list_of_files: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 405 | f = open(file) |
| 406 | c = f.read(1) |
| 407 | |
Antoine Pitrou | c561a9a | 2011-12-03 23:06:50 +0100 | [diff] [blame] | 408 | Indeed, using CPython's reference counting and destructor scheme, each new |
| 409 | assignment to *f* closes the previous file. With a traditional GC, however, |
| 410 | those file objects will only get collected (and closed) at varying and possibly |
| 411 | long intervals. |
| 412 | |
| 413 | If you want to write code that will work with any Python implementation, |
| 414 | you should explicitly close the file or use the :keyword:`with` statement; |
| 415 | this will work regardless of memory management scheme:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 416 | |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 417 | for file in very_long_list_of_files: |
| 418 | with open(file) as f: |
| 419 | c = f.read(1) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 420 | |
| 421 | |
Antoine Pitrou | c561a9a | 2011-12-03 23:06:50 +0100 | [diff] [blame] | 422 | Why doesn't CPython use a more traditional garbage collection scheme? |
| 423 | --------------------------------------------------------------------- |
| 424 | |
| 425 | For one thing, this is not a C standard feature and hence it's not portable. |
| 426 | (Yes, we know about the Boehm GC library. It has bits of assembler code for |
| 427 | *most* common platforms, not for all of them, and although it is mostly |
| 428 | transparent, it isn't completely transparent; patches are required to get |
| 429 | Python to work with it.) |
| 430 | |
| 431 | Traditional GC also becomes a problem when Python is embedded into other |
| 432 | applications. While in a standalone Python it's fine to replace the standard |
| 433 | malloc() and free() with versions provided by the GC library, an application |
| 434 | embedding Python may want to have its *own* substitute for malloc() and free(), |
| 435 | and may not want Python's. Right now, CPython works with anything that |
| 436 | implements malloc() and free() properly. |
| 437 | |
| 438 | |
| 439 | Why isn't all memory freed when CPython exits? |
| 440 | ---------------------------------------------- |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 441 | |
| 442 | Objects referenced from the global namespaces of Python modules are not always |
| 443 | deallocated when Python exits. This may happen if there are circular |
| 444 | references. There are also certain bits of memory that are allocated by the C |
| 445 | library that are impossible to free (e.g. a tool like Purify will complain about |
| 446 | these). Python is, however, aggressive about cleaning up memory on exit and |
| 447 | does try to destroy every single object. |
| 448 | |
| 449 | If you want to force Python to delete certain things on deallocation use the |
| 450 | :mod:`atexit` module to run a function that will force those deletions. |
| 451 | |
| 452 | |
| 453 | Why are there separate tuple and list data types? |
| 454 | ------------------------------------------------- |
| 455 | |
| 456 | Lists and tuples, while similar in many respects, are generally used in |
| 457 | fundamentally different ways. Tuples can be thought of as being similar to |
| 458 | Pascal records or C structs; they're small collections of related data which may |
| 459 | be of different types which are operated on as a group. For example, a |
| 460 | Cartesian coordinate is appropriately represented as a tuple of two or three |
| 461 | numbers. |
| 462 | |
| 463 | Lists, on the other hand, are more like arrays in other languages. They tend to |
| 464 | hold a varying number of objects all of which have the same type and which are |
| 465 | operated on one-by-one. For example, ``os.listdir('.')`` returns a list of |
| 466 | strings representing the files in the current directory. Functions which |
| 467 | operate on this output would generally not break if you added another file or |
| 468 | two to the directory. |
| 469 | |
| 470 | Tuples are immutable, meaning that once a tuple has been created, you can't |
| 471 | replace any of its elements with a new value. Lists are mutable, meaning that |
| 472 | you can always change a list's elements. Only immutable elements can be used as |
| 473 | dictionary keys, and hence only tuples and not lists can be used as keys. |
| 474 | |
| 475 | |
Andrés Delfino | 8d41278 | 2018-07-07 20:25:47 -0300 | [diff] [blame] | 476 | How are lists implemented in CPython? |
| 477 | ------------------------------------- |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 478 | |
Andrés Delfino | 8d41278 | 2018-07-07 20:25:47 -0300 | [diff] [blame] | 479 | CPython's lists are really variable-length arrays, not Lisp-style linked lists. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 480 | The implementation uses a contiguous array of references to other objects, and |
| 481 | keeps a pointer to this array and the array's length in a list head structure. |
| 482 | |
| 483 | This makes indexing a list ``a[i]`` an operation whose cost is independent of |
| 484 | the size of the list or the value of the index. |
| 485 | |
| 486 | When items are appended or inserted, the array of references is resized. Some |
| 487 | cleverness is applied to improve the performance of appending items repeatedly; |
| 488 | when the array must be grown, some extra space is allocated so the next few |
| 489 | times don't require an actual resize. |
| 490 | |
| 491 | |
Andrés Delfino | 8d41278 | 2018-07-07 20:25:47 -0300 | [diff] [blame] | 492 | How are dictionaries implemented in CPython? |
| 493 | -------------------------------------------- |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 494 | |
Andrés Delfino | 8d41278 | 2018-07-07 20:25:47 -0300 | [diff] [blame] | 495 | CPython's dictionaries are implemented as resizable hash tables. Compared to |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 496 | B-trees, this gives better performance for lookup (the most common operation by |
| 497 | far) under most circumstances, and the implementation is simpler. |
| 498 | |
| 499 | Dictionaries work by computing a hash code for each key stored in the dictionary |
| 500 | using the :func:`hash` built-in function. The hash code varies widely depending |
Georg Brandl | b20a019 | 2012-03-14 07:50:17 +0100 | [diff] [blame] | 501 | on the key and a per-process seed; for example, "Python" could hash to |
| 502 | -539294296 while "python", a string that differs by a single bit, could hash |
| 503 | to 1142331976. The hash code is then used to calculate a location in an |
| 504 | internal array where the value will be stored. Assuming that you're storing |
| 505 | keys that all have different hash values, this means that dictionaries take |
Srinivas Reddy Thatiparthy (శ్రీనివాస్ రెడ్డి తాటిపర్తి) | 866c168 | 2018-06-26 13:57:05 +0530 | [diff] [blame] | 506 | constant time -- O(1), in Big-O notation -- to retrieve a key. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 507 | |
| 508 | |
| 509 | Why must dictionary keys be immutable? |
| 510 | -------------------------------------- |
| 511 | |
| 512 | The hash table implementation of dictionaries uses a hash value calculated from |
| 513 | the key value to find the key. If the key were a mutable object, its value |
| 514 | could change, and thus its hash could also change. But since whoever changes |
| 515 | the key object can't tell that it was being used as a dictionary key, it can't |
| 516 | move the entry around in the dictionary. Then, when you try to look up the same |
| 517 | object in the dictionary it won't be found because its hash value is different. |
| 518 | If you tried to look up the old value it wouldn't be found either, because the |
| 519 | value of the object found in that hash bin would be different. |
| 520 | |
| 521 | If you want a dictionary indexed with a list, simply convert the list to a tuple |
| 522 | first; the function ``tuple(L)`` creates a tuple with the same entries as the |
| 523 | list ``L``. Tuples are immutable and can therefore be used as dictionary keys. |
| 524 | |
| 525 | Some unacceptable solutions that have been proposed: |
| 526 | |
| 527 | - Hash lists by their address (object ID). This doesn't work because if you |
| 528 | construct a new list with the same value it won't be found; e.g.:: |
| 529 | |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 530 | mydict = {[1, 2]: '12'} |
| 531 | print(mydict[[1, 2]]) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 532 | |
Stéphane Wirtel | e483f02 | 2018-10-26 12:52:11 +0200 | [diff] [blame] | 533 | would raise a :exc:`KeyError` exception because the id of the ``[1, 2]`` used in the |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 534 | second line differs from that in the first line. In other words, dictionary |
| 535 | keys should be compared using ``==``, not using :keyword:`is`. |
| 536 | |
| 537 | - Make a copy when using a list as a key. This doesn't work because the list, |
| 538 | being a mutable object, could contain a reference to itself, and then the |
| 539 | copying code would run into an infinite loop. |
| 540 | |
| 541 | - Allow lists as keys but tell the user not to modify them. This would allow a |
| 542 | class of hard-to-track bugs in programs when you forgot or modified a list by |
| 543 | accident. It also invalidates an important invariant of dictionaries: every |
| 544 | value in ``d.keys()`` is usable as a key of the dictionary. |
| 545 | |
| 546 | - Mark lists as read-only once they are used as a dictionary key. The problem |
| 547 | is that it's not just the top-level object that could change its value; you |
| 548 | could use a tuple containing a list as a key. Entering anything as a key into |
| 549 | a dictionary would require marking all objects reachable from there as |
| 550 | read-only -- and again, self-referential objects could cause an infinite loop. |
| 551 | |
| 552 | There is a trick to get around this if you need to, but use it at your own risk: |
| 553 | You can wrap a mutable structure inside a class instance which has both a |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 554 | :meth:`__eq__` and a :meth:`__hash__` method. You must then make sure that the |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 555 | hash value for all such wrapper objects that reside in a dictionary (or other |
| 556 | hash based structure), remain fixed while the object is in the dictionary (or |
| 557 | other structure). :: |
| 558 | |
| 559 | class ListWrapper: |
| 560 | def __init__(self, the_list): |
| 561 | self.the_list = the_list |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 562 | |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 563 | def __eq__(self, other): |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 564 | return self.the_list == other.the_list |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 565 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 566 | def __hash__(self): |
| 567 | l = self.the_list |
| 568 | result = 98767 - len(l)*555 |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 569 | for i, el in enumerate(l): |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 570 | try: |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 571 | result = result + (hash(el) % 9999999) * 1001 + i |
| 572 | except Exception: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 573 | result = (result % 7777777) + i * 333 |
| 574 | return result |
| 575 | |
| 576 | Note that the hash computation is complicated by the possibility that some |
| 577 | members of the list may be unhashable and also by the possibility of arithmetic |
| 578 | overflow. |
| 579 | |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 580 | Furthermore it must always be the case that if ``o1 == o2`` (ie ``o1.__eq__(o2) |
| 581 | is True``) then ``hash(o1) == hash(o2)`` (ie, ``o1.__hash__() == o2.__hash__()``), |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 582 | regardless of whether the object is in a dictionary or not. If you fail to meet |
| 583 | these restrictions dictionaries and other hash based structures will misbehave. |
| 584 | |
| 585 | In the case of ListWrapper, whenever the wrapper object is in a dictionary the |
| 586 | wrapped list must not change to avoid anomalies. Don't do this unless you are |
| 587 | prepared to think hard about the requirements and the consequences of not |
| 588 | meeting them correctly. Consider yourself warned. |
| 589 | |
| 590 | |
| 591 | Why doesn't list.sort() return the sorted list? |
| 592 | ----------------------------------------------- |
| 593 | |
| 594 | In situations where performance matters, making a copy of the list just to sort |
| 595 | it would be wasteful. Therefore, :meth:`list.sort` sorts the list in place. In |
| 596 | order to remind you of that fact, it does not return the sorted list. This way, |
| 597 | you won't be fooled into accidentally overwriting a list when you need a sorted |
| 598 | copy but also need to keep the unsorted version around. |
| 599 | |
Antoine Pitrou | dec0f21 | 2011-12-03 23:08:57 +0100 | [diff] [blame] | 600 | If you want to return a new list, use the built-in :func:`sorted` function |
| 601 | instead. This function creates a new list from a provided iterable, sorts |
| 602 | it and returns it. For example, here's how to iterate over the keys of a |
| 603 | dictionary in sorted order:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 604 | |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 605 | for key in sorted(mydict): |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 606 | ... # do whatever with mydict[key]... |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 607 | |
| 608 | |
| 609 | How do you specify and enforce an interface spec in Python? |
| 610 | ----------------------------------------------------------- |
| 611 | |
| 612 | An interface specification for a module as provided by languages such as C++ and |
| 613 | Java describes the prototypes for the methods and functions of the module. Many |
| 614 | feel that compile-time enforcement of interface specifications helps in the |
| 615 | construction of large programs. |
| 616 | |
| 617 | Python 2.6 adds an :mod:`abc` module that lets you define Abstract Base Classes |
| 618 | (ABCs). You can then use :func:`isinstance` and :func:`issubclass` to check |
| 619 | whether an instance or a class implements a particular ABC. The |
Éric Araujo | b8edbdf | 2011-09-01 05:57:12 +0200 | [diff] [blame] | 620 | :mod:`collections.abc` module defines a set of useful ABCs such as |
Serhiy Storchaka | bfdcd43 | 2013-10-13 23:09:14 +0300 | [diff] [blame] | 621 | :class:`~collections.abc.Iterable`, :class:`~collections.abc.Container`, and |
| 622 | :class:`~collections.abc.MutableMapping`. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 623 | |
| 624 | For Python, many of the advantages of interface specifications can be obtained |
| 625 | by an appropriate test discipline for components. There is also a tool, |
| 626 | PyChecker, which can be used to find problems due to subclassing. |
| 627 | |
| 628 | A good test suite for a module can both provide a regression test and serve as a |
| 629 | module interface specification and a set of examples. Many Python modules can |
| 630 | be run as a script to provide a simple "self test." Even modules which use |
| 631 | complex external interfaces can often be tested in isolation using trivial |
| 632 | "stub" emulations of the external interface. The :mod:`doctest` and |
| 633 | :mod:`unittest` modules or third-party test frameworks can be used to construct |
| 634 | exhaustive test suites that exercise every line of code in a module. |
| 635 | |
| 636 | An appropriate testing discipline can help build large complex applications in |
| 637 | Python as well as having interface specifications would. In fact, it can be |
| 638 | better because an interface specification cannot test certain properties of a |
| 639 | program. For example, the :meth:`append` method is expected to add new elements |
| 640 | to the end of some internal list; an interface specification cannot test that |
| 641 | your :meth:`append` implementation will actually do this correctly, but it's |
| 642 | trivial to check this property in a test suite. |
| 643 | |
Ilya Kamenshchikov | a0f7119 | 2019-07-16 17:13:38 +0200 | [diff] [blame^] | 644 | Writing test suites is very helpful, and you might want to design your code to |
| 645 | make it easily tested. One increasingly popular technique, test-driven |
| 646 | development, calls for writing parts of the test suite first, before you write |
| 647 | any of the actual code. Of course Python allows you to be sloppy and not write |
| 648 | test cases at all. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 649 | |
| 650 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 651 | Why is there no goto? |
| 652 | --------------------- |
| 653 | |
| 654 | You can use exceptions to provide a "structured goto" that even works across |
| 655 | function calls. Many feel that exceptions can conveniently emulate all |
| 656 | reasonable uses of the "go" or "goto" constructs of C, Fortran, and other |
| 657 | languages. For example:: |
| 658 | |
Ezio Melotti | 19cdee8 | 2013-01-05 06:53:27 +0200 | [diff] [blame] | 659 | class label(Exception): pass # declare a label |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 660 | |
| 661 | try: |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 662 | ... |
| 663 | if condition: raise label() # goto label |
| 664 | ... |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 665 | except label: # where to goto |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 666 | pass |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 667 | ... |
| 668 | |
| 669 | This doesn't allow you to jump into the middle of a loop, but that's usually |
| 670 | considered an abuse of goto anyway. Use sparingly. |
| 671 | |
| 672 | |
| 673 | Why can't raw strings (r-strings) end with a backslash? |
| 674 | ------------------------------------------------------- |
| 675 | |
| 676 | More precisely, they can't end with an odd number of backslashes: the unpaired |
| 677 | backslash at the end escapes the closing quote character, leaving an |
| 678 | unterminated string. |
| 679 | |
| 680 | Raw strings were designed to ease creating input for processors (chiefly regular |
| 681 | expression engines) that want to do their own backslash escape processing. Such |
| 682 | processors consider an unmatched trailing backslash to be an error anyway, so |
| 683 | raw strings disallow that. In return, they allow you to pass on the string |
| 684 | quote character by escaping it with a backslash. These rules work well when |
| 685 | r-strings are used for their intended purpose. |
| 686 | |
| 687 | If you're trying to build Windows pathnames, note that all Windows system calls |
| 688 | accept forward slashes too:: |
| 689 | |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 690 | f = open("/mydir/file.txt") # works fine! |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 691 | |
| 692 | If you're trying to build a pathname for a DOS command, try e.g. one of :: |
| 693 | |
| 694 | dir = r"\this\is\my\dos\dir" "\\" |
| 695 | dir = r"\this\is\my\dos\dir\ "[:-1] |
| 696 | dir = "\\this\\is\\my\\dos\\dir\\" |
| 697 | |
| 698 | |
| 699 | Why doesn't Python have a "with" statement for attribute assignments? |
| 700 | --------------------------------------------------------------------- |
| 701 | |
| 702 | Python has a 'with' statement that wraps the execution of a block, calling code |
| 703 | on the entrance and exit from the block. Some language have a construct that |
| 704 | looks like this:: |
| 705 | |
| 706 | with obj: |
Benjamin Peterson | 1baf465 | 2009-12-31 03:11:23 +0000 | [diff] [blame] | 707 | a = 1 # equivalent to obj.a = 1 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 708 | total = total + 1 # obj.total = obj.total + 1 |
| 709 | |
| 710 | In Python, such a construct would be ambiguous. |
| 711 | |
| 712 | Other languages, such as Object Pascal, Delphi, and C++, use static types, so |
| 713 | it's possible to know, in an unambiguous way, what member is being assigned |
| 714 | to. This is the main point of static typing -- the compiler *always* knows the |
| 715 | scope of every variable at compile time. |
| 716 | |
| 717 | Python uses dynamic types. It is impossible to know in advance which attribute |
| 718 | will be referenced at runtime. Member attributes may be added or removed from |
| 719 | objects on the fly. This makes it impossible to know, from a simple reading, |
| 720 | what attribute is being referenced: a local one, a global one, or a member |
| 721 | attribute? |
| 722 | |
| 723 | For instance, take the following incomplete snippet:: |
| 724 | |
| 725 | def foo(a): |
| 726 | with a: |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 727 | print(x) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 728 | |
| 729 | The snippet assumes that "a" must have a member attribute called "x". However, |
| 730 | there is nothing in Python that tells the interpreter this. What should happen |
| 731 | if "a" is, let us say, an integer? If there is a global variable named "x", |
| 732 | will it be used inside the with block? As you see, the dynamic nature of Python |
| 733 | makes such choices much harder. |
| 734 | |
| 735 | The primary benefit of "with" and similar language features (reduction of code |
| 736 | volume) can, however, easily be achieved in Python by assignment. Instead of:: |
| 737 | |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 738 | function(args).mydict[index][index].a = 21 |
| 739 | function(args).mydict[index][index].b = 42 |
| 740 | function(args).mydict[index][index].c = 63 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 741 | |
| 742 | write this:: |
| 743 | |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 744 | ref = function(args).mydict[index][index] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 745 | ref.a = 21 |
| 746 | ref.b = 42 |
| 747 | ref.c = 63 |
| 748 | |
| 749 | This also has the side-effect of increasing execution speed because name |
| 750 | bindings are resolved at run-time in Python, and the second version only needs |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 751 | to perform the resolution once. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 752 | |
| 753 | |
| 754 | Why are colons required for the if/while/def/class statements? |
| 755 | -------------------------------------------------------------- |
| 756 | |
| 757 | The colon is required primarily to enhance readability (one of the results of |
| 758 | the experimental ABC language). Consider this:: |
| 759 | |
| 760 | if a == b |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 761 | print(a) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 762 | |
| 763 | versus :: |
| 764 | |
| 765 | if a == b: |
Georg Brandl | 99de488 | 2009-12-20 14:24:06 +0000 | [diff] [blame] | 766 | print(a) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 767 | |
| 768 | Notice how the second one is slightly easier to read. Notice further how a |
| 769 | colon sets off the example in this FAQ answer; it's a standard usage in English. |
| 770 | |
| 771 | Another minor reason is that the colon makes it easier for editors with syntax |
| 772 | highlighting; they can look for colons to decide when indentation needs to be |
| 773 | increased instead of having to do a more elaborate parsing of the program text. |
| 774 | |
| 775 | |
| 776 | Why does Python allow commas at the end of lists and tuples? |
| 777 | ------------------------------------------------------------ |
| 778 | |
| 779 | Python lets you add a trailing comma at the end of lists, tuples, and |
| 780 | dictionaries:: |
| 781 | |
| 782 | [1, 2, 3,] |
| 783 | ('a', 'b', 'c',) |
| 784 | d = { |
| 785 | "A": [1, 5], |
| 786 | "B": [6, 7], # last trailing comma is optional but good style |
| 787 | } |
| 788 | |
| 789 | |
| 790 | There are several reasons to allow this. |
| 791 | |
| 792 | When you have a literal value for a list, tuple, or dictionary spread across |
| 793 | multiple lines, it's easier to add more elements because you don't have to |
Georg Brandl | 7b8c132 | 2013-04-14 10:31:06 +0200 | [diff] [blame] | 794 | remember to add a comma to the previous line. The lines can also be reordered |
| 795 | without creating a syntax error. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 796 | |
| 797 | Accidentally omitting the comma can lead to errors that are hard to diagnose. |
| 798 | For example:: |
| 799 | |
| 800 | x = [ |
| 801 | "fee", |
| 802 | "fie" |
| 803 | "foo", |
| 804 | "fum" |
| 805 | ] |
| 806 | |
| 807 | This list looks like it has four elements, but it actually contains three: |
| 808 | "fee", "fiefoo" and "fum". Always adding the comma avoids this source of error. |
| 809 | |
| 810 | Allowing the trailing comma may also make programmatic code generation easier. |