Senthil Kumaran | 6e13f13 | 2012-02-09 17:54:17 +0800 | [diff] [blame] | 1 | .. _socket-howto: |
| 2 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 3 | **************************** |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 4 | Socket Programming HOWTO |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 5 | **************************** |
| 6 | |
| 7 | :Author: Gordon McMillan |
| 8 | |
| 9 | |
| 10 | .. topic:: Abstract |
| 11 | |
| 12 | Sockets are used nearly everywhere, but are one of the most severely |
| 13 | misunderstood technologies around. This is a 10,000 foot overview of sockets. |
| 14 | It's not really a tutorial - you'll still have work to do in getting things |
| 15 | operational. It doesn't cover the fine points (and there are a lot of them), but |
| 16 | I hope it will give you enough background to begin using them decently. |
| 17 | |
| 18 | |
| 19 | Sockets |
| 20 | ======= |
| 21 | |
Martin v. Löwis | 987475c | 2011-05-29 16:54:08 +0200 | [diff] [blame] | 22 | I'm only going to talk about INET (i.e. IPv4) sockets, but they account for at least 99% of |
| 23 | the sockets in use. And I'll only talk about STREAM (i.e. TCP) sockets - unless you really |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 24 | know what you're doing (in which case this HOWTO isn't for you!), you'll get |
| 25 | better behavior and performance from a STREAM socket than anything else. I will |
| 26 | try to clear up the mystery of what a socket is, as well as some hints on how to |
| 27 | work with blocking and non-blocking sockets. But I'll start by talking about |
| 28 | blocking sockets. You'll need to know how they work before dealing with |
| 29 | non-blocking sockets. |
| 30 | |
| 31 | Part of the trouble with understanding these things is that "socket" can mean a |
| 32 | number of subtly different things, depending on context. So first, let's make a |
| 33 | distinction between a "client" socket - an endpoint of a conversation, and a |
| 34 | "server" socket, which is more like a switchboard operator. The client |
| 35 | application (your browser, for example) uses "client" sockets exclusively; the |
| 36 | web server it's talking to uses both "server" sockets and "client" sockets. |
| 37 | |
| 38 | |
| 39 | History |
| 40 | ------- |
| 41 | |
Ezio Melotti | eda1990 | 2011-05-14 09:17:52 +0300 | [diff] [blame] | 42 | Of the various forms of :abbr:`IPC (Inter Process Communication)`, |
| 43 | sockets are by far the most popular. On any given platform, there are |
| 44 | likely to be other forms of IPC that are faster, but for |
| 45 | cross-platform communication, sockets are about the only game in town. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 46 | |
| 47 | They were invented in Berkeley as part of the BSD flavor of Unix. They spread |
| 48 | like wildfire with the Internet. With good reason --- the combination of sockets |
| 49 | with INET makes talking to arbitrary machines around the world unbelievably easy |
| 50 | (at least compared to other schemes). |
| 51 | |
| 52 | |
| 53 | Creating a Socket |
| 54 | ================= |
| 55 | |
| 56 | Roughly speaking, when you clicked on the link that brought you to this page, |
| 57 | your browser did something like the following:: |
| 58 | |
Antoine Pitrou | 8345451 | 2011-12-05 01:37:34 +0100 | [diff] [blame] | 59 | # create an INET, STREAMing socket |
Collin Winter | 4c6a140 | 2007-09-10 00:47:20 +0000 | [diff] [blame] | 60 | s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) |
Antoine Pitrou | 8345451 | 2011-12-05 01:37:34 +0100 | [diff] [blame] | 61 | # now connect to the web server on port 80 - the normal http port |
| 62 | s.connect(("www.python.org", 80)) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 63 | |
Ezio Melotti | eda1990 | 2011-05-14 09:17:52 +0300 | [diff] [blame] | 64 | When the ``connect`` completes, the socket ``s`` can be used to send |
| 65 | in a request for the text of the page. The same socket will read the |
| 66 | reply, and then be destroyed. That's right, destroyed. Client sockets |
| 67 | are normally only used for one exchange (or a small set of sequential |
| 68 | exchanges). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 69 | |
| 70 | What happens in the web server is a bit more complex. First, the web server |
Ezio Melotti | eda1990 | 2011-05-14 09:17:52 +0300 | [diff] [blame] | 71 | creates a "server socket":: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 72 | |
Antoine Pitrou | 8345451 | 2011-12-05 01:37:34 +0100 | [diff] [blame] | 73 | # create an INET, STREAMing socket |
| 74 | serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) |
| 75 | # bind the socket to a public host, and a well-known port |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 76 | serversocket.bind((socket.gethostname(), 80)) |
Antoine Pitrou | 8345451 | 2011-12-05 01:37:34 +0100 | [diff] [blame] | 77 | # become a server socket |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 78 | serversocket.listen(5) |
| 79 | |
| 80 | A couple things to notice: we used ``socket.gethostname()`` so that the socket |
Georg Brandl | a204636 | 2013-04-14 10:59:04 +0200 | [diff] [blame] | 81 | would be visible to the outside world. If we had used ``s.bind(('localhost', |
| 82 | 80))`` or ``s.bind(('127.0.0.1', 80))`` we would still have a "server" socket, |
| 83 | but one that was only visible within the same machine. ``s.bind(('', 80))`` |
| 84 | specifies that the socket is reachable by any address the machine happens to |
| 85 | have. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 86 | |
| 87 | A second thing to note: low number ports are usually reserved for "well known" |
| 88 | services (HTTP, SNMP etc). If you're playing around, use a nice high number (4 |
| 89 | digits). |
| 90 | |
| 91 | Finally, the argument to ``listen`` tells the socket library that we want it to |
| 92 | queue up as many as 5 connect requests (the normal max) before refusing outside |
| 93 | connections. If the rest of the code is written properly, that should be plenty. |
| 94 | |
Ezio Melotti | eda1990 | 2011-05-14 09:17:52 +0300 | [diff] [blame] | 95 | Now that we have a "server" socket, listening on port 80, we can enter the |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 96 | mainloop of the web server:: |
| 97 | |
Collin Winter | 4c6a140 | 2007-09-10 00:47:20 +0000 | [diff] [blame] | 98 | while True: |
Antoine Pitrou | 8345451 | 2011-12-05 01:37:34 +0100 | [diff] [blame] | 99 | # accept connections from outside |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 100 | (clientsocket, address) = serversocket.accept() |
Antoine Pitrou | 8345451 | 2011-12-05 01:37:34 +0100 | [diff] [blame] | 101 | # now do something with the clientsocket |
| 102 | # in this case, we'll pretend this is a threaded server |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 103 | ct = client_thread(clientsocket) |
| 104 | ct.run() |
| 105 | |
| 106 | There's actually 3 general ways in which this loop could work - dispatching a |
| 107 | thread to handle ``clientsocket``, create a new process to handle |
| 108 | ``clientsocket``, or restructure this app to use non-blocking sockets, and |
| 109 | mulitplex between our "server" socket and any active ``clientsocket``\ s using |
| 110 | ``select``. More about that later. The important thing to understand now is |
| 111 | this: this is *all* a "server" socket does. It doesn't send any data. It doesn't |
| 112 | receive any data. It just produces "client" sockets. Each ``clientsocket`` is |
| 113 | created in response to some *other* "client" socket doing a ``connect()`` to the |
| 114 | host and port we're bound to. As soon as we've created that ``clientsocket``, we |
| 115 | go back to listening for more connections. The two "clients" are free to chat it |
| 116 | up - they are using some dynamically allocated port which will be recycled when |
| 117 | the conversation ends. |
| 118 | |
| 119 | |
| 120 | IPC |
| 121 | --- |
| 122 | |
| 123 | If you need fast IPC between two processes on one machine, you should look into |
Antoine Pitrou | 8e644f0 | 2011-12-05 01:43:32 +0100 | [diff] [blame] | 124 | pipes or shared memory. If you do decide to use AF_INET sockets, bind the |
| 125 | "server" socket to ``'localhost'``. On most platforms, this will take a |
| 126 | shortcut around a couple of layers of network code and be quite a bit faster. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 127 | |
Antoine Pitrou | 8e644f0 | 2011-12-05 01:43:32 +0100 | [diff] [blame] | 128 | .. seealso:: |
| 129 | The :mod:`multiprocessing` integrates cross-platform IPC into a higher-level |
| 130 | API. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 131 | |
| 132 | |
| 133 | Using a Socket |
| 134 | ============== |
| 135 | |
| 136 | The first thing to note, is that the web browser's "client" socket and the web |
| 137 | server's "client" socket are identical beasts. That is, this is a "peer to peer" |
| 138 | conversation. Or to put it another way, *as the designer, you will have to |
| 139 | decide what the rules of etiquette are for a conversation*. Normally, the |
| 140 | ``connect``\ ing socket starts the conversation, by sending in a request, or |
| 141 | perhaps a signon. But that's a design decision - it's not a rule of sockets. |
| 142 | |
| 143 | Now there are two sets of verbs to use for communication. You can use ``send`` |
| 144 | and ``recv``, or you can transform your client socket into a file-like beast and |
Ezio Melotti | eda1990 | 2011-05-14 09:17:52 +0300 | [diff] [blame] | 145 | use ``read`` and ``write``. The latter is the way Java presents its sockets. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 146 | I'm not going to talk about it here, except to warn you that you need to use |
| 147 | ``flush`` on sockets. These are buffered "files", and a common mistake is to |
| 148 | ``write`` something, and then ``read`` for a reply. Without a ``flush`` in |
| 149 | there, you may wait forever for the reply, because the request may still be in |
| 150 | your output buffer. |
| 151 | |
Sandro Tosi | cfdba61 | 2012-04-23 19:45:07 +0200 | [diff] [blame] | 152 | Now we come to the major stumbling block of sockets - ``send`` and ``recv`` operate |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 153 | on the network buffers. They do not necessarily handle all the bytes you hand |
| 154 | them (or expect from them), because their major focus is handling the network |
| 155 | buffers. In general, they return when the associated network buffers have been |
| 156 | filled (``send``) or emptied (``recv``). They then tell you how many bytes they |
| 157 | handled. It is *your* responsibility to call them again until your message has |
| 158 | been completely dealt with. |
| 159 | |
| 160 | When a ``recv`` returns 0 bytes, it means the other side has closed (or is in |
| 161 | the process of closing) the connection. You will not receive any more data on |
| 162 | this connection. Ever. You may be able to send data successfully; I'll talk |
Sandro Tosi | cfdba61 | 2012-04-23 19:45:07 +0200 | [diff] [blame] | 163 | more about this later. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 164 | |
| 165 | A protocol like HTTP uses a socket for only one transfer. The client sends a |
Ezio Melotti | eda1990 | 2011-05-14 09:17:52 +0300 | [diff] [blame] | 166 | request, then reads a reply. That's it. The socket is discarded. This means that |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 167 | a client can detect the end of the reply by receiving 0 bytes. |
| 168 | |
| 169 | But if you plan to reuse your socket for further transfers, you need to realize |
Ezio Melotti | eda1990 | 2011-05-14 09:17:52 +0300 | [diff] [blame] | 170 | that *there is no* :abbr:`EOT (End of Transfer)` *on a socket.* I repeat: if a socket |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 171 | ``send`` or ``recv`` returns after handling 0 bytes, the connection has been |
| 172 | broken. If the connection has *not* been broken, you may wait on a ``recv`` |
| 173 | forever, because the socket will *not* tell you that there's nothing more to |
| 174 | read (for now). Now if you think about that a bit, you'll come to realize a |
| 175 | fundamental truth of sockets: *messages must either be fixed length* (yuck), *or |
| 176 | be delimited* (shrug), *or indicate how long they are* (much better), *or end by |
| 177 | shutting down the connection*. The choice is entirely yours, (but some ways are |
| 178 | righter than others). |
| 179 | |
| 180 | Assuming you don't want to end the connection, the simplest solution is a fixed |
| 181 | length message:: |
| 182 | |
Ezio Melotti | 680241e | 2014-06-27 16:34:14 +0300 | [diff] [blame] | 183 | class MySocket: |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 184 | """demonstration class only |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 185 | - coded for clarity, not efficiency |
Collin Winter | 4c6a140 | 2007-09-10 00:47:20 +0000 | [diff] [blame] | 186 | """ |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 187 | |
| 188 | def __init__(self, sock=None): |
Georg Brandl | a1c6a1c | 2009-01-03 21:26:05 +0000 | [diff] [blame] | 189 | if sock is None: |
| 190 | self.sock = socket.socket( |
| 191 | socket.AF_INET, socket.SOCK_STREAM) |
Ezio Melotti | 680241e | 2014-06-27 16:34:14 +0300 | [diff] [blame] | 192 | else: |
| 193 | self.sock = sock |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 194 | |
| 195 | def connect(self, host, port): |
Collin Winter | 4c6a140 | 2007-09-10 00:47:20 +0000 | [diff] [blame] | 196 | self.sock.connect((host, port)) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 197 | |
| 198 | def mysend(self, msg): |
Georg Brandl | a1c6a1c | 2009-01-03 21:26:05 +0000 | [diff] [blame] | 199 | totalsent = 0 |
| 200 | while totalsent < MSGLEN: |
| 201 | sent = self.sock.send(msg[totalsent:]) |
| 202 | if sent == 0: |
| 203 | raise RuntimeError("socket connection broken") |
| 204 | totalsent = totalsent + sent |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 205 | |
| 206 | def myreceive(self): |
Raymond Hettinger | ae4bab7 | 2014-05-18 21:02:25 +0100 | [diff] [blame] | 207 | chunks = [] |
| 208 | bytes_recd = 0 |
| 209 | while bytes_recd < MSGLEN: |
| 210 | chunk = self.sock.recv(min(MSGLEN - bytes_recd, 2048)) |
Martin v. Löwis | a7eaa41 | 2011-05-29 17:15:44 +0200 | [diff] [blame] | 211 | if chunk == b'': |
Georg Brandl | a1c6a1c | 2009-01-03 21:26:05 +0000 | [diff] [blame] | 212 | raise RuntimeError("socket connection broken") |
Benjamin Peterson | 419f1fa | 2014-05-26 15:10:42 -0700 | [diff] [blame] | 213 | chunks.append(chunk) |
Raymond Hettinger | ae4bab7 | 2014-05-18 21:02:25 +0100 | [diff] [blame] | 214 | bytes_recd = bytes_recd + len(chunk) |
| 215 | return b''.join(chunks) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 216 | |
| 217 | The sending code here is usable for almost any messaging scheme - in Python you |
| 218 | send strings, and you can use ``len()`` to determine its length (even if it has |
| 219 | embedded ``\0`` characters). It's mostly the receiving code that gets more |
| 220 | complex. (And in C, it's not much worse, except you can't use ``strlen`` if the |
| 221 | message has embedded ``\0``\ s.) |
| 222 | |
| 223 | The easiest enhancement is to make the first character of the message an |
| 224 | indicator of message type, and have the type determine the length. Now you have |
| 225 | two ``recv``\ s - the first to get (at least) that first character so you can |
| 226 | look up the length, and the second in a loop to get the rest. If you decide to |
| 227 | go the delimited route, you'll be receiving in some arbitrary chunk size, (4096 |
| 228 | or 8192 is frequently a good match for network buffer sizes), and scanning what |
| 229 | you've received for a delimiter. |
| 230 | |
| 231 | One complication to be aware of: if your conversational protocol allows multiple |
| 232 | messages to be sent back to back (without some kind of reply), and you pass |
| 233 | ``recv`` an arbitrary chunk size, you may end up reading the start of a |
| 234 | following message. You'll need to put that aside and hold onto it, until it's |
| 235 | needed. |
| 236 | |
| 237 | Prefixing the message with it's length (say, as 5 numeric characters) gets more |
| 238 | complex, because (believe it or not), you may not get all 5 characters in one |
| 239 | ``recv``. In playing around, you'll get away with it; but in high network loads, |
| 240 | your code will very quickly break unless you use two ``recv`` loops - the first |
| 241 | to determine the length, the second to get the data part of the message. Nasty. |
| 242 | This is also when you'll discover that ``send`` does not always manage to get |
| 243 | rid of everything in one pass. And despite having read this, you will eventually |
| 244 | get bit by it! |
| 245 | |
| 246 | In the interests of space, building your character, (and preserving my |
| 247 | competitive position), these enhancements are left as an exercise for the |
| 248 | reader. Lets move on to cleaning up. |
| 249 | |
| 250 | |
| 251 | Binary Data |
| 252 | ----------- |
| 253 | |
| 254 | It is perfectly possible to send binary data over a socket. The major problem is |
| 255 | that not all machines use the same formats for binary data. For example, a |
| 256 | Motorola chip will represent a 16 bit integer with the value 1 as the two hex |
| 257 | bytes 00 01. Intel and DEC, however, are byte-reversed - that same 1 is 01 00. |
| 258 | Socket libraries have calls for converting 16 and 32 bit integers - ``ntohl, |
| 259 | htonl, ntohs, htons`` where "n" means *network* and "h" means *host*, "s" means |
| 260 | *short* and "l" means *long*. Where network order is host order, these do |
| 261 | nothing, but where the machine is byte-reversed, these swap the bytes around |
| 262 | appropriately. |
| 263 | |
| 264 | In these days of 32 bit machines, the ascii representation of binary data is |
| 265 | frequently smaller than the binary representation. That's because a surprising |
| 266 | amount of the time, all those longs have the value 0, or maybe 1. The string "0" |
| 267 | would be two bytes, while binary is four. Of course, this doesn't fit well with |
| 268 | fixed-length messages. Decisions, decisions. |
| 269 | |
| 270 | |
| 271 | Disconnecting |
| 272 | ============= |
| 273 | |
| 274 | Strictly speaking, you're supposed to use ``shutdown`` on a socket before you |
| 275 | ``close`` it. The ``shutdown`` is an advisory to the socket at the other end. |
| 276 | Depending on the argument you pass it, it can mean "I'm not going to send |
| 277 | anymore, but I'll still listen", or "I'm not listening, good riddance!". Most |
| 278 | socket libraries, however, are so used to programmers neglecting to use this |
| 279 | piece of etiquette that normally a ``close`` is the same as ``shutdown(); |
| 280 | close()``. So in most situations, an explicit ``shutdown`` is not needed. |
| 281 | |
| 282 | One way to use ``shutdown`` effectively is in an HTTP-like exchange. The client |
| 283 | sends a request and then does a ``shutdown(1)``. This tells the server "This |
| 284 | client is done sending, but can still receive." The server can detect "EOF" by |
| 285 | a receive of 0 bytes. It can assume it has the complete request. The server |
| 286 | sends a reply. If the ``send`` completes successfully then, indeed, the client |
| 287 | was still receiving. |
| 288 | |
| 289 | Python takes the automatic shutdown a step further, and says that when a socket |
| 290 | is garbage collected, it will automatically do a ``close`` if it's needed. But |
| 291 | relying on this is a very bad habit. If your socket just disappears without |
| 292 | doing a ``close``, the socket at the other end may hang indefinitely, thinking |
| 293 | you're just being slow. *Please* ``close`` your sockets when you're done. |
| 294 | |
| 295 | |
| 296 | When Sockets Die |
| 297 | ---------------- |
| 298 | |
| 299 | Probably the worst thing about using blocking sockets is what happens when the |
| 300 | other side comes down hard (without doing a ``close``). Your socket is likely to |
Antoine Pitrou | 5b73ca4 | 2011-12-05 01:46:35 +0100 | [diff] [blame] | 301 | hang. TCP is a reliable protocol, and it will wait a long, long time |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 302 | before giving up on a connection. If you're using threads, the entire thread is |
| 303 | essentially dead. There's not much you can do about it. As long as you aren't |
| 304 | doing something dumb, like holding a lock while doing a blocking read, the |
| 305 | thread isn't really consuming much in the way of resources. Do *not* try to kill |
| 306 | the thread - part of the reason that threads are more efficient than processes |
| 307 | is that they avoid the overhead associated with the automatic recycling of |
| 308 | resources. In other words, if you do manage to kill the thread, your whole |
| 309 | process is likely to be screwed up. |
| 310 | |
| 311 | |
| 312 | Non-blocking Sockets |
| 313 | ==================== |
| 314 | |
Georg Brandl | 4b05466 | 2010-10-06 08:56:53 +0000 | [diff] [blame] | 315 | If you've understood the preceding, you already know most of what you need to |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 316 | know about the mechanics of using sockets. You'll still use the same calls, in |
| 317 | much the same ways. It's just that, if you do it right, your app will be almost |
| 318 | inside-out. |
| 319 | |
| 320 | In Python, you use ``socket.setblocking(0)`` to make it non-blocking. In C, it's |
| 321 | more complex, (for one thing, you'll need to choose between the BSD flavor |
| 322 | ``O_NONBLOCK`` and the almost indistinguishable Posix flavor ``O_NDELAY``, which |
| 323 | is completely different from ``TCP_NODELAY``), but it's the exact same idea. You |
| 324 | do this after creating the socket, but before using it. (Actually, if you're |
| 325 | nuts, you can switch back and forth.) |
| 326 | |
| 327 | The major mechanical difference is that ``send``, ``recv``, ``connect`` and |
| 328 | ``accept`` can return without having done anything. You have (of course) a |
| 329 | number of choices. You can check return code and error codes and generally drive |
| 330 | yourself crazy. If you don't believe me, try it sometime. Your app will grow |
| 331 | large, buggy and suck CPU. So let's skip the brain-dead solutions and do it |
| 332 | right. |
| 333 | |
| 334 | Use ``select``. |
| 335 | |
| 336 | In C, coding ``select`` is fairly complex. In Python, it's a piece of cake, but |
| 337 | it's close enough to the C version that if you understand ``select`` in Python, |
Ezio Melotti | eda1990 | 2011-05-14 09:17:52 +0300 | [diff] [blame] | 338 | you'll have little trouble with it in C:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 339 | |
| 340 | ready_to_read, ready_to_write, in_error = \ |
| 341 | select.select( |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 342 | potential_readers, |
| 343 | potential_writers, |
| 344 | potential_errs, |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 345 | timeout) |
| 346 | |
| 347 | You pass ``select`` three lists: the first contains all sockets that you might |
| 348 | want to try reading; the second all the sockets you might want to try writing |
| 349 | to, and the last (normally left empty) those that you want to check for errors. |
| 350 | You should note that a socket can go into more than one list. The ``select`` |
| 351 | call is blocking, but you can give it a timeout. This is generally a sensible |
| 352 | thing to do - give it a nice long timeout (say a minute) unless you have good |
| 353 | reason to do otherwise. |
| 354 | |
Ezio Melotti | eda1990 | 2011-05-14 09:17:52 +0300 | [diff] [blame] | 355 | In return, you will get three lists. They contain the sockets that are actually |
Christian Heimes | c3f30c4 | 2008-02-22 16:37:40 +0000 | [diff] [blame] | 356 | readable, writable and in error. Each of these lists is a subset (possibly |
Eli Bendersky | 46ab96a | 2011-05-22 06:56:15 +0300 | [diff] [blame] | 357 | empty) of the corresponding list you passed in. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 358 | |
| 359 | If a socket is in the output readable list, you can be |
| 360 | as-close-to-certain-as-we-ever-get-in-this-business that a ``recv`` on that |
| 361 | socket will return *something*. Same idea for the writable list. You'll be able |
| 362 | to send *something*. Maybe not all you want to, but *something* is better than |
| 363 | nothing. (Actually, any reasonably healthy socket will return as writable - it |
| 364 | just means outbound network buffer space is available.) |
| 365 | |
| 366 | If you have a "server" socket, put it in the potential_readers list. If it comes |
| 367 | out in the readable list, your ``accept`` will (almost certainly) work. If you |
| 368 | have created a new socket to ``connect`` to someone else, put it in the |
Christian Heimes | c3f30c4 | 2008-02-22 16:37:40 +0000 | [diff] [blame] | 369 | potential_writers list. If it shows up in the writable list, you have a decent |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 370 | chance that it has connected. |
| 371 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 372 | Actually, ``select`` can be handy even with blocking sockets. It's one way of |
| 373 | determining whether you will block - the socket returns as readable when there's |
| 374 | something in the buffers. However, this still doesn't help with the problem of |
| 375 | determining whether the other end is done, or just busy with something else. |
| 376 | |
| 377 | **Portability alert**: On Unix, ``select`` works both with the sockets and |
| 378 | files. Don't try this on Windows. On Windows, ``select`` works with sockets |
| 379 | only. Also note that in C, many of the more advanced socket options are done |
| 380 | differently on Windows. In fact, on Windows I usually use threads (which work |
Martin v. Löwis | 2d449aa | 2011-06-06 10:25:55 +0200 | [diff] [blame] | 381 | very, very well) with my sockets. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 382 | |
| 383 | |