blob: 4cd63b4e23bf99dfecdc2d129044e2cfe19261a2 [file] [log] [blame]
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -07001Table of Contents
2=================
Lucas Eckels9bd90e62012-08-06 15:07:02 -07003
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -07004 - [Intro](#intro)
5 - [git](#git)
6 - [Portability](#Portability)
7 - [Windows vs Unix](#winvsunix)
8 - [Library](#Library)
9 - [`Curl_connect`](#Curl_connect)
10 - [`Curl_do`](#Curl_do)
11 - [`Curl_readwrite`](#Curl_readwrite)
12 - [`Curl_done`](#Curl_done)
13 - [`Curl_disconnect`](#Curl_disconnect)
14 - [HTTP(S)](#http)
15 - [FTP](#ftp)
16 - [Kerberos](#kerberos)
17 - [TELNET](#telnet)
18 - [FILE](#file)
19 - [SMB](#smb)
20 - [LDAP](#ldap)
21 - [E-mail](#email)
22 - [General](#general)
23 - [Persistent Connections](#persistent)
24 - [multi interface/non-blocking](#multi)
25 - [SSL libraries](#ssl)
26 - [Library Symbols](#symbols)
27 - [Return Codes and Informationals](#returncodes)
28 - [AP/ABI](#abi)
29 - [Client](#client)
30 - [Memory Debugging](#memorydebug)
31 - [Test Suite](#test)
32 - [Asynchronous name resolves](#asyncdns)
33 - [c-ares](#cares)
34 - [`curl_off_t`](#curl_off_t)
35 - [curlx](#curlx)
36 - [Content Encoding](#contentencoding)
37 - [hostip.c explained](#hostip)
38 - [Track Down Memory Leaks](#memoryleak)
39 - [`multi_socket`](#multi_socket)
40 - [Structs in libcurl](#structs)
Lucas Eckels9bd90e62012-08-06 15:07:02 -070041
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -070042<a name="intro"></a>
43curl internals
44==============
45
46 This project is split in two. The library and the client. The client part
47 uses the library, but the library is designed to allow other applications to
48 use it.
Lucas Eckels9bd90e62012-08-06 15:07:02 -070049
50 The largest amount of code and complexity is in the library part.
51
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -070052
53<a name="git"></a>
54git
Lucas Eckels9bd90e62012-08-06 15:07:02 -070055===
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -070056
Lucas Eckels9bd90e62012-08-06 15:07:02 -070057 All changes to the sources are committed to the git repository as soon as
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -070058 they're somewhat verified to work. Changes shall be committed as independently
Lucas Eckels9bd90e62012-08-06 15:07:02 -070059 as possible so that individual changes can be easier spotted and tracked
60 afterwards.
61
62 Tagging shall be used extensively, and by the time we release new archives we
63 should tag the sources with a name similar to the released version number.
64
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -070065<a name="Portability"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -070066Portability
67===========
68
69 We write curl and libcurl to compile with C89 compilers. On 32bit and up
70 machines. Most of libcurl assumes more or less POSIX compliance but that's
71 not a requirement.
72
73 We write libcurl to build and work with lots of third party tools, and we
74 want it to remain functional and buildable with these and later versions
75 (older versions may still work but is not what we work hard to maintain):
76
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -070077Dependencies
78------------
Lucas Eckels9bd90e62012-08-06 15:07:02 -070079
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -070080 - OpenSSL 0.9.7
81 - GnuTLS 1.2
82 - zlib 1.1.4
83 - libssh2 0.16
84 - c-ares 1.6.0
85 - libidn 0.4.1
86 - cyassl 2.0.0
87 - openldap 2.0
88 - MIT Kerberos 1.2.4
89 - GSKit V5R3M0
90 - NSS 3.14.x
91 - axTLS 1.2.7
92 - PolarSSL 1.3.0
93 - Heimdal ?
94 - nghttp2 1.0.0
95
96Operating Systems
97-----------------
Lucas Eckels9bd90e62012-08-06 15:07:02 -070098
99 On systems where configure runs, we aim at working on them all - if they have
100 a suitable C compiler. On systems that don't run configure, we strive to keep
101 curl running fine on:
102
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700103 - Windows 98
104 - AS/400 V5R3M0
105 - Symbian 9.1
106 - Windows CE ?
107 - TPF ?
108
109Build tools
110-----------
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700111
112 When writing code (mostly for generating stuff included in release tarballs)
113 we use a few "build tools" and we make sure that we remain functional with
114 these versions:
115
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700116 - GNU Libtool 1.4.2
117 - GNU Autoconf 2.57
118 - GNU Automake 1.7
119 - GNU M4 1.4
120 - perl 5.004
121 - roffit 0.5
122 - groff ? (any version that supports "groff -Tps -man [in] [out]")
123 - ps2pdf (gs) ?
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700124
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700125<a name="winvsunix"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700126Windows vs Unix
127===============
128
129 There are a few differences in how to program curl the unix way compared to
130 the Windows way. The four perhaps most notable details are:
131
132 1. Different function names for socket operations.
133
134 In curl, this is solved with defines and macros, so that the source looks
135 the same at all places except for the header file that defines them. The
136 macros in use are sclose(), sread() and swrite().
137
138 2. Windows requires a couple of init calls for the socket stuff.
139
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700140 That's taken care of by the `curl_global_init()` call, but if other libs
141 also do it etc there might be reasons for applications to alter that
142 behaviour.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700143
144 3. The file descriptors for network communication and file operations are
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700145 not easily interchangeable as in unix.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700146
147 We avoid this by not trying any funny tricks on file descriptors.
148
149 4. When writing data to stdout, Windows makes end-of-lines the DOS way, thus
150 destroying binary data, although you do want that conversion if it is
151 text coming through... (sigh)
152
153 We set stdout to binary under windows
154
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700155 Inside the source code, We make an effort to avoid `#ifdef [Your OS]`. All
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700156 conditionals that deal with features *should* instead be in the format
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700157 `#ifdef HAVE_THAT_WEIRD_FUNCTION`. Since Windows can't run configure scripts,
158 we maintain a `curl_config-win32.h` file in lib directory that is supposed to
159 look exactly as a `curl_config.h` file would have looked like on a Windows
160 machine!
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700161
162 Generally speaking: always remember that this will be compiled on dozens of
163 operating systems. Don't walk on the edge.
164
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700165<a name="Library"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700166Library
167=======
168
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700169 (See `LIBCURL-STRUCTS` for a separate document describing all major internal
170 structs and their purposes.)
171
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700172 There are plenty of entry points to the library, namely each publicly defined
173 function that libcurl offers to applications. All of those functions are
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700174 rather small and easy-to-follow. All the ones prefixed with `curl_easy` are
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700175 put in the lib/easy.c file.
176
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700177 `curl_global_init_()` and `curl_global_cleanup()` should be called by the
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700178 application to initialize and clean up global stuff in the library. As of
179 today, it can handle the global SSL initing if SSL is enabled and it can init
180 the socket layer on windows machines. libcurl itself has no "global" scope.
181
182 All printf()-style functions use the supplied clones in lib/mprintf.c. This
183 makes sure we stay absolutely platform independent.
184
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700185 [ `curl_easy_init()`][2] allocates an internal struct and makes some
186 initializations. The returned handle does not reveal internals. This is the
187 'SessionHandle' struct which works as an "anchor" struct for all `curl_easy`
188 functions. All connections performed will get connect-specific data allocated
189 that should be used for things related to particular connections/requests.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700190
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700191 [`curl_easy_setopt()`][1] takes three arguments, where the option stuff must
192 be passed in pairs: the parameter-ID and the parameter-value. The list of
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700193 options is documented in the man page. This function mainly sets things in
194 the 'SessionHandle' struct.
195
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700196 `curl_easy_perform()` is just a wrapper function that makes use of the multi
197 API. It basically calls `curl_multi_init()`, `curl_multi_add_handle()`,
198 `curl_multi_wait()`, and `curl_multi_perform()` until the transfer is done
199 and then returns.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700200
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700201 Some of the most important key functions in url.c are called from multi.c
202 when certain key steps are to be made in the transfer operation.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700203
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700204<a name="Curl_connect"></a>
205Curl_connect()
206--------------
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700207
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700208 Analyzes the URL, it separates the different components and connects to the
209 remote host. This may involve using a proxy and/or using SSL. The
210 `Curl_resolv()` function in lib/hostip.c is used for looking up host names
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700211 (it does then use the proper underlying method, which may vary between
212 platforms and builds).
213
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700214 When `Curl_connect` is done, we are connected to the remote site. Then it
215 is time to tell the server to get a document/file. `Curl_do()` arranges
216 this.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700217
218 This function makes sure there's an allocated and initiated 'connectdata'
219 struct that is used for this particular connection only (although there may
220 be several requests performed on the same connect). A bunch of things are
221 inited/inherited from the SessionHandle struct.
222
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700223<a name="Curl_do"></a>
224Curl_do()
225---------
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700226
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700227 `Curl_do()` makes sure the proper protocol-specific function is called. The
228 functions are named after the protocols they handle.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700229
230 The protocol-specific functions of course deal with protocol-specific
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700231 negotiations and setup. They have access to the `Curl_sendf()` (from
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700232 lib/sendf.c) function to send printf-style formatted data to the remote
233 host and when they're ready to make the actual file transfer they call the
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700234 `Curl_Transfer()` function (in lib/transfer.c) to setup the transfer and
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700235 returns.
236
237 If this DO function fails and the connection is being re-used, libcurl will
238 then close this connection, setup a new connection and re-issue the DO
239 request on that. This is because there is no way to be perfectly sure that
240 we have discovered a dead connection before the DO function and thus we
241 might wrongly be re-using a connection that was closed by the remote peer.
242
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700243 Some time during the DO function, the `Curl_setup_transfer()` function must
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700244 be called with some basic info about the upcoming transfer: what socket(s)
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700245 to read/write and the expected file transfer sizes (if known).
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700246
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700247<a name="Curl_readwrite"></a>
248Curl_readwrite()
249----------------
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700250
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700251 Called during the transfer of the actual protocol payload.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700252
253 During transfer, the progress functions in lib/progress.c are called at a
254 frequent interval (or at the user's choice, a specified callback might get
255 called). The speedcheck functions in lib/speedcheck.c are also used to
256 verify that the transfer is as fast as required.
257
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700258<a name="Curl_done"></a>
259Curl_done()
260-----------
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700261
262 Called after a transfer is done. This function takes care of everything
263 that has to be done after a transfer. This function attempts to leave
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700264 matters in a state so that `Curl_do()` should be possible to call again on
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700265 the same connection (in a persistent connection case). It might also soon
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700266 be closed with `Curl_disconnect()`.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700267
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700268<a name="Curl_disconnect"></a>
269Curl_disconnect()
270-----------------
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700271
272 When doing normal connections and transfers, no one ever tries to close any
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700273 connections so this is not normally called when `curl_easy_perform()` is
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700274 used. This function is only used when we are certain that no more transfers
275 is going to be made on the connection. It can be also closed by force, or
276 it can be called to make sure that libcurl doesn't keep too many
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700277 connections alive at the same time.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700278
279 This function cleans up all resources that are associated with a single
280 connection.
281
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700282<a name="http"></a>
283HTTP(S)
284=======
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700285
286 HTTP offers a lot and is the protocol in curl that uses the most lines of
287 code. There is a special file (lib/formdata.c) that offers all the multipart
288 post functions.
289
290 base64-functions for user+password stuff (and more) is in (lib/base64.c) and
291 all functions for parsing and sending cookies are found in (lib/cookie.c).
292
293 HTTPS uses in almost every means the same procedure as HTTP, with only two
294 exceptions: the connect procedure is different and the function used to read
295 or write from the socket is different, although the latter fact is hidden in
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700296 the source by the use of `Curl_read()` for reading and `Curl_write()` for
297 writing data to the remote server.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700298
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700299 `http_chunks.c` contains functions that understands HTTP 1.1 chunked transfer
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700300 encoding.
301
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700302 An interesting detail with the HTTP(S) request, is the `Curl_add_buffer()`
303 series of functions we use. They append data to one single buffer, and when
304 the building is done the entire request is sent off in one single write. This
305 is done this way to overcome problems with flawed firewalls and lame servers.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700306
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700307<a name="ftp"></a>
308FTP
309===
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700310
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700311 The `Curl_if2ip()` function can be used for getting the IP number of a
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700312 specified network interface, and it resides in lib/if2ip.c.
313
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700314 `Curl_ftpsendf()` is used for sending FTP commands to the remote server. It
315 was made a separate function to prevent us programmers from forgetting that
316 they must be CRLF terminated. They must also be sent in one single write() to
317 make firewalls and similar happy.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700318
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700319<a name="kerberos"></a>
320Kerberos
321--------
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700322
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700323 Kerberos support is mainly in lib/krb5.c and lib/security.c but also
324 `curl_sasl_sspi.c` and `curl_sasl_gssapi.c` for the email protocols and
325 `socks_gssapi.c` and `socks_sspi.c` for SOCKS5 proxy specifics.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700326
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700327<a name="telnet"></a>
328TELNET
329======
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700330
331 Telnet is implemented in lib/telnet.c.
332
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700333<a name="file"></a>
334FILE
335====
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700336
337 The file:// protocol is dealt with in lib/file.c.
338
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700339<a name="smb"></a>
340SMB
341===
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700342
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700343 The smb:// protocol is dealt with in lib/smb.c.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700344
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700345<a name="ldap"></a>
346LDAP
347====
348
349 Everything LDAP is in lib/ldap.c and lib/openldap.c
350
351<a name="email"></a>
352E-mail
353======
354
355 The e-mail related source code is in lib/imap.c, lib/pop3.c and lib/smtp.c.
356
357<a name="general"></a>
358General
359=======
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700360
361 URL encoding and decoding, called escaping and unescaping in the source code,
362 is found in lib/escape.c.
363
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700364 While transferring data in Transfer() a few functions might get used.
365 `curl_getdate()` in lib/parsedate.c is for HTTP date comparisons (and more).
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700366
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700367 lib/getenv.c offers `curl_getenv()` which is for reading environment
368 variables in a neat platform independent way. That's used in the client, but
369 also in lib/url.c when checking the proxy environment variables. Note that
370 contrary to the normal unix getenv(), this returns an allocated buffer that
371 must be free()ed after use.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700372
373 lib/netrc.c holds the .netrc parser
374
375 lib/timeval.c features replacement functions for systems that don't have
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700376 gettimeofday() and a few support functions for timeval conversions.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700377
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700378 A function named `curl_version()` that returns the full curl version string
379 is found in lib/version.c.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700380
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700381<a name="persistent"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700382Persistent Connections
383======================
384
385 The persistent connection support in libcurl requires some considerations on
386 how to do things inside of the library.
387
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700388 - The 'SessionHandle' struct returned in the [`curl_easy_init()`][2] call
389 must never hold connection-oriented data. It is meant to hold the root data
390 as well as all the options etc that the library-user may choose.
391
392 - The 'SessionHandle' struct holds the "connection cache" (an array of
393 pointers to 'connectdata' structs).
394
395 - This enables the 'curl handle' to be reused on subsequent transfers.
396
397 - When libcurl is told to perform a transfer, it first checks for an already
398 existing connection in the cache that we can use. Otherwise it creates a
399 new one and adds that the cache. If the cache is full already when a new
400 connection is added added, it will first close the oldest unused one.
401
402 - When the transfer operation is complete, the connection is left
403 open. Particular options may tell libcurl not to, and protocols may signal
404 closure on connections and then they won't be kept open of course.
405
406 - When `curl_easy_cleanup()` is called, we close all still opened connections,
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700407 unless of course the multi interface "owns" the connections.
408
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700409 The curl handle must be re-used in order for the persistent connections to
410 work.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700411
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700412<a name="multi"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700413multi interface/non-blocking
414============================
415
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700416 The multi interface is a non-blocking interface to the library. To make that
417 interface work as good as possible, no low-level functions within libcurl
418 must be written to work in a blocking manner. (There are still a few spots
419 violating this rule.)
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700420
421 One of the primary reasons we introduced c-ares support was to allow the name
422 resolve phase to be perfectly non-blocking as well.
423
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700424 The FTP and the SFTP/SCP protocols are examples of how we adapt and adjust
425 the code to allow non-blocking operations even on multi-stage command-
426 response protocols. They are built around state machines that return when
427 they would otherwise block waiting for data. The DICT, LDAP and TELNET
428 protocols are crappy examples and they are subject for rewrite in the future
429 to better fit the libcurl protocol family.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700430
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700431<a name="ssl"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700432SSL libraries
433=============
434
435 Originally libcurl supported SSLeay for SSL/TLS transports, but that was then
436 extended to its successor OpenSSL but has since also been extended to several
437 other SSL/TLS libraries and we expect and hope to further extend the support
438 in future libcurl versions.
439
440 To deal with this internally in the best way possible, we have a generic SSL
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700441 function API as provided by the vtls/vtls.[ch] system, and they are the only
442 SSL functions we must use from within libcurl. vtls is then crafted to use
443 the appropriate lower-level function calls to whatever SSL library that is in
444 use. For example vtls/openssl.[ch] for the OpenSSL library.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700445
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700446<a name="symbols"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700447Library Symbols
448===============
449
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700450 All symbols used internally in libcurl must use a `Curl_` prefix if they're
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700451 used in more than a single file. Single-file symbols must be made static.
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700452 Public ("exported") symbols must use a `curl_` prefix. (There are exceptions,
453 but they are to be changed to follow this pattern in future versions.) Public
454 API functions are marked with `CURL_EXTERN` in the public header files so
455 that all others can be hidden on platforms where this is possible.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700456
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700457<a name="returncodes"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700458Return Codes and Informationals
459===============================
460
461 I've made things simple. Almost every function in libcurl returns a CURLcode,
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700462 that must be `CURLE_OK` if everything is OK or otherwise a suitable error
463 code as the curl/curl.h include file defines. The very spot that detects an
464 error must use the `Curl_failf()` function to set the human-readable error
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700465 description.
466
467 In aiding the user to understand what's happening and to debug curl usage, we
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700468 must supply a fair amount of informational messages by using the
469 `Curl_infof()` function. Those messages are only displayed when the user
470 explicitly asks for them. They are best used when revealing information that
471 isn't otherwise obvious.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700472
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700473<a name="abi"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700474API/ABI
475=======
476
477 We make an effort to not export or show internals or how internals work, as
478 that makes it easier to keep a solid API/ABI over time. See docs/libcurl/ABI
479 for our promise to users.
480
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700481<a name="client"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700482Client
483======
484
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700485 main() resides in `src/tool_main.c`.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700486
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700487 `src/tool_hugehelp.c` is automatically generated by the mkhelp.pl perl script
488 to display the complete "manual" and the src/tool_urlglob.c file holds the
489 functions used for the URL-"globbing" support. Globbing in the sense that the
490 {} and [] expansion stuff is there.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700491
492 The client mostly messes around to setup its 'config' struct properly, then
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700493 it calls the `curl_easy_*()` functions of the library and when it gets back
494 control after the `curl_easy_perform()` it cleans up the library, checks
495 status and exits.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700496
497 When the operation is done, the ourWriteOut() function in src/writeout.c may
498 be called to report about the operation. That function is using the
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700499 `curl_easy_getinfo()` function to extract useful information from the curl
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700500 session.
501
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700502 It may loop and do all this several times if many URLs were specified on the
503 command line or config file.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700504
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700505<a name="memorydebug"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700506Memory Debugging
507================
508
509 The file lib/memdebug.c contains debug-versions of a few functions. Functions
510 such as malloc, free, fopen, fclose, etc that somehow deal with resources
511 that might give us problems if we "leak" them. The functions in the memdebug
512 system do nothing fancy, they do their normal function and then log
513 information about what they just did. The logged data can then be analyzed
514 after a complete session,
515
516 memanalyze.pl is the perl script present in tests/ that analyzes a log file
517 generated by the memory tracking system. It detects if resources are
518 allocated but never freed and other kinds of errors related to resource
519 management.
520
521 Internally, definition of preprocessor symbol DEBUGBUILD restricts code which
522 is only compiled for debug enabled builds. And symbol CURLDEBUG is used to
523 differentiate code which is _only_ used for memory tracking/debugging.
524
525 Use -DCURLDEBUG when compiling to enable memory debugging, this is also
526 switched on by running configure with --enable-curldebug. Use -DDEBUGBUILD
527 when compiling to enable a debug build or run configure with --enable-debug.
528
529 curl --version will list 'Debug' feature for debug enabled builds, and
530 will list 'TrackMemory' feature for curl debug memory tracking capable
531 builds. These features are independent and can be controlled when running
532 the configure script. When --enable-debug is given both features will be
533 enabled, unless some restriction prevents memory tracking from being used.
534
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700535<a name="test"></a>
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700536Test Suite
537==========
538
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700539 The test suite is placed in its own subdirectory directly off the root in the
540 curl archive tree, and it contains a bunch of scripts and a lot of test case
541 data.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700542
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700543 The main test script is runtests.pl that will invoke test servers like
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700544 httpserver.pl and ftpserver.pl before all the test cases are performed. The
545 test suite currently only runs on unix-like platforms.
546
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700547 You'll find a description of the test suite in the tests/README file, and the
548 test case data files in the tests/FILEFORMAT file.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700549
550 The test suite automatically detects if curl was built with the memory
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700551 debugging enabled, and if it was it will detect memory leaks, too.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700552
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700553<a name="asyncdns"></a>
554Asynchronous name resolves
555==========================
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700556
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700557 libcurl can be built to do name resolves asynchronously, using either the
558 normal resolver in a threaded manner or by using c-ares.
Lucas Eckels9bd90e62012-08-06 15:07:02 -0700559
Bertrand SIMONNETe6cd7382015-07-01 15:39:44 -0700560<a name="cares"></a>
561[c-ares][3]
562------
563
564### Build libcurl to use a c-ares
565
5661. ./configure --enable-ares=/path/to/ares/install
5672. make
568
569### c-ares on win32
570
571 First I compiled c-ares. I changed the default C runtime library to be the
572 single-threaded rather than the multi-threaded (this seems to be required to
573 prevent linking errors later on). Then I simply build the areslib project
574 (the other projects adig/ahost seem to fail under MSVC).
575
576 Next was libcurl. I opened lib/config-win32.h and I added a:
577 `#define USE_ARES 1`
578
579 Next thing I did was I added the path for the ares includes to the include
580 path, and the libares.lib to the libraries.
581
582 Lastly, I also changed libcurl to be single-threaded rather than
583 multi-threaded, again this was to prevent some duplicate symbol errors. I'm
584 not sure why I needed to change everything to single-threaded, but when I
585 didn't I got redefinition errors for several CRT functions (malloc, stricmp,
586 etc.)
587
588<a name="curl_off_t"></a>
589`curl_off_t`
590==========
591
592 curl_off_t is a data type provided by the external libcurl include
593 headers. It is the type meant to be used for the [`curl_easy_setopt()`][1]
594 options that end with LARGE. The type is 64bit large on most modern
595 platforms.
596
597curlx
598=====
599
600 The libcurl source code offers a few functions by source only. They are not
601 part of the official libcurl API, but the source files might be useful for
602 others so apps can optionally compile/build with these sources to gain
603 additional functions.
604
605 We provide them through a single header file for easy access for apps:
606 "curlx.h"
607
608`curlx_strtoofft()`
609-------------------
610 A macro that converts a string containing a number to a curl_off_t number.
611 This might use the curlx_strtoll() function which is provided as source
612 code in strtoofft.c. Note that the function is only provided if no
613 strtoll() (or equivalent) function exist on your platform. If curl_off_t
614 is only a 32 bit number on your platform, this macro uses strtol().
615
616`curlx_tvnow()`
617---------------
618 returns a struct timeval for the current time.
619
620`curlx_tvdiff()`
621--------------
622 returns the difference between two timeval structs, in number of
623 milliseconds.
624
625`curlx_tvdiff_secs()`
626---------------------
627 returns the same as curlx_tvdiff but with full usec resolution (as a
628 double)
629
630Future
631------
632
633 Several functions will be removed from the public curl_ name space in a
634 future libcurl release. They will then only become available as curlx_
635 functions instead. To make the transition easier, we already today provide
636 these functions with the curlx_ prefix to allow sources to get built properly
637 with the new function names. The functions this concerns are:
638
639 - `curlx_getenv`
640 - `curlx_strequal`
641 - `curlx_strnequal`
642 - `curlx_mvsnprintf`
643 - `curlx_msnprintf`
644 - `curlx_maprintf`
645 - `curlx_mvaprintf`
646 - `curlx_msprintf`
647 - `curlx_mprintf`
648 - `curlx_mfprintf`
649 - `curlx_mvsprintf`
650 - `curlx_mvprintf`
651 - `curlx_mvfprintf`
652
653<a name="contentencoding"></a>
654Content Encoding
655================
656
657## About content encodings
658
659 [HTTP/1.1][4] specifies that a client may request that a server encode its
660 response. This is usually used to compress a response using one of a set of
661 commonly available compression techniques. These schemes are 'deflate' (the
662 zlib algorithm), 'gzip' and 'compress'. A client requests that the sever
663 perform an encoding by including an Accept-Encoding header in the request
664 document. The value of the header should be one of the recognized tokens
665 'deflate', ... (there's a way to register new schemes/tokens, see sec 3.5 of
666 the spec). A server MAY honor the client's encoding request. When a response
667 is encoded, the server includes a Content-Encoding header in the
668 response. The value of the Content-Encoding header indicates which scheme was
669 used to encode the data.
670
671 A client may tell a server that it can understand several different encoding
672 schemes. In this case the server may choose any one of those and use it to
673 encode the response (indicating which one using the Content-Encoding header).
674 It's also possible for a client to attach priorities to different schemes so
675 that the server knows which it prefers. See sec 14.3 of RFC 2616 for more
676 information on the Accept-Encoding header.
677
678## Supported content encodings
679
680 The 'deflate' and 'gzip' content encoding are supported by libcurl. Both
681 regular and chunked transfers work fine. The zlib library is required for
682 this feature.
683
684## The libcurl interface
685
686 To cause libcurl to request a content encoding use:
687
688 [`curl_easy_setopt`][1](curl, [`CURLOPT_ACCEPT_ENCODING`][5], string)
689
690 where string is the intended value of the Accept-Encoding header.
691
692 Currently, libcurl only understands how to process responses that use the
693 "deflate" or "gzip" Content-Encoding, so the only values for
694 [`CURLOPT_ACCEPT_ENCODING`][5] that will work (besides "identity," which does
695 nothing) are "deflate" and "gzip" If a response is encoded using the
696 "compress" or methods, libcurl will return an error indicating that the
697 response could not be decoded. If <string> is NULL no Accept-Encoding header
698 is generated. If <string> is a zero-length string, then an Accept-Encoding
699 header containing all supported encodings will be generated.
700
701 The [`CURLOPT_ACCEPT_ENCODING`][5] must be set to any non-NULL value for
702 content to be automatically decoded. If it is not set and the server still
703 sends encoded content (despite not having been asked), the data is returned
704 in its raw form and the Content-Encoding type is not checked.
705
706## The curl interface
707
708 Use the [--compressed][6] option with curl to cause it to ask servers to
709 compress responses using any format supported by curl.
710
711<a name="hostip"></a>
712hostip.c explained
713==================
714
715 The main compile-time defines to keep in mind when reading the host*.c source
716 file are these:
717
718## `CURLRES_IPV6`
719
720 this host has getaddrinfo() and family, and thus we use that. The host may
721 not be able to resolve IPv6, but we don't really have to take that into
722 account. Hosts that aren't IPv6-enabled have CURLRES_IPV4 defined.
723
724## `CURLRES_ARES`
725
726 is defined if libcurl is built to use c-ares for asynchronous name
727 resolves. This can be Windows or *nix.
728
729## `CURLRES_THREADED`
730
731 is defined if libcurl is built to use threading for asynchronous name
732 resolves. The name resolve will be done in a new thread, and the supported
733 asynch API will be the same as for ares-builds. This is the default under
734 (native) Windows.
735
736 If any of the two previous are defined, `CURLRES_ASYNCH` is defined too. If
737 libcurl is not built to use an asynchronous resolver, `CURLRES_SYNCH` is
738 defined.
739
740## host*.c sources
741
742 The host*.c sources files are split up like this:
743
744 - hostip.c - method-independent resolver functions and utility functions
745 - hostasyn.c - functions for asynchronous name resolves
746 - hostsyn.c - functions for synchronous name resolves
747 - asyn-ares.c - functions for asynchronous name resolves using c-ares
748 - asyn-thread.c - functions for asynchronous name resolves using threads
749 - hostip4.c - IPv4 specific functions
750 - hostip6.c - IPv6 specific functions
751
752 The hostip.h is the single united header file for all this. It defines the
753 `CURLRES_*` defines based on the config*.h and curl_setup.h defines.
754
755<a name="memoryleak"></a>
756Track Down Memory Leaks
757=======================
758
759## Single-threaded
760
761 Please note that this memory leak system is not adjusted to work in more
762 than one thread. If you want/need to use it in a multi-threaded app. Please
763 adjust accordingly.
764
765
766## Build
767
768 Rebuild libcurl with -DCURLDEBUG (usually, rerunning configure with
769 --enable-debug fixes this). 'make clean' first, then 'make' so that all
770 files actually are rebuilt properly. It will also make sense to build
771 libcurl with the debug option (usually -g to the compiler) so that debugging
772 it will be easier if you actually do find a leak in the library.
773
774 This will create a library that has memory debugging enabled.
775
776## Modify Your Application
777
778 Add a line in your application code:
779
780 `curl_memdebug("dump");`
781
782 This will make the malloc debug system output a full trace of all resource
783 using functions to the given file name. Make sure you rebuild your program
784 and that you link with the same libcurl you built for this purpose as
785 described above.
786
787## Run Your Application
788
789 Run your program as usual. Watch the specified memory trace file grow.
790
791 Make your program exit and use the proper libcurl cleanup functions etc. So
792 that all non-leaks are returned/freed properly.
793
794## Analyze the Flow
795
796 Use the tests/memanalyze.pl perl script to analyze the dump file:
797
798 tests/memanalyze.pl dump
799
800 This now outputs a report on what resources that were allocated but never
801 freed etc. This report is very fine for posting to the list!
802
803 If this doesn't produce any output, no leak was detected in libcurl. Then
804 the leak is mostly likely to be in your code.
805
806<a name="multi_socket"></a>
807`multi_socket`
808==============
809
810 Implementation of the `curl_multi_socket` API
811
812 The main ideas of this API are simply:
813
814 1 - The application can use whatever event system it likes as it gets info
815 from libcurl about what file descriptors libcurl waits for what action
816 on. (The previous API returns `fd_sets` which is very select()-centric).
817
818 2 - When the application discovers action on a single socket, it calls
819 libcurl and informs that there was action on this particular socket and
820 libcurl can then act on that socket/transfer only and not care about
821 any other transfers. (The previous API always had to scan through all
822 the existing transfers.)
823
824 The idea is that [`curl_multi_socket_action()`][7] calls a given callback
825 with information about what socket to wait for what action on, and the
826 callback only gets called if the status of that socket has changed.
827
828 We also added a timer callback that makes libcurl call the application when
829 the timeout value changes, and you set that with [`curl_multi_setopt()`][9]
830 and the [`CURLMOPT_TIMERFUNCTION`][10] option. To get this to work,
831 Internally, there's an added a struct to each easy handle in which we store
832 an "expire time" (if any). The structs are then "splay sorted" so that we
833 can add and remove times from the linked list and yet somewhat swiftly
834 figure out both how long time there is until the next nearest timer expires
835 and which timer (handle) we should take care of now. Of course, the upside
836 of all this is that we get a [`curl_multi_timeout()`][8] that should also
837 work with old-style applications that use [`curl_multi_perform()`][11].
838
839 We created an internal "socket to easy handles" hash table that given
840 a socket (file descriptor) return the easy handle that waits for action on
841 that socket. This hash is made using the already existing hash code
842 (previously only used for the DNS cache).
843
844 To make libcurl able to report plain sockets in the socket callback, we had
845 to re-organize the internals of the [`curl_multi_fdset()`][12] etc so that
846 the conversion from sockets to `fd_sets` for that function is only done in
847 the last step before the data is returned. I also had to extend c-ares to
848 get a function that can return plain sockets, as that library too returned
849 only `fd_sets` and that is no longer good enough. The changes done to c-ares
850 are available in c-ares 1.3.1 and later.
851
852<a name="structs"></a>
853Structs in libcurl
854==================
855
856This section should cover 7.32.0 pretty accurately, but will make sense even
857for older and later versions as things don't change drastically that often.
858
859## SessionHandle
860
861 The SessionHandle handle struct is the one returned to the outside in the
862 external API as a "CURL *". This is usually known as an easy handle in API
863 documentations and examples.
864
865 Information and state that is related to the actual connection is in the
866 'connectdata' struct. When a transfer is about to be made, libcurl will
867 either create a new connection or re-use an existing one. The particular
868 connectdata that is used by this handle is pointed out by
869 SessionHandle->easy_conn.
870
871 Data and information that regard this particular single transfer is put in
872 the SingleRequest sub-struct.
873
874 When the SessionHandle struct is added to a multi handle, as it must be in
875 order to do any transfer, the ->multi member will point to the `Curl_multi`
876 struct it belongs to. The ->prev and ->next members will then be used by the
877 multi code to keep a linked list of SessionHandle structs that are added to
878 that same multi handle. libcurl always uses multi so ->multi *will* point to
879 a `Curl_multi` when a transfer is in progress.
880
881 ->mstate is the multi state of this particular SessionHandle. When
882 `multi_runsingle()` is called, it will act on this handle according to which
883 state it is in. The mstate is also what tells which sockets to return for a
884 specific SessionHandle when [`curl_multi_fdset()`][12] is called etc.
885
886 The libcurl source code generally use the name 'data' for the variable that
887 points to the SessionHandle.
888
889 When doing multiplexed HTTP/2 transfers, each SessionHandle is associated
890 with an individual stream, sharing the same connectdata struct. Multiplexing
891 makes it even more important to keep things associated with the right thing!
892
893## connectdata
894
895 A general idea in libcurl is to keep connections around in a connection
896 "cache" after they have been used in case they will be used again and then
897 re-use an existing one instead of creating a new as it creates a significant
898 performance boost.
899
900 Each 'connectdata' identifies a single physical connection to a server. If
901 the connection can't be kept alive, the connection will be closed after use
902 and then this struct can be removed from the cache and freed.
903
904 Thus, the same SessionHandle can be used multiple times and each time select
905 another connectdata struct to use for the connection. Keep this in mind, as
906 it is then important to consider if options or choices are based on the
907 connection or the SessionHandle.
908
909 Functions in libcurl will assume that connectdata->data points to the
910 SessionHandle that uses this connection (for the moment).
911
912 As a special complexity, some protocols supported by libcurl require a
913 special disconnect procedure that is more than just shutting down the
914 socket. It can involve sending one or more commands to the server before
915 doing so. Since connections are kept in the connection cache after use, the
916 original SessionHandle may no longer be around when the time comes to shut
917 down a particular connection. For this purpose, libcurl holds a special
918 dummy `closure_handle` SessionHandle in the `Curl_multi` struct to use when
919 needed.
920
921 FTP uses two TCP connections for a typical transfer but it keeps both in
922 this single struct and thus can be considered a single connection for most
923 internal concerns.
924
925 The libcurl source code generally use the name 'conn' for the variable that
926 points to the connectdata.
927
928## Curl_multi
929
930 Internally, the easy interface is implemented as a wrapper around multi
931 interface functions. This makes everything multi interface.
932
933 `Curl_multi` is the multi handle struct exposed as "CURLM *" in external APIs.
934
935 This struct holds a list of SessionHandle structs that have been added to
936 this handle with [`curl_multi_add_handle()`][13]. The start of the list is
937 ->easyp and ->num_easy is a counter of added SessionHandles.
938
939 ->msglist is a linked list of messages to send back when
940 [`curl_multi_info_read()`][14] is called. Basically a node is added to that
941 list when an individual SessionHandle's transfer has completed.
942
943 ->hostcache points to the name cache. It is a hash table for looking up name
944 to IP. The nodes have a limited life time in there and this cache is meant
945 to reduce the time for when the same name is wanted within a short period of
946 time.
947
948 ->timetree points to a tree of SessionHandles, sorted by the remaining time
949 until it should be checked - normally some sort of timeout. Each
950 SessionHandle has one node in the tree.
951
952 ->sockhash is a hash table to allow fast lookups of socket descriptor to
953 which SessionHandle that uses that descriptor. This is necessary for the
954 `multi_socket` API.
955
956 ->conn_cache points to the connection cache. It keeps track of all
957 connections that are kept after use. The cache has a maximum size.
958
959 ->closure_handle is described in the 'connectdata' section.
960
961 The libcurl source code generally use the name 'multi' for the variable that
962 points to the Curl_multi struct.
963
964## Curl_handler
965
966 Each unique protocol that is supported by libcurl needs to provide at least
967 one `Curl_handler` struct. It defines what the protocol is called and what
968 functions the main code should call to deal with protocol specific issues.
969 In general, there's a source file named [protocol].c in which there's a
970 "struct `Curl_handler` `Curl_handler_[protocol]`" declared. In url.c there's
971 then the main array with all individual `Curl_handler` structs pointed to
972 from a single array which is scanned through when a URL is given to libcurl
973 to work with.
974
975 ->scheme is the URL scheme name, usually spelled out in uppercase. That's
976 "HTTP" or "FTP" etc. SSL versions of the protcol need its own `Curl_handler`
977 setup so HTTPS separate from HTTP.
978
979 ->setup_connection is called to allow the protocol code to allocate protocol
980 specific data that then gets associated with that SessionHandle for the rest
981 of this transfer. It gets freed again at the end of the transfer. It will be
982 called before the 'connectdata' for the transfer has been selected/created.
983 Most protocols will allocate its private 'struct [PROTOCOL]' here and assign
984 SessionHandle->req.protop to point to it.
985
986 ->connect_it allows a protocol to do some specific actions after the TCP
987 connect is done, that can still be considered part of the connection phase.
988
989 Some protocols will alter the connectdata->recv[] and connectdata->send[]
990 function pointers in this function.
991
992 ->connecting is similarly a function that keeps getting called as long as the
993 protocol considers itself still in the connecting phase.
994
995 ->do_it is the function called to issue the transfer request. What we call
996 the DO action internally. If the DO is not enough and things need to be kept
997 getting done for the entire DO sequence to complete, ->doing is then usually
998 also provided. Each protocol that needs to do multiple commands or similar
999 for do/doing need to implement their own state machines (see SCP, SFTP,
1000 FTP). Some protocols (only FTP and only due to historical reasons) has a
1001 separate piece of the DO state called `DO_MORE`.
1002
1003 ->doing keeps getting called while issuing the transfer request command(s)
1004
1005 ->done gets called when the transfer is complete and DONE. That's after the
1006 main data has been transferred.
1007
1008 ->do_more gets called during the `DO_MORE` state. The FTP protocol uses this
1009 state when setting up the second connection.
1010
1011 ->`proto_getsock`
1012 ->`doing_getsock`
1013 ->`domore_getsock`
1014 ->`perform_getsock`
1015 Functions that return socket information. Which socket(s) to wait for which
1016 action(s) during the particular multi state.
1017
1018 ->disconnect is called immediately before the TCP connection is shutdown.
1019
1020 ->readwrite gets called during transfer to allow the protocol to do extra
1021 reads/writes
1022
1023 ->defport is the default report TCP or UDP port this protocol uses
1024
1025 ->protocol is one or more bits in the `CURLPROTO_*` set. The SSL versions
1026 have their "base" protocol set and then the SSL variation. Like
1027 "HTTP|HTTPS".
1028
1029 ->flags is a bitmask with additional information about the protocol that will
1030 make it get treated differently by the generic engine:
1031
1032 - `PROTOPT_SSL` - will make it connect and negotiate SSL
1033
1034 - `PROTOPT_DUAL` - this protocol uses two connections
1035
1036 - `PROTOPT_CLOSEACTION` - this protocol has actions to do before closing the
1037 connection. This flag is no longer used by code, yet still set for a bunch
1038 protocol handlers.
1039
1040 - `PROTOPT_DIRLOCK` - "direction lock". The SSH protocols set this bit to
1041 limit which "direction" of socket actions that the main engine will
1042 concern itself about.
1043
1044 - `PROTOPT_NONETWORK` - a protocol that doesn't use network (read file:)
1045
1046 - `PROTOPT_NEEDSPWD` - this protocol needs a password and will use a default
1047 one unless one is provided
1048
1049 - `PROTOPT_NOURLQUERY` - this protocol can't handle a query part on the URL
1050 (?foo=bar)
1051
1052## conncache
1053
1054 Is a hash table with connections for later re-use. Each SessionHandle has
1055 a pointer to its connection cache. Each multi handle sets up a connection
1056 cache that all added SessionHandles share by default.
1057
1058## Curl_share
1059
1060 The libcurl share API allocates a `Curl_share` struct, exposed to the
1061 external API as "CURLSH *".
1062
1063 The idea is that the struct can have a set of own versions of caches and
1064 pools and then by providing this struct in the `CURLOPT_SHARE` option, those
1065 specific SessionHandles will use the caches/pools that this share handle
1066 holds.
1067
1068 Then individual SessionHandle structs can be made to share specific things
1069 that they otherwise wouldn't, such as cookies.
1070
1071 The `Curl_share` struct can currently hold cookies, DNS cache and the SSL
1072 session cache.
1073
1074## CookieInfo
1075
1076 This is the main cookie struct. It holds all known cookies and related
1077 information. Each SessionHandle has its own private CookieInfo even when
1078 they are added to a multi handle. They can be made to share cookies by using
1079 the share API.
1080
1081
1082[1]: http://curl.haxx.se/libcurl/c/curl_easy_setopt.html
1083[2]: http://curl.haxx.se/libcurl/c/curl_easy_init.html
1084[3]: http://c-ares.haxx.se/
1085[4]: https://tools.ietf.org/html/rfc7230 "RFC 7230"
1086[5]: http://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
1087[6]: http://curl.haxx.se/docs/manpage.html#--compressed
1088[7]: http://curl.haxx.se/libcurl/c/curl_multi_socket_action.html
1089[8]: http://curl.haxx.se/libcurl/c/curl_multi_timeout.html
1090[9]: http://curl.haxx.se/libcurl/c/curl_multi_setopt.html
1091[10]: http://curl.haxx.se/libcurl/c/CURLMOPT_TIMERFUNCTION.html
1092[11]: http://curl.haxx.se/libcurl/c/curl_multi_perform.html
1093[12]: http://curl.haxx.se/libcurl/c/curl_multi_fdset.html
1094[13]: http://curl.haxx.se/libcurl/c/curl_multi_add_handle.html
1095[14]: http://curl.haxx.se/libcurl/c/curl_multi_info_read.html