Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1 | .\" ************************************************************************** |
| 2 | .\" * _ _ ____ _ |
| 3 | .\" * Project ___| | | | _ \| | |
| 4 | .\" * / __| | | | |_) | | |
| 5 | .\" * | (__| |_| | _ <| |___ |
| 6 | .\" * \___|\___/|_| \_\_____| |
| 7 | .\" * |
Elliott Hughes | cac3980 | 2018-04-27 16:19:43 -0700 | [diff] [blame] | 8 | .\" * Copyright (C) 1998 - 2018, Daniel Stenberg, <daniel@haxx.se>, et al. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 9 | .\" * |
| 10 | .\" * This software is licensed as described in the file COPYING, which |
| 11 | .\" * you should have received as part of this distribution. The terms |
Alex Deymo | 8f1a214 | 2016-06-28 14:49:26 -0700 | [diff] [blame] | 12 | .\" * are also available at https://curl.haxx.se/docs/copyright.html. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 13 | .\" * |
| 14 | .\" * You may opt to use, copy, modify, merge, publish, distribute and/or sell |
| 15 | .\" * copies of the Software, and permit persons to whom the Software is |
| 16 | .\" * furnished to do so, under the terms of the COPYING file. |
| 17 | .\" * |
| 18 | .\" * This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY |
| 19 | .\" * KIND, either express or implied. |
| 20 | .\" * |
| 21 | .\" ************************************************************************** |
| 22 | .\" |
Elliott Hughes | cac3980 | 2018-04-27 16:19:43 -0700 | [diff] [blame] | 23 | .TH libcurl-tutorial 3 "February 23, 2018" "libcurl 7.59.0" "libcurl programming" |
Elliott Hughes | 82be86d | 2017-09-20 17:00:17 -0700 | [diff] [blame] | 24 | |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 25 | .SH NAME |
| 26 | libcurl-tutorial \- libcurl programming tutorial |
| 27 | .SH "Objective" |
| 28 | This document attempts to describe the general principles and some basic |
| 29 | approaches to consider when programming with libcurl. The text will focus |
| 30 | mainly on the C interface but might apply fairly well on other interfaces as |
| 31 | well as they usually follow the C one pretty closely. |
| 32 | |
| 33 | This document will refer to 'the user' as the person writing the source code |
| 34 | that uses libcurl. That would probably be you or someone in your position. |
| 35 | What will be generally referred to as 'the program' will be the collected |
| 36 | source code that you write that is using libcurl for transfers. The program |
| 37 | is outside libcurl and libcurl is outside of the program. |
| 38 | |
| 39 | To get more details on all options and functions described herein, please |
| 40 | refer to their respective man pages. |
| 41 | |
| 42 | .SH "Building" |
| 43 | There are many different ways to build C programs. This chapter will assume a |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 44 | Unix style build process. If you use a different build system, you can still |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 45 | read this to get general information that may apply to your environment as |
| 46 | well. |
| 47 | .IP "Compiling the Program" |
| 48 | Your compiler needs to know where the libcurl headers are located. Therefore |
| 49 | you must set your compiler's include path to point to the directory where you |
| 50 | installed them. The 'curl-config'[3] tool can be used to get this information: |
| 51 | |
| 52 | $ curl-config --cflags |
| 53 | |
| 54 | .IP "Linking the Program with libcurl" |
| 55 | When having compiled the program, you need to link your object files to create |
| 56 | a single executable. For that to succeed, you need to link with libcurl and |
| 57 | possibly also with other libraries that libcurl itself depends on. Like the |
| 58 | OpenSSL libraries, but even some standard OS libraries may be needed on the |
| 59 | command line. To figure out which flags to use, once again the 'curl-config' |
| 60 | tool comes to the rescue: |
| 61 | |
| 62 | $ curl-config --libs |
| 63 | |
| 64 | .IP "SSL or Not" |
| 65 | libcurl can be built and customized in many ways. One of the things that |
| 66 | varies from different libraries and builds is the support for SSL-based |
| 67 | transfers, like HTTPS and FTPS. If a supported SSL library was detected |
| 68 | properly at build-time, libcurl will be built with SSL support. To figure out |
| 69 | if an installed libcurl has been built with SSL support enabled, use |
| 70 | \&'curl-config' like this: |
| 71 | |
| 72 | $ curl-config --feature |
| 73 | |
| 74 | And if SSL is supported, the keyword 'SSL' will be written to stdout, |
| 75 | possibly together with a few other features that could be either on or off on |
| 76 | for different libcurls. |
| 77 | |
| 78 | See also the "Features libcurl Provides" further down. |
| 79 | .IP "autoconf macro" |
| 80 | When you write your configure script to detect libcurl and setup variables |
| 81 | accordingly, we offer a prewritten macro that probably does everything you |
| 82 | need in this area. See docs/libcurl/libcurl.m4 file - it includes docs on how |
| 83 | to use it. |
| 84 | |
| 85 | .SH "Portable Code in a Portable World" |
| 86 | The people behind libcurl have put a considerable effort to make libcurl work |
| 87 | on a large amount of different operating systems and environments. |
| 88 | |
| 89 | You program libcurl the same way on all platforms that libcurl runs on. There |
| 90 | are only very few minor considerations that differ. If you just make sure to |
| 91 | write your code portable enough, you may very well create yourself a very |
| 92 | portable program. libcurl shouldn't stop you from that. |
| 93 | |
| 94 | .SH "Global Preparation" |
| 95 | The program must initialize some of the libcurl functionality globally. That |
| 96 | means it should be done exactly once, no matter how many times you intend to |
| 97 | use the library. Once for your program's entire life time. This is done using |
| 98 | |
| 99 | curl_global_init() |
| 100 | |
| 101 | and it takes one parameter which is a bit pattern that tells libcurl what to |
| 102 | initialize. Using \fICURL_GLOBAL_ALL\fP will make it initialize all known |
| 103 | internal sub modules, and might be a good default option. The current two bits |
| 104 | that are specified are: |
| 105 | .RS |
| 106 | .IP "CURL_GLOBAL_WIN32" |
| 107 | which only does anything on Windows machines. When used on |
| 108 | a Windows machine, it'll make libcurl initialize the win32 socket |
| 109 | stuff. Without having that initialized properly, your program cannot use |
| 110 | sockets properly. You should only do this once for each application, so if |
| 111 | your program already does this or of another library in use does it, you |
| 112 | should not tell libcurl to do this as well. |
| 113 | .IP CURL_GLOBAL_SSL |
| 114 | which only does anything on libcurls compiled and built SSL-enabled. On these |
| 115 | systems, this will make libcurl initialize the SSL library properly for this |
| 116 | application. This only needs to be done once for each application so if your |
| 117 | program or another library already does this, this bit should not be needed. |
| 118 | .RE |
| 119 | |
| 120 | libcurl has a default protection mechanism that detects if |
| 121 | \fIcurl_global_init(3)\fP hasn't been called by the time |
| 122 | \fIcurl_easy_perform(3)\fP is called and if that is the case, libcurl runs the |
| 123 | function itself with a guessed bit pattern. Please note that depending solely |
| 124 | on this is not considered nice nor very good. |
| 125 | |
| 126 | When the program no longer uses libcurl, it should call |
| 127 | \fIcurl_global_cleanup(3)\fP, which is the opposite of the init call. It will |
| 128 | then do the reversed operations to cleanup the resources the |
| 129 | \fIcurl_global_init(3)\fP call initialized. |
| 130 | |
| 131 | Repeated calls to \fIcurl_global_init(3)\fP and \fIcurl_global_cleanup(3)\fP |
| 132 | should be avoided. They should only be called once each. |
| 133 | |
| 134 | .SH "Features libcurl Provides" |
| 135 | It is considered best-practice to determine libcurl features at run-time |
| 136 | rather than at build-time (if possible of course). By calling |
| 137 | \fIcurl_version_info(3)\fP and checking out the details of the returned |
| 138 | struct, your program can figure out exactly what the currently running libcurl |
| 139 | supports. |
| 140 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 141 | .SH "Two Interfaces" |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 142 | libcurl first introduced the so called easy interface. All operations in the |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 143 | easy interface are prefixed with 'curl_easy'. The easy interface lets you do |
| 144 | single transfers with a synchronous and blocking function call. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 145 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 146 | libcurl also offers another interface that allows multiple simultaneous |
| 147 | transfers in a single thread, the so called multi interface. More about that |
| 148 | interface is detailed in a separate chapter further down. You still need to |
| 149 | understand the easy interface first, so please continue reading for better |
| 150 | understanding. |
| 151 | .SH "Handle the Easy libcurl" |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 152 | To use the easy interface, you must first create yourself an easy handle. You |
| 153 | need one handle for each easy session you want to perform. Basically, you |
| 154 | should use one handle for every thread you plan to use for transferring. You |
| 155 | must never share the same handle in multiple threads. |
| 156 | |
| 157 | Get an easy handle with |
| 158 | |
| 159 | easyhandle = curl_easy_init(); |
| 160 | |
| 161 | It returns an easy handle. Using that you proceed to the next step: setting |
| 162 | up your preferred actions. A handle is just a logic entity for the upcoming |
| 163 | transfer or series of transfers. |
| 164 | |
| 165 | You set properties and options for this handle using |
| 166 | \fIcurl_easy_setopt(3)\fP. They control how the subsequent transfer or |
| 167 | transfers will be made. Options remain set in the handle until set again to |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 168 | something different. They are sticky. Multiple requests using the same handle |
| 169 | will use the same options. |
| 170 | |
| 171 | If you at any point would like to blank all previously set options for a |
| 172 | single easy handle, you can call \fIcurl_easy_reset(3)\fP and you can also |
| 173 | make a clone of an easy handle (with all its set options) using |
| 174 | \fIcurl_easy_duphandle(3)\fP. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 175 | |
| 176 | Many of the options you set in libcurl are "strings", pointers to data |
| 177 | terminated with a zero byte. When you set strings with |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 178 | \fIcurl_easy_setopt(3)\fP, libcurl makes its own copy so that they don't need |
| 179 | to be kept around in your application after being set[4]. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 180 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 181 | One of the most basic properties to set in the handle is the URL. You set your |
| 182 | preferred URL to transfer with \fICURLOPT_URL(3)\fP in a manner similar to: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 183 | |
| 184 | .nf |
| 185 | curl_easy_setopt(handle, CURLOPT_URL, "http://domain.com/"); |
| 186 | .fi |
| 187 | |
| 188 | Let's assume for a while that you want to receive data as the URL identifies a |
| 189 | remote resource you want to get here. Since you write a sort of application |
| 190 | that needs this transfer, I assume that you would like to get the data passed |
| 191 | to you directly instead of simply getting it passed to stdout. So, you write |
| 192 | your own function that matches this prototype: |
| 193 | |
| 194 | size_t write_data(void *buffer, size_t size, size_t nmemb, void *userp); |
| 195 | |
| 196 | You tell libcurl to pass all data to this function by issuing a function |
| 197 | similar to this: |
| 198 | |
| 199 | curl_easy_setopt(easyhandle, CURLOPT_WRITEFUNCTION, write_data); |
| 200 | |
| 201 | You can control what data your callback function gets in the fourth argument |
| 202 | by setting another property: |
| 203 | |
| 204 | curl_easy_setopt(easyhandle, CURLOPT_WRITEDATA, &internal_struct); |
| 205 | |
| 206 | Using that property, you can easily pass local data between your application |
| 207 | and the function that gets invoked by libcurl. libcurl itself won't touch the |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 208 | data you pass with \fICURLOPT_WRITEDATA(3)\fP. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 209 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 210 | libcurl offers its own default internal callback that will take care of the |
| 211 | data if you don't set the callback with \fICURLOPT_WRITEFUNCTION(3)\fP. It |
| 212 | will then simply output the received data to stdout. You can have the default |
| 213 | callback write the data to a different file handle by passing a 'FILE *' to a |
| 214 | file opened for writing with the \fICURLOPT_WRITEDATA(3)\fP option. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 215 | |
| 216 | Now, we need to take a step back and have a deep breath. Here's one of those |
| 217 | rare platform-dependent nitpicks. Did you spot it? On some platforms[2], |
| 218 | libcurl won't be able to operate on files opened by the program. Thus, if you |
| 219 | use the default callback and pass in an open file with |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 220 | \fICURLOPT_WRITEDATA(3)\fP, it will crash. You should therefore avoid this to |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 221 | make your program run fine virtually everywhere. |
| 222 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 223 | (\fICURLOPT_WRITEDATA(3)\fP was formerly known as \fICURLOPT_FILE\fP. Both |
| 224 | names still work and do the same thing). |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 225 | |
| 226 | If you're using libcurl as a win32 DLL, you MUST use the |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 227 | \fICURLOPT_WRITEFUNCTION(3)\fP if you set \fICURLOPT_WRITEDATA(3)\fP - or you |
| 228 | will experience crashes. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 229 | |
| 230 | There are of course many more options you can set, and we'll get back to a few |
| 231 | of them later. Let's instead continue to the actual transfer: |
| 232 | |
| 233 | success = curl_easy_perform(easyhandle); |
| 234 | |
| 235 | \fIcurl_easy_perform(3)\fP will connect to the remote site, do the necessary |
| 236 | commands and receive the transfer. Whenever it receives data, it calls the |
| 237 | callback function we previously set. The function may get one byte at a time, |
| 238 | or it may get many kilobytes at once. libcurl delivers as much as possible as |
| 239 | often as possible. Your callback function should return the number of bytes it |
| 240 | \&"took care of". If that is not the exact same amount of bytes that was |
| 241 | passed to it, libcurl will abort the operation and return with an error code. |
| 242 | |
| 243 | When the transfer is complete, the function returns a return code that informs |
| 244 | you if it succeeded in its mission or not. If a return code isn't enough for |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 245 | you, you can use the \fICURLOPT_ERRORBUFFER(3)\fP to point libcurl to a buffer |
| 246 | of yours where it'll store a human readable error message as well. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 247 | |
| 248 | If you then want to transfer another file, the handle is ready to be used |
| 249 | again. Mind you, it is even preferred that you re-use an existing handle if |
| 250 | you intend to make another transfer. libcurl will then attempt to re-use the |
| 251 | previous connection. |
| 252 | |
| 253 | For some protocols, downloading a file can involve a complicated process of |
| 254 | logging in, setting the transfer mode, changing the current directory and |
| 255 | finally transferring the file data. libcurl takes care of all that |
| 256 | complication for you. Given simply the URL to a file, libcurl will take care |
| 257 | of all the details needed to get the file moved from one machine to another. |
| 258 | |
| 259 | .SH "Multi-threading Issues" |
Alex Deymo | 8f1a214 | 2016-06-28 14:49:26 -0700 | [diff] [blame] | 260 | libcurl is thread safe but there are a few exceptions. Refer to |
| 261 | \fIlibcurl-thread(3)\fP for more information. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 262 | |
| 263 | .SH "When It Doesn't Work" |
| 264 | There will always be times when the transfer fails for some reason. You might |
| 265 | have set the wrong libcurl option or misunderstood what the libcurl option |
| 266 | actually does, or the remote server might return non-standard replies that |
| 267 | confuse the library which then confuses your program. |
| 268 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 269 | There's one golden rule when these things occur: set the |
| 270 | \fICURLOPT_VERBOSE(3)\fP option to 1. It'll cause the library to spew out the |
| 271 | entire protocol details it sends, some internal info and some received |
| 272 | protocol data as well (especially when using FTP). If you're using HTTP, |
| 273 | adding the headers in the received output to study is also a clever way to get |
| 274 | a better understanding why the server behaves the way it does. Include headers |
| 275 | in the normal body output with \fICURLOPT_HEADER(3)\fP set 1. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 276 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 277 | Of course, there are bugs left. We need to know about them to be able to fix |
| 278 | them, so we're quite dependent on your bug reports! When you do report |
| 279 | suspected bugs in libcurl, please include as many details as you possibly can: |
| 280 | a protocol dump that \fICURLOPT_VERBOSE(3)\fP produces, library version, as |
| 281 | much as possible of your code that uses libcurl, operating system name and |
| 282 | version, compiler name and version etc. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 283 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 284 | If \fICURLOPT_VERBOSE(3)\fP is not enough, you increase the level of debug |
| 285 | data your application receive by using the \fICURLOPT_DEBUGFUNCTION(3)\fP. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 286 | |
| 287 | Getting some in-depth knowledge about the protocols involved is never wrong, |
| 288 | and if you're trying to do funny things, you might very well understand |
| 289 | libcurl and how to use it better if you study the appropriate RFC documents |
| 290 | at least briefly. |
| 291 | |
| 292 | .SH "Upload Data to a Remote Site" |
| 293 | libcurl tries to keep a protocol independent approach to most transfers, thus |
| 294 | uploading to a remote FTP site is very similar to uploading data to a HTTP |
| 295 | server with a PUT request. |
| 296 | |
| 297 | Of course, first you either create an easy handle or you re-use one existing |
| 298 | one. Then you set the URL to operate on just like before. This is the remote |
| 299 | URL, that we now will upload. |
| 300 | |
| 301 | Since we write an application, we most likely want libcurl to get the upload |
| 302 | data by asking us for it. To make it do that, we set the read callback and |
| 303 | the custom pointer libcurl will pass to our read callback. The read callback |
| 304 | should have a prototype similar to: |
| 305 | |
| 306 | size_t function(char *bufptr, size_t size, size_t nitems, void *userp); |
| 307 | |
| 308 | Where bufptr is the pointer to a buffer we fill in with data to upload and |
| 309 | size*nitems is the size of the buffer and therefore also the maximum amount |
| 310 | of data we can return to libcurl in this call. The 'userp' pointer is the |
| 311 | custom pointer we set to point to a struct of ours to pass private data |
| 312 | between the application and the callback. |
| 313 | |
| 314 | curl_easy_setopt(easyhandle, CURLOPT_READFUNCTION, read_function); |
| 315 | |
| 316 | curl_easy_setopt(easyhandle, CURLOPT_READDATA, &filedata); |
| 317 | |
| 318 | Tell libcurl that we want to upload: |
| 319 | |
| 320 | curl_easy_setopt(easyhandle, CURLOPT_UPLOAD, 1L); |
| 321 | |
| 322 | A few protocols won't behave properly when uploads are done without any prior |
| 323 | knowledge of the expected file size. So, set the upload file size using the |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 324 | \fICURLOPT_INFILESIZE_LARGE(3)\fP for all known file sizes like this[1]: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 325 | |
| 326 | .nf |
| 327 | /* in this example, file_size must be an curl_off_t variable */ |
| 328 | curl_easy_setopt(easyhandle, CURLOPT_INFILESIZE_LARGE, file_size); |
| 329 | .fi |
| 330 | |
| 331 | When you call \fIcurl_easy_perform(3)\fP this time, it'll perform all the |
| 332 | necessary operations and when it has invoked the upload it'll call your |
| 333 | supplied callback to get the data to upload. The program should return as much |
| 334 | data as possible in every invoke, as that is likely to make the upload perform |
| 335 | as fast as possible. The callback should return the number of bytes it wrote |
| 336 | in the buffer. Returning 0 will signal the end of the upload. |
| 337 | |
| 338 | .SH "Passwords" |
| 339 | Many protocols use or even require that user name and password are provided |
| 340 | to be able to download or upload the data of your choice. libcurl offers |
| 341 | several ways to specify them. |
| 342 | |
| 343 | Most protocols support that you specify the name and password in the URL |
| 344 | itself. libcurl will detect this and use them accordingly. This is written |
| 345 | like this: |
| 346 | |
| 347 | protocol://user:password@example.com/path/ |
| 348 | |
| 349 | If you need any odd letters in your user name or password, you should enter |
| 350 | them URL encoded, as %XX where XX is a two-digit hexadecimal number. |
| 351 | |
| 352 | libcurl also provides options to set various passwords. The user name and |
| 353 | password as shown embedded in the URL can instead get set with the |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 354 | \fICURLOPT_USERPWD(3)\fP option. The argument passed to libcurl should be a |
| 355 | char * to a string in the format "user:password". In a manner like this: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 356 | |
| 357 | curl_easy_setopt(easyhandle, CURLOPT_USERPWD, "myname:thesecret"); |
| 358 | |
| 359 | Another case where name and password might be needed at times, is for those |
| 360 | users who need to authenticate themselves to a proxy they use. libcurl offers |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 361 | another option for this, the \fICURLOPT_PROXYUSERPWD(3)\fP. It is used quite |
| 362 | similar to the \fICURLOPT_USERPWD(3)\fP option like this: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 363 | |
| 364 | curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "myname:thesecret"); |
| 365 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 366 | There's a long time Unix "standard" way of storing FTP user names and |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 367 | passwords, namely in the $HOME/.netrc file. The file should be made private |
| 368 | so that only the user may read it (see also the "Security Considerations" |
| 369 | chapter), as it might contain the password in plain text. libcurl has the |
| 370 | ability to use this file to figure out what set of user name and password to |
| 371 | use for a particular host. As an extension to the normal functionality, |
| 372 | libcurl also supports this file for non-FTP protocols such as HTTP. To make |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 373 | curl use this file, use the \fICURLOPT_NETRC(3)\fP option: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 374 | |
| 375 | curl_easy_setopt(easyhandle, CURLOPT_NETRC, 1L); |
| 376 | |
| 377 | And a very basic example of how such a .netrc file may look like: |
| 378 | |
| 379 | .nf |
| 380 | machine myhost.mydomain.com |
| 381 | login userlogin |
| 382 | password secretword |
| 383 | .fi |
| 384 | |
| 385 | All these examples have been cases where the password has been optional, or |
| 386 | at least you could leave it out and have libcurl attempt to do its job |
| 387 | without it. There are times when the password isn't optional, like when |
| 388 | you're using an SSL private key for secure transfers. |
| 389 | |
| 390 | To pass the known private key password to libcurl: |
| 391 | |
| 392 | curl_easy_setopt(easyhandle, CURLOPT_KEYPASSWD, "keypassword"); |
| 393 | |
| 394 | .SH "HTTP Authentication" |
| 395 | The previous chapter showed how to set user name and password for getting |
| 396 | URLs that require authentication. When using the HTTP protocol, there are |
| 397 | many different ways a client can provide those credentials to the server and |
| 398 | you can control which way libcurl will (attempt to) use them. The default HTTP |
| 399 | authentication method is called 'Basic', which is sending the name and |
| 400 | password in clear-text in the HTTP request, base64-encoded. This is insecure. |
| 401 | |
| 402 | At the time of this writing, libcurl can be built to use: Basic, Digest, NTLM, |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 403 | Negotiate (SPNEGO). You can tell libcurl which one to use |
| 404 | with \fICURLOPT_HTTPAUTH(3)\fP as in: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 405 | |
| 406 | curl_easy_setopt(easyhandle, CURLOPT_HTTPAUTH, CURLAUTH_DIGEST); |
| 407 | |
| 408 | And when you send authentication to a proxy, you can also set authentication |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 409 | type the same way but instead with \fICURLOPT_PROXYAUTH(3)\fP: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 410 | |
| 411 | curl_easy_setopt(easyhandle, CURLOPT_PROXYAUTH, CURLAUTH_NTLM); |
| 412 | |
| 413 | Both these options allow you to set multiple types (by ORing them together), |
| 414 | to make libcurl pick the most secure one out of the types the server/proxy |
| 415 | claims to support. This method does however add a round-trip since libcurl |
| 416 | must first ask the server what it supports: |
| 417 | |
| 418 | curl_easy_setopt(easyhandle, CURLOPT_HTTPAUTH, |
| 419 | CURLAUTH_DIGEST|CURLAUTH_BASIC); |
| 420 | |
| 421 | For convenience, you can use the 'CURLAUTH_ANY' define (instead of a list |
| 422 | with specific types) which allows libcurl to use whatever method it wants. |
| 423 | |
| 424 | When asking for multiple types, libcurl will pick the available one it |
| 425 | considers "best" in its own internal order of preference. |
| 426 | |
| 427 | .SH "HTTP POSTing" |
| 428 | We get many questions regarding how to issue HTTP POSTs with libcurl the |
| 429 | proper way. This chapter will thus include examples using both different |
| 430 | versions of HTTP POST that libcurl supports. |
| 431 | |
| 432 | The first version is the simple POST, the most common version, that most HTML |
| 433 | pages using the <form> tag uses. We provide a pointer to the data and tell |
| 434 | libcurl to post it all to the remote site: |
| 435 | |
| 436 | .nf |
| 437 | char *data="name=daniel&project=curl"; |
| 438 | curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, data); |
| 439 | curl_easy_setopt(easyhandle, CURLOPT_URL, "http://posthere.com/"); |
| 440 | |
| 441 | curl_easy_perform(easyhandle); /* post away! */ |
| 442 | .fi |
| 443 | |
| 444 | Simple enough, huh? Since you set the POST options with the |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 445 | \fICURLOPT_POSTFIELDS(3)\fP, this automatically switches the handle to use |
| 446 | POST in the upcoming request. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 447 | |
| 448 | Ok, so what if you want to post binary data that also requires you to set the |
| 449 | Content-Type: header of the post? Well, binary posts prevent libcurl from |
| 450 | being able to do strlen() on the data to figure out the size, so therefore we |
| 451 | must tell libcurl the size of the post data. Setting headers in libcurl |
| 452 | requests are done in a generic way, by building a list of our own headers and |
| 453 | then passing that list to libcurl. |
| 454 | |
| 455 | .nf |
| 456 | struct curl_slist *headers=NULL; |
| 457 | headers = curl_slist_append(headers, "Content-Type: text/xml"); |
| 458 | |
| 459 | /* post binary data */ |
| 460 | curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDS, binaryptr); |
| 461 | |
| 462 | /* set the size of the postfields data */ |
| 463 | curl_easy_setopt(easyhandle, CURLOPT_POSTFIELDSIZE, 23L); |
| 464 | |
| 465 | /* pass our list of custom made headers */ |
| 466 | curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers); |
| 467 | |
| 468 | curl_easy_perform(easyhandle); /* post away! */ |
| 469 | |
| 470 | curl_slist_free_all(headers); /* free the header list */ |
| 471 | .fi |
| 472 | |
| 473 | While the simple examples above cover the majority of all cases where HTTP |
| 474 | POST operations are required, they don't do multi-part formposts. Multi-part |
| 475 | formposts were introduced as a better way to post (possibly large) binary data |
| 476 | and were first documented in the RFC1867 (updated in RFC2388). They're called |
| 477 | multi-part because they're built by a chain of parts, each part being a single |
| 478 | unit of data. Each part has its own name and contents. You can in fact create |
| 479 | and post a multi-part formpost with the regular libcurl POST support described |
| 480 | above, but that would require that you build a formpost yourself and provide |
Alex Deymo | 486467e | 2017-12-19 19:04:07 +0100 | [diff] [blame] | 481 | to libcurl. To make that easier, libcurl provides a MIME API consisting in |
| 482 | several functions: using those, you can create and fill a multi-part form. |
| 483 | Function \fIcurl_mime_init(3)\fP creates a multi-part body; you can then |
| 484 | append new parts to a multi-part body using \fIcurl_mime_addpart(3)\fP. |
| 485 | There are three possible data sources for a part: memory using |
| 486 | \fIcurl_mime_data(3)\fP, file using \fIcurl_mime_filedata(3)\fP and |
| 487 | user-defined data read callback using \fIcurl_mime_data_cb(3)\fP. |
| 488 | \fIcurl_mime_name(3)\fP sets a part's (i.e.: form field) name, while |
| 489 | \fIcurl_mime_filename(3)\fP fills in the remote file name. With |
| 490 | \fIcurl_mime_type(3)\fP, you can tell the MIME type of a part, |
| 491 | \fIcurl_mime_headers(3)\fP allows defining the part's headers. When a |
| 492 | multi-part body is no longer needed, you can destroy it using |
| 493 | \fIcurl_mime_free(3)\fP. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 494 | |
| 495 | The following example sets two simple text parts with plain textual contents, |
| 496 | and then a file with binary contents and uploads the whole thing. |
| 497 | |
| 498 | .nf |
Alex Deymo | 486467e | 2017-12-19 19:04:07 +0100 | [diff] [blame] | 499 | curl_mime *multipart = curl_mime_init(easyhandle); |
| 500 | curl_mimepart *part = curl_mime_addpart(mutipart); |
| 501 | curl_mime_name(part, "name"); |
| 502 | curl_mime_data(part, "daniel", CURL_ZERO_TERMINATED); |
| 503 | part = curl_mime_addpart(mutipart); |
| 504 | curl_mime_name(part, "project"); |
| 505 | curl_mime_data(part, "curl", CURL_ZERO_TERMINATED); |
| 506 | part = curl_mime_addpart(mutipart); |
| 507 | curl_mime_name(part, "logotype-image"); |
| 508 | curl_mime_filedata(part, "curl.png"); |
| 509 | |
| 510 | /* Set the form info */ |
| 511 | curl_easy_setopt(easyhandle, CURLOPT_MIMEPOST, multipart); |
| 512 | |
| 513 | curl_easy_perform(easyhandle); /* post away! */ |
| 514 | |
| 515 | /* free the post data again */ |
| 516 | curl_mime_free(multipart); |
| 517 | .fi |
| 518 | |
| 519 | To post multiple files for a single form field, you must supply each file in |
| 520 | a separate part, all with the same field name. Although function |
| 521 | \fIcurl_mime_subparts(3)\fP implements nested muti-parts, this way of |
| 522 | multiple files posting is deprecated by RFC 7578, chapter 4.3. |
| 523 | |
| 524 | To set the data source from an already opened FILE pointer, use: |
| 525 | |
| 526 | .nf |
| 527 | curl_mime_data_cb(part, filesize, (curl_read_callback) fread, |
| 528 | (curl_seek_callback) fseek, NULL, filepointer); |
| 529 | .fi |
| 530 | |
| 531 | A deprecated \fIcurl_formadd(3)\fP function is still supported in libcurl. |
| 532 | It should however not be used anymore for new designs and programs using it |
| 533 | ought to be converted to the MIME API. It is however described here as an |
| 534 | aid to conversion. |
| 535 | |
| 536 | Using \fIcurl_formadd\fP, you add parts to the form. When you're done adding |
| 537 | parts, you post the whole form. |
| 538 | |
| 539 | The MIME API example above is expressed as follows using this function: |
| 540 | |
| 541 | .nf |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 542 | struct curl_httppost *post=NULL; |
| 543 | struct curl_httppost *last=NULL; |
| 544 | curl_formadd(&post, &last, |
| 545 | CURLFORM_COPYNAME, "name", |
| 546 | CURLFORM_COPYCONTENTS, "daniel", CURLFORM_END); |
| 547 | curl_formadd(&post, &last, |
| 548 | CURLFORM_COPYNAME, "project", |
| 549 | CURLFORM_COPYCONTENTS, "curl", CURLFORM_END); |
| 550 | curl_formadd(&post, &last, |
| 551 | CURLFORM_COPYNAME, "logotype-image", |
| 552 | CURLFORM_FILECONTENT, "curl.png", CURLFORM_END); |
| 553 | |
| 554 | /* Set the form info */ |
| 555 | curl_easy_setopt(easyhandle, CURLOPT_HTTPPOST, post); |
| 556 | |
| 557 | curl_easy_perform(easyhandle); /* post away! */ |
| 558 | |
| 559 | /* free the post data again */ |
| 560 | curl_formfree(post); |
| 561 | .fi |
| 562 | |
| 563 | Multipart formposts are chains of parts using MIME-style separators and |
| 564 | headers. It means that each one of these separate parts get a few headers set |
| 565 | that describe the individual content-type, size etc. To enable your |
| 566 | application to handicraft this formpost even more, libcurl allows you to |
| 567 | supply your own set of custom headers to such an individual form part. You can |
| 568 | of course supply headers to as many parts as you like, but this little example |
| 569 | will show how you set headers to one specific part when you add that to the |
| 570 | post handle: |
| 571 | |
| 572 | .nf |
| 573 | struct curl_slist *headers=NULL; |
| 574 | headers = curl_slist_append(headers, "Content-Type: text/xml"); |
| 575 | |
| 576 | curl_formadd(&post, &last, |
| 577 | CURLFORM_COPYNAME, "logotype-image", |
| 578 | CURLFORM_FILECONTENT, "curl.xml", |
| 579 | CURLFORM_CONTENTHEADER, headers, |
| 580 | CURLFORM_END); |
| 581 | |
| 582 | curl_easy_perform(easyhandle); /* post away! */ |
| 583 | |
| 584 | curl_formfree(post); /* free post */ |
| 585 | curl_slist_free_all(headers); /* free custom header list */ |
| 586 | .fi |
| 587 | |
| 588 | Since all options on an easyhandle are "sticky", they remain the same until |
| 589 | changed even if you do call \fIcurl_easy_perform(3)\fP, you may need to tell |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 590 | curl to go back to a plain GET request if you intend to do one as your next |
| 591 | request. You force an easyhandle to go back to GET by using the |
| 592 | \fICURLOPT_HTTPGET(3)\fP option: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 593 | |
| 594 | curl_easy_setopt(easyhandle, CURLOPT_HTTPGET, 1L); |
| 595 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 596 | Just setting \fICURLOPT_POSTFIELDS(3)\fP to "" or NULL will *not* stop libcurl |
| 597 | from doing a POST. It will just make it POST without any data to send! |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 598 | |
Alex Deymo | 486467e | 2017-12-19 19:04:07 +0100 | [diff] [blame] | 599 | .SH "Converting from deprecated form API to MIME API" |
| 600 | Four rules have to be respected in building the multi-part: |
| 601 | .br |
| 602 | - The easy handle must be created before building the multi-part. |
| 603 | .br |
| 604 | - The multi-part is always created by a call to curl_mime_init(easyhandle). |
| 605 | .br |
| 606 | - Each part is created by a call to curl_mime_addpart(multipart). |
| 607 | .br |
| 608 | - When complete, the multi-part must be bound to the easy handle using |
| 609 | \fICURLOPT_MIMEPOST(3)\fP instead of \fICURLOPT_HTTPPOST(3)\fP. |
| 610 | |
| 611 | Here are some example of \fIcurl_formadd\fP calls to MIME API sequences: |
| 612 | |
| 613 | .nf |
| 614 | curl_formadd(&post, &last, |
| 615 | CURLFORM_COPYNAME, "id", |
| 616 | CURLFORM_COPYCONTENTS, "daniel", CURLFORM_END); |
| 617 | CURLFORM_CONTENTHEADER, headers, |
| 618 | CURLFORM_END); |
| 619 | .fi |
| 620 | becomes: |
| 621 | .nf |
| 622 | part = curl_mime_addpart(multipart); |
| 623 | curl_mime_name(part, "id"); |
| 624 | curl_mime_data(part, "daniel", CURL_ZERO_TERMINATED); |
| 625 | curl_mime_headers(part, headers, FALSE); |
| 626 | .fi |
| 627 | |
| 628 | Setting the last \fIcurl_mime_headers\fP argument to TRUE would have caused |
| 629 | the headers to be automatically released upon destroyed the multi-part, thus |
| 630 | saving a clean-up call to \fIcurl_slist_free_all(3)\fP. |
| 631 | |
| 632 | .nf |
| 633 | curl_formadd(&post, &last, |
| 634 | CURLFORM_PTRNAME, "logotype-image", |
| 635 | CURLFORM_FILECONTENT, "-", |
| 636 | CURLFORM_END); |
| 637 | .fi |
| 638 | becomes: |
| 639 | .nf |
| 640 | part = curl_mime_addpart(multipart); |
| 641 | curl_mime_name(part, "logotype-image"); |
| 642 | curl_mime_data_cb(part, (curl_off_t) -1, fread, fseek, NULL, stdin); |
| 643 | .fi |
| 644 | |
| 645 | \fIcurl_mime_name\fP always copies the field name. The special file name "-" |
| 646 | is not supported by \fIcurl_mime_file\fP: to read an open file, use |
| 647 | a callback source using fread(). The transfer will be chunked since the data |
| 648 | size is unknown. |
| 649 | |
| 650 | .nf |
| 651 | curl_formadd(&post, &last, |
| 652 | CURLFORM_COPYNAME, "datafile[]", |
| 653 | CURLFORM_FILE, "file1", |
| 654 | CURLFORM_FILE, "file2", |
| 655 | CURLFORM_END); |
| 656 | .fi |
| 657 | becomes: |
| 658 | .nf |
| 659 | part = curl_mime_addpart(multipart); |
| 660 | curl_mime_name(part, "datafile[]"); |
| 661 | curl_mime_filedata(part, "file1"); |
| 662 | part = curl_mime_addpart(multipart); |
| 663 | curl_mime_name(part, "datafile[]"); |
| 664 | curl_mime_filedata(part, "file2"); |
| 665 | .fi |
| 666 | |
| 667 | The deprecated multipart/mixed implementation of multiple files field is |
| 668 | translated to two distinct parts with the same name. |
| 669 | |
| 670 | .nf |
| 671 | curl_easy_setopt(easyhandle, CURLOPT_READFUNCTION, myreadfunc); |
| 672 | curl_formadd(&post, &last, |
| 673 | CURLFORM_COPYNAME, "stream", |
| 674 | CURLFORM_STREAM, arg, |
| 675 | CURLFORM_CONTENTLEN, (curl_off_t) datasize, |
| 676 | CURLFORM_FILENAME, "archive.zip", |
| 677 | CURLFORM_CONTENTTYPE, "application/zip", |
| 678 | CURLFORM_END); |
| 679 | .fi |
| 680 | becomes: |
| 681 | .nf |
| 682 | part = curl_mime_addpart(multipart); |
| 683 | curl_mime_name(part, "stream"); |
| 684 | curl_mime_data_cb(part, (curl_off_t) datasize, |
| 685 | myreadfunc, NULL, NULL, arg); |
| 686 | curl_mime_filename(part, "archive.zip"); |
| 687 | curl_mime_type(part, "application/zip"); |
| 688 | .fi |
| 689 | |
| 690 | \fICURLOPT_READFUNCTION\fP callback is not used: it is replace by directly |
| 691 | setting the part source data from the callback read function. |
| 692 | |
| 693 | .nf |
| 694 | curl_formadd(&post, &last, |
| 695 | CURLFORM_COPYNAME, "memfile", |
| 696 | CURLFORM_BUFFER, "memfile.bin", |
| 697 | CURLFORM_BUFFERPTR, databuffer, |
| 698 | CURLFORM_BUFFERLENGTH, (long) sizeof databuffer, |
| 699 | CURLFORM_END); |
| 700 | .fi |
| 701 | becomes: |
| 702 | .nf |
| 703 | part = curl_mime_addpart(multipart); |
| 704 | curl_mime_name(part, "memfile"); |
| 705 | curl_mime_data(part, databuffer, (curl_off_t) sizeof databuffer); |
| 706 | curl_mime_filename(part, "memfile.bin"); |
| 707 | .fi |
| 708 | |
| 709 | \fIcurl_mime_data\fP always copies the initial data: data buffer is thus |
| 710 | free for immediate reuse. |
| 711 | |
| 712 | .nf |
| 713 | curl_formadd(&post, &last, |
| 714 | CURLFORM_COPYNAME, "message", |
| 715 | CURLFORM_FILECONTENT, "msg.txt", |
| 716 | CURLFORM_END); |
| 717 | .fi |
| 718 | becomes: |
| 719 | .nf |
| 720 | part = curl_mime_addpart(multipart); |
| 721 | curl_mime_name(part, "message"); |
| 722 | curl_mime_filedata(part, "msg.txt"); |
| 723 | curl_mime_filename(part, NULL); |
| 724 | .fi |
| 725 | |
| 726 | Use of \fIcurl_mime_filedata\fP sets the remote file name as a side effect: it |
| 727 | is therefore necessary to clear it for \fICURLFORM_FILECONTENT\fP emulation. |
| 728 | |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 729 | .SH "Showing Progress" |
| 730 | |
| 731 | For historical and traditional reasons, libcurl has a built-in progress meter |
| 732 | that can be switched on and then makes it present a progress meter in your |
| 733 | terminal. |
| 734 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 735 | Switch on the progress meter by, oddly enough, setting |
| 736 | \fICURLOPT_NOPROGRESS(3)\fP to zero. This option is set to 1 by default. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 737 | |
| 738 | For most applications however, the built-in progress meter is useless and |
| 739 | what instead is interesting is the ability to specify a progress |
| 740 | callback. The function pointer you pass to libcurl will then be called on |
| 741 | irregular intervals with information about the current transfer. |
| 742 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 743 | Set the progress callback by using \fICURLOPT_PROGRESSFUNCTION(3)\fP. And pass |
| 744 | a pointer to a function that matches this prototype: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 745 | |
| 746 | .nf |
| 747 | int progress_callback(void *clientp, |
| 748 | double dltotal, |
| 749 | double dlnow, |
| 750 | double ultotal, |
| 751 | double ulnow); |
| 752 | .fi |
| 753 | |
| 754 | If any of the input arguments is unknown, a 0 will be passed. The first |
| 755 | argument, the 'clientp' is the pointer you pass to libcurl with |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 756 | \fICURLOPT_PROGRESSDATA(3)\fP. libcurl won't touch it. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 757 | |
| 758 | .SH "libcurl with C++" |
| 759 | |
| 760 | There's basically only one thing to keep in mind when using C++ instead of C |
| 761 | when interfacing libcurl: |
| 762 | |
| 763 | The callbacks CANNOT be non-static class member functions |
| 764 | |
| 765 | Example C++ code: |
| 766 | |
| 767 | .nf |
| 768 | class AClass { |
| 769 | static size_t write_data(void *ptr, size_t size, size_t nmemb, |
| 770 | void *ourpointer) |
| 771 | { |
| 772 | /* do what you want with the data */ |
| 773 | } |
| 774 | } |
| 775 | .fi |
| 776 | |
| 777 | .SH "Proxies" |
| 778 | |
| 779 | What "proxy" means according to Merriam-Webster: "a person authorized to act |
| 780 | for another" but also "the agency, function, or office of a deputy who acts as |
| 781 | a substitute for another". |
| 782 | |
| 783 | Proxies are exceedingly common these days. Companies often only offer Internet |
| 784 | access to employees through their proxies. Network clients or user-agents ask |
| 785 | the proxy for documents, the proxy does the actual request and then it returns |
| 786 | them. |
| 787 | |
| 788 | libcurl supports SOCKS and HTTP proxies. When a given URL is wanted, libcurl |
| 789 | will ask the proxy for it instead of trying to connect to the actual host |
| 790 | identified in the URL. |
| 791 | |
| 792 | If you're using a SOCKS proxy, you may find that libcurl doesn't quite support |
| 793 | all operations through it. |
| 794 | |
| 795 | For HTTP proxies: the fact that the proxy is a HTTP proxy puts certain |
| 796 | restrictions on what can actually happen. A requested URL that might not be a |
| 797 | HTTP URL will be still be passed to the HTTP proxy to deliver back to |
| 798 | libcurl. This happens transparently, and an application may not need to |
| 799 | know. I say "may", because at times it is very important to understand that |
| 800 | all operations over a HTTP proxy use the HTTP protocol. For example, you |
| 801 | can't invoke your own custom FTP commands or even proper FTP directory |
| 802 | listings. |
| 803 | |
| 804 | .IP "Proxy Options" |
| 805 | |
| 806 | To tell libcurl to use a proxy at a given port number: |
| 807 | |
| 808 | curl_easy_setopt(easyhandle, CURLOPT_PROXY, "proxy-host.com:8080"); |
| 809 | |
| 810 | Some proxies require user authentication before allowing a request, and you |
| 811 | pass that information similar to this: |
| 812 | |
| 813 | curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "user:password"); |
| 814 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 815 | If you want to, you can specify the host name only in the |
| 816 | \fICURLOPT_PROXY(3)\fP option, and set the port number separately with |
| 817 | \fICURLOPT_PROXYPORT(3)\fP. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 818 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 819 | Tell libcurl what kind of proxy it is with \fICURLOPT_PROXYTYPE(3)\fP (if not, |
| 820 | it will default to assume a HTTP proxy): |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 821 | |
| 822 | curl_easy_setopt(easyhandle, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS4); |
| 823 | |
| 824 | .IP "Environment Variables" |
| 825 | |
| 826 | libcurl automatically checks and uses a set of environment variables to know |
| 827 | what proxies to use for certain protocols. The names of the variables are |
| 828 | following an ancient de facto standard and are built up as "[protocol]_proxy" |
| 829 | (note the lower casing). Which makes the variable \&'http_proxy' checked for a |
| 830 | name of a proxy to use when the input URL is HTTP. Following the same rule, |
| 831 | the variable named 'ftp_proxy' is checked for FTP URLs. Again, the proxies are |
| 832 | always HTTP proxies, the different names of the variables simply allows |
| 833 | different HTTP proxies to be used. |
| 834 | |
| 835 | The proxy environment variable contents should be in the format |
| 836 | \&"[protocol://][user:password@]machine[:port]". Where the protocol:// part is |
| 837 | simply ignored if present (so http://proxy and bluerk://proxy will do the |
| 838 | same) and the optional port number specifies on which port the proxy operates |
| 839 | on the host. If not specified, the internal default port number will be used |
| 840 | and that is most likely *not* the one you would like it to be. |
| 841 | |
| 842 | There are two special environment variables. 'all_proxy' is what sets proxy |
| 843 | for any URL in case the protocol specific variable wasn't set, and |
| 844 | \&'no_proxy' defines a list of hosts that should not use a proxy even though a |
| 845 | variable may say so. If 'no_proxy' is a plain asterisk ("*") it matches all |
| 846 | hosts. |
| 847 | |
| 848 | To explicitly disable libcurl's checking for and using the proxy environment |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 849 | variables, set the proxy name to "" - an empty string - with |
| 850 | \fICURLOPT_PROXY(3)\fP. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 851 | .IP "SSL and Proxies" |
| 852 | |
| 853 | SSL is for secure point-to-point connections. This involves strong encryption |
| 854 | and similar things, which effectively makes it impossible for a proxy to |
| 855 | operate as a "man in between" which the proxy's task is, as previously |
| 856 | discussed. Instead, the only way to have SSL work over a HTTP proxy is to ask |
| 857 | the proxy to tunnel trough everything without being able to check or fiddle |
| 858 | with the traffic. |
| 859 | |
Elliott Hughes | cac3980 | 2018-04-27 16:19:43 -0700 | [diff] [blame] | 860 | Opening an SSL connection over a HTTP proxy is therefore a matter of asking the |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 861 | proxy for a straight connection to the target host on a specified port. This |
| 862 | is made with the HTTP request CONNECT. ("please mr proxy, connect me to that |
| 863 | remote host"). |
| 864 | |
| 865 | Because of the nature of this operation, where the proxy has no idea what kind |
| 866 | of data that is passed in and out through this tunnel, this breaks some of the |
| 867 | very few advantages that come from using a proxy, such as caching. Many |
| 868 | organizations prevent this kind of tunneling to other destination port numbers |
| 869 | than 443 (which is the default HTTPS port number). |
| 870 | |
| 871 | .IP "Tunneling Through Proxy" |
| 872 | As explained above, tunneling is required for SSL to work and often even |
| 873 | restricted to the operation intended for SSL; HTTPS. |
| 874 | |
| 875 | This is however not the only time proxy-tunneling might offer benefits to |
| 876 | you or your application. |
| 877 | |
| 878 | As tunneling opens a direct connection from your application to the remote |
| 879 | machine, it suddenly also re-introduces the ability to do non-HTTP |
| 880 | operations over a HTTP proxy. You can in fact use things such as FTP |
| 881 | upload or FTP custom commands this way. |
| 882 | |
| 883 | Again, this is often prevented by the administrators of proxies and is |
| 884 | rarely allowed. |
| 885 | |
| 886 | Tell libcurl to use proxy tunneling like this: |
| 887 | |
| 888 | curl_easy_setopt(easyhandle, CURLOPT_HTTPPROXYTUNNEL, 1L); |
| 889 | |
| 890 | In fact, there might even be times when you want to do plain HTTP |
| 891 | operations using a tunnel like this, as it then enables you to operate on |
| 892 | the remote server instead of asking the proxy to do so. libcurl will not |
| 893 | stand in the way for such innovative actions either! |
| 894 | |
| 895 | .IP "Proxy Auto-Config" |
| 896 | |
| 897 | Netscape first came up with this. It is basically a web page (usually using a |
| 898 | \&.pac extension) with a Javascript that when executed by the browser with the |
| 899 | requested URL as input, returns information to the browser on how to connect |
| 900 | to the URL. The returned information might be "DIRECT" (which means no proxy |
| 901 | should be used), "PROXY host:port" (to tell the browser where the proxy for |
| 902 | this particular URL is) or "SOCKS host:port" (to direct the browser to a SOCKS |
| 903 | proxy). |
| 904 | |
| 905 | libcurl has no means to interpret or evaluate Javascript and thus it doesn't |
| 906 | support this. If you get yourself in a position where you face this nasty |
| 907 | invention, the following advice have been mentioned and used in the past: |
| 908 | |
| 909 | - Depending on the Javascript complexity, write up a script that translates it |
| 910 | to another language and execute that. |
| 911 | |
| 912 | - Read the Javascript code and rewrite the same logic in another language. |
| 913 | |
| 914 | - Implement a Javascript interpreter; people have successfully used the |
| 915 | Mozilla Javascript engine in the past. |
| 916 | |
| 917 | - Ask your admins to stop this, for a static proxy setup or similar. |
| 918 | |
| 919 | .SH "Persistence Is The Way to Happiness" |
| 920 | |
| 921 | Re-cycling the same easy handle several times when doing multiple requests is |
| 922 | the way to go. |
| 923 | |
| 924 | After each single \fIcurl_easy_perform(3)\fP operation, libcurl will keep the |
| 925 | connection alive and open. A subsequent request using the same easy handle to |
| 926 | the same host might just be able to use the already open connection! This |
| 927 | reduces network impact a lot. |
| 928 | |
| 929 | Even if the connection is dropped, all connections involving SSL to the same |
| 930 | host again, will benefit from libcurl's session ID cache that drastically |
| 931 | reduces re-connection time. |
| 932 | |
| 933 | FTP connections that are kept alive save a lot of time, as the command- |
| 934 | response round-trips are skipped, and also you don't risk getting blocked |
| 935 | without permission to login again like on many FTP servers only allowing N |
| 936 | persons to be logged in at the same time. |
| 937 | |
| 938 | libcurl caches DNS name resolving results, to make lookups of a previously |
| 939 | looked up name a lot faster. |
| 940 | |
| 941 | Other interesting details that improve performance for subsequent requests |
| 942 | may also be added in the future. |
| 943 | |
| 944 | Each easy handle will attempt to keep the last few connections alive for a |
| 945 | while in case they are to be used again. You can set the size of this "cache" |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 946 | with the \fICURLOPT_MAXCONNECTS(3)\fP option. Default is 5. There is very |
| 947 | seldom any point in changing this value, and if you think of changing this it |
| 948 | is often just a matter of thinking again. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 949 | |
| 950 | To force your upcoming request to not use an already existing connection (it |
| 951 | will even close one first if there happens to be one alive to the same host |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 952 | you're about to operate on), you can do that by setting |
| 953 | \fICURLOPT_FRESH_CONNECT(3)\fP to 1. In a similar spirit, you can also forbid |
| 954 | the upcoming request to be "lying" around and possibly get re-used after the |
| 955 | request by setting \fICURLOPT_FORBID_REUSE(3)\fP to 1. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 956 | |
| 957 | .SH "HTTP Headers Used by libcurl" |
| 958 | When you use libcurl to do HTTP requests, it'll pass along a series of headers |
| 959 | automatically. It might be good for you to know and understand these. You |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 960 | can replace or remove them by using the \fICURLOPT_HTTPHEADER(3)\fP option. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 961 | |
| 962 | .IP "Host" |
| 963 | This header is required by HTTP 1.1 and even many 1.0 servers and should be |
| 964 | the name of the server we want to talk to. This includes the port number if |
| 965 | anything but default. |
| 966 | |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 967 | .IP "Accept" |
| 968 | \&"*/*". |
| 969 | |
| 970 | .IP "Expect" |
| 971 | When doing POST requests, libcurl sets this header to \&"100-continue" to ask |
| 972 | the server for an "OK" message before it proceeds with sending the data part |
| 973 | of the post. If the POSTed data amount is deemed "small", libcurl will not use |
| 974 | this header. |
| 975 | |
| 976 | .SH "Customizing Operations" |
| 977 | There is an ongoing development today where more and more protocols are built |
| 978 | upon HTTP for transport. This has obvious benefits as HTTP is a tested and |
| 979 | reliable protocol that is widely deployed and has excellent proxy-support. |
| 980 | |
| 981 | When you use one of these protocols, and even when doing other kinds of |
| 982 | programming you may need to change the traditional HTTP (or FTP or...) |
| 983 | manners. You may need to change words, headers or various data. |
| 984 | |
| 985 | libcurl is your friend here too. |
| 986 | |
| 987 | .IP CUSTOMREQUEST |
| 988 | If just changing the actual HTTP request keyword is what you want, like when |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 989 | GET, HEAD or POST is not good enough for you, \fICURLOPT_CUSTOMREQUEST(3)\fP |
| 990 | is there for you. It is very simple to use: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 991 | |
| 992 | curl_easy_setopt(easyhandle, CURLOPT_CUSTOMREQUEST, "MYOWNREQUEST"); |
| 993 | |
| 994 | When using the custom request, you change the request keyword of the actual |
| 995 | request you are performing. Thus, by default you make a GET request but you can |
| 996 | also make a POST operation (as described before) and then replace the POST |
| 997 | keyword if you want to. You're the boss. |
| 998 | |
| 999 | .IP "Modify Headers" |
| 1000 | HTTP-like protocols pass a series of headers to the server when doing the |
| 1001 | request, and you're free to pass any amount of extra headers that you |
| 1002 | think fit. Adding headers is this easy: |
| 1003 | |
| 1004 | .nf |
| 1005 | struct curl_slist *headers=NULL; /* init to NULL is important */ |
| 1006 | |
| 1007 | headers = curl_slist_append(headers, "Hey-server-hey: how are you?"); |
| 1008 | headers = curl_slist_append(headers, "X-silly-content: yes"); |
| 1009 | |
| 1010 | /* pass our list of custom made headers */ |
| 1011 | curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers); |
| 1012 | |
| 1013 | curl_easy_perform(easyhandle); /* transfer http */ |
| 1014 | |
| 1015 | curl_slist_free_all(headers); /* free the header list */ |
| 1016 | .fi |
| 1017 | |
| 1018 | \&... and if you think some of the internally generated headers, such as |
| 1019 | Accept: or Host: don't contain the data you want them to contain, you can |
| 1020 | replace them by simply setting them too: |
| 1021 | |
| 1022 | .nf |
| 1023 | headers = curl_slist_append(headers, "Accept: Agent-007"); |
| 1024 | headers = curl_slist_append(headers, "Host: munged.host.line"); |
| 1025 | .fi |
| 1026 | |
| 1027 | .IP "Delete Headers" |
| 1028 | If you replace an existing header with one with no contents, you will prevent |
| 1029 | the header from being sent. For instance, if you want to completely prevent the |
| 1030 | \&"Accept:" header from being sent, you can disable it with code similar to this: |
| 1031 | |
| 1032 | headers = curl_slist_append(headers, "Accept:"); |
| 1033 | |
| 1034 | Both replacing and canceling internal headers should be done with careful |
| 1035 | consideration and you should be aware that you may violate the HTTP protocol |
| 1036 | when doing so. |
| 1037 | |
| 1038 | .IP "Enforcing chunked transfer-encoding" |
| 1039 | |
| 1040 | By making sure a request uses the custom header "Transfer-Encoding: chunked" |
| 1041 | when doing a non-GET HTTP operation, libcurl will switch over to "chunked" |
| 1042 | upload, even though the size of the data to upload might be known. By default, |
| 1043 | libcurl usually switches over to chunked upload automatically if the upload |
| 1044 | data size is unknown. |
| 1045 | |
| 1046 | .IP "HTTP Version" |
| 1047 | |
| 1048 | All HTTP requests includes the version number to tell the server which version |
| 1049 | we support. libcurl speaks HTTP 1.1 by default. Some very old servers don't |
| 1050 | like getting 1.1-requests and when dealing with stubborn old things like that, |
| 1051 | you can tell libcurl to use 1.0 instead by doing something like this: |
| 1052 | |
| 1053 | curl_easy_setopt(easyhandle, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_0); |
| 1054 | |
| 1055 | .IP "FTP Custom Commands" |
| 1056 | |
| 1057 | Not all protocols are HTTP-like, and thus the above may not help you when |
| 1058 | you want to make, for example, your FTP transfers to behave differently. |
| 1059 | |
| 1060 | Sending custom commands to a FTP server means that you need to send the |
| 1061 | commands exactly as the FTP server expects them (RFC959 is a good guide |
| 1062 | here), and you can only use commands that work on the control-connection |
| 1063 | alone. All kinds of commands that require data interchange and thus need |
| 1064 | a data-connection must be left to libcurl's own judgement. Also be aware |
| 1065 | that libcurl will do its very best to change directory to the target |
| 1066 | directory before doing any transfer, so if you change directory (with CWD |
| 1067 | or similar) you might confuse libcurl and then it might not attempt to |
| 1068 | transfer the file in the correct remote directory. |
| 1069 | |
| 1070 | A little example that deletes a given file before an operation: |
| 1071 | |
| 1072 | .nf |
| 1073 | headers = curl_slist_append(headers, "DELE file-to-remove"); |
| 1074 | |
| 1075 | /* pass the list of custom commands to the handle */ |
| 1076 | curl_easy_setopt(easyhandle, CURLOPT_QUOTE, headers); |
| 1077 | |
| 1078 | curl_easy_perform(easyhandle); /* transfer ftp data! */ |
| 1079 | |
| 1080 | curl_slist_free_all(headers); /* free the header list */ |
| 1081 | .fi |
| 1082 | |
| 1083 | If you would instead want this operation (or chain of operations) to happen |
| 1084 | _after_ the data transfer took place the option to \fIcurl_easy_setopt(3)\fP |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1085 | would instead be called \fICURLOPT_POSTQUOTE(3)\fP and used the exact same |
| 1086 | way. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1087 | |
| 1088 | The custom FTP command will be issued to the server in the same order they are |
| 1089 | added to the list, and if a command gets an error code returned back from the |
| 1090 | server, no more commands will be issued and libcurl will bail out with an |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1091 | error code (CURLE_QUOTE_ERROR). Note that if you use \fICURLOPT_QUOTE(3)\fP to |
| 1092 | send commands before a transfer, no transfer will actually take place when a |
| 1093 | quote command has failed. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1094 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1095 | If you set the \fICURLOPT_HEADER(3)\fP to 1, you will tell libcurl to get |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1096 | information about the target file and output "headers" about it. The headers |
| 1097 | will be in "HTTP-style", looking like they do in HTTP. |
| 1098 | |
| 1099 | The option to enable headers or to run custom FTP commands may be useful to |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1100 | combine with \fICURLOPT_NOBODY(3)\fP. If this option is set, no actual file |
| 1101 | content transfer will be performed. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1102 | |
| 1103 | .IP "FTP Custom CUSTOMREQUEST" |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1104 | If you do want to list the contents of a FTP directory using your own defined |
| 1105 | FTP command, \fICURLOPT_CUSTOMREQUEST(3)\fP will do just that. "NLST" is the |
| 1106 | default one for listing directories but you're free to pass in your idea of a |
| 1107 | good alternative. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1108 | |
| 1109 | .SH "Cookies Without Chocolate Chips" |
| 1110 | In the HTTP sense, a cookie is a name with an associated value. A server sends |
| 1111 | the name and value to the client, and expects it to get sent back on every |
| 1112 | subsequent request to the server that matches the particular conditions |
| 1113 | set. The conditions include that the domain name and path match and that the |
| 1114 | cookie hasn't become too old. |
| 1115 | |
| 1116 | In real-world cases, servers send new cookies to replace existing ones to |
| 1117 | update them. Server use cookies to "track" users and to keep "sessions". |
| 1118 | |
| 1119 | Cookies are sent from server to clients with the header Set-Cookie: and |
| 1120 | they're sent from clients to servers with the Cookie: header. |
| 1121 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1122 | To just send whatever cookie you want to a server, you can use |
| 1123 | \fICURLOPT_COOKIE(3)\fP to set a cookie string like this: |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1124 | |
| 1125 | curl_easy_setopt(easyhandle, CURLOPT_COOKIE, "name1=var1; name2=var2;"); |
| 1126 | |
| 1127 | In many cases, that is not enough. You might want to dynamically save |
| 1128 | whatever cookies the remote server passes to you, and make sure those cookies |
| 1129 | are then used accordingly on later requests. |
| 1130 | |
| 1131 | One way to do this, is to save all headers you receive in a plain file and |
| 1132 | when you make a request, you tell libcurl to read the previous headers to |
| 1133 | figure out which cookies to use. Set the header file to read cookies from with |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1134 | \fICURLOPT_COOKIEFILE(3)\fP. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1135 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1136 | The \fICURLOPT_COOKIEFILE(3)\fP option also automatically enables the cookie |
| 1137 | parser in libcurl. Until the cookie parser is enabled, libcurl will not parse |
| 1138 | or understand incoming cookies and they will just be ignored. However, when |
| 1139 | the parser is enabled the cookies will be understood and the cookies will be |
| 1140 | kept in memory and used properly in subsequent requests when the same handle |
| 1141 | is used. Many times this is enough, and you may not have to save the cookies |
Alex Deymo | 8f1a214 | 2016-06-28 14:49:26 -0700 | [diff] [blame] | 1142 | to disk at all. Note that the file you specify to \fICURLOPT_COOKIEFILE(3)\fP |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1143 | doesn't have to exist to enable the parser, so a common way to just enable the |
| 1144 | parser and not read any cookies is to use the name of a file you know doesn't |
| 1145 | exist. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1146 | |
| 1147 | If you would rather use existing cookies that you've previously received with |
| 1148 | your Netscape or Mozilla browsers, you can make libcurl use that cookie file |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1149 | as input. The \fICURLOPT_COOKIEFILE(3)\fP is used for that too, as libcurl |
| 1150 | will automatically find out what kind of file it is and act accordingly. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1151 | |
| 1152 | Perhaps the most advanced cookie operation libcurl offers, is saving the |
| 1153 | entire internal cookie state back into a Netscape/Mozilla formatted cookie |
| 1154 | file. We call that the cookie-jar. When you set a file name with |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1155 | \fICURLOPT_COOKIEJAR(3)\fP, that file name will be created and all received |
| 1156 | cookies will be stored in it when \fIcurl_easy_cleanup(3)\fP is called. This |
| 1157 | enables cookies to get passed on properly between multiple handles without any |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1158 | information getting lost. |
| 1159 | |
| 1160 | .SH "FTP Peculiarities We Need" |
| 1161 | |
| 1162 | FTP transfers use a second TCP/IP connection for the data transfer. This is |
| 1163 | usually a fact you can forget and ignore but at times this fact will come |
| 1164 | back to haunt you. libcurl offers several different ways to customize how the |
| 1165 | second connection is being made. |
| 1166 | |
| 1167 | libcurl can either connect to the server a second time or tell the server to |
| 1168 | connect back to it. The first option is the default and it is also what works |
| 1169 | best for all the people behind firewalls, NATs or IP-masquerading setups. |
| 1170 | libcurl then tells the server to open up a new port and wait for a second |
| 1171 | connection. This is by default attempted with EPSV first, and if that doesn't |
| 1172 | work it tries PASV instead. (EPSV is an extension to the original FTP spec |
| 1173 | and does not exist nor work on all FTP servers.) |
| 1174 | |
| 1175 | You can prevent libcurl from first trying the EPSV command by setting |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1176 | \fICURLOPT_FTP_USE_EPSV(3)\fP to zero. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1177 | |
| 1178 | In some cases, you will prefer to have the server connect back to you for the |
| 1179 | second connection. This might be when the server is perhaps behind a firewall |
| 1180 | or something and only allows connections on a single port. libcurl then |
| 1181 | informs the remote server which IP address and port number to connect to. |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1182 | This is made with the \fICURLOPT_FTPPORT(3)\fP option. If you set it to "-", |
| 1183 | libcurl will use your system's "default IP address". If you want to use a |
| 1184 | particular IP, you can set the full IP address, a host name to resolve to an |
| 1185 | IP address or even a local network interface name that libcurl will get the IP |
| 1186 | address from. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1187 | |
| 1188 | When doing the "PORT" approach, libcurl will attempt to use the EPRT and the |
| 1189 | LPRT before trying PORT, as they work with more protocols. You can disable |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1190 | this behavior by setting \fICURLOPT_FTP_USE_EPRT(3)\fP to zero. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1191 | |
Alex Deymo | 486467e | 2017-12-19 19:04:07 +0100 | [diff] [blame] | 1192 | .SH "MIME API revisited for SMTP and IMAP" |
| 1193 | In addition to support HTTP multi-part form fields, the MIME API can be used |
| 1194 | to build structured e-mail messages and send them via SMTP or append such |
| 1195 | messages to IMAP directories. |
| 1196 | |
| 1197 | A structured e-mail message may contain several parts: some are displayed |
| 1198 | inline by the MUA, some are attachments. Parts can also be structured as |
| 1199 | multi-part, for example to include another e-mail message or to offer several |
| 1200 | text formats alternatives. This can be nested to any level. |
| 1201 | |
| 1202 | To build such a message, you prepare the nth-level multi-part and then include |
| 1203 | it as a source to the parent multi-part using function |
| 1204 | \fIcurl_mime_subparts(3)\fP. Once it has been |
| 1205 | bound to its parent multi-part, a nth-level multi-part belongs to it and |
| 1206 | should not be freed explicitly. |
| 1207 | |
| 1208 | E-mail messages data is not supposed to be non-ascii and line length is |
| 1209 | limited: fortunately, some transfer encodings are defined by the standards |
| 1210 | to support the transmission of such incompatible data. Function |
| 1211 | \fIcurl_mime_encoder(3)\fP tells a part that its source data must be encoded |
| 1212 | before being sent. It also generates the corresponding header for that part. |
| 1213 | If the part data you want to send is already encoded in such a scheme, |
| 1214 | do not use this function (this would over-encode it), but explicitly set the |
| 1215 | corresponding part header. |
| 1216 | |
| 1217 | Upon sending such a message, libcurl prepends it with the header list |
| 1218 | set with \fICURLOPT_HTTPHEADER(3)\fP, as 0th-level mime part headers. |
| 1219 | |
| 1220 | Here is an example building an e-mail message with an inline plain/html text |
| 1221 | alternative and a file attachment encoded in base64: |
| 1222 | |
| 1223 | .nf |
| 1224 | curl_mime *message = curl_mime_init(easyhandle); |
| 1225 | |
| 1226 | /* The inline part is an alternative proposing the html and the text |
| 1227 | versions of the e-mail. */ |
| 1228 | curl_mime *alt = curl_mime_init(easyhandle); |
| 1229 | |
| 1230 | /* HTML message. */ |
| 1231 | curl_mimepart *part = curl_mime_addpart(alt); |
| 1232 | curl_mime_data(part, "<html><body><p>This is HTML</p></body></html>", |
| 1233 | CURL_ZERO_TERMINATED); |
| 1234 | curl_mime_type(part, "text/html"); |
| 1235 | |
| 1236 | /* Text message. */ |
| 1237 | part = curl_mime_addpart(alt); |
| 1238 | curl_mime_data(part, "This is plain text message", |
| 1239 | CURL_ZERO_TERMINATED); |
| 1240 | |
| 1241 | /* Create the inline part. */ |
| 1242 | part = curl_mime_addpart(message); |
| 1243 | curl_mime_subparts(part, alt); |
| 1244 | curl_mime_type(part, "multipart/alternative"); |
| 1245 | struct curl_slist *headers = curl_slist_append(NULL, |
| 1246 | "Content-Disposition: inline"); |
| 1247 | curl_mime_headers(part, headers, TRUE); |
| 1248 | |
| 1249 | /* Add the attachment. */ |
| 1250 | part = curl_mime_addpart(message); |
| 1251 | curl_mime_filedata(part, "manual.pdf"); |
| 1252 | curl_mime_encoder(part, "base64"); |
| 1253 | |
| 1254 | /* Build the mail headers. */ |
| 1255 | headers = curl_slist_append(NULL, "From: me@example.com"); |
| 1256 | headers = curl_slist_append(headers, "To: you@example.com"); |
| 1257 | |
| 1258 | /* Set these into the easy handle. */ |
| 1259 | curl_easy_setopt(easyhandle, CURLOPT_HTTPHEADER, headers); |
| 1260 | curl_easy_setopt(easyhandle, CURLOPT_MIMEPOST, mime); |
| 1261 | .fi |
| 1262 | |
| 1263 | It should be noted that appending a message to an IMAP directory requires |
| 1264 | the message size to be known prior upload. It is therefore not possible to |
| 1265 | include parts with unknown data size in this context. |
| 1266 | |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1267 | .SH "Headers Equal Fun" |
| 1268 | |
| 1269 | Some protocols provide "headers", meta-data separated from the normal |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1270 | data. These headers are by default not included in the normal data stream, but |
| 1271 | you can make them appear in the data stream by setting \fICURLOPT_HEADER(3)\fP |
| 1272 | to 1. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1273 | |
| 1274 | What might be even more useful, is libcurl's ability to separate the headers |
| 1275 | from the data and thus make the callbacks differ. You can for example set a |
| 1276 | different pointer to pass to the ordinary write callback by setting |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1277 | \fICURLOPT_HEADERDATA(3)\fP. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1278 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1279 | Or, you can set an entirely separate function to receive the headers, by using |
| 1280 | \fICURLOPT_HEADERFUNCTION(3)\fP. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1281 | |
| 1282 | The headers are passed to the callback function one by one, and you can |
| 1283 | depend on that fact. It makes it easier for you to add custom header parsers |
| 1284 | etc. |
| 1285 | |
| 1286 | \&"Headers" for FTP transfers equal all the FTP server responses. They aren't |
| 1287 | actually true headers, but in this case we pretend they are! ;-) |
| 1288 | |
| 1289 | .SH "Post Transfer Information" |
Elliott Hughes | cac3980 | 2018-04-27 16:19:43 -0700 | [diff] [blame] | 1290 | See \fIcurl_easy_getinfo(3)\fP. |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1291 | .SH "The multi Interface" |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1292 | The easy interface as described in detail in this document is a synchronous |
| 1293 | interface that transfers one file at a time and doesn't return until it is |
| 1294 | done. |
| 1295 | |
| 1296 | The multi interface, on the other hand, allows your program to transfer |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1297 | multiple files in both directions at the same time, without forcing you to use |
| 1298 | multiple threads. The name might make it seem that the multi interface is for |
| 1299 | multi-threaded programs, but the truth is almost the reverse. The multi |
| 1300 | interface allows a single-threaded application to perform the same kinds of |
| 1301 | multiple, simultaneous transfers that multi-threaded programs can perform. It |
| 1302 | allows many of the benefits of multi-threaded transfers without the complexity |
| 1303 | of managing and synchronizing many threads. |
| 1304 | |
| 1305 | To complicate matters somewhat more, there are even two versions of the multi |
| 1306 | interface. The event based one, also called multi_socket and the "normal one" |
| 1307 | designed for using with select(). See the libcurl-multi.3 man page for details |
| 1308 | on the multi_socket event based API, this description here is for the select() |
| 1309 | oriented one. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1310 | |
| 1311 | To use this interface, you are better off if you first understand the basics |
| 1312 | of how to use the easy interface. The multi interface is simply a way to make |
| 1313 | multiple transfers at the same time by adding up multiple easy handles into |
| 1314 | a "multi stack". |
| 1315 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1316 | You create the easy handles you want, one for each concurrent transfer, and |
| 1317 | you set all the options just like you learned above, and then you create a |
| 1318 | multi handle with \fIcurl_multi_init(3)\fP and add all those easy handles to |
| 1319 | that multi handle with \fIcurl_multi_add_handle(3)\fP. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1320 | |
| 1321 | When you've added the handles you have for the moment (you can still add new |
| 1322 | ones at any time), you start the transfers by calling |
| 1323 | \fIcurl_multi_perform(3)\fP. |
| 1324 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1325 | \fIcurl_multi_perform(3)\fP is asynchronous. It will only perform what can be |
| 1326 | done now and then return back control to your program. It is designed to never |
| 1327 | block. You need to keep calling the function until all transfers are |
| 1328 | completed. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1329 | |
| 1330 | The best usage of this interface is when you do a select() on all possible |
| 1331 | file descriptors or sockets to know when to call libcurl again. This also |
| 1332 | makes it easy for you to wait and respond to actions on your own application's |
| 1333 | sockets/handles. You figure out what to select() for by using |
| 1334 | \fIcurl_multi_fdset(3)\fP, that fills in a set of fd_set variables for you |
| 1335 | with the particular file descriptors libcurl uses for the moment. |
| 1336 | |
| 1337 | When you then call select(), it'll return when one of the file handles signal |
| 1338 | action and you then call \fIcurl_multi_perform(3)\fP to allow libcurl to do |
| 1339 | what it wants to do. Take note that libcurl does also feature some time-out |
| 1340 | code so we advise you to never use very long timeouts on select() before you |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1341 | call \fIcurl_multi_perform(3)\fP again. \fIcurl_multi_timeout(3)\fP is |
| 1342 | provided to help you get a suitable timeout period. |
| 1343 | |
| 1344 | Another precaution you should use: always call \fIcurl_multi_fdset(3)\fP |
| 1345 | immediately before the select() call since the current set of file descriptors |
| 1346 | may change in any curl function invoke. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1347 | |
| 1348 | If you want to stop the transfer of one of the easy handles in the stack, you |
| 1349 | can use \fIcurl_multi_remove_handle(3)\fP to remove individual easy |
| 1350 | handles. Remember that easy handles should be \fIcurl_easy_cleanup(3)\fPed. |
| 1351 | |
| 1352 | When a transfer within the multi stack has finished, the counter of running |
| 1353 | transfers (as filled in by \fIcurl_multi_perform(3)\fP) will decrease. When |
| 1354 | the number reaches zero, all transfers are done. |
| 1355 | |
| 1356 | \fIcurl_multi_info_read(3)\fP can be used to get information about completed |
| 1357 | transfers. It then returns the CURLcode for each easy transfer, to allow you |
| 1358 | to figure out success on each individual transfer. |
| 1359 | |
| 1360 | .SH "SSL, Certificates and Other Tricks" |
| 1361 | |
| 1362 | [ seeding, passwords, keys, certificates, ENGINE, ca certs ] |
| 1363 | |
| 1364 | .SH "Sharing Data Between Easy Handles" |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1365 | You can share some data between easy handles when the easy interface is used, |
| 1366 | and some data is share automatically when you use the multi interface. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1367 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1368 | When you add easy handles to a multi handle, these easy handles will |
| 1369 | automatically share a lot of the data that otherwise would be kept on a |
| 1370 | per-easy handle basis when the easy interface is used. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1371 | |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1372 | The DNS cache is shared between handles within a multi handle, making |
| 1373 | subsequent name resolving faster, and the connection pool that is kept to |
| 1374 | better allow persistent connections and connection re-use is also shared. If |
| 1375 | you're using the easy interface, you can still share these between specific |
| 1376 | easy handles by using the share interface, see \fIlibcurl-share(3)\fP. |
| 1377 | |
| 1378 | Some things are never shared automatically, not within multi handles, like for |
| 1379 | example cookies so the only way to share that is with the share interface. |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1380 | .SH "Footnotes" |
| 1381 | |
| 1382 | .IP "[1]" |
| 1383 | libcurl 7.10.3 and later have the ability to switch over to chunked |
| 1384 | Transfer-Encoding in cases where HTTP uploads are done with data of an unknown |
| 1385 | size. |
| 1386 | .IP "[2]" |
| 1387 | This happens on Windows machines when libcurl is built and used as a |
| 1388 | DLL. However, you can still do this on Windows if you link with a static |
| 1389 | library. |
| 1390 | .IP "[3]" |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1391 | The curl-config tool is generated at build-time (on Unix-like systems) and |
Lucas Eckels | 9bd90e6 | 2012-08-06 15:07:02 -0700 | [diff] [blame] | 1392 | should be installed with the 'make install' or similar instruction that |
| 1393 | installs the library, header files, man pages etc. |
| 1394 | .IP "[4]" |
| 1395 | This behavior was different in versions before 7.17.0, where strings had to |
| 1396 | remain valid past the end of the \fIcurl_easy_setopt(3)\fP call. |
Bertrand SIMONNET | e6cd738 | 2015-07-01 15:39:44 -0700 | [diff] [blame] | 1397 | .SH "SEE ALSO" |
| 1398 | .BR libcurl-errors "(3), " libcurl-multi "(3), " libcurl-easy "(3) " |