| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | :mod:`csv` --- CSV File Reading and Writing | 
 | 2 | =========================================== | 
 | 3 |  | 
 | 4 | .. module:: csv | 
 | 5 |    :synopsis: Write and read tabular data to and from delimited files. | 
 | 6 | .. sectionauthor:: Skip Montanaro <skip@pobox.com> | 
 | 7 |  | 
 | 8 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 9 | .. index:: | 
 | 10 |    single: csv | 
 | 11 |    pair: data; tabular | 
 | 12 |  | 
 | 13 | The so-called CSV (Comma Separated Values) format is the most common import and | 
 | 14 | export format for spreadsheets and databases.  There is no "CSV standard", so | 
 | 15 | the format is operationally defined by the many applications which read and | 
 | 16 | write it.  The lack of a standard means that subtle differences often exist in | 
 | 17 | the data produced and consumed by different applications.  These differences can | 
 | 18 | make it annoying to process CSV files from multiple sources.  Still, while the | 
 | 19 | delimiters and quoting characters vary, the overall format is similar enough | 
 | 20 | that it is possible to write a single module which can efficiently manipulate | 
 | 21 | such data, hiding the details of reading and writing the data from the | 
 | 22 | programmer. | 
 | 23 |  | 
 | 24 | The :mod:`csv` module implements classes to read and write tabular data in CSV | 
 | 25 | format.  It allows programmers to say, "write this data in the format preferred | 
 | 26 | by Excel," or "read data from this file which was generated by Excel," without | 
 | 27 | knowing the precise details of the CSV format used by Excel.  Programmers can | 
 | 28 | also describe the CSV formats understood by other applications or define their | 
 | 29 | own special-purpose CSV formats. | 
 | 30 |  | 
 | 31 | The :mod:`csv` module's :class:`reader` and :class:`writer` objects read and | 
 | 32 | write sequences.  Programmers can also read and write data in dictionary form | 
 | 33 | using the :class:`DictReader` and :class:`DictWriter` classes. | 
 | 34 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 35 | .. seealso:: | 
 | 36 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 37 |    :pep:`305` - CSV File API | 
 | 38 |       The Python Enhancement Proposal which proposed this addition to Python. | 
 | 39 |  | 
 | 40 |  | 
 | 41 | .. _csv-contents: | 
 | 42 |  | 
 | 43 | Module Contents | 
 | 44 | --------------- | 
 | 45 |  | 
 | 46 | The :mod:`csv` module defines the following functions: | 
 | 47 |  | 
 | 48 |  | 
| Georg Brandl | c2a4f4f | 2009-04-10 09:03:43 +0000 | [diff] [blame] | 49 | .. function:: reader(csvfile, dialect='excel', **fmtparams) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 50 |  | 
 | 51 |    Return a reader object which will iterate over lines in the given *csvfile*. | 
| Georg Brandl | 9afde1c | 2007-11-01 20:32:30 +0000 | [diff] [blame] | 52 |    *csvfile* can be any object which supports the :term:`iterator` protocol and returns a | 
| Georg Brandl | b30f330 | 2011-01-06 09:23:56 +0000 | [diff] [blame] | 53 |    string each time its :meth:`!__next__` method is called --- :term:`file objects | 
| Antoine Pitrou | 11cb961 | 2010-09-15 11:11:28 +0000 | [diff] [blame] | 54 |    <file object>` and list objects are both suitable.   If *csvfile* is a file object, | 
 | 55 |    it should be opened with ``newline=''``. [#]_  An optional | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 56 |    *dialect* parameter can be given which is used to define a set of parameters | 
 | 57 |    specific to a particular CSV dialect.  It may be an instance of a subclass of | 
 | 58 |    the :class:`Dialect` class or one of the strings returned by the | 
| Georg Brandl | c2a4f4f | 2009-04-10 09:03:43 +0000 | [diff] [blame] | 59 |    :func:`list_dialects` function.  The other optional *fmtparams* keyword arguments | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 60 |    can be given to override individual formatting parameters in the current | 
 | 61 |    dialect.  For full details about the dialect and formatting parameters, see | 
 | 62 |    section :ref:`csv-fmt-params`. | 
 | 63 |  | 
| Skip Montanaro | 0468df3 | 2009-03-25 00:52:18 +0000 | [diff] [blame] | 64 |    Each row read from the csv file is returned as a list of strings.  No | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 65 |    automatic data type conversion is performed unless the ``QUOTE_NONNUMERIC`` format | 
 | 66 |    option is specified (in which case unquoted fields are transformed into floats). | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 67 |  | 
| Christian Heimes | b9eccbf | 2007-12-05 20:18:38 +0000 | [diff] [blame] | 68 |    A short usage example:: | 
| Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 69 |  | 
| Christian Heimes | b9eccbf | 2007-12-05 20:18:38 +0000 | [diff] [blame] | 70 |       >>> import csv | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 71 |       >>> spamReader = csv.reader(open('eggs.csv', newline=''), delimiter=' ', quotechar='|') | 
| Christian Heimes | b9eccbf | 2007-12-05 20:18:38 +0000 | [diff] [blame] | 72 |       >>> for row in spamReader: | 
| Georg Brandl | f694518 | 2008-02-01 11:56:49 +0000 | [diff] [blame] | 73 |       ...     print(', '.join(row)) | 
| Christian Heimes | b9eccbf | 2007-12-05 20:18:38 +0000 | [diff] [blame] | 74 |       Spam, Spam, Spam, Spam, Spam, Baked Beans | 
 | 75 |       Spam, Lovely Spam, Wonderful Spam | 
 | 76 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 77 |  | 
| Georg Brandl | c2a4f4f | 2009-04-10 09:03:43 +0000 | [diff] [blame] | 78 | .. function:: writer(csvfile, dialect='excel', **fmtparams) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 79 |  | 
 | 80 |    Return a writer object responsible for converting the user's data into delimited | 
 | 81 |    strings on the given file-like object.  *csvfile* can be any object with a | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 82 |    :func:`write` method.  An optional *dialect* | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 83 |    parameter can be given which is used to define a set of parameters specific to a | 
 | 84 |    particular CSV dialect.  It may be an instance of a subclass of the | 
 | 85 |    :class:`Dialect` class or one of the strings returned by the | 
| Georg Brandl | c2a4f4f | 2009-04-10 09:03:43 +0000 | [diff] [blame] | 86 |    :func:`list_dialects` function.  The other optional *fmtparams* keyword arguments | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 87 |    can be given to override individual formatting parameters in the current | 
 | 88 |    dialect.  For full details about the dialect and formatting parameters, see | 
 | 89 |    section :ref:`csv-fmt-params`. To make it | 
 | 90 |    as easy as possible to interface with modules which implement the DB API, the | 
 | 91 |    value :const:`None` is written as the empty string.  While this isn't a | 
 | 92 |    reversible transformation, it makes it easier to dump SQL NULL data values to | 
 | 93 |    CSV files without preprocessing the data returned from a ``cursor.fetch*`` call. | 
 | 94 |    All other non-string data are stringified with :func:`str` before being written. | 
 | 95 |  | 
| Christian Heimes | b9eccbf | 2007-12-05 20:18:38 +0000 | [diff] [blame] | 96 |    A short usage example:: | 
 | 97 |  | 
 | 98 |       >>> import csv | 
 | 99 |       >>> spamWriter = csv.writer(open('eggs.csv', 'w'), delimiter=' ', | 
| Skip Montanaro | f290cb5 | 2009-08-18 14:37:42 +0000 | [diff] [blame] | 100 |       ...                         quotechar='|', quoting=csv.QUOTE_MINIMAL) | 
| Christian Heimes | b9eccbf | 2007-12-05 20:18:38 +0000 | [diff] [blame] | 101 |       >>> spamWriter.writerow(['Spam'] * 5 + ['Baked Beans']) | 
 | 102 |       >>> spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam']) | 
 | 103 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 104 |  | 
| Georg Brandl | c2a4f4f | 2009-04-10 09:03:43 +0000 | [diff] [blame] | 105 | .. function:: register_dialect(name[, dialect], **fmtparams) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 106 |  | 
| Georg Brandl | f694518 | 2008-02-01 11:56:49 +0000 | [diff] [blame] | 107 |    Associate *dialect* with *name*.  *name* must be a string. The | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 108 |    dialect can be specified either by passing a sub-class of :class:`Dialect`, or | 
| Georg Brandl | c2a4f4f | 2009-04-10 09:03:43 +0000 | [diff] [blame] | 109 |    by *fmtparams* keyword arguments, or both, with keyword arguments overriding | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 110 |    parameters of the dialect. For full details about the dialect and formatting | 
 | 111 |    parameters, see section :ref:`csv-fmt-params`. | 
 | 112 |  | 
 | 113 |  | 
 | 114 | .. function:: unregister_dialect(name) | 
 | 115 |  | 
 | 116 |    Delete the dialect associated with *name* from the dialect registry.  An | 
 | 117 |    :exc:`Error` is raised if *name* is not a registered dialect name. | 
 | 118 |  | 
 | 119 |  | 
 | 120 | .. function:: get_dialect(name) | 
 | 121 |  | 
| Georg Brandl | 6554cb9 | 2007-12-02 23:15:43 +0000 | [diff] [blame] | 122 |    Return the dialect associated with *name*.  An :exc:`Error` is raised if | 
 | 123 |    *name* is not a registered dialect name.  This function returns an immutable | 
 | 124 |    :class:`Dialect`. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 125 |  | 
 | 126 | .. function:: list_dialects() | 
 | 127 |  | 
 | 128 |    Return the names of all registered dialects. | 
 | 129 |  | 
 | 130 |  | 
 | 131 | .. function:: field_size_limit([new_limit]) | 
 | 132 |  | 
 | 133 |    Returns the current maximum field size allowed by the parser. If *new_limit* is | 
 | 134 |    given, this becomes the new limit. | 
 | 135 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 136 |  | 
 | 137 | The :mod:`csv` module defines the following classes: | 
 | 138 |  | 
| Georg Brandl | c2a4f4f | 2009-04-10 09:03:43 +0000 | [diff] [blame] | 139 | .. class:: DictReader(csvfile, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 140 |  | 
 | 141 |    Create an object which operates like a regular reader but maps the information | 
 | 142 |    read into a dict whose keys are given by the optional  *fieldnames* parameter. | 
 | 143 |    If the *fieldnames* parameter is omitted, the values in the first row of the | 
| R. David Murray | 578ec90 | 2009-11-09 14:21:38 +0000 | [diff] [blame] | 144 |    *csvfile* will be used as the fieldnames.  If the row read has more fields | 
 | 145 |    than the fieldnames sequence, the remaining data is added as a sequence | 
 | 146 |    keyed by the value of *restkey*.  If the row read has fewer fields than the | 
 | 147 |    fieldnames sequence, the remaining keys take the value of the optional | 
 | 148 |    *restval* parameter.  Any other optional or keyword arguments are passed to | 
 | 149 |    the underlying :class:`reader` instance. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 150 |  | 
 | 151 |  | 
| Georg Brandl | c2a4f4f | 2009-04-10 09:03:43 +0000 | [diff] [blame] | 152 | .. class:: DictWriter(csvfile, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 153 |  | 
 | 154 |    Create an object which operates like a regular writer but maps dictionaries onto | 
 | 155 |    output rows.  The *fieldnames* parameter identifies the order in which values in | 
 | 156 |    the dictionary passed to the :meth:`writerow` method are written to the | 
 | 157 |    *csvfile*.  The optional *restval* parameter specifies the value to be written | 
 | 158 |    if the dictionary is missing a key in *fieldnames*.  If the dictionary passed to | 
 | 159 |    the :meth:`writerow` method contains a key not found in *fieldnames*, the | 
 | 160 |    optional *extrasaction* parameter indicates what action to take.  If it is set | 
 | 161 |    to ``'raise'`` a :exc:`ValueError` is raised.  If it is set to ``'ignore'``, | 
 | 162 |    extra values in the dictionary are ignored.  Any other optional or keyword | 
 | 163 |    arguments are passed to the underlying :class:`writer` instance. | 
 | 164 |  | 
 | 165 |    Note that unlike the :class:`DictReader` class, the *fieldnames* parameter of | 
 | 166 |    the :class:`DictWriter` is not optional.  Since Python's :class:`dict` objects | 
 | 167 |    are not ordered, there is not enough information available to deduce the order | 
 | 168 |    in which the row should be written to the *csvfile*. | 
 | 169 |  | 
 | 170 |  | 
 | 171 | .. class:: Dialect | 
 | 172 |  | 
 | 173 |    The :class:`Dialect` class is a container class relied on primarily for its | 
 | 174 |    attributes, which are used to define the parameters for a specific | 
 | 175 |    :class:`reader` or :class:`writer` instance. | 
 | 176 |  | 
 | 177 |  | 
 | 178 | .. class:: excel() | 
 | 179 |  | 
 | 180 |    The :class:`excel` class defines the usual properties of an Excel-generated CSV | 
 | 181 |    file.  It is registered with the dialect name ``'excel'``. | 
 | 182 |  | 
 | 183 |  | 
 | 184 | .. class:: excel_tab() | 
 | 185 |  | 
 | 186 |    The :class:`excel_tab` class defines the usual properties of an Excel-generated | 
 | 187 |    TAB-delimited file.  It is registered with the dialect name ``'excel-tab'``. | 
 | 188 |  | 
 | 189 |  | 
| Georg Brandl | 7424dd3 | 2010-10-27 07:27:06 +0000 | [diff] [blame] | 190 | .. class:: unix_dialect() | 
 | 191 |  | 
 | 192 |    The :class:`unix_dialect` class defines the usual properties of a CSV file | 
 | 193 |    generated on UNIX systems, i.e. using ``'\n'`` as line terminator and quoting | 
 | 194 |    all fields.  It is registered with the dialect name ``'unix'``. | 
 | 195 |  | 
 | 196 |    .. versionadded:: 3.2 | 
 | 197 |  | 
 | 198 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 199 | .. class:: Sniffer() | 
 | 200 |  | 
 | 201 |    The :class:`Sniffer` class is used to deduce the format of a CSV file. | 
 | 202 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 203 |    The :class:`Sniffer` class provides two methods: | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 204 |  | 
| Georg Brandl | c2a4f4f | 2009-04-10 09:03:43 +0000 | [diff] [blame] | 205 |    .. method:: sniff(sample, delimiters=None) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 206 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 207 |       Analyze the given *sample* and return a :class:`Dialect` subclass | 
 | 208 |       reflecting the parameters found.  If the optional *delimiters* parameter | 
 | 209 |       is given, it is interpreted as a string containing possible valid | 
 | 210 |       delimiter characters. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 211 |  | 
 | 212 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 213 |    .. method:: has_header(sample) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 214 |  | 
| Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 215 |       Analyze the sample text (presumed to be in CSV format) and return | 
 | 216 |       :const:`True` if the first row appears to be a series of column headers. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 217 |  | 
| Christian Heimes | 7f04431 | 2008-01-06 17:05:40 +0000 | [diff] [blame] | 218 | An example for :class:`Sniffer` use:: | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 219 |  | 
| Christian Heimes | 7f04431 | 2008-01-06 17:05:40 +0000 | [diff] [blame] | 220 |    csvfile = open("example.csv") | 
 | 221 |    dialect = csv.Sniffer().sniff(csvfile.read(1024)) | 
 | 222 |    csvfile.seek(0) | 
 | 223 |    reader = csv.reader(csvfile, dialect) | 
 | 224 |    # ... process CSV file contents here ... | 
 | 225 |  | 
 | 226 |  | 
 | 227 | The :mod:`csv` module defines the following constants: | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 228 |  | 
 | 229 | .. data:: QUOTE_ALL | 
 | 230 |  | 
 | 231 |    Instructs :class:`writer` objects to quote all fields. | 
 | 232 |  | 
 | 233 |  | 
 | 234 | .. data:: QUOTE_MINIMAL | 
 | 235 |  | 
 | 236 |    Instructs :class:`writer` objects to only quote those fields which contain | 
 | 237 |    special characters such as *delimiter*, *quotechar* or any of the characters in | 
 | 238 |    *lineterminator*. | 
 | 239 |  | 
 | 240 |  | 
 | 241 | .. data:: QUOTE_NONNUMERIC | 
 | 242 |  | 
 | 243 |    Instructs :class:`writer` objects to quote all non-numeric fields. | 
 | 244 |  | 
 | 245 |    Instructs the reader to convert all non-quoted fields to type *float*. | 
 | 246 |  | 
 | 247 |  | 
 | 248 | .. data:: QUOTE_NONE | 
 | 249 |  | 
 | 250 |    Instructs :class:`writer` objects to never quote fields.  When the current | 
 | 251 |    *delimiter* occurs in output data it is preceded by the current *escapechar* | 
 | 252 |    character.  If *escapechar* is not set, the writer will raise :exc:`Error` if | 
 | 253 |    any characters that require escaping are encountered. | 
 | 254 |  | 
 | 255 |    Instructs :class:`reader` to perform no special processing of quote characters. | 
 | 256 |  | 
 | 257 | The :mod:`csv` module defines the following exception: | 
 | 258 |  | 
 | 259 |  | 
 | 260 | .. exception:: Error | 
 | 261 |  | 
 | 262 |    Raised by any of the functions when an error is detected. | 
 | 263 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 264 | .. _csv-fmt-params: | 
 | 265 |  | 
 | 266 | Dialects and Formatting Parameters | 
 | 267 | ---------------------------------- | 
 | 268 |  | 
 | 269 | To make it easier to specify the format of input and output records, specific | 
 | 270 | formatting parameters are grouped together into dialects.  A dialect is a | 
 | 271 | subclass of the :class:`Dialect` class having a set of specific methods and a | 
 | 272 | single :meth:`validate` method.  When creating :class:`reader` or | 
 | 273 | :class:`writer` objects, the programmer can specify a string or a subclass of | 
 | 274 | the :class:`Dialect` class as the dialect parameter.  In addition to, or instead | 
 | 275 | of, the *dialect* parameter, the programmer can also specify individual | 
 | 276 | formatting parameters, which have the same names as the attributes defined below | 
 | 277 | for the :class:`Dialect` class. | 
 | 278 |  | 
 | 279 | Dialects support the following attributes: | 
 | 280 |  | 
 | 281 |  | 
 | 282 | .. attribute:: Dialect.delimiter | 
 | 283 |  | 
 | 284 |    A one-character string used to separate fields.  It defaults to ``','``. | 
 | 285 |  | 
 | 286 |  | 
 | 287 | .. attribute:: Dialect.doublequote | 
 | 288 |  | 
 | 289 |    Controls how instances of *quotechar* appearing inside a field should be | 
 | 290 |    themselves be quoted.  When :const:`True`, the character is doubled. When | 
 | 291 |    :const:`False`, the *escapechar* is used as a prefix to the *quotechar*.  It | 
 | 292 |    defaults to :const:`True`. | 
 | 293 |  | 
 | 294 |    On output, if *doublequote* is :const:`False` and no *escapechar* is set, | 
 | 295 |    :exc:`Error` is raised if a *quotechar* is found in a field. | 
 | 296 |  | 
 | 297 |  | 
 | 298 | .. attribute:: Dialect.escapechar | 
 | 299 |  | 
 | 300 |    A one-character string used by the writer to escape the *delimiter* if *quoting* | 
 | 301 |    is set to :const:`QUOTE_NONE` and the *quotechar* if *doublequote* is | 
 | 302 |    :const:`False`. On reading, the *escapechar* removes any special meaning from | 
 | 303 |    the following character. It defaults to :const:`None`, which disables escaping. | 
 | 304 |  | 
 | 305 |  | 
 | 306 | .. attribute:: Dialect.lineterminator | 
 | 307 |  | 
 | 308 |    The string used to terminate lines produced by the :class:`writer`. It defaults | 
 | 309 |    to ``'\r\n'``. | 
 | 310 |  | 
 | 311 |    .. note:: | 
 | 312 |  | 
 | 313 |       The :class:`reader` is hard-coded to recognise either ``'\r'`` or ``'\n'`` as | 
 | 314 |       end-of-line, and ignores *lineterminator*. This behavior may change in the | 
 | 315 |       future. | 
 | 316 |  | 
 | 317 |  | 
 | 318 | .. attribute:: Dialect.quotechar | 
 | 319 |  | 
 | 320 |    A one-character string used to quote fields containing special characters, such | 
 | 321 |    as the *delimiter* or *quotechar*, or which contain new-line characters.  It | 
 | 322 |    defaults to ``'"'``. | 
 | 323 |  | 
 | 324 |  | 
 | 325 | .. attribute:: Dialect.quoting | 
 | 326 |  | 
 | 327 |    Controls when quotes should be generated by the writer and recognised by the | 
 | 328 |    reader.  It can take on any of the :const:`QUOTE_\*` constants (see section | 
 | 329 |    :ref:`csv-contents`) and defaults to :const:`QUOTE_MINIMAL`. | 
 | 330 |  | 
 | 331 |  | 
 | 332 | .. attribute:: Dialect.skipinitialspace | 
 | 333 |  | 
 | 334 |    When :const:`True`, whitespace immediately following the *delimiter* is ignored. | 
 | 335 |    The default is :const:`False`. | 
 | 336 |  | 
 | 337 |  | 
 | 338 | Reader Objects | 
 | 339 | -------------- | 
 | 340 |  | 
 | 341 | Reader objects (:class:`DictReader` instances and objects returned by the | 
 | 342 | :func:`reader` function) have the following public methods: | 
 | 343 |  | 
| Georg Brandl | c748506 | 2009-04-01 15:53:15 +0000 | [diff] [blame] | 344 | .. method:: csvreader.__next__() | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 345 |  | 
 | 346 |    Return the next row of the reader's iterable object as a list, parsed according | 
| Georg Brandl | c748506 | 2009-04-01 15:53:15 +0000 | [diff] [blame] | 347 |    to the current dialect.  Usually you should call this as ``next(reader)``. | 
 | 348 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 349 |  | 
 | 350 | Reader objects have the following public attributes: | 
 | 351 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 352 | .. attribute:: csvreader.dialect | 
 | 353 |  | 
 | 354 |    A read-only description of the dialect in use by the parser. | 
 | 355 |  | 
 | 356 |  | 
 | 357 | .. attribute:: csvreader.line_num | 
 | 358 |  | 
 | 359 |    The number of lines read from the source iterator. This is not the same as the | 
 | 360 |    number of records returned, as records can span multiple lines. | 
 | 361 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 362 |  | 
| Skip Montanaro | af8fcfa | 2008-08-09 19:44:22 +0000 | [diff] [blame] | 363 | DictReader objects have the following public attribute: | 
 | 364 |  | 
| Skip Montanaro | af8fcfa | 2008-08-09 19:44:22 +0000 | [diff] [blame] | 365 | .. attribute:: csvreader.fieldnames | 
 | 366 |  | 
 | 367 |    If not passed as a parameter when creating the object, this attribute is | 
 | 368 |    initialized upon first access or when the first record is read from the | 
 | 369 |    file. | 
 | 370 |  | 
| Skip Montanaro | af8fcfa | 2008-08-09 19:44:22 +0000 | [diff] [blame] | 371 |  | 
 | 372 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 373 | Writer Objects | 
 | 374 | -------------- | 
 | 375 |  | 
 | 376 | :class:`Writer` objects (:class:`DictWriter` instances and objects returned by | 
 | 377 | the :func:`writer` function) have the following public methods.  A *row* must be | 
 | 378 | a sequence of strings or numbers for :class:`Writer` objects and a dictionary | 
 | 379 | mapping fieldnames to strings or numbers (by passing them through :func:`str` | 
 | 380 | first) for :class:`DictWriter` objects.  Note that complex numbers are written | 
 | 381 | out surrounded by parens. This may cause some problems for other programs which | 
 | 382 | read CSV files (assuming they support complex numbers at all). | 
 | 383 |  | 
 | 384 |  | 
 | 385 | .. method:: csvwriter.writerow(row) | 
 | 386 |  | 
 | 387 |    Write the *row* parameter to the writer's file object, formatted according to | 
 | 388 |    the current dialect. | 
 | 389 |  | 
 | 390 |  | 
 | 391 | .. method:: csvwriter.writerows(rows) | 
 | 392 |  | 
 | 393 |    Write all the *rows* parameters (a list of *row* objects as described above) to | 
 | 394 |    the writer's file object, formatted according to the current dialect. | 
 | 395 |  | 
 | 396 | Writer objects have the following public attribute: | 
 | 397 |  | 
 | 398 |  | 
 | 399 | .. attribute:: csvwriter.dialect | 
 | 400 |  | 
 | 401 |    A read-only description of the dialect in use by the writer. | 
 | 402 |  | 
 | 403 |  | 
| R. David Murray | be0698b | 2010-02-23 22:57:58 +0000 | [diff] [blame] | 404 | DictWriter objects have the following public method: | 
 | 405 |  | 
 | 406 |  | 
 | 407 | .. method:: DictWriter.writeheader() | 
 | 408 |  | 
 | 409 |    Write a row with the field names (as specified in the constructor). | 
 | 410 |  | 
| R. David Murray | 19e4548 | 2010-02-23 23:00:34 +0000 | [diff] [blame] | 411 |    .. versionadded:: 3.2 | 
| R. David Murray | be0698b | 2010-02-23 22:57:58 +0000 | [diff] [blame] | 412 |  | 
 | 413 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 414 | .. _csv-examples: | 
 | 415 |  | 
 | 416 | Examples | 
 | 417 | -------- | 
 | 418 |  | 
 | 419 | The simplest example of reading a CSV file:: | 
 | 420 |  | 
 | 421 |    import csv | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 422 |    reader = csv.reader(open("some.csv", newline='')) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 423 |    for row in reader: | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 424 |        print(row) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 425 |  | 
 | 426 | Reading a file with an alternate format:: | 
 | 427 |  | 
 | 428 |    import csv | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 429 |    reader = csv.reader(open("passwd"), delimiter=':', quoting=csv.QUOTE_NONE) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 430 |    for row in reader: | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 431 |        print(row) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 432 |  | 
 | 433 | The corresponding simplest possible writing example is:: | 
 | 434 |  | 
 | 435 |    import csv | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 436 |    writer = csv.writer(open("some.csv", "w")) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 437 |    writer.writerows(someiterable) | 
 | 438 |  | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 439 | Since :func:`open` is used to open a CSV file for reading, the file | 
 | 440 | will by default be decoded into unicode using the system default | 
 | 441 | encoding (see :func:`locale.getpreferredencoding`).  To decode a file | 
 | 442 | using a different encoding, use the ``encoding`` argument of open:: | 
 | 443 |  | 
 | 444 |     import csv | 
 | 445 |     reader = csv.reader(open("some.csv", newline='', encoding='utf-8')) | 
 | 446 |     for row in reader: | 
 | 447 |         print(row) | 
 | 448 |  | 
 | 449 | The same applies to writing in something other than the system default | 
 | 450 | encoding: specify the encoding argument when opening the output file. | 
 | 451 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 452 | Registering a new dialect:: | 
 | 453 |  | 
 | 454 |    import csv | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 455 |    csv.register_dialect('unixpwd', delimiter=':', quoting=csv.QUOTE_NONE) | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 456 |    reader = csv.reader(open("passwd"), 'unixpwd') | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 457 |  | 
 | 458 | A slightly more advanced use of the reader --- catching and reporting errors:: | 
 | 459 |  | 
 | 460 |    import csv, sys | 
 | 461 |    filename = "some.csv" | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 462 |    reader = csv.reader(open(filename, newline='')) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 463 |    try: | 
 | 464 |        for row in reader: | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 465 |            print(row) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 466 |    except csv.Error as e: | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 467 |        sys.exit('file {}, line {}: {}'.format(filename, reader.line_num, e)) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 468 |  | 
 | 469 | And while the module doesn't directly support parsing strings, it can easily be | 
 | 470 | done:: | 
 | 471 |  | 
 | 472 |    import csv | 
 | 473 |    for row in csv.reader(['one,two,three']): | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 474 |        print(row) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 475 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 476 |  | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 477 | .. rubric:: Footnotes | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 478 |  | 
| R. David Murray | 8b7d4aa | 2009-04-04 01:38:38 +0000 | [diff] [blame] | 479 | .. [#] If ``newline=''`` is not specified, newlines embedded inside quoted fields | 
 | 480 |    will not be interpreted correctly.  It should always be safe to specify | 
 | 481 |    ``newline=''``, since the csv module does its own universal newline handling | 
 | 482 |    on input. |