blob: 66b2fb4e09b676185d6ae7fa964437949142392e [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`csv` --- CSV File Reading and Writing
3===========================================
4
5.. module:: csv
6 :synopsis: Write and read tabular data to and from delimited files.
7.. sectionauthor:: Skip Montanaro <skip@pobox.com>
8
9
10.. versionadded:: 2.3
11
12.. index::
13 single: csv
14 pair: data; tabular
15
16The so-called CSV (Comma Separated Values) format is the most common import and
17export format for spreadsheets and databases. There is no "CSV standard", so
18the format is operationally defined by the many applications which read and
19write it. The lack of a standard means that subtle differences often exist in
20the data produced and consumed by different applications. These differences can
21make it annoying to process CSV files from multiple sources. Still, while the
22delimiters and quoting characters vary, the overall format is similar enough
23that it is possible to write a single module which can efficiently manipulate
24such data, hiding the details of reading and writing the data from the
25programmer.
26
27The :mod:`csv` module implements classes to read and write tabular data in CSV
28format. It allows programmers to say, "write this data in the format preferred
29by Excel," or "read data from this file which was generated by Excel," without
30knowing the precise details of the CSV format used by Excel. Programmers can
31also describe the CSV formats understood by other applications or define their
32own special-purpose CSV formats.
33
34The :mod:`csv` module's :class:`reader` and :class:`writer` objects read and
35write sequences. Programmers can also read and write data in dictionary form
36using the :class:`DictReader` and :class:`DictWriter` classes.
37
38.. note::
39
40 This version of the :mod:`csv` module doesn't support Unicode input. Also,
41 there are currently some issues regarding ASCII NUL characters. Accordingly,
42 all input should be UTF-8 or printable ASCII to be safe; see the examples in
Éric Araujo06176a82012-07-02 17:46:40 -040043 section :ref:`csv-examples`.
Georg Brandl8ec7f652007-08-15 14:28:01 +000044
45
46.. seealso::
47
Georg Brandl8ec7f652007-08-15 14:28:01 +000048 :pep:`305` - CSV File API
49 The Python Enhancement Proposal which proposed this addition to Python.
50
51
52.. _csv-contents:
53
54Module Contents
55---------------
56
57The :mod:`csv` module defines the following functions:
58
59
Hynek Schlawack7d978902012-08-28 12:33:46 +020060.. function:: reader(csvfile, dialect='excel', **fmtparams)
Georg Brandl8ec7f652007-08-15 14:28:01 +000061
62 Return a reader object which will iterate over lines in the given *csvfile*.
Georg Brandle7a09902007-10-21 12:10:28 +000063 *csvfile* can be any object which supports the :term:`iterator` protocol and returns a
Georg Brandl9fa61bb2009-07-26 14:19:57 +000064 string each time its :meth:`!next` method is called --- file objects and list
Georg Brandl8ec7f652007-08-15 14:28:01 +000065 objects are both suitable. If *csvfile* is a file object, it must be opened
66 with the 'b' flag on platforms where that makes a difference. An optional
67 *dialect* parameter can be given which is used to define a set of parameters
68 specific to a particular CSV dialect. It may be an instance of a subclass of
69 the :class:`Dialect` class or one of the strings returned by the
Hynek Schlawack7d978902012-08-28 12:33:46 +020070 :func:`list_dialects` function. The other optional *fmtparams* keyword arguments
Georg Brandl8ec7f652007-08-15 14:28:01 +000071 can be given to override individual formatting parameters in the current
72 dialect. For full details about the dialect and formatting parameters, see
73 section :ref:`csv-fmt-params`.
74
Skip Montanaro9a1337b2009-03-25 00:52:11 +000075 Each row read from the csv file is returned as a list of strings. No
76 automatic data type conversion is performed.
Georg Brandl8ec7f652007-08-15 14:28:01 +000077
Georg Brandl722e1012007-12-05 17:56:50 +000078 A short usage example::
Georg Brandlc62ef8b2009-01-03 20:55:06 +000079
Georg Brandl722e1012007-12-05 17:56:50 +000080 >>> import csv
Ezio Melottia733d812012-09-15 05:46:24 +030081 >>> with open('eggs.csv', 'rb') as csvfile:
82 ... spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
83 ... for row in spamreader:
84 ... print ', '.join(row)
Georg Brandl722e1012007-12-05 17:56:50 +000085 Spam, Spam, Spam, Spam, Spam, Baked Beans
86 Spam, Lovely Spam, Wonderful Spam
87
Georg Brandl8ec7f652007-08-15 14:28:01 +000088 .. versionchanged:: 2.5
89 The parser is now stricter with respect to multi-line quoted fields. Previously,
90 if a line ended within a quoted field without a terminating newline character, a
91 newline would be inserted into the returned field. This behavior caused problems
92 when reading files which contained carriage return characters within fields.
93 The behavior was changed to return the field without inserting newlines. As a
94 consequence, if newlines embedded within fields are important, the input should
95 be split into lines in a manner which preserves the newline characters.
96
97
Hynek Schlawack7d978902012-08-28 12:33:46 +020098.. function:: writer(csvfile, dialect='excel', **fmtparams)
Georg Brandl8ec7f652007-08-15 14:28:01 +000099
100 Return a writer object responsible for converting the user's data into delimited
101 strings on the given file-like object. *csvfile* can be any object with a
102 :func:`write` method. If *csvfile* is a file object, it must be opened with the
103 'b' flag on platforms where that makes a difference. An optional *dialect*
104 parameter can be given which is used to define a set of parameters specific to a
105 particular CSV dialect. It may be an instance of a subclass of the
106 :class:`Dialect` class or one of the strings returned by the
Hynek Schlawack7d978902012-08-28 12:33:46 +0200107 :func:`list_dialects` function. The other optional *fmtparams* keyword arguments
Georg Brandl8ec7f652007-08-15 14:28:01 +0000108 can be given to override individual formatting parameters in the current
109 dialect. For full details about the dialect and formatting parameters, see
110 section :ref:`csv-fmt-params`. To make it
111 as easy as possible to interface with modules which implement the DB API, the
112 value :const:`None` is written as the empty string. While this isn't a
113 reversible transformation, it makes it easier to dump SQL NULL data values to
114 CSV files without preprocessing the data returned from a ``cursor.fetch*`` call.
115 All other non-string data are stringified with :func:`str` before being written.
116
Georg Brandl722e1012007-12-05 17:56:50 +0000117 A short usage example::
118
Ezio Melottia733d812012-09-15 05:46:24 +0300119 import csv
120 with open('eggs.csv', 'wb') as csvfile:
121 spamwriter = csv.writer(csvfile, delimiter=' ',
122 quotechar='|', quoting=csv.QUOTE_MINIMAL)
123 spamwriter.writerow(['Spam'] * 5 + ['Baked Beans'])
124 spamwriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])
Georg Brandl722e1012007-12-05 17:56:50 +0000125
Georg Brandl8ec7f652007-08-15 14:28:01 +0000126
Hynek Schlawack7d978902012-08-28 12:33:46 +0200127.. function:: register_dialect(name[, dialect], **fmtparams)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000128
129 Associate *dialect* with *name*. *name* must be a string or Unicode object. The
130 dialect can be specified either by passing a sub-class of :class:`Dialect`, or
Hynek Schlawack7d978902012-08-28 12:33:46 +0200131 by *fmtparams* keyword arguments, or both, with keyword arguments overriding
Georg Brandl8ec7f652007-08-15 14:28:01 +0000132 parameters of the dialect. For full details about the dialect and formatting
133 parameters, see section :ref:`csv-fmt-params`.
134
135
136.. function:: unregister_dialect(name)
137
138 Delete the dialect associated with *name* from the dialect registry. An
139 :exc:`Error` is raised if *name* is not a registered dialect name.
140
141
142.. function:: get_dialect(name)
143
144 Return the dialect associated with *name*. An :exc:`Error` is raised if *name*
145 is not a registered dialect name.
146
Skip Montanarod469ff12007-11-04 15:56:52 +0000147 .. versionchanged:: 2.5
Georg Brandl9c466ba2007-11-04 17:43:49 +0000148 This function now returns an immutable :class:`Dialect`. Previously an
149 instance of the requested dialect was returned. Users could modify the
150 underlying class, changing the behavior of active readers and writers.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000151
152.. function:: list_dialects()
153
154 Return the names of all registered dialects.
155
156
157.. function:: field_size_limit([new_limit])
158
159 Returns the current maximum field size allowed by the parser. If *new_limit* is
160 given, this becomes the new limit.
161
162 .. versionadded:: 2.5
163
164The :mod:`csv` module defines the following classes:
165
166
Hynek Schlawacke58ce012012-05-22 10:27:40 +0200167.. class:: DictReader(csvfile, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000168
169 Create an object which operates like a regular reader but maps the information
170 read into a dict whose keys are given by the optional *fieldnames* parameter.
171 If the *fieldnames* parameter is omitted, the values in the first row of the
R. David Murraya5dcf212009-11-09 14:18:14 +0000172 *csvfile* will be used as the fieldnames. If the row read has more fields
173 than the fieldnames sequence, the remaining data is added as a sequence
174 keyed by the value of *restkey*. If the row read has fewer fields than the
175 fieldnames sequence, the remaining keys take the value of the optional
176 *restval* parameter. Any other optional or keyword arguments are passed to
177 the underlying :class:`reader` instance.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000178
179
Hynek Schlawacke58ce012012-05-22 10:27:40 +0200180.. class:: DictWriter(csvfile, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000181
182 Create an object which operates like a regular writer but maps dictionaries onto
183 output rows. The *fieldnames* parameter identifies the order in which values in
184 the dictionary passed to the :meth:`writerow` method are written to the
185 *csvfile*. The optional *restval* parameter specifies the value to be written
186 if the dictionary is missing a key in *fieldnames*. If the dictionary passed to
187 the :meth:`writerow` method contains a key not found in *fieldnames*, the
188 optional *extrasaction* parameter indicates what action to take. If it is set
189 to ``'raise'`` a :exc:`ValueError` is raised. If it is set to ``'ignore'``,
190 extra values in the dictionary are ignored. Any other optional or keyword
191 arguments are passed to the underlying :class:`writer` instance.
192
193 Note that unlike the :class:`DictReader` class, the *fieldnames* parameter of
194 the :class:`DictWriter` is not optional. Since Python's :class:`dict` objects
195 are not ordered, there is not enough information available to deduce the order
196 in which the row should be written to the *csvfile*.
197
198
199.. class:: Dialect
200
201 The :class:`Dialect` class is a container class relied on primarily for its
202 attributes, which are used to define the parameters for a specific
203 :class:`reader` or :class:`writer` instance.
204
205
206.. class:: excel()
207
208 The :class:`excel` class defines the usual properties of an Excel-generated CSV
209 file. It is registered with the dialect name ``'excel'``.
210
211
212.. class:: excel_tab()
213
214 The :class:`excel_tab` class defines the usual properties of an Excel-generated
215 TAB-delimited file. It is registered with the dialect name ``'excel-tab'``.
216
217
218.. class:: Sniffer()
219
220 The :class:`Sniffer` class is used to deduce the format of a CSV file.
221
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000222 The :class:`Sniffer` class provides two methods:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000223
Hynek Schlawacke58ce012012-05-22 10:27:40 +0200224 .. method:: sniff(sample, delimiters=None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000225
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000226 Analyze the given *sample* and return a :class:`Dialect` subclass
227 reflecting the parameters found. If the optional *delimiters* parameter
228 is given, it is interpreted as a string containing possible valid
229 delimiter characters.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000230
231
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000232 .. method:: has_header(sample)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000233
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000234 Analyze the sample text (presumed to be in CSV format) and return
235 :const:`True` if the first row appears to be a series of column headers.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000236
Georg Brandl14aaee12008-01-06 16:04:56 +0000237An example for :class:`Sniffer` use::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000238
Ezio Melottia733d812012-09-15 05:46:24 +0300239 with open('example.csv', 'rb') as csvfile:
240 dialect = csv.Sniffer().sniff(csvfile.read(1024))
241 csvfile.seek(0)
242 reader = csv.reader(csvfile, dialect)
243 # ... process CSV file contents here ...
Georg Brandl14aaee12008-01-06 16:04:56 +0000244
245
246The :mod:`csv` module defines the following constants:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000247
248.. data:: QUOTE_ALL
249
250 Instructs :class:`writer` objects to quote all fields.
251
252
253.. data:: QUOTE_MINIMAL
254
255 Instructs :class:`writer` objects to only quote those fields which contain
256 special characters such as *delimiter*, *quotechar* or any of the characters in
257 *lineterminator*.
258
259
260.. data:: QUOTE_NONNUMERIC
261
262 Instructs :class:`writer` objects to quote all non-numeric fields.
263
264 Instructs the reader to convert all non-quoted fields to type *float*.
265
266
267.. data:: QUOTE_NONE
268
269 Instructs :class:`writer` objects to never quote fields. When the current
270 *delimiter* occurs in output data it is preceded by the current *escapechar*
271 character. If *escapechar* is not set, the writer will raise :exc:`Error` if
272 any characters that require escaping are encountered.
273
274 Instructs :class:`reader` to perform no special processing of quote characters.
275
276The :mod:`csv` module defines the following exception:
277
278
279.. exception:: Error
280
281 Raised by any of the functions when an error is detected.
282
283
284.. _csv-fmt-params:
285
286Dialects and Formatting Parameters
287----------------------------------
288
289To make it easier to specify the format of input and output records, specific
290formatting parameters are grouped together into dialects. A dialect is a
291subclass of the :class:`Dialect` class having a set of specific methods and a
292single :meth:`validate` method. When creating :class:`reader` or
293:class:`writer` objects, the programmer can specify a string or a subclass of
294the :class:`Dialect` class as the dialect parameter. In addition to, or instead
295of, the *dialect* parameter, the programmer can also specify individual
296formatting parameters, which have the same names as the attributes defined below
297for the :class:`Dialect` class.
298
299Dialects support the following attributes:
300
301
302.. attribute:: Dialect.delimiter
303
304 A one-character string used to separate fields. It defaults to ``','``.
305
306
307.. attribute:: Dialect.doublequote
308
309 Controls how instances of *quotechar* appearing inside a field should be
310 themselves be quoted. When :const:`True`, the character is doubled. When
311 :const:`False`, the *escapechar* is used as a prefix to the *quotechar*. It
312 defaults to :const:`True`.
313
314 On output, if *doublequote* is :const:`False` and no *escapechar* is set,
315 :exc:`Error` is raised if a *quotechar* is found in a field.
316
317
318.. attribute:: Dialect.escapechar
319
320 A one-character string used by the writer to escape the *delimiter* if *quoting*
321 is set to :const:`QUOTE_NONE` and the *quotechar* if *doublequote* is
322 :const:`False`. On reading, the *escapechar* removes any special meaning from
323 the following character. It defaults to :const:`None`, which disables escaping.
324
325
326.. attribute:: Dialect.lineterminator
327
328 The string used to terminate lines produced by the :class:`writer`. It defaults
329 to ``'\r\n'``.
330
331 .. note::
332
333 The :class:`reader` is hard-coded to recognise either ``'\r'`` or ``'\n'`` as
334 end-of-line, and ignores *lineterminator*. This behavior may change in the
335 future.
336
337
338.. attribute:: Dialect.quotechar
339
340 A one-character string used to quote fields containing special characters, such
341 as the *delimiter* or *quotechar*, or which contain new-line characters. It
342 defaults to ``'"'``.
343
344
345.. attribute:: Dialect.quoting
346
347 Controls when quotes should be generated by the writer and recognised by the
348 reader. It can take on any of the :const:`QUOTE_\*` constants (see section
349 :ref:`csv-contents`) and defaults to :const:`QUOTE_MINIMAL`.
350
351
352.. attribute:: Dialect.skipinitialspace
353
354 When :const:`True`, whitespace immediately following the *delimiter* is ignored.
355 The default is :const:`False`.
356
357
Ezio Melotti355637b2012-11-18 12:55:35 +0200358.. attribute:: Dialect.strict
359
360 When ``True``, raise exception :exc:`Error` on bad CSV input.
361 The default is ``False``.
362
Georg Brandl8ec7f652007-08-15 14:28:01 +0000363Reader Objects
364--------------
365
366Reader objects (:class:`DictReader` instances and objects returned by the
367:func:`reader` function) have the following public methods:
368
369
370.. method:: csvreader.next()
371
372 Return the next row of the reader's iterable object as a list, parsed according
373 to the current dialect.
374
375Reader objects have the following public attributes:
376
377
378.. attribute:: csvreader.dialect
379
380 A read-only description of the dialect in use by the parser.
381
382
383.. attribute:: csvreader.line_num
384
385 The number of lines read from the source iterator. This is not the same as the
386 number of records returned, as records can span multiple lines.
387
388 .. versionadded:: 2.5
389
390
Skip Montanaroa032bf42008-08-08 22:52:51 +0000391DictReader objects have the following public attribute:
392
393
394.. attribute:: csvreader.fieldnames
395
396 If not passed as a parameter when creating the object, this attribute is
397 initialized upon first access or when the first record is read from the
398 file.
399
400 .. versionchanged:: 2.6
401
402
Georg Brandl8ec7f652007-08-15 14:28:01 +0000403Writer Objects
404--------------
405
406:class:`Writer` objects (:class:`DictWriter` instances and objects returned by
407the :func:`writer` function) have the following public methods. A *row* must be
408a sequence of strings or numbers for :class:`Writer` objects and a dictionary
409mapping fieldnames to strings or numbers (by passing them through :func:`str`
410first) for :class:`DictWriter` objects. Note that complex numbers are written
411out surrounded by parens. This may cause some problems for other programs which
412read CSV files (assuming they support complex numbers at all).
413
414
415.. method:: csvwriter.writerow(row)
416
417 Write the *row* parameter to the writer's file object, formatted according to
418 the current dialect.
419
420
421.. method:: csvwriter.writerows(rows)
422
423 Write all the *rows* parameters (a list of *row* objects as described above) to
424 the writer's file object, formatted according to the current dialect.
425
426Writer objects have the following public attribute:
427
428
429.. attribute:: csvwriter.dialect
430
431 A read-only description of the dialect in use by the writer.
432
433
Dirkjan Ochtman86148172010-02-23 21:09:52 +0000434DictWriter objects have the following public method:
435
436
437.. method:: DictWriter.writeheader()
438
439 Write a row with the field names (as specified in the constructor).
440
441 .. versionadded:: 2.7
442
443
Georg Brandl8ec7f652007-08-15 14:28:01 +0000444.. _csv-examples:
445
446Examples
447--------
448
449The simplest example of reading a CSV file::
450
451 import csv
Eli Benderskyec40bab2011-03-13 08:45:19 +0200452 with open('some.csv', 'rb') as f:
453 reader = csv.reader(f)
454 for row in reader:
455 print row
Georg Brandl8ec7f652007-08-15 14:28:01 +0000456
457Reading a file with an alternate format::
458
459 import csv
Eli Benderskyec40bab2011-03-13 08:45:19 +0200460 with open('passwd', 'rb') as f:
461 reader = csv.reader(f, delimiter=':', quoting=csv.QUOTE_NONE)
462 for row in reader:
463 print row
Georg Brandl8ec7f652007-08-15 14:28:01 +0000464
465The corresponding simplest possible writing example is::
466
467 import csv
Eli Benderskyec40bab2011-03-13 08:45:19 +0200468 with open('some.csv', 'wb') as f:
469 writer = csv.writer(f)
470 writer.writerows(someiterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000471
472Registering a new dialect::
473
474 import csv
Georg Brandl8ec7f652007-08-15 14:28:01 +0000475 csv.register_dialect('unixpwd', delimiter=':', quoting=csv.QUOTE_NONE)
Eli Benderskyec40bab2011-03-13 08:45:19 +0200476 with open('passwd', 'rb') as f:
477 reader = csv.reader(f, 'unixpwd')
Georg Brandl8ec7f652007-08-15 14:28:01 +0000478
479A slightly more advanced use of the reader --- catching and reporting errors::
480
Benjamin Petersona7b55a32009-02-20 03:31:23 +0000481 import csv, sys
Eli Benderskyec40bab2011-03-13 08:45:19 +0200482 filename = 'some.csv'
483 with open(filename, 'rb') as f:
484 reader = csv.reader(f)
485 try:
486 for row in reader:
487 print row
Andrew Svetlov1625d882012-10-30 21:56:43 +0200488 except csv.Error as e:
Eli Benderskyec40bab2011-03-13 08:45:19 +0200489 sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000490
491And while the module doesn't directly support parsing strings, it can easily be
492done::
493
494 import csv
495 for row in csv.reader(['one,two,three']):
496 print row
497
498The :mod:`csv` module doesn't directly support reading and writing Unicode, but
499it is 8-bit-clean save for some problems with ASCII NUL characters. So you can
500write functions or classes that handle the encoding and decoding for you as long
501as you avoid encodings like UTF-16 that use NULs. UTF-8 is recommended.
502
Georg Brandlcf3fb252007-10-21 10:52:38 +0000503:func:`unicode_csv_reader` below is a :term:`generator` that wraps :class:`csv.reader`
Georg Brandl8ec7f652007-08-15 14:28:01 +0000504to handle Unicode CSV data (a list of Unicode strings). :func:`utf_8_encoder`
Georg Brandlcf3fb252007-10-21 10:52:38 +0000505is a :term:`generator` that encodes the Unicode strings as UTF-8, one string (or row) at
Georg Brandl8ec7f652007-08-15 14:28:01 +0000506a time. The encoded strings are parsed by the CSV reader, and
507:func:`unicode_csv_reader` decodes the UTF-8-encoded cells back into Unicode::
508
509 import csv
510
511 def unicode_csv_reader(unicode_csv_data, dialect=csv.excel, **kwargs):
512 # csv.py doesn't do Unicode; encode temporarily as UTF-8:
513 csv_reader = csv.reader(utf_8_encoder(unicode_csv_data),
514 dialect=dialect, **kwargs)
515 for row in csv_reader:
516 # decode UTF-8 back to Unicode, cell by cell:
517 yield [unicode(cell, 'utf-8') for cell in row]
518
519 def utf_8_encoder(unicode_csv_data):
520 for line in unicode_csv_data:
521 yield line.encode('utf-8')
522
523For all other encodings the following :class:`UnicodeReader` and
524:class:`UnicodeWriter` classes can be used. They take an additional *encoding*
525parameter in their constructor and make sure that the data passes the real
526reader or writer encoded as UTF-8::
527
Benjamin Petersona7b55a32009-02-20 03:31:23 +0000528 import csv, codecs, cStringIO
Georg Brandl8ec7f652007-08-15 14:28:01 +0000529
530 class UTF8Recoder:
531 """
532 Iterator that reads an encoded stream and reencodes the input to UTF-8
533 """
534 def __init__(self, f, encoding):
535 self.reader = codecs.getreader(encoding)(f)
536
537 def __iter__(self):
538 return self
539
540 def next(self):
541 return self.reader.next().encode("utf-8")
542
543 class UnicodeReader:
544 """
545 A CSV reader which will iterate over lines in the CSV file "f",
546 which is encoded in the given encoding.
547 """
548
549 def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
550 f = UTF8Recoder(f, encoding)
551 self.reader = csv.reader(f, dialect=dialect, **kwds)
552
553 def next(self):
554 row = self.reader.next()
555 return [unicode(s, "utf-8") for s in row]
556
557 def __iter__(self):
558 return self
559
560 class UnicodeWriter:
561 """
562 A CSV writer which will write rows to CSV file "f",
563 which is encoded in the given encoding.
564 """
565
566 def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
567 # Redirect output to a queue
568 self.queue = cStringIO.StringIO()
569 self.writer = csv.writer(self.queue, dialect=dialect, **kwds)
570 self.stream = f
571 self.encoder = codecs.getincrementalencoder(encoding)()
572
573 def writerow(self, row):
574 self.writer.writerow([s.encode("utf-8") for s in row])
575 # Fetch UTF-8 output from the queue ...
576 data = self.queue.getvalue()
577 data = data.decode("utf-8")
578 # ... and reencode it into the target encoding
579 data = self.encoder.encode(data)
580 # write to the target stream
581 self.stream.write(data)
582 # empty queue
583 self.queue.truncate(0)
584
585 def writerows(self, rows):
586 for row in rows:
587 self.writerow(row)
588