Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame^] | 1 | :mod:`json` JSON encoder and decoder |
| 2 | ==================================== |
| 3 | |
| 4 | .. module:: json |
| 5 | :synopsis: encode and decode the JSON format |
| 6 | .. moduleauthor:: Bob Ippolito <bob@redivi.com> |
| 7 | .. sectionauthor:: Bob Ippolito <bob@redivi.com> |
| 8 | .. versionadded:: 2.6 |
| 9 | |
| 10 | JSON (JavaScript Object Notation) <http://json.org> is a subset of JavaScript |
| 11 | syntax (ECMA-262 3rd edition) used as a lightweight data interchange format. |
| 12 | |
| 13 | :mod:`json` exposes an API familiar to uses of the standard library marshal and |
| 14 | pickle modules. |
| 15 | |
| 16 | Encoding basic Python object hierarchies:: |
| 17 | |
| 18 | >>> import json |
| 19 | >>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}]) |
| 20 | '["foo", {"bar": ["baz", null, 1.0, 2]}]' |
| 21 | >>> print json.dumps("\"foo\bar") |
| 22 | "\"foo\bar" |
| 23 | >>> print json.dumps(u'\u1234') |
| 24 | "\u1234" |
| 25 | >>> print json.dumps('\\') |
| 26 | "\\" |
| 27 | >>> print json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True) |
| 28 | {"a": 0, "b": 0, "c": 0} |
| 29 | >>> from StringIO import StringIO |
| 30 | >>> io = StringIO() |
| 31 | >>> json.dump(['streaming API'], io) |
| 32 | >>> io.getvalue() |
| 33 | '["streaming API"]' |
| 34 | |
| 35 | Compact encoding:: |
| 36 | |
| 37 | >>> import json |
| 38 | >>> json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',',':')) |
| 39 | '[1,2,3,{"4":5,"6":7}]' |
| 40 | |
| 41 | Pretty printing:: |
| 42 | |
| 43 | >>> import json |
| 44 | >>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4) |
| 45 | { |
| 46 | "4": 5, |
| 47 | "6": 7 |
| 48 | } |
| 49 | |
| 50 | Decoding JSON:: |
| 51 | |
| 52 | >>> import json |
| 53 | >>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]') |
| 54 | [u'foo', {u'bar': [u'baz', None, 1.0, 2]}] |
| 55 | >>> json.loads('"\\"foo\\bar"') |
| 56 | u'"foo\x08ar' |
| 57 | >>> from StringIO import StringIO |
| 58 | >>> io = StringIO('["streaming API"]') |
| 59 | >>> json.load(io) |
| 60 | [u'streaming API'] |
| 61 | |
| 62 | Specializing JSON object decoding:: |
| 63 | |
| 64 | >>> import json |
| 65 | >>> def as_complex(dct): |
| 66 | ... if '__complex__' in dct: |
| 67 | ... return complex(dct['real'], dct['imag']) |
| 68 | ... return dct |
| 69 | ... |
| 70 | >>> json.loads('{"__complex__": true, "real": 1, "imag": 2}', |
| 71 | ... object_hook=as_complex) |
| 72 | (1+2j) |
| 73 | >>> import decimal |
| 74 | >>> json.loads('1.1', parse_float=decimal.Decimal) |
| 75 | Decimal('1.1') |
| 76 | |
| 77 | Extending JSONEncoder:: |
| 78 | |
| 79 | >>> import json |
| 80 | >>> class ComplexEncoder(json.JSONEncoder): |
| 81 | ... def default(self, obj): |
| 82 | ... if isinstance(obj, complex): |
| 83 | ... return [obj.real, obj.imag] |
| 84 | ... return json.JSONEncoder.default(self, obj) |
| 85 | ... |
| 86 | >>> dumps(2 + 1j, cls=ComplexEncoder) |
| 87 | '[2.0, 1.0]' |
| 88 | >>> ComplexEncoder().encode(2 + 1j) |
| 89 | '[2.0, 1.0]' |
| 90 | >>> list(ComplexEncoder().iterencode(2 + 1j)) |
| 91 | ['[', '2.0', ', ', '1.0', ']'] |
| 92 | |
| 93 | |
| 94 | .. highlight:: none |
| 95 | |
| 96 | Using json.tool from the shell to validate and pretty-print:: |
| 97 | |
| 98 | $ echo '{"json":"obj"}' | python -mjson.tool |
| 99 | { |
| 100 | "json": "obj" |
| 101 | } |
| 102 | $ echo '{ 1.2:3.4}' | python -mjson.tool |
| 103 | Expecting property name: line 1 column 2 (char 2) |
| 104 | |
| 105 | .. highlight:: python |
| 106 | |
| 107 | .. note:: |
| 108 | |
| 109 | Note that the JSON produced by this module's default settings is a subset of |
| 110 | YAML, so it may be used as a serializer for that as well. |
| 111 | |
| 112 | |
| 113 | Basic Usage |
| 114 | ----------- |
| 115 | |
| 116 | .. function:: dump(obj, fp[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, encoding[, default[, **kw]]]]]]]]]]) |
| 117 | |
| 118 | Serialize *obj* as a JSON formatted stream to *fp* (a |
| 119 | ``.write()``-supporting file-like object). |
| 120 | |
| 121 | If *skipkeys* is ``True`` (It is ``False`` by default.), then ``dict`` keys |
| 122 | that are not basic types (``str``, ``unicode``, ``int``, ``long``, |
| 123 | ``float``, ``bool``, ``None``) will be skipped instead of raising a |
| 124 | :exc:`TypeError`. |
| 125 | |
| 126 | If *ensure_ascii* is ``False`` (It is ``True`` by default.), then the some |
| 127 | chunks written to *fp* may be ``unicode`` instances, subject to normal |
| 128 | Python ``str`` to ``unicode`` coercion rules. Unless ``fp.write()`` |
| 129 | explicitly understands ``unicode`` (as in ``codecs.getwriter()``) this is |
| 130 | likely to cause an error. |
| 131 | |
| 132 | If *check_circular* is ``False``, then the circular reference check for |
| 133 | container types will be skipped and a circular reference will result in an |
| 134 | :exc:`OverflowError` (or worse). |
| 135 | |
| 136 | If *allow_nan* is ``False``, then it will be a :exc:`ValueError` to |
| 137 | serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``) in |
| 138 | strict compliance of the JSON specification, instead of using the JavaScript |
| 139 | equivalents (``NaN``, ``Infinity``, ``-Infinity``). |
| 140 | |
| 141 | If *indent* is a non-negative integer, then JSON array elements and object |
| 142 | members will be pretty-printed with that indent level. An indent level of 0 |
| 143 | will only insert newlines. ``None`` is the most compact representation. |
| 144 | |
| 145 | If *separators* is an ``(item_separator, dict_separator)`` tuple then it |
| 146 | will be used instead of the default ``(', ', ': ')`` separators. ``(',', |
| 147 | ':')`` is the most compact JSON representation. |
| 148 | |
| 149 | *encoding* is the character encoding for str instances, default is UTF-8. |
| 150 | |
| 151 | *default(obj)* is a function that should return a serializable version of |
| 152 | obj or raise :exc:`TypeError`. The default simply raises :exc:`TypeError`. |
| 153 | |
| 154 | To use a custom :class:`JSONEncoder`` subclass (e.g. one that overrides the |
| 155 | ``.default()`` method to serialize additional types), specify it with the |
| 156 | *cls* kwarg. |
| 157 | |
| 158 | |
| 159 | .. function:: dump(obj[, skipkeys[, ensure_ascii[, check_circular[, allow_nan[, cls[, indent[, separators[, encoding[, default[, **kw]]]]]]]]]]) |
| 160 | |
| 161 | Serialize *obj* to a JSON formatted ``str``. |
| 162 | |
| 163 | If *skipkeys* is ``True`` (It is ``False`` by default.), then ``dict`` keys |
| 164 | that are not basic types (``str``, ``unicode``, ``int``, ``long``, |
| 165 | ``float``, ``bool``, ``None``) will be skipped instead of raising a |
| 166 | :exc:`TypeError`. |
| 167 | |
| 168 | If *ensure_ascii* is ``False``, then the return value will be a ``unicode`` |
| 169 | instance subject to normal Python ``str`` to ``unicode`` coercion rules |
| 170 | instead of being escaped to an ASCII ``str``. |
| 171 | |
| 172 | If *check_circular* is ``False``, then the circular reference check for |
| 173 | container types will be skipped and a circular reference will result in an |
| 174 | :exc:`OverflowError` (or worse). |
| 175 | |
| 176 | If *allow_nan* is ``False``, then it will be a :exc:`ValueError` to |
| 177 | serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``) in |
| 178 | strict compliance of the JSON specification, instead of using the JavaScript |
| 179 | equivalents (``NaN``, ``Infinity``, ``-Infinity``). |
| 180 | |
| 181 | If *indent* is a non-negative integer, then JSON array elements and object |
| 182 | members will be pretty-printed with that indent level. An indent level of 0 |
| 183 | will only insert newlines. ``None`` is the most compact representation. |
| 184 | |
| 185 | If *separators* is an ``(item_separator, dict_separator)`` tuple then it |
| 186 | will be used instead of the default ``(', ', ': ')`` separators. ``(',', |
| 187 | ':')`` is the most compact JSON representation. |
| 188 | |
| 189 | *encoding* is the character encoding for str instances, default is UTF-8. |
| 190 | |
| 191 | *default(obj)* is a function that should return a serializable version of |
| 192 | obj or raise :exc:`TypeError`. The default simply raises :exc:`TypeError`. |
| 193 | |
| 194 | To use a custom :class:`JSONEncoder`` subclass (e.g. one that overrides the |
| 195 | ``.default()`` method to serialize additional types), specify it with the |
| 196 | *cls* kwarg. |
| 197 | |
| 198 | |
| 199 | .. function loads(s[, encoding[, cls[, object_hook[, parse_float[, parse_int[, parse_constant[, **kw]]]]]]]) |
| 200 | |
| 201 | Deserialize *s* (a ``str`` or ``unicode`` instance containing a JSON |
| 202 | document) to a Python object. |
| 203 | |
| 204 | If *s* is a ``str`` instance and is encoded with an ASCII based encoding |
| 205 | other than utf-8 (e.g. latin-1) then an appropriate ``encoding`` name must be |
| 206 | specified. Encodings that are not ASCII based (such as UCS-2) are not allowed |
| 207 | and should be decoded to ``unicode`` first. |
| 208 | |
| 209 | *object_hook* is an optional function that will be called with the result of |
| 210 | any object literal decode (a ``dict``). The return value of ``object_hook`` |
| 211 | will be used instead of the ``dict``. This feature can be used to implement |
| 212 | custom decoders (e.g. JSON-RPC class hinting). |
| 213 | |
| 214 | *parse_float*, if specified, will be called with the string of every JSON |
| 215 | float to be decoded. By default, this is equivalent to |
| 216 | ``float(num_str)``. This can be used to use another datatype or parser for |
| 217 | JSON floats (e.g. decimal.Decimal). |
| 218 | |
| 219 | *parse_int*, if specified, will be called with the string of every JSON int |
| 220 | to be decoded. By default this is equivalent to int(num_str). This can be |
| 221 | used to use another datatype or parser for JSON integers (e.g. float). |
| 222 | |
| 223 | *parse_constant*, if specified, will be called with one of the following |
| 224 | strings: -Infinity, Infinity, NaN, null, true, false. This can be used to |
| 225 | raise an exception if invalid JSON numbers are encountered. |
| 226 | |
| 227 | To use a custom :class:`JSONDecoder` subclass, specify it with the ``cls`` |
| 228 | kwarg. Additional keyword arguments will be passed to the constructor of the |
| 229 | class. |
| 230 | |
| 231 | |
| 232 | .. function load(fp[, encoding[, cls[, object_hook[, parse_float[, parse_int[, parse_constant[, **kw]]]]]]]) |
| 233 | |
| 234 | Deserialize *fp* (a ``.read()``-supporting file-like object containing a JSON |
| 235 | document) to a Python object. |
| 236 | |
| 237 | If the contents of *fp* is encoded with an ASCII based encoding other than |
| 238 | utf-8 (e.g. latin-1), then an appropriate ``encoding`` name must be |
| 239 | specified. Encodings that are not ASCII based (such as UCS-2) are not |
| 240 | allowed, and should be wrapped with :func:`codecs.getreader(fp)(encoding)`, |
| 241 | or simply decoded to a ``unicode`` object and passed to ``loads()`` |
| 242 | |
| 243 | *object_hook* is an optional function that will be called with the result of |
| 244 | any object literal decode (a ``dict``). The return value of *object_hook* |
| 245 | will be used instead of the ``dict``. This feature can be used to implement |
| 246 | custom decoders (e.g. JSON-RPC class hinting). |
| 247 | |
| 248 | To use a custom :class:`JSONDecoder` subclass, specify it with the ``cls`` |
| 249 | kwarg. Additional keyword arguments will be passed to the constructor of the |
| 250 | class. |
| 251 | |
| 252 | |
| 253 | Encoders and decoders |
| 254 | --------------------- |
| 255 | |
| 256 | .. class:: JSONDecoder([encoding[, object_hook[, parse_float[, parse_int[, parse_constant[, strict]]]]]]) |
| 257 | |
| 258 | Simple JSON decoder |
| 259 | |
| 260 | Performs the following translations in decoding by default: |
| 261 | |
| 262 | +---------------+-------------------+ |
| 263 | | JSON | Python | |
| 264 | +===============+===================+ |
| 265 | | object | dict | |
| 266 | +---------------+-------------------+ |
| 267 | | array | list | |
| 268 | +---------------+-------------------+ |
| 269 | | string | unicode | |
| 270 | +---------------+-------------------+ |
| 271 | | number (int) | int, long | |
| 272 | +---------------+-------------------+ |
| 273 | | number (real) | float | |
| 274 | +---------------+-------------------+ |
| 275 | | true | True | |
| 276 | +---------------+-------------------+ |
| 277 | | false | False | |
| 278 | +---------------+-------------------+ |
| 279 | | null | None | |
| 280 | +---------------+-------------------+ |
| 281 | |
| 282 | It also understands ``NaN``, ``Infinity``, and ``-Infinity`` as their |
| 283 | corresponding ``float`` values, which is outside the JSON spec. |
| 284 | |
| 285 | *encoding* determines the encoding used to interpret any ``str`` objects |
| 286 | decoded by this instance (utf-8 by default). It has no effect when decoding |
| 287 | ``unicode`` objects. |
| 288 | |
| 289 | Note that currently only encodings that are a superset of ASCII work, |
| 290 | strings of other encodings should be passed in as ``unicode``. |
| 291 | |
| 292 | *object_hook*, if specified, will be called with the result of every JSON |
| 293 | object decoded and its return value will be used in place of the given |
| 294 | ``dict``. This can be used to provide custom deserializations (e.g. to |
| 295 | support JSON-RPC class hinting). |
| 296 | |
| 297 | *parse_float*, if specified, will be called with the string of every JSON |
| 298 | float to be decoded. By default this is equivalent to float(num_str). This |
| 299 | can be used to use another datatype or parser for JSON floats |
| 300 | (e.g. decimal.Decimal). |
| 301 | |
| 302 | *parse_int*, if specified, will be called with the string of every JSON int |
| 303 | to be decoded. By default this is equivalent to int(num_str). This can be |
| 304 | used to use another datatype or parser for JSON integers (e.g. float). |
| 305 | |
| 306 | *parse_constant*, if specified, will be called with one of the following |
| 307 | strings: -Infinity, Infinity, NaN, null, true, false. This can be used to |
| 308 | raise an exception if invalid JSON numbers are encountered. |
| 309 | |
| 310 | |
| 311 | .. method:: decode(s) |
| 312 | |
| 313 | Return the Python representation of *s* (a ``str`` or ``unicode`` instance |
| 314 | containing a JSON document) |
| 315 | |
| 316 | .. method:: raw_decode(s) |
| 317 | |
| 318 | Decode a JSON document from *s* (a ``str`` or ``unicode`` beginning with a |
| 319 | JSON document) and return a 2-tuple of the Python representation and the |
| 320 | index in *s* where the document ended. |
| 321 | |
| 322 | This can be used to decode a JSON document from a string that may have |
| 323 | extraneous data at the end. |
| 324 | |
| 325 | |
| 326 | .. class:: JSONEncoder([skipkeys[, ensure_ascii[, check_circular[, allow_nan[, sort_keys[, indent[, separators[, encoding[, default]]]]]]]]]) |
| 327 | |
| 328 | Extensible JSON <http://json.org> encoder for Python data structures. |
| 329 | |
| 330 | Supports the following objects and types by default: |
| 331 | |
| 332 | +-------------------+---------------+ |
| 333 | | Python | JSON | |
| 334 | +===================+===============+ |
| 335 | | dict | object | |
| 336 | +-------------------+---------------+ |
| 337 | | list, tuple | array | |
| 338 | +-------------------+---------------+ |
| 339 | | str, unicode | string | |
| 340 | +-------------------+---------------+ |
| 341 | | int, long, float | number | |
| 342 | +-------------------+---------------+ |
| 343 | | True | true | |
| 344 | +-------------------+---------------+ |
| 345 | | False | false | |
| 346 | +-------------------+---------------+ |
| 347 | | None | null | |
| 348 | +-------------------+---------------+ |
| 349 | |
| 350 | To extend this to recognize other objects, subclass and implement a |
| 351 | ``.default()`` method with another method that returns a serializable object |
| 352 | for ``o`` if possible, otherwise it should call the superclass implementation |
| 353 | (to raise :exc:`TypeError`). |
| 354 | |
| 355 | If *skipkeys* is ``False`` (the default), then it is a :exc:`TypeError` to |
| 356 | attempt encoding of keys that are not str, int, long, float or None. If |
| 357 | *skipkeys* is ``True``, such items are simply skipped. |
| 358 | |
| 359 | If *ensure_ascii* is ``True``, the output is guaranteed to be ``str`` objects |
| 360 | with all incoming unicode characters escaped. If *ensure_ascii* is |
| 361 | ``False``, the output will be unicode object. |
| 362 | |
| 363 | If *check_circular* is ``True`` (the default), then lists, dicts, and custom |
| 364 | encoded objects will be checked for circular references during encoding to |
| 365 | prevent an infinite recursion (which would cause an :exc:`OverflowError`). |
| 366 | Otherwise, no such check takes place. |
| 367 | |
| 368 | If *allow_nan* is ``True`` (the default), then ``NaN``, ``Infinity``, and ``-Infinity`` |
| 369 | will be encoded as such. This behavior is not JSON specification compliant, |
| 370 | but is consistent with most JavaScript based encoders and decoders. |
| 371 | Otherwise, it will be a :exc:`ValueError` to encode such floats. |
| 372 | |
| 373 | If *sort_keys* is ``True`` (the default), then the output of dictionaries |
| 374 | will be sorted by key; this is useful for regression tests to ensure that |
| 375 | JSON serializations can be compared on a day-to-day basis. |
| 376 | |
| 377 | If *indent* is a non-negative integer (It is ``None`` by default.), then JSON |
| 378 | array elements and object members will be pretty-printed with that indent |
| 379 | level. An indent level of 0 will only insert newlines. ``None`` is the most |
| 380 | compact representation. |
| 381 | |
| 382 | If specified, *separators* should be a (item_separator, key_separator) tuple. |
| 383 | The default is ``(', ', ': ')``. To get the most compact JSON |
| 384 | representation, you should specify ``(',', ':')`` to eliminate whitespace. |
| 385 | |
| 386 | If specified, *default* is a function that gets called for objects that can't |
| 387 | otherwise be serialized. It should return a JSON encodable version of the |
| 388 | object or raise a :exc:`TypeError`. |
| 389 | |
| 390 | If *encoding* is not ``None``, then all input strings will be transformed |
| 391 | into unicode using that encoding prior to JSON-encoding. The default is |
| 392 | UTF-8. |
| 393 | |
| 394 | |
| 395 | .. method:: default(o) |
| 396 | |
| 397 | Implement this method in a subclass such that it returns a serializable |
| 398 | object for *o*, or calls the base implementation (to raise a |
| 399 | :exc:`TypeError`). |
| 400 | |
| 401 | For example, to support arbitrary iterators, you could implement default |
| 402 | like this:: |
| 403 | |
| 404 | def default(self, o): |
| 405 | try: |
| 406 | iterable = iter(o) |
| 407 | except TypeError: |
| 408 | pass |
| 409 | else: |
| 410 | return list(iterable) |
| 411 | return JSONEncoder.default(self, o) |
| 412 | |
| 413 | |
| 414 | .. method:: encode(o) |
| 415 | |
| 416 | Return a JSON string representation of a Python data structure, *o*. For |
| 417 | example:: |
| 418 | |
| 419 | >>> JSONEncoder().encode({"foo": ["bar", "baz"]}) |
| 420 | '{"foo": ["bar", "baz"]}' |
| 421 | |
| 422 | |
| 423 | .. method:: iterencode(o) |
| 424 | |
| 425 | Encode the given object, *o*, and yield each string representation as |
| 426 | available. |
| 427 | |
| 428 | For example:: |
| 429 | |
| 430 | for chunk in JSONEncoder().iterencode(bigobject): |
| 431 | mysocket.write(chunk) |