Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 1 | :mod:`json` --- JSON encoder and decoder |
| 2 | ======================================== |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 3 | |
| 4 | .. module:: json |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 5 | :synopsis: Encode and decode the JSON format. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 6 | .. moduleauthor:: Bob Ippolito <bob@redivi.com> |
| 7 | .. sectionauthor:: Bob Ippolito <bob@redivi.com> |
| 8 | .. versionadded:: 2.6 |
| 9 | |
Antoine Pitrou | f3e0a69 | 2012-08-24 19:46:17 +0200 | [diff] [blame] | 10 | `JSON (JavaScript Object Notation) <http://json.org>`_, specified by |
| 11 | :rfc:`4627`, is a lightweight data interchange format based on a subset of |
| 12 | `JavaScript <http://en.wikipedia.org/wiki/JavaScript>`_ syntax (`ECMA-262 3rd |
| 13 | edition <http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-262,%203rd%20edition,%20December%201999.pdf>`_). |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 14 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 15 | :mod:`json` exposes an API familiar to users of the standard library |
| 16 | :mod:`marshal` and :mod:`pickle` modules. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 17 | |
| 18 | Encoding basic Python object hierarchies:: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 19 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 20 | >>> import json |
| 21 | >>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}]) |
| 22 | '["foo", {"bar": ["baz", null, 1.0, 2]}]' |
| 23 | >>> print json.dumps("\"foo\bar") |
| 24 | "\"foo\bar" |
| 25 | >>> print json.dumps(u'\u1234') |
| 26 | "\u1234" |
| 27 | >>> print json.dumps('\\') |
| 28 | "\\" |
| 29 | >>> print json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True) |
| 30 | {"a": 0, "b": 0, "c": 0} |
| 31 | >>> from StringIO import StringIO |
| 32 | >>> io = StringIO() |
| 33 | >>> json.dump(['streaming API'], io) |
| 34 | >>> io.getvalue() |
| 35 | '["streaming API"]' |
| 36 | |
| 37 | Compact encoding:: |
| 38 | |
| 39 | >>> import json |
| 40 | >>> json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',',':')) |
| 41 | '[1,2,3,{"4":5,"6":7}]' |
| 42 | |
| 43 | Pretty printing:: |
| 44 | |
| 45 | >>> import json |
Ezio Melotti | 3a237eb | 2012-11-29 00:22:30 +0200 | [diff] [blame] | 46 | >>> print json.dumps({'4': 5, '6': 7}, sort_keys=True, |
| 47 | ... indent=4, separators=(',', ': ')) |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 48 | { |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 49 | "4": 5, |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 50 | "6": 7 |
| 51 | } |
| 52 | |
| 53 | Decoding JSON:: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 54 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 55 | >>> import json |
| 56 | >>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]') |
| 57 | [u'foo', {u'bar': [u'baz', None, 1.0, 2]}] |
| 58 | >>> json.loads('"\\"foo\\bar"') |
| 59 | u'"foo\x08ar' |
| 60 | >>> from StringIO import StringIO |
| 61 | >>> io = StringIO('["streaming API"]') |
| 62 | >>> json.load(io) |
| 63 | [u'streaming API'] |
| 64 | |
| 65 | Specializing JSON object decoding:: |
| 66 | |
| 67 | >>> import json |
| 68 | >>> def as_complex(dct): |
| 69 | ... if '__complex__' in dct: |
| 70 | ... return complex(dct['real'], dct['imag']) |
| 71 | ... return dct |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 72 | ... |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 73 | >>> json.loads('{"__complex__": true, "real": 1, "imag": 2}', |
| 74 | ... object_hook=as_complex) |
| 75 | (1+2j) |
| 76 | >>> import decimal |
| 77 | >>> json.loads('1.1', parse_float=decimal.Decimal) |
| 78 | Decimal('1.1') |
| 79 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 80 | Extending :class:`JSONEncoder`:: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 81 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 82 | >>> import json |
| 83 | >>> class ComplexEncoder(json.JSONEncoder): |
| 84 | ... def default(self, obj): |
| 85 | ... if isinstance(obj, complex): |
| 86 | ... return [obj.real, obj.imag] |
R David Murray | 35893b7 | 2013-03-17 22:06:18 -0400 | [diff] [blame] | 87 | ... # Let the base class default method raise the TypeError |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 88 | ... return json.JSONEncoder.default(self, obj) |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 89 | ... |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 90 | >>> dumps(2 + 1j, cls=ComplexEncoder) |
| 91 | '[2.0, 1.0]' |
| 92 | >>> ComplexEncoder().encode(2 + 1j) |
| 93 | '[2.0, 1.0]' |
| 94 | >>> list(ComplexEncoder().iterencode(2 + 1j)) |
| 95 | ['[', '2.0', ', ', '1.0', ']'] |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 96 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 97 | |
| 98 | .. highlight:: none |
| 99 | |
| 100 | Using json.tool from the shell to validate and pretty-print:: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 101 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 102 | $ echo '{"json":"obj"}' | python -mjson.tool |
| 103 | { |
| 104 | "json": "obj" |
| 105 | } |
Antoine Pitrou | d9a5137 | 2012-06-29 01:58:26 +0200 | [diff] [blame] | 106 | $ echo '{1.2:3.4}' | python -mjson.tool |
Serhiy Storchaka | 49d4022 | 2013-02-21 20:17:54 +0200 | [diff] [blame] | 107 | Expecting property name enclosed in double quotes: line 1 column 2 (char 1) |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 108 | |
| 109 | .. highlight:: python |
| 110 | |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 111 | .. note:: |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 112 | |
Antoine Pitrou | f3e0a69 | 2012-08-24 19:46:17 +0200 | [diff] [blame] | 113 | JSON is a subset of `YAML <http://yaml.org/>`_ 1.2. The JSON produced by |
| 114 | this module's default settings (in particular, the default *separators* |
| 115 | value) is also a subset of YAML 1.0 and 1.1. This module can thus also be |
| 116 | used as a YAML serializer. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 117 | |
| 118 | |
| 119 | Basic Usage |
| 120 | ----------- |
| 121 | |
Andrew Svetlov | 41c25ba | 2012-10-28 14:58:52 +0200 | [diff] [blame] | 122 | .. function:: dump(obj, fp, skipkeys=False, ensure_ascii=True, \ |
| 123 | check_circular=True, allow_nan=True, cls=None, \ |
| 124 | indent=None, separators=None, encoding="utf-8", \ |
| 125 | default=None, sort_keys=False, **kw) |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 126 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 127 | Serialize *obj* as a JSON formatted stream to *fp* (a ``.write()``-supporting |
Ezio Melotti | d5cdc94 | 2013-03-29 03:59:29 +0200 | [diff] [blame] | 128 | :term:`file-like object`) using this :ref:`conversion table |
| 129 | <py-to-json-table>`. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 130 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 131 | If *skipkeys* is ``True`` (default: ``False``), then dict keys that are not |
| 132 | of a basic type (:class:`str`, :class:`unicode`, :class:`int`, :class:`long`, |
| 133 | :class:`float`, :class:`bool`, ``None``) will be skipped instead of raising a |
| 134 | :exc:`TypeError`. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 135 | |
Petri Lehtinen | f9e1f11 | 2012-09-01 07:27:58 +0300 | [diff] [blame] | 136 | If *ensure_ascii* is ``True`` (the default), all non-ASCII characters in the |
| 137 | output are escaped with ``\uXXXX`` sequences, and the result is a |
| 138 | :class:`str` instance consisting of ASCII characters only. If |
| 139 | *ensure_ascii* is ``False``, some chunks written to *fp* may be |
| 140 | :class:`unicode` instances. This usually happens because the input contains |
| 141 | unicode strings or the *encoding* parameter is used. Unless ``fp.write()`` |
| 142 | explicitly understands :class:`unicode` (as in :func:`codecs.getwriter`) |
| 143 | this is likely to cause an error. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 144 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 145 | If *check_circular* is ``False`` (default: ``True``), then the circular |
| 146 | reference check for container types will be skipped and a circular reference |
| 147 | will result in an :exc:`OverflowError` (or worse). |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 148 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 149 | If *allow_nan* is ``False`` (default: ``True``), then it will be a |
| 150 | :exc:`ValueError` to serialize out of range :class:`float` values (``nan``, |
| 151 | ``inf``, ``-inf``) in strict compliance of the JSON specification, instead of |
| 152 | using the JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``). |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 153 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 154 | If *indent* is a non-negative integer, then JSON array elements and object |
R David Murray | ea8b6ef | 2011-04-12 21:00:26 -0400 | [diff] [blame] | 155 | members will be pretty-printed with that indent level. An indent level of 0, |
| 156 | or negative, will only insert newlines. ``None`` (the default) selects the |
| 157 | most compact representation. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 158 | |
Ezio Melotti | 3a237eb | 2012-11-29 00:22:30 +0200 | [diff] [blame] | 159 | .. note:: |
| 160 | |
| 161 | Since the default item separator is ``', '``, the output might include |
| 162 | trailing whitespace when *indent* is specified. You can use |
| 163 | ``separators=(',', ': ')`` to avoid this. |
| 164 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 165 | If *separators* is an ``(item_separator, dict_separator)`` tuple, then it |
| 166 | will be used instead of the default ``(', ', ': ')`` separators. ``(',', |
| 167 | ':')`` is the most compact JSON representation. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 168 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 169 | *encoding* is the character encoding for str instances, default is UTF-8. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 170 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 171 | *default(obj)* is a function that should return a serializable version of |
| 172 | *obj* or raise :exc:`TypeError`. The default simply raises :exc:`TypeError`. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 173 | |
Andrew Svetlov | 41c25ba | 2012-10-28 14:58:52 +0200 | [diff] [blame] | 174 | If *sort_keys* is ``True`` (default: ``False``), then the output of |
| 175 | dictionaries will be sorted by key. |
| 176 | |
Georg Brandl | fc29f27 | 2009-01-02 20:25:14 +0000 | [diff] [blame] | 177 | To use a custom :class:`JSONEncoder` subclass (e.g. one that overrides the |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 178 | :meth:`default` method to serialize additional types), specify it with the |
Georg Brandl | db949b8 | 2010-10-15 17:04:45 +0000 | [diff] [blame] | 179 | *cls* kwarg; otherwise :class:`JSONEncoder` is used. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 180 | |
Ezio Melotti | 6033d26 | 2011-04-15 07:37:00 +0300 | [diff] [blame] | 181 | .. note:: |
| 182 | |
| 183 | Unlike :mod:`pickle` and :mod:`marshal`, JSON is not a framed protocol so |
| 184 | trying to serialize more objects with repeated calls to :func:`dump` and |
| 185 | the same *fp* will result in an invalid JSON file. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 186 | |
Andrew Svetlov | 41c25ba | 2012-10-28 14:58:52 +0200 | [diff] [blame] | 187 | .. function:: dumps(obj, skipkeys=False, ensure_ascii=True, \ |
| 188 | check_circular=True, allow_nan=True, cls=None, \ |
| 189 | indent=None, separators=None, encoding="utf-8", \ |
| 190 | default=None, sort_keys=False, **kw) |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 191 | |
Ezio Melotti | d5cdc94 | 2013-03-29 03:59:29 +0200 | [diff] [blame] | 192 | Serialize *obj* to a JSON formatted :class:`str` using this :ref:`conversion |
| 193 | table <py-to-json-table>`. If *ensure_ascii* is ``False``, the result may |
| 194 | contain non-ASCII characters and the return value may be a :class:`unicode` |
| 195 | instance. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 196 | |
Petri Lehtinen | f9e1f11 | 2012-09-01 07:27:58 +0300 | [diff] [blame] | 197 | The arguments have the same meaning as in :func:`dump`. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 198 | |
Senthil Kumaran | e3d7354 | 2012-03-17 00:37:38 -0700 | [diff] [blame] | 199 | .. note:: |
| 200 | |
| 201 | Keys in key/value pairs of JSON are always of the type :class:`str`. When |
| 202 | a dictionary is converted into JSON, all the keys of the dictionary are |
Terry Jan Reedy | 3d08f25 | 2013-03-08 19:35:15 -0500 | [diff] [blame] | 203 | coerced to strings. As a result of this, if a dictionary is converted |
Senthil Kumaran | e3d7354 | 2012-03-17 00:37:38 -0700 | [diff] [blame] | 204 | into JSON and then back into a dictionary, the dictionary may not equal |
| 205 | the original one. That is, ``loads(dumps(x)) != x`` if x has non-string |
| 206 | keys. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 207 | |
Raymond Hettinger | 91852ca | 2009-03-19 19:19:03 +0000 | [diff] [blame] | 208 | .. function:: load(fp[, encoding[, cls[, object_hook[, parse_float[, parse_int[, parse_constant[, object_pairs_hook[, **kw]]]]]]]]) |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 209 | |
Antoine Pitrou | 85ede8d | 2012-08-24 19:49:08 +0200 | [diff] [blame] | 210 | Deserialize *fp* (a ``.read()``-supporting :term:`file-like object` |
Ezio Melotti | d5cdc94 | 2013-03-29 03:59:29 +0200 | [diff] [blame] | 211 | containing a JSON document) to a Python object using this :ref:`conversion |
| 212 | table <json-to-py-table>`. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 213 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 214 | If the contents of *fp* are encoded with an ASCII based encoding other than |
| 215 | UTF-8 (e.g. latin-1), then an appropriate *encoding* name must be specified. |
| 216 | Encodings that are not ASCII based (such as UCS-2) are not allowed, and |
Georg Brandl | 49cc4ea | 2009-04-23 08:44:57 +0000 | [diff] [blame] | 217 | should be wrapped with ``codecs.getreader(encoding)(fp)``, or simply decoded |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 218 | to a :class:`unicode` object and passed to :func:`loads`. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 219 | |
| 220 | *object_hook* is an optional function that will be called with the result of |
Andrew M. Kuchling | 1967200 | 2009-03-30 22:29:15 +0000 | [diff] [blame] | 221 | any object literal decoded (a :class:`dict`). The return value of |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 222 | *object_hook* will be used instead of the :class:`dict`. This feature can be used |
Antoine Pitrou | f3e0a69 | 2012-08-24 19:46:17 +0200 | [diff] [blame] | 223 | to implement custom decoders (e.g. `JSON-RPC <http://www.jsonrpc.org>`_ |
| 224 | class hinting). |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 225 | |
Raymond Hettinger | 91852ca | 2009-03-19 19:19:03 +0000 | [diff] [blame] | 226 | *object_pairs_hook* is an optional function that will be called with the |
Andrew M. Kuchling | 1967200 | 2009-03-30 22:29:15 +0000 | [diff] [blame] | 227 | result of any object literal decoded with an ordered list of pairs. The |
Raymond Hettinger | 91852ca | 2009-03-19 19:19:03 +0000 | [diff] [blame] | 228 | return value of *object_pairs_hook* will be used instead of the |
| 229 | :class:`dict`. This feature can be used to implement custom decoders that |
| 230 | rely on the order that the key and value pairs are decoded (for example, |
| 231 | :func:`collections.OrderedDict` will remember the order of insertion). If |
| 232 | *object_hook* is also defined, the *object_pairs_hook* takes priority. |
| 233 | |
| 234 | .. versionchanged:: 2.7 |
| 235 | Added support for *object_pairs_hook*. |
| 236 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 237 | *parse_float*, if specified, will be called with the string of every JSON |
| 238 | float to be decoded. By default, this is equivalent to ``float(num_str)``. |
| 239 | This can be used to use another datatype or parser for JSON floats |
| 240 | (e.g. :class:`decimal.Decimal`). |
| 241 | |
| 242 | *parse_int*, if specified, will be called with the string of every JSON int |
| 243 | to be decoded. By default, this is equivalent to ``int(num_str)``. This can |
| 244 | be used to use another datatype or parser for JSON integers |
| 245 | (e.g. :class:`float`). |
| 246 | |
| 247 | *parse_constant*, if specified, will be called with one of the following |
Hynek Schlawack | 019935f | 2012-05-16 18:02:54 +0200 | [diff] [blame] | 248 | strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``. |
| 249 | This can be used to raise an exception if invalid JSON numbers |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 250 | are encountered. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 251 | |
Hynek Schlawack | 897b278 | 2012-05-20 11:50:41 +0200 | [diff] [blame] | 252 | .. versionchanged:: 2.7 |
| 253 | *parse_constant* doesn't get called on 'null', 'true', 'false' anymore. |
| 254 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 255 | To use a custom :class:`JSONDecoder` subclass, specify it with the ``cls`` |
Georg Brandl | db949b8 | 2010-10-15 17:04:45 +0000 | [diff] [blame] | 256 | kwarg; otherwise :class:`JSONDecoder` is used. Additional keyword arguments |
| 257 | will be passed to the constructor of the class. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 258 | |
| 259 | |
Raymond Hettinger | 91852ca | 2009-03-19 19:19:03 +0000 | [diff] [blame] | 260 | .. function:: loads(s[, encoding[, cls[, object_hook[, parse_float[, parse_int[, parse_constant[, object_pairs_hook[, **kw]]]]]]]]) |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 261 | |
| 262 | Deserialize *s* (a :class:`str` or :class:`unicode` instance containing a JSON |
Ezio Melotti | d5cdc94 | 2013-03-29 03:59:29 +0200 | [diff] [blame] | 263 | document) to a Python object using this :ref:`conversion table |
| 264 | <json-to-py-table>`. |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 265 | |
| 266 | If *s* is a :class:`str` instance and is encoded with an ASCII based encoding |
| 267 | other than UTF-8 (e.g. latin-1), then an appropriate *encoding* name must be |
| 268 | specified. Encodings that are not ASCII based (such as UCS-2) are not |
| 269 | allowed and should be decoded to :class:`unicode` first. |
| 270 | |
Georg Brandl | c630195 | 2010-05-10 21:02:51 +0000 | [diff] [blame] | 271 | The other arguments have the same meaning as in :func:`load`. |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 272 | |
| 273 | |
Antoine Pitrou | f3e0a69 | 2012-08-24 19:46:17 +0200 | [diff] [blame] | 274 | Encoders and Decoders |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 275 | --------------------- |
| 276 | |
Raymond Hettinger | 91852ca | 2009-03-19 19:19:03 +0000 | [diff] [blame] | 277 | .. class:: JSONDecoder([encoding[, object_hook[, parse_float[, parse_int[, parse_constant[, strict[, object_pairs_hook]]]]]]]) |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 278 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 279 | Simple JSON decoder. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 280 | |
| 281 | Performs the following translations in decoding by default: |
| 282 | |
Ezio Melotti | d5cdc94 | 2013-03-29 03:59:29 +0200 | [diff] [blame] | 283 | .. _json-to-py-table: |
| 284 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 285 | +---------------+-------------------+ |
| 286 | | JSON | Python | |
| 287 | +===============+===================+ |
| 288 | | object | dict | |
| 289 | +---------------+-------------------+ |
| 290 | | array | list | |
| 291 | +---------------+-------------------+ |
| 292 | | string | unicode | |
| 293 | +---------------+-------------------+ |
| 294 | | number (int) | int, long | |
| 295 | +---------------+-------------------+ |
| 296 | | number (real) | float | |
| 297 | +---------------+-------------------+ |
| 298 | | true | True | |
| 299 | +---------------+-------------------+ |
| 300 | | false | False | |
| 301 | +---------------+-------------------+ |
| 302 | | null | None | |
| 303 | +---------------+-------------------+ |
| 304 | |
| 305 | It also understands ``NaN``, ``Infinity``, and ``-Infinity`` as their |
| 306 | corresponding ``float`` values, which is outside the JSON spec. |
| 307 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 308 | *encoding* determines the encoding used to interpret any :class:`str` objects |
| 309 | decoded by this instance (UTF-8 by default). It has no effect when decoding |
| 310 | :class:`unicode` objects. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 311 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 312 | Note that currently only encodings that are a superset of ASCII work, strings |
| 313 | of other encodings should be passed in as :class:`unicode`. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 314 | |
| 315 | *object_hook*, if specified, will be called with the result of every JSON |
| 316 | object decoded and its return value will be used in place of the given |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 317 | :class:`dict`. This can be used to provide custom deserializations (e.g. to |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 318 | support JSON-RPC class hinting). |
| 319 | |
Raymond Hettinger | 91852ca | 2009-03-19 19:19:03 +0000 | [diff] [blame] | 320 | *object_pairs_hook*, if specified will be called with the result of every |
| 321 | JSON object decoded with an ordered list of pairs. The return value of |
| 322 | *object_pairs_hook* will be used instead of the :class:`dict`. This |
| 323 | feature can be used to implement custom decoders that rely on the order |
| 324 | that the key and value pairs are decoded (for example, |
| 325 | :func:`collections.OrderedDict` will remember the order of insertion). If |
| 326 | *object_hook* is also defined, the *object_pairs_hook* takes priority. |
| 327 | |
| 328 | .. versionchanged:: 2.7 |
| 329 | Added support for *object_pairs_hook*. |
| 330 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 331 | *parse_float*, if specified, will be called with the string of every JSON |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 332 | float to be decoded. By default, this is equivalent to ``float(num_str)``. |
| 333 | This can be used to use another datatype or parser for JSON floats |
| 334 | (e.g. :class:`decimal.Decimal`). |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 335 | |
| 336 | *parse_int*, if specified, will be called with the string of every JSON int |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 337 | to be decoded. By default, this is equivalent to ``int(num_str)``. This can |
| 338 | be used to use another datatype or parser for JSON integers |
| 339 | (e.g. :class:`float`). |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 340 | |
| 341 | *parse_constant*, if specified, will be called with one of the following |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 342 | strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``, ``'null'``, ``'true'``, |
| 343 | ``'false'``. This can be used to raise an exception if invalid JSON numbers |
| 344 | are encountered. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 345 | |
Georg Brandl | db949b8 | 2010-10-15 17:04:45 +0000 | [diff] [blame] | 346 | If *strict* is ``False`` (``True`` is the default), then control characters |
| 347 | will be allowed inside strings. Control characters in this context are |
| 348 | those with character codes in the 0-31 range, including ``'\t'`` (tab), |
| 349 | ``'\n'``, ``'\r'`` and ``'\0'``. |
| 350 | |
Felix Crux | b4e186c | 2013-08-12 17:39:51 -0400 | [diff] [blame^] | 351 | If the data being deserialized is not a valid JSON document, a |
| 352 | :exc:`ValueError` will be raised. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 353 | |
| 354 | .. method:: decode(s) |
| 355 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 356 | Return the Python representation of *s* (a :class:`str` or |
| 357 | :class:`unicode` instance containing a JSON document) |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 358 | |
| 359 | .. method:: raw_decode(s) |
| 360 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 361 | Decode a JSON document from *s* (a :class:`str` or :class:`unicode` |
| 362 | beginning with a JSON document) and return a 2-tuple of the Python |
| 363 | representation and the index in *s* where the document ended. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 364 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 365 | This can be used to decode a JSON document from a string that may have |
| 366 | extraneous data at the end. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 367 | |
| 368 | |
| 369 | .. class:: JSONEncoder([skipkeys[, ensure_ascii[, check_circular[, allow_nan[, sort_keys[, indent[, separators[, encoding[, default]]]]]]]]]) |
| 370 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 371 | Extensible JSON encoder for Python data structures. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 372 | |
| 373 | Supports the following objects and types by default: |
| 374 | |
Ezio Melotti | d5cdc94 | 2013-03-29 03:59:29 +0200 | [diff] [blame] | 375 | .. _py-to-json-table: |
| 376 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 377 | +-------------------+---------------+ |
| 378 | | Python | JSON | |
| 379 | +===================+===============+ |
| 380 | | dict | object | |
| 381 | +-------------------+---------------+ |
| 382 | | list, tuple | array | |
| 383 | +-------------------+---------------+ |
| 384 | | str, unicode | string | |
| 385 | +-------------------+---------------+ |
| 386 | | int, long, float | number | |
| 387 | +-------------------+---------------+ |
| 388 | | True | true | |
| 389 | +-------------------+---------------+ |
| 390 | | False | false | |
| 391 | +-------------------+---------------+ |
| 392 | | None | null | |
| 393 | +-------------------+---------------+ |
| 394 | |
| 395 | To extend this to recognize other objects, subclass and implement a |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 396 | :meth:`default` method with another method that returns a serializable object |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 397 | for ``o`` if possible, otherwise it should call the superclass implementation |
| 398 | (to raise :exc:`TypeError`). |
| 399 | |
| 400 | If *skipkeys* is ``False`` (the default), then it is a :exc:`TypeError` to |
| 401 | attempt encoding of keys that are not str, int, long, float or None. If |
| 402 | *skipkeys* is ``True``, such items are simply skipped. |
| 403 | |
Petri Lehtinen | f9e1f11 | 2012-09-01 07:27:58 +0300 | [diff] [blame] | 404 | If *ensure_ascii* is ``True`` (the default), all non-ASCII characters in the |
| 405 | output are escaped with ``\uXXXX`` sequences, and the results are |
| 406 | :class:`str` instances consisting of ASCII characters only. If |
| 407 | *ensure_ascii* is ``False``, a result may be a :class:`unicode` |
| 408 | instance. This usually happens if the input contains unicode strings or the |
| 409 | *encoding* parameter is used. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 410 | |
| 411 | If *check_circular* is ``True`` (the default), then lists, dicts, and custom |
| 412 | encoded objects will be checked for circular references during encoding to |
| 413 | prevent an infinite recursion (which would cause an :exc:`OverflowError`). |
| 414 | Otherwise, no such check takes place. |
| 415 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 416 | If *allow_nan* is ``True`` (the default), then ``NaN``, ``Infinity``, and |
| 417 | ``-Infinity`` will be encoded as such. This behavior is not JSON |
| 418 | specification compliant, but is consistent with most JavaScript based |
| 419 | encoders and decoders. Otherwise, it will be a :exc:`ValueError` to encode |
| 420 | such floats. |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 421 | |
Georg Brandl | 21946af | 2010-10-06 09:28:45 +0000 | [diff] [blame] | 422 | If *sort_keys* is ``True`` (default ``False``), then the output of dictionaries |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 423 | will be sorted by key; this is useful for regression tests to ensure that |
| 424 | JSON serializations can be compared on a day-to-day basis. |
| 425 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 426 | If *indent* is a non-negative integer (it is ``None`` by default), then JSON |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 427 | array elements and object members will be pretty-printed with that indent |
| 428 | level. An indent level of 0 will only insert newlines. ``None`` is the most |
| 429 | compact representation. |
| 430 | |
Ezio Melotti | 3a237eb | 2012-11-29 00:22:30 +0200 | [diff] [blame] | 431 | .. note:: |
| 432 | |
| 433 | Since the default item separator is ``', '``, the output might include |
| 434 | trailing whitespace when *indent* is specified. You can use |
| 435 | ``separators=(',', ': ')`` to avoid this. |
| 436 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 437 | If specified, *separators* should be an ``(item_separator, key_separator)`` |
| 438 | tuple. The default is ``(', ', ': ')``. To get the most compact JSON |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 439 | representation, you should specify ``(',', ':')`` to eliminate whitespace. |
| 440 | |
| 441 | If specified, *default* is a function that gets called for objects that can't |
| 442 | otherwise be serialized. It should return a JSON encodable version of the |
| 443 | object or raise a :exc:`TypeError`. |
| 444 | |
| 445 | If *encoding* is not ``None``, then all input strings will be transformed |
| 446 | into unicode using that encoding prior to JSON-encoding. The default is |
| 447 | UTF-8. |
| 448 | |
| 449 | |
| 450 | .. method:: default(o) |
| 451 | |
| 452 | Implement this method in a subclass such that it returns a serializable |
| 453 | object for *o*, or calls the base implementation (to raise a |
| 454 | :exc:`TypeError`). |
| 455 | |
| 456 | For example, to support arbitrary iterators, you could implement default |
| 457 | like this:: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 458 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 459 | def default(self, o): |
| 460 | try: |
Georg Brandl | 1379ae0 | 2008-09-24 09:47:55 +0000 | [diff] [blame] | 461 | iterable = iter(o) |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 462 | except TypeError: |
Georg Brandl | 1379ae0 | 2008-09-24 09:47:55 +0000 | [diff] [blame] | 463 | pass |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 464 | else: |
| 465 | return list(iterable) |
R David Murray | 35893b7 | 2013-03-17 22:06:18 -0400 | [diff] [blame] | 466 | # Let the base class default method raise the TypeError |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 467 | return JSONEncoder.default(self, o) |
| 468 | |
| 469 | |
| 470 | .. method:: encode(o) |
| 471 | |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 472 | Return a JSON string representation of a Python data structure, *o*. For |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 473 | example:: |
| 474 | |
| 475 | >>> JSONEncoder().encode({"foo": ["bar", "baz"]}) |
| 476 | '{"foo": ["bar", "baz"]}' |
| 477 | |
| 478 | |
| 479 | .. method:: iterencode(o) |
| 480 | |
| 481 | Encode the given object, *o*, and yield each string representation as |
Georg Brandl | 3961f18 | 2008-05-05 20:53:39 +0000 | [diff] [blame] | 482 | available. For example:: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 483 | |
Brett Cannon | 4b964f9 | 2008-05-05 20:21:38 +0000 | [diff] [blame] | 484 | for chunk in JSONEncoder().iterencode(bigobject): |
| 485 | mysocket.write(chunk) |
Antoine Pitrou | f3e0a69 | 2012-08-24 19:46:17 +0200 | [diff] [blame] | 486 | |
| 487 | |
| 488 | Standard Compliance |
| 489 | ------------------- |
| 490 | |
| 491 | The JSON format is specified by :rfc:`4627`. This section details this |
| 492 | module's level of compliance with the RFC. For simplicity, |
| 493 | :class:`JSONEncoder` and :class:`JSONDecoder` subclasses, and parameters other |
| 494 | than those explicitly mentioned, are not considered. |
| 495 | |
| 496 | This module does not comply with the RFC in a strict fashion, implementing some |
| 497 | extensions that are valid JavaScript but not valid JSON. In particular: |
| 498 | |
| 499 | - Top-level non-object, non-array values are accepted and output; |
| 500 | - Infinite and NaN number values are accepted and output; |
| 501 | - Repeated names within an object are accepted, and only the value of the last |
| 502 | name-value pair is used. |
| 503 | |
| 504 | Since the RFC permits RFC-compliant parsers to accept input texts that are not |
| 505 | RFC-compliant, this module's deserializer is technically RFC-compliant under |
| 506 | default settings. |
| 507 | |
| 508 | Character Encodings |
| 509 | ^^^^^^^^^^^^^^^^^^^ |
| 510 | |
| 511 | The RFC recommends that JSON be represented using either UTF-8, UTF-16, or |
| 512 | UTF-32, with UTF-8 being the default. Accordingly, this module uses UTF-8 as |
| 513 | the default for its *encoding* parameter. |
| 514 | |
| 515 | This module's deserializer only directly works with ASCII-compatible encodings; |
| 516 | UTF-16, UTF-32, and other ASCII-incompatible encodings require the use of |
| 517 | workarounds described in the documentation for the deserializer's *encoding* |
| 518 | parameter. |
| 519 | |
| 520 | The RFC also non-normatively describes a limited encoding detection technique |
| 521 | for JSON texts; this module's deserializer does not implement this or any other |
| 522 | kind of encoding detection. |
| 523 | |
| 524 | As permitted, though not required, by the RFC, this module's serializer sets |
| 525 | *ensure_ascii=True* by default, thus escaping the output so that the resulting |
| 526 | strings only contain ASCII characters. |
| 527 | |
| 528 | |
| 529 | Top-level Non-Object, Non-Array Values |
| 530 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 531 | |
| 532 | The RFC specifies that the top-level value of a JSON text must be either a |
| 533 | JSON object or array (Python :class:`dict` or :class:`list`). This module's |
| 534 | deserializer also accepts input texts consisting solely of a |
| 535 | JSON null, boolean, number, or string value:: |
| 536 | |
| 537 | >>> just_a_json_string = '"spam and eggs"' # Not by itself a valid JSON text |
| 538 | >>> json.loads(just_a_json_string) |
| 539 | u'spam and eggs' |
| 540 | |
| 541 | This module itself does not include a way to request that such input texts be |
| 542 | regarded as illegal. Likewise, this module's serializer also accepts single |
| 543 | Python :data:`None`, :class:`bool`, numeric, and :class:`str` |
| 544 | values as input and will generate output texts consisting solely of a top-level |
| 545 | JSON null, boolean, number, or string value without raising an exception:: |
| 546 | |
| 547 | >>> neither_a_list_nor_a_dict = u"spam and eggs" |
| 548 | >>> json.dumps(neither_a_list_nor_a_dict) # The result is not a valid JSON text |
| 549 | '"spam and eggs"' |
| 550 | |
| 551 | This module's serializer does not itself include a way to enforce the |
| 552 | aforementioned constraint. |
| 553 | |
| 554 | |
| 555 | Infinite and NaN Number Values |
| 556 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 557 | |
| 558 | The RFC does not permit the representation of infinite or NaN number values. |
| 559 | Despite that, by default, this module accepts and outputs ``Infinity``, |
| 560 | ``-Infinity``, and ``NaN`` as if they were valid JSON number literal values:: |
| 561 | |
| 562 | >>> # Neither of these calls raises an exception, but the results are not valid JSON |
| 563 | >>> json.dumps(float('-inf')) |
| 564 | '-Infinity' |
| 565 | >>> json.dumps(float('nan')) |
| 566 | 'NaN' |
| 567 | >>> # Same when deserializing |
| 568 | >>> json.loads('-Infinity') |
| 569 | -inf |
| 570 | >>> json.loads('NaN') |
| 571 | nan |
| 572 | |
| 573 | In the serializer, the *allow_nan* parameter can be used to alter this |
| 574 | behavior. In the deserializer, the *parse_constant* parameter can be used to |
| 575 | alter this behavior. |
| 576 | |
| 577 | |
| 578 | Repeated Names Within an Object |
| 579 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 580 | |
| 581 | The RFC specifies that the names within a JSON object should be unique, but |
| 582 | does not specify how repeated names in JSON objects should be handled. By |
| 583 | default, this module does not raise an exception; instead, it ignores all but |
| 584 | the last name-value pair for a given name:: |
| 585 | |
| 586 | >>> weird_json = '{"x": 1, "x": 2, "x": 3}' |
| 587 | >>> json.loads(weird_json) |
| 588 | {u'x': 3} |
| 589 | |
| 590 | The *object_pairs_hook* parameter can be used to alter this behavior. |