blob: 32dc274a9e158f6b48dc5f22f03c69cd92e8ea52 [file] [log] [blame]
Georg Brandlf6842722008-01-19 22:08:21 +00001.. highlightlang:: c
2
3.. _stringobjects:
4
Benjamin Peterson404d1822008-05-26 14:02:09 +00005String/Bytes Objects
6--------------------
Georg Brandlf6842722008-01-19 22:08:21 +00007
8These functions raise :exc:`TypeError` when expecting a string parameter and are
9called with a non-string parameter.
10
Benjamin Peterson404d1822008-05-26 14:02:09 +000011.. note::
Benjamin Petersonafb5a482009-02-16 14:54:34 +000012
13 These functions have been renamed to PyBytes_* in Python 3.x. Unless
14 otherwise noted, the PyBytes functions available in 3.x are aliased to their
15 PyString_* equivalents to help porting.
Benjamin Peterson404d1822008-05-26 14:02:09 +000016
Georg Brandlf6842722008-01-19 22:08:21 +000017.. index:: object: string
18
19
Sandro Tosi98ed08f2012-01-14 16:42:02 +010020.. c:type:: PyStringObject
Georg Brandlf6842722008-01-19 22:08:21 +000021
Sandro Tosi98ed08f2012-01-14 16:42:02 +010022 This subtype of :c:type:`PyObject` represents a Python string object.
Georg Brandlf6842722008-01-19 22:08:21 +000023
24
Sandro Tosi98ed08f2012-01-14 16:42:02 +010025.. c:var:: PyTypeObject PyString_Type
Georg Brandlf6842722008-01-19 22:08:21 +000026
27 .. index:: single: StringType (in module types)
28
Sandro Tosi98ed08f2012-01-14 16:42:02 +010029 This instance of :c:type:`PyTypeObject` represents the Python string type; it is
Georg Brandlf6842722008-01-19 22:08:21 +000030 the same object as ``str`` and ``types.StringType`` in the Python layer. .
31
32
Sandro Tosi98ed08f2012-01-14 16:42:02 +010033.. c:function:: int PyString_Check(PyObject *o)
Georg Brandlf6842722008-01-19 22:08:21 +000034
35 Return true if the object *o* is a string object or an instance of a subtype of
36 the string type.
37
38 .. versionchanged:: 2.2
39 Allowed subtypes to be accepted.
40
41
Sandro Tosi98ed08f2012-01-14 16:42:02 +010042.. c:function:: int PyString_CheckExact(PyObject *o)
Georg Brandlf6842722008-01-19 22:08:21 +000043
44 Return true if the object *o* is a string object, but not an instance of a
45 subtype of the string type.
46
47 .. versionadded:: 2.2
48
49
Sandro Tosi98ed08f2012-01-14 16:42:02 +010050.. c:function:: PyObject* PyString_FromString(const char *v)
Georg Brandlf6842722008-01-19 22:08:21 +000051
52 Return a new string object with a copy of the string *v* as value on success,
53 and *NULL* on failure. The parameter *v* must not be *NULL*; it will not be
54 checked.
55
56
Sandro Tosi98ed08f2012-01-14 16:42:02 +010057.. c:function:: PyObject* PyString_FromStringAndSize(const char *v, Py_ssize_t len)
Georg Brandlf6842722008-01-19 22:08:21 +000058
59 Return a new string object with a copy of the string *v* as value and length
60 *len* on success, and *NULL* on failure. If *v* is *NULL*, the contents of the
61 string are uninitialized.
62
Jeroen Ruigrok van der Werven089c5cd2009-04-25 17:59:03 +000063 .. versionchanged:: 2.5
Sandro Tosi98ed08f2012-01-14 16:42:02 +010064 This function used an :c:type:`int` type for *len*. This might require
Jeroen Ruigrok van der Werven089c5cd2009-04-25 17:59:03 +000065 changes in your code for properly supporting 64-bit systems.
66
Georg Brandlf6842722008-01-19 22:08:21 +000067
Sandro Tosi98ed08f2012-01-14 16:42:02 +010068.. c:function:: PyObject* PyString_FromFormat(const char *format, ...)
Georg Brandlf6842722008-01-19 22:08:21 +000069
Sandro Tosi98ed08f2012-01-14 16:42:02 +010070 Take a C :c:func:`printf`\ -style *format* string and a variable number of
Georg Brandlf6842722008-01-19 22:08:21 +000071 arguments, calculate the size of the resulting Python string and return a string
72 with the values formatted into it. The variable arguments must be C types and
73 must correspond exactly to the format characters in the *format* string. The
74 following format characters are allowed:
75
76 .. % This should be exactly the same as the table in PyErr_Format.
77 .. % One should just refer to the other.
78 .. % The descriptions for %zd and %zu are wrong, but the truth is complicated
79 .. % because not all compilers support the %z width modifier -- we fake it
80 .. % when necessary via interpolating PY_FORMAT_SIZE_T.
Mark Dickinson82864d12009-11-15 16:18:58 +000081 .. % Similar comments apply to the %ll width modifier and
82 .. % PY_FORMAT_LONG_LONG.
Georg Brandlf6842722008-01-19 22:08:21 +000083 .. % %u, %lu, %zu should have "new in Python 2.5" blurbs.
84
85 +-------------------+---------------+--------------------------------+
86 | Format Characters | Type | Comment |
87 +===================+===============+================================+
88 | :attr:`%%` | *n/a* | The literal % character. |
89 +-------------------+---------------+--------------------------------+
90 | :attr:`%c` | int | A single character, |
91 | | | represented as an C int. |
92 +-------------------+---------------+--------------------------------+
93 | :attr:`%d` | int | Exactly equivalent to |
94 | | | ``printf("%d")``. |
95 +-------------------+---------------+--------------------------------+
96 | :attr:`%u` | unsigned int | Exactly equivalent to |
97 | | | ``printf("%u")``. |
98 +-------------------+---------------+--------------------------------+
99 | :attr:`%ld` | long | Exactly equivalent to |
100 | | | ``printf("%ld")``. |
101 +-------------------+---------------+--------------------------------+
102 | :attr:`%lu` | unsigned long | Exactly equivalent to |
103 | | | ``printf("%lu")``. |
104 +-------------------+---------------+--------------------------------+
Mark Dickinson82864d12009-11-15 16:18:58 +0000105 | :attr:`%lld` | long long | Exactly equivalent to |
106 | | | ``printf("%lld")``. |
107 +-------------------+---------------+--------------------------------+
108 | :attr:`%llu` | unsigned | Exactly equivalent to |
109 | | long long | ``printf("%llu")``. |
110 +-------------------+---------------+--------------------------------+
Georg Brandlf6842722008-01-19 22:08:21 +0000111 | :attr:`%zd` | Py_ssize_t | Exactly equivalent to |
112 | | | ``printf("%zd")``. |
113 +-------------------+---------------+--------------------------------+
114 | :attr:`%zu` | size_t | Exactly equivalent to |
115 | | | ``printf("%zu")``. |
116 +-------------------+---------------+--------------------------------+
117 | :attr:`%i` | int | Exactly equivalent to |
118 | | | ``printf("%i")``. |
119 +-------------------+---------------+--------------------------------+
120 | :attr:`%x` | int | Exactly equivalent to |
121 | | | ``printf("%x")``. |
122 +-------------------+---------------+--------------------------------+
123 | :attr:`%s` | char\* | A null-terminated C character |
124 | | | array. |
125 +-------------------+---------------+--------------------------------+
126 | :attr:`%p` | void\* | The hex representation of a C |
127 | | | pointer. Mostly equivalent to |
128 | | | ``printf("%p")`` except that |
129 | | | it is guaranteed to start with |
130 | | | the literal ``0x`` regardless |
131 | | | of what the platform's |
132 | | | ``printf`` yields. |
133 +-------------------+---------------+--------------------------------+
134
135 An unrecognized format character causes all the rest of the format string to be
136 copied as-is to the result string, and any extra arguments discarded.
137
Mark Dickinson82864d12009-11-15 16:18:58 +0000138 .. note::
139
140 The `"%lld"` and `"%llu"` format specifiers are only available
Georg Brandlf6d367452010-03-12 10:02:03 +0000141 when :const:`HAVE_LONG_LONG` is defined.
Mark Dickinson82864d12009-11-15 16:18:58 +0000142
143 .. versionchanged:: 2.7
144 Support for `"%lld"` and `"%llu"` added.
145
Georg Brandlf6842722008-01-19 22:08:21 +0000146
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100147.. c:function:: PyObject* PyString_FromFormatV(const char *format, va_list vargs)
Georg Brandlf6842722008-01-19 22:08:21 +0000148
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100149 Identical to :c:func:`PyString_FromFormat` except that it takes exactly two
Georg Brandlf6842722008-01-19 22:08:21 +0000150 arguments.
151
152
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100153.. c:function:: Py_ssize_t PyString_Size(PyObject *string)
Georg Brandlf6842722008-01-19 22:08:21 +0000154
155 Return the length of the string in string object *string*.
156
Jeroen Ruigrok van der Werven089c5cd2009-04-25 17:59:03 +0000157 .. versionchanged:: 2.5
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100158 This function returned an :c:type:`int` type. This might require changes
Jeroen Ruigrok van der Werven089c5cd2009-04-25 17:59:03 +0000159 in your code for properly supporting 64-bit systems.
160
Georg Brandlf6842722008-01-19 22:08:21 +0000161
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100162.. c:function:: Py_ssize_t PyString_GET_SIZE(PyObject *string)
Georg Brandlf6842722008-01-19 22:08:21 +0000163
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100164 Macro form of :c:func:`PyString_Size` but without error checking.
Georg Brandlf6842722008-01-19 22:08:21 +0000165
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000166 .. versionchanged:: 2.5
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100167 This macro returned an :c:type:`int` type. This might require changes in
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000168 your code for properly supporting 64-bit systems.
169
Georg Brandlf6842722008-01-19 22:08:21 +0000170
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100171.. c:function:: char* PyString_AsString(PyObject *string)
Georg Brandlf6842722008-01-19 22:08:21 +0000172
173 Return a NUL-terminated representation of the contents of *string*. The pointer
174 refers to the internal buffer of *string*, not a copy. The data must not be
175 modified in any way, unless the string was just created using
176 ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If
177 *string* is a Unicode object, this function computes the default encoding of
178 *string* and operates on that. If *string* is not a string object at all,
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100179 :c:func:`PyString_AsString` returns *NULL* and raises :exc:`TypeError`.
Georg Brandlf6842722008-01-19 22:08:21 +0000180
181
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100182.. c:function:: char* PyString_AS_STRING(PyObject *string)
Georg Brandlf6842722008-01-19 22:08:21 +0000183
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100184 Macro form of :c:func:`PyString_AsString` but without error checking. Only
Georg Brandlf6842722008-01-19 22:08:21 +0000185 string objects are supported; no Unicode objects should be passed.
186
187
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100188.. c:function:: int PyString_AsStringAndSize(PyObject *obj, char **buffer, Py_ssize_t *length)
Georg Brandlf6842722008-01-19 22:08:21 +0000189
190 Return a NUL-terminated representation of the contents of the object *obj*
191 through the output variables *buffer* and *length*.
192
193 The function accepts both string and Unicode objects as input. For Unicode
194 objects it returns the default encoded version of the object. If *length* is
195 *NULL*, the resulting buffer may not contain NUL characters; if it does, the
196 function returns ``-1`` and a :exc:`TypeError` is raised.
197
198 The buffer refers to an internal string buffer of *obj*, not a copy. The data
199 must not be modified in any way, unless the string was just created using
200 ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If
201 *string* is a Unicode object, this function computes the default encoding of
202 *string* and operates on that. If *string* is not a string object at all,
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100203 :c:func:`PyString_AsStringAndSize` returns ``-1`` and raises :exc:`TypeError`.
Georg Brandlf6842722008-01-19 22:08:21 +0000204
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000205 .. versionchanged:: 2.5
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100206 This function used an :c:type:`int *` type for *length*. This might
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000207 require changes in your code for properly supporting 64-bit systems.
208
Georg Brandlf6842722008-01-19 22:08:21 +0000209
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100210.. c:function:: void PyString_Concat(PyObject **string, PyObject *newpart)
Georg Brandlf6842722008-01-19 22:08:21 +0000211
212 Create a new string object in *\*string* containing the contents of *newpart*
213 appended to *string*; the caller will own the new reference. The reference to
214 the old value of *string* will be stolen. If the new string cannot be created,
215 the old reference to *string* will still be discarded and the value of
216 *\*string* will be set to *NULL*; the appropriate exception will be set.
217
218
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100219.. c:function:: void PyString_ConcatAndDel(PyObject **string, PyObject *newpart)
Georg Brandlf6842722008-01-19 22:08:21 +0000220
221 Create a new string object in *\*string* containing the contents of *newpart*
222 appended to *string*. This version decrements the reference count of *newpart*.
223
224
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100225.. c:function:: int _PyString_Resize(PyObject **string, Py_ssize_t newsize)
Georg Brandlf6842722008-01-19 22:08:21 +0000226
227 A way to resize a string object even though it is "immutable". Only use this to
228 build up a brand new string object; don't use this if the string may already be
229 known in other parts of the code. It is an error to call this function if the
230 refcount on the input string object is not one. Pass the address of an existing
231 string object as an lvalue (it may be written into), and the new size desired.
232 On success, *\*string* holds the resized string object and ``0`` is returned;
233 the address in *\*string* may differ from its input value. If the reallocation
234 fails, the original string object at *\*string* is deallocated, *\*string* is
235 set to *NULL*, a memory exception is set, and ``-1`` is returned.
236
Jeroen Ruigrok van der Werven089c5cd2009-04-25 17:59:03 +0000237 .. versionchanged:: 2.5
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100238 This function used an :c:type:`int` type for *newsize*. This might
Jeroen Ruigrok van der Werven089c5cd2009-04-25 17:59:03 +0000239 require changes in your code for properly supporting 64-bit systems.
Georg Brandlf6842722008-01-19 22:08:21 +0000240
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100241.. c:function:: PyObject* PyString_Format(PyObject *format, PyObject *args)
Georg Brandlf6842722008-01-19 22:08:21 +0000242
243 Return a new string object from *format* and *args*. Analogous to ``format %
Benjamin Peterson29f0dc52013-10-04 10:55:15 -0400244 args``. The *args* argument must be a tuple or dict.
Georg Brandlf6842722008-01-19 22:08:21 +0000245
246
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100247.. c:function:: void PyString_InternInPlace(PyObject **string)
Georg Brandlf6842722008-01-19 22:08:21 +0000248
249 Intern the argument *\*string* in place. The argument must be the address of a
250 pointer variable pointing to a Python string object. If there is an existing
251 interned string that is the same as *\*string*, it sets *\*string* to it
252 (decrementing the reference count of the old string object and incrementing the
253 reference count of the interned string object), otherwise it leaves *\*string*
254 alone and interns it (incrementing its reference count). (Clarification: even
255 though there is a lot of talk about reference counts, think of this function as
256 reference-count-neutral; you own the object after the call if and only if you
257 owned it before the call.)
258
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000259 .. note::
260
261 This function is not available in 3.x and does not have a PyBytes alias.
262
Georg Brandlf6842722008-01-19 22:08:21 +0000263
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100264.. c:function:: PyObject* PyString_InternFromString(const char *v)
Georg Brandlf6842722008-01-19 22:08:21 +0000265
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100266 A combination of :c:func:`PyString_FromString` and
267 :c:func:`PyString_InternInPlace`, returning either a new string object that has
Georg Brandlf6842722008-01-19 22:08:21 +0000268 been interned, or a new ("owned") reference to an earlier interned string object
269 with the same value.
270
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000271 .. note::
272
273 This function is not available in 3.x and does not have a PyBytes alias.
274
Georg Brandlf6842722008-01-19 22:08:21 +0000275
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100276.. c:function:: PyObject* PyString_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
Georg Brandlf6842722008-01-19 22:08:21 +0000277
278 Create an object by decoding *size* bytes of the encoded buffer *s* using the
279 codec registered for *encoding*. *encoding* and *errors* have the same meaning
280 as the parameters of the same name in the :func:`unicode` built-in function.
281 The codec to be used is looked up using the Python codec registry. Return
282 *NULL* if an exception was raised by the codec.
283
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000284 .. note::
285
286 This function is not available in 3.x and does not have a PyBytes alias.
287
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000288 .. versionchanged:: 2.5
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100289 This function used an :c:type:`int` type for *size*. This might require
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000290 changes in your code for properly supporting 64-bit systems.
291
Georg Brandlf6842722008-01-19 22:08:21 +0000292
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100293.. c:function:: PyObject* PyString_AsDecodedObject(PyObject *str, const char *encoding, const char *errors)
Georg Brandlf6842722008-01-19 22:08:21 +0000294
295 Decode a string object by passing it to the codec registered for *encoding* and
296 return the result as Python object. *encoding* and *errors* have the same
297 meaning as the parameters of the same name in the string :meth:`encode` method.
298 The codec to be used is looked up using the Python codec registry. Return *NULL*
299 if an exception was raised by the codec.
300
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000301 .. note::
302
303 This function is not available in 3.x and does not have a PyBytes alias.
304
Georg Brandlf6842722008-01-19 22:08:21 +0000305
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100306.. c:function:: PyObject* PyString_Encode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
Georg Brandlf6842722008-01-19 22:08:21 +0000307
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100308 Encode the :c:type:`char` buffer of the given size by passing it to the codec
Georg Brandlf6842722008-01-19 22:08:21 +0000309 registered for *encoding* and return a Python object. *encoding* and *errors*
310 have the same meaning as the parameters of the same name in the string
311 :meth:`encode` method. The codec to be used is looked up using the Python codec
312 registry. Return *NULL* if an exception was raised by the codec.
313
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000314 .. note::
315
316 This function is not available in 3.x and does not have a PyBytes alias.
317
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000318 .. versionchanged:: 2.5
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100319 This function used an :c:type:`int` type for *size*. This might require
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000320 changes in your code for properly supporting 64-bit systems.
321
Georg Brandlf6842722008-01-19 22:08:21 +0000322
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100323.. c:function:: PyObject* PyString_AsEncodedObject(PyObject *str, const char *encoding, const char *errors)
Georg Brandlf6842722008-01-19 22:08:21 +0000324
325 Encode a string object using the codec registered for *encoding* and return the
326 result as Python object. *encoding* and *errors* have the same meaning as the
327 parameters of the same name in the string :meth:`encode` method. The codec to be
328 used is looked up using the Python codec registry. Return *NULL* if an exception
329 was raised by the codec.
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000330
331 .. note::
332
333 This function is not available in 3.x and does not have a PyBytes alias.