blob: c7d27a31c79054aac33cdfb5ebb04a8140653f76 [file] [log] [blame]
Georg Brandlf6842722008-01-19 22:08:21 +00001.. highlightlang:: c
2
3.. _stringobjects:
4
Benjamin Peterson404d1822008-05-26 14:02:09 +00005String/Bytes Objects
6--------------------
Georg Brandlf6842722008-01-19 22:08:21 +00007
8These functions raise :exc:`TypeError` when expecting a string parameter and are
9called with a non-string parameter.
10
Benjamin Peterson404d1822008-05-26 14:02:09 +000011.. note::
Benjamin Petersonafb5a482009-02-16 14:54:34 +000012
13 These functions have been renamed to PyBytes_* in Python 3.x. Unless
14 otherwise noted, the PyBytes functions available in 3.x are aliased to their
15 PyString_* equivalents to help porting.
Benjamin Peterson404d1822008-05-26 14:02:09 +000016
Georg Brandlf6842722008-01-19 22:08:21 +000017.. index:: object: string
18
19
20.. ctype:: PyStringObject
21
22 This subtype of :ctype:`PyObject` represents a Python string object.
23
24
25.. cvar:: PyTypeObject PyString_Type
26
27 .. index:: single: StringType (in module types)
28
29 This instance of :ctype:`PyTypeObject` represents the Python string type; it is
30 the same object as ``str`` and ``types.StringType`` in the Python layer. .
31
32
33.. cfunction:: int PyString_Check(PyObject *o)
34
35 Return true if the object *o* is a string object or an instance of a subtype of
36 the string type.
37
38 .. versionchanged:: 2.2
39 Allowed subtypes to be accepted.
40
41
42.. cfunction:: int PyString_CheckExact(PyObject *o)
43
44 Return true if the object *o* is a string object, but not an instance of a
45 subtype of the string type.
46
47 .. versionadded:: 2.2
48
49
50.. cfunction:: PyObject* PyString_FromString(const char *v)
51
52 Return a new string object with a copy of the string *v* as value on success,
53 and *NULL* on failure. The parameter *v* must not be *NULL*; it will not be
54 checked.
55
56
57.. cfunction:: PyObject* PyString_FromStringAndSize(const char *v, Py_ssize_t len)
58
59 Return a new string object with a copy of the string *v* as value and length
60 *len* on success, and *NULL* on failure. If *v* is *NULL*, the contents of the
61 string are uninitialized.
62
Jeroen Ruigrok van der Werven089c5cd2009-04-25 17:59:03 +000063 .. versionchanged:: 2.5
64 This function used an :ctype:`int` type for *len*. This might require
65 changes in your code for properly supporting 64-bit systems.
66
Georg Brandlf6842722008-01-19 22:08:21 +000067
68.. cfunction:: PyObject* PyString_FromFormat(const char *format, ...)
69
70 Take a C :cfunc:`printf`\ -style *format* string and a variable number of
71 arguments, calculate the size of the resulting Python string and return a string
72 with the values formatted into it. The variable arguments must be C types and
73 must correspond exactly to the format characters in the *format* string. The
74 following format characters are allowed:
75
76 .. % This should be exactly the same as the table in PyErr_Format.
77 .. % One should just refer to the other.
78 .. % The descriptions for %zd and %zu are wrong, but the truth is complicated
79 .. % because not all compilers support the %z width modifier -- we fake it
80 .. % when necessary via interpolating PY_FORMAT_SIZE_T.
81 .. % %u, %lu, %zu should have "new in Python 2.5" blurbs.
82
83 +-------------------+---------------+--------------------------------+
84 | Format Characters | Type | Comment |
85 +===================+===============+================================+
86 | :attr:`%%` | *n/a* | The literal % character. |
87 +-------------------+---------------+--------------------------------+
88 | :attr:`%c` | int | A single character, |
89 | | | represented as an C int. |
90 +-------------------+---------------+--------------------------------+
91 | :attr:`%d` | int | Exactly equivalent to |
92 | | | ``printf("%d")``. |
93 +-------------------+---------------+--------------------------------+
94 | :attr:`%u` | unsigned int | Exactly equivalent to |
95 | | | ``printf("%u")``. |
96 +-------------------+---------------+--------------------------------+
97 | :attr:`%ld` | long | Exactly equivalent to |
98 | | | ``printf("%ld")``. |
99 +-------------------+---------------+--------------------------------+
100 | :attr:`%lu` | unsigned long | Exactly equivalent to |
101 | | | ``printf("%lu")``. |
102 +-------------------+---------------+--------------------------------+
103 | :attr:`%zd` | Py_ssize_t | Exactly equivalent to |
104 | | | ``printf("%zd")``. |
105 +-------------------+---------------+--------------------------------+
106 | :attr:`%zu` | size_t | Exactly equivalent to |
107 | | | ``printf("%zu")``. |
108 +-------------------+---------------+--------------------------------+
109 | :attr:`%i` | int | Exactly equivalent to |
110 | | | ``printf("%i")``. |
111 +-------------------+---------------+--------------------------------+
112 | :attr:`%x` | int | Exactly equivalent to |
113 | | | ``printf("%x")``. |
114 +-------------------+---------------+--------------------------------+
115 | :attr:`%s` | char\* | A null-terminated C character |
116 | | | array. |
117 +-------------------+---------------+--------------------------------+
118 | :attr:`%p` | void\* | The hex representation of a C |
119 | | | pointer. Mostly equivalent to |
120 | | | ``printf("%p")`` except that |
121 | | | it is guaranteed to start with |
122 | | | the literal ``0x`` regardless |
123 | | | of what the platform's |
124 | | | ``printf`` yields. |
125 +-------------------+---------------+--------------------------------+
126
127 An unrecognized format character causes all the rest of the format string to be
128 copied as-is to the result string, and any extra arguments discarded.
129
130
131.. cfunction:: PyObject* PyString_FromFormatV(const char *format, va_list vargs)
132
Benjamin Peterson438e9ac2008-05-26 14:29:09 +0000133 Identical to :cfunc:`PyString_FromFormat` except that it takes exactly two
Georg Brandlf6842722008-01-19 22:08:21 +0000134 arguments.
135
136
137.. cfunction:: Py_ssize_t PyString_Size(PyObject *string)
138
139 Return the length of the string in string object *string*.
140
Jeroen Ruigrok van der Werven089c5cd2009-04-25 17:59:03 +0000141 .. versionchanged:: 2.5
142 This function returned an :ctype:`int` type. This might require changes
143 in your code for properly supporting 64-bit systems.
144
Georg Brandlf6842722008-01-19 22:08:21 +0000145
146.. cfunction:: Py_ssize_t PyString_GET_SIZE(PyObject *string)
147
148 Macro form of :cfunc:`PyString_Size` but without error checking.
149
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000150 .. versionchanged:: 2.5
151 This macro returned an :ctype:`int` type. This might require changes in
152 your code for properly supporting 64-bit systems.
153
Georg Brandlf6842722008-01-19 22:08:21 +0000154
155.. cfunction:: char* PyString_AsString(PyObject *string)
156
157 Return a NUL-terminated representation of the contents of *string*. The pointer
158 refers to the internal buffer of *string*, not a copy. The data must not be
159 modified in any way, unless the string was just created using
160 ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If
161 *string* is a Unicode object, this function computes the default encoding of
162 *string* and operates on that. If *string* is not a string object at all,
163 :cfunc:`PyString_AsString` returns *NULL* and raises :exc:`TypeError`.
164
165
166.. cfunction:: char* PyString_AS_STRING(PyObject *string)
167
168 Macro form of :cfunc:`PyString_AsString` but without error checking. Only
169 string objects are supported; no Unicode objects should be passed.
170
171
172.. cfunction:: int PyString_AsStringAndSize(PyObject *obj, char **buffer, Py_ssize_t *length)
173
174 Return a NUL-terminated representation of the contents of the object *obj*
175 through the output variables *buffer* and *length*.
176
177 The function accepts both string and Unicode objects as input. For Unicode
178 objects it returns the default encoded version of the object. If *length* is
179 *NULL*, the resulting buffer may not contain NUL characters; if it does, the
180 function returns ``-1`` and a :exc:`TypeError` is raised.
181
182 The buffer refers to an internal string buffer of *obj*, not a copy. The data
183 must not be modified in any way, unless the string was just created using
184 ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If
185 *string* is a Unicode object, this function computes the default encoding of
186 *string* and operates on that. If *string* is not a string object at all,
187 :cfunc:`PyString_AsStringAndSize` returns ``-1`` and raises :exc:`TypeError`.
188
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000189 .. versionchanged:: 2.5
190 This function used an :ctype:`int *` type for *length*. This might
191 require changes in your code for properly supporting 64-bit systems.
192
Georg Brandlf6842722008-01-19 22:08:21 +0000193
194.. cfunction:: void PyString_Concat(PyObject **string, PyObject *newpart)
195
196 Create a new string object in *\*string* containing the contents of *newpart*
197 appended to *string*; the caller will own the new reference. The reference to
198 the old value of *string* will be stolen. If the new string cannot be created,
199 the old reference to *string* will still be discarded and the value of
200 *\*string* will be set to *NULL*; the appropriate exception will be set.
201
202
203.. cfunction:: void PyString_ConcatAndDel(PyObject **string, PyObject *newpart)
204
205 Create a new string object in *\*string* containing the contents of *newpart*
206 appended to *string*. This version decrements the reference count of *newpart*.
207
208
209.. cfunction:: int _PyString_Resize(PyObject **string, Py_ssize_t newsize)
210
211 A way to resize a string object even though it is "immutable". Only use this to
212 build up a brand new string object; don't use this if the string may already be
213 known in other parts of the code. It is an error to call this function if the
214 refcount on the input string object is not one. Pass the address of an existing
215 string object as an lvalue (it may be written into), and the new size desired.
216 On success, *\*string* holds the resized string object and ``0`` is returned;
217 the address in *\*string* may differ from its input value. If the reallocation
218 fails, the original string object at *\*string* is deallocated, *\*string* is
219 set to *NULL*, a memory exception is set, and ``-1`` is returned.
220
Jeroen Ruigrok van der Werven089c5cd2009-04-25 17:59:03 +0000221 .. versionchanged:: 2.5
222 This function used an :ctype:`int` type for *newsize*. This might
223 require changes in your code for properly supporting 64-bit systems.
Georg Brandlf6842722008-01-19 22:08:21 +0000224
225.. cfunction:: PyObject* PyString_Format(PyObject *format, PyObject *args)
226
227 Return a new string object from *format* and *args*. Analogous to ``format %
228 args``. The *args* argument must be a tuple.
229
230
231.. cfunction:: void PyString_InternInPlace(PyObject **string)
232
233 Intern the argument *\*string* in place. The argument must be the address of a
234 pointer variable pointing to a Python string object. If there is an existing
235 interned string that is the same as *\*string*, it sets *\*string* to it
236 (decrementing the reference count of the old string object and incrementing the
237 reference count of the interned string object), otherwise it leaves *\*string*
238 alone and interns it (incrementing its reference count). (Clarification: even
239 though there is a lot of talk about reference counts, think of this function as
240 reference-count-neutral; you own the object after the call if and only if you
241 owned it before the call.)
242
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000243 .. note::
244
245 This function is not available in 3.x and does not have a PyBytes alias.
246
Georg Brandlf6842722008-01-19 22:08:21 +0000247
248.. cfunction:: PyObject* PyString_InternFromString(const char *v)
249
250 A combination of :cfunc:`PyString_FromString` and
251 :cfunc:`PyString_InternInPlace`, returning either a new string object that has
252 been interned, or a new ("owned") reference to an earlier interned string object
253 with the same value.
254
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000255 .. note::
256
257 This function is not available in 3.x and does not have a PyBytes alias.
258
Georg Brandlf6842722008-01-19 22:08:21 +0000259
260.. cfunction:: PyObject* PyString_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
261
262 Create an object by decoding *size* bytes of the encoded buffer *s* using the
263 codec registered for *encoding*. *encoding* and *errors* have the same meaning
264 as the parameters of the same name in the :func:`unicode` built-in function.
265 The codec to be used is looked up using the Python codec registry. Return
266 *NULL* if an exception was raised by the codec.
267
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000268 .. note::
269
270 This function is not available in 3.x and does not have a PyBytes alias.
271
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000272 .. versionchanged:: 2.5
273 This function used an :ctype:`int` type for *size*. This might require
274 changes in your code for properly supporting 64-bit systems.
275
Georg Brandlf6842722008-01-19 22:08:21 +0000276
277.. cfunction:: PyObject* PyString_AsDecodedObject(PyObject *str, const char *encoding, const char *errors)
278
279 Decode a string object by passing it to the codec registered for *encoding* and
280 return the result as Python object. *encoding* and *errors* have the same
281 meaning as the parameters of the same name in the string :meth:`encode` method.
282 The codec to be used is looked up using the Python codec registry. Return *NULL*
283 if an exception was raised by the codec.
284
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000285 .. note::
286
287 This function is not available in 3.x and does not have a PyBytes alias.
288
Georg Brandlf6842722008-01-19 22:08:21 +0000289
290.. cfunction:: PyObject* PyString_Encode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
291
292 Encode the :ctype:`char` buffer of the given size by passing it to the codec
293 registered for *encoding* and return a Python object. *encoding* and *errors*
294 have the same meaning as the parameters of the same name in the string
295 :meth:`encode` method. The codec to be used is looked up using the Python codec
296 registry. Return *NULL* if an exception was raised by the codec.
297
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000298 .. note::
299
300 This function is not available in 3.x and does not have a PyBytes alias.
301
Jeroen Ruigrok van der Werven7b3750c2009-04-25 20:55:39 +0000302 .. versionchanged:: 2.5
303 This function used an :ctype:`int` type for *size*. This might require
304 changes in your code for properly supporting 64-bit systems.
305
Georg Brandlf6842722008-01-19 22:08:21 +0000306
307.. cfunction:: PyObject* PyString_AsEncodedObject(PyObject *str, const char *encoding, const char *errors)
308
309 Encode a string object using the codec registered for *encoding* and return the
310 result as Python object. *encoding* and *errors* have the same meaning as the
311 parameters of the same name in the string :meth:`encode` method. The codec to be
312 used is looked up using the Python codec registry. Return *NULL* if an exception
313 was raised by the codec.
Benjamin Petersonafb5a482009-02-16 14:54:34 +0000314
315 .. note::
316
317 This function is not available in 3.x and does not have a PyBytes alias.