blob: 156c8f259b8cc60ad93d7f552a68fd19cc4371ee [file] [log] [blame]
Georg Brandlf6842722008-01-19 22:08:21 +00001.. highlightlang:: c
2
3.. _stringobjects:
4
Benjamin Peterson404d1822008-05-26 14:02:09 +00005String/Bytes Objects
6--------------------
Georg Brandlf6842722008-01-19 22:08:21 +00007
8These functions raise :exc:`TypeError` when expecting a string parameter and are
9called with a non-string parameter.
10
Benjamin Peterson404d1822008-05-26 14:02:09 +000011.. note::
12 These functions have been renamed to PyBytes_* in Python 3.x. The PyBytes
13 names are also available in 2.6.
14
Georg Brandlf6842722008-01-19 22:08:21 +000015.. index:: object: string
16
17
18.. ctype:: PyStringObject
19
20 This subtype of :ctype:`PyObject` represents a Python string object.
21
22
23.. cvar:: PyTypeObject PyString_Type
24
25 .. index:: single: StringType (in module types)
26
27 This instance of :ctype:`PyTypeObject` represents the Python string type; it is
28 the same object as ``str`` and ``types.StringType`` in the Python layer. .
29
30
31.. cfunction:: int PyString_Check(PyObject *o)
32
33 Return true if the object *o* is a string object or an instance of a subtype of
34 the string type.
35
36 .. versionchanged:: 2.2
37 Allowed subtypes to be accepted.
38
39
40.. cfunction:: int PyString_CheckExact(PyObject *o)
41
42 Return true if the object *o* is a string object, but not an instance of a
43 subtype of the string type.
44
45 .. versionadded:: 2.2
46
47
48.. cfunction:: PyObject* PyString_FromString(const char *v)
49
50 Return a new string object with a copy of the string *v* as value on success,
51 and *NULL* on failure. The parameter *v* must not be *NULL*; it will not be
52 checked.
53
54
55.. cfunction:: PyObject* PyString_FromStringAndSize(const char *v, Py_ssize_t len)
56
57 Return a new string object with a copy of the string *v* as value and length
58 *len* on success, and *NULL* on failure. If *v* is *NULL*, the contents of the
59 string are uninitialized.
60
Jeroen Ruigrok van der Werven0051bf32009-04-29 08:00:05 +000061 .. versionchanged:: 2.5
62 This function used an :ctype:`int` type for *len*. This might require
63 changes in your code for properly supporting 64-bit systems.
64
Georg Brandlf6842722008-01-19 22:08:21 +000065
66.. cfunction:: PyObject* PyString_FromFormat(const char *format, ...)
67
68 Take a C :cfunc:`printf`\ -style *format* string and a variable number of
69 arguments, calculate the size of the resulting Python string and return a string
70 with the values formatted into it. The variable arguments must be C types and
71 must correspond exactly to the format characters in the *format* string. The
72 following format characters are allowed:
73
74 .. % This should be exactly the same as the table in PyErr_Format.
75 .. % One should just refer to the other.
76 .. % The descriptions for %zd and %zu are wrong, but the truth is complicated
77 .. % because not all compilers support the %z width modifier -- we fake it
78 .. % when necessary via interpolating PY_FORMAT_SIZE_T.
79 .. % %u, %lu, %zu should have "new in Python 2.5" blurbs.
80
81 +-------------------+---------------+--------------------------------+
82 | Format Characters | Type | Comment |
83 +===================+===============+================================+
84 | :attr:`%%` | *n/a* | The literal % character. |
85 +-------------------+---------------+--------------------------------+
86 | :attr:`%c` | int | A single character, |
87 | | | represented as an C int. |
88 +-------------------+---------------+--------------------------------+
89 | :attr:`%d` | int | Exactly equivalent to |
90 | | | ``printf("%d")``. |
91 +-------------------+---------------+--------------------------------+
92 | :attr:`%u` | unsigned int | Exactly equivalent to |
93 | | | ``printf("%u")``. |
94 +-------------------+---------------+--------------------------------+
95 | :attr:`%ld` | long | Exactly equivalent to |
96 | | | ``printf("%ld")``. |
97 +-------------------+---------------+--------------------------------+
98 | :attr:`%lu` | unsigned long | Exactly equivalent to |
99 | | | ``printf("%lu")``. |
100 +-------------------+---------------+--------------------------------+
101 | :attr:`%zd` | Py_ssize_t | Exactly equivalent to |
102 | | | ``printf("%zd")``. |
103 +-------------------+---------------+--------------------------------+
104 | :attr:`%zu` | size_t | Exactly equivalent to |
105 | | | ``printf("%zu")``. |
106 +-------------------+---------------+--------------------------------+
107 | :attr:`%i` | int | Exactly equivalent to |
108 | | | ``printf("%i")``. |
109 +-------------------+---------------+--------------------------------+
110 | :attr:`%x` | int | Exactly equivalent to |
111 | | | ``printf("%x")``. |
112 +-------------------+---------------+--------------------------------+
113 | :attr:`%s` | char\* | A null-terminated C character |
114 | | | array. |
115 +-------------------+---------------+--------------------------------+
116 | :attr:`%p` | void\* | The hex representation of a C |
117 | | | pointer. Mostly equivalent to |
118 | | | ``printf("%p")`` except that |
119 | | | it is guaranteed to start with |
120 | | | the literal ``0x`` regardless |
121 | | | of what the platform's |
122 | | | ``printf`` yields. |
123 +-------------------+---------------+--------------------------------+
124
125 An unrecognized format character causes all the rest of the format string to be
126 copied as-is to the result string, and any extra arguments discarded.
127
128
129.. cfunction:: PyObject* PyString_FromFormatV(const char *format, va_list vargs)
130
Benjamin Peterson438e9ac2008-05-26 14:29:09 +0000131 Identical to :cfunc:`PyString_FromFormat` except that it takes exactly two
Georg Brandlf6842722008-01-19 22:08:21 +0000132 arguments.
133
134
135.. cfunction:: Py_ssize_t PyString_Size(PyObject *string)
136
137 Return the length of the string in string object *string*.
138
Jeroen Ruigrok van der Werven0051bf32009-04-29 08:00:05 +0000139 .. versionchanged:: 2.5
140 This function returned an :ctype:`int` type. This might require changes
141 in your code for properly supporting 64-bit systems.
142
Georg Brandlf6842722008-01-19 22:08:21 +0000143
144.. cfunction:: Py_ssize_t PyString_GET_SIZE(PyObject *string)
145
146 Macro form of :cfunc:`PyString_Size` but without error checking.
147
Jeroen Ruigrok van der Werven0051bf32009-04-29 08:00:05 +0000148 .. versionchanged:: 2.5
149 This macro returned an :ctype:`int` type. This might require changes in
150 your code for properly supporting 64-bit systems.
151
Georg Brandlf6842722008-01-19 22:08:21 +0000152
153.. cfunction:: char* PyString_AsString(PyObject *string)
154
155 Return a NUL-terminated representation of the contents of *string*. The pointer
156 refers to the internal buffer of *string*, not a copy. The data must not be
157 modified in any way, unless the string was just created using
158 ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If
159 *string* is a Unicode object, this function computes the default encoding of
160 *string* and operates on that. If *string* is not a string object at all,
161 :cfunc:`PyString_AsString` returns *NULL* and raises :exc:`TypeError`.
162
163
164.. cfunction:: char* PyString_AS_STRING(PyObject *string)
165
166 Macro form of :cfunc:`PyString_AsString` but without error checking. Only
167 string objects are supported; no Unicode objects should be passed.
168
169
170.. cfunction:: int PyString_AsStringAndSize(PyObject *obj, char **buffer, Py_ssize_t *length)
171
172 Return a NUL-terminated representation of the contents of the object *obj*
173 through the output variables *buffer* and *length*.
174
175 The function accepts both string and Unicode objects as input. For Unicode
176 objects it returns the default encoded version of the object. If *length* is
177 *NULL*, the resulting buffer may not contain NUL characters; if it does, the
178 function returns ``-1`` and a :exc:`TypeError` is raised.
179
180 The buffer refers to an internal string buffer of *obj*, not a copy. The data
181 must not be modified in any way, unless the string was just created using
182 ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If
183 *string* is a Unicode object, this function computes the default encoding of
184 *string* and operates on that. If *string* is not a string object at all,
185 :cfunc:`PyString_AsStringAndSize` returns ``-1`` and raises :exc:`TypeError`.
186
Jeroen Ruigrok van der Werven0051bf32009-04-29 08:00:05 +0000187 .. versionchanged:: 2.5
188 This function used an :ctype:`int *` type for *length*. This might
189 require changes in your code for properly supporting 64-bit systems.
190
Georg Brandlf6842722008-01-19 22:08:21 +0000191
192.. cfunction:: void PyString_Concat(PyObject **string, PyObject *newpart)
193
194 Create a new string object in *\*string* containing the contents of *newpart*
195 appended to *string*; the caller will own the new reference. The reference to
196 the old value of *string* will be stolen. If the new string cannot be created,
197 the old reference to *string* will still be discarded and the value of
198 *\*string* will be set to *NULL*; the appropriate exception will be set.
199
200
201.. cfunction:: void PyString_ConcatAndDel(PyObject **string, PyObject *newpart)
202
203 Create a new string object in *\*string* containing the contents of *newpart*
204 appended to *string*. This version decrements the reference count of *newpart*.
205
206
207.. cfunction:: int _PyString_Resize(PyObject **string, Py_ssize_t newsize)
208
209 A way to resize a string object even though it is "immutable". Only use this to
210 build up a brand new string object; don't use this if the string may already be
211 known in other parts of the code. It is an error to call this function if the
212 refcount on the input string object is not one. Pass the address of an existing
213 string object as an lvalue (it may be written into), and the new size desired.
214 On success, *\*string* holds the resized string object and ``0`` is returned;
215 the address in *\*string* may differ from its input value. If the reallocation
216 fails, the original string object at *\*string* is deallocated, *\*string* is
217 set to *NULL*, a memory exception is set, and ``-1`` is returned.
218
Jeroen Ruigrok van der Werven0051bf32009-04-29 08:00:05 +0000219 .. versionchanged:: 2.5
220 This function used an :ctype:`int` type for *newsize*. This might
221 require changes in your code for properly supporting 64-bit systems.
Georg Brandlf6842722008-01-19 22:08:21 +0000222
223.. cfunction:: PyObject* PyString_Format(PyObject *format, PyObject *args)
224
225 Return a new string object from *format* and *args*. Analogous to ``format %
226 args``. The *args* argument must be a tuple.
227
228
229.. cfunction:: void PyString_InternInPlace(PyObject **string)
230
231 Intern the argument *\*string* in place. The argument must be the address of a
232 pointer variable pointing to a Python string object. If there is an existing
233 interned string that is the same as *\*string*, it sets *\*string* to it
234 (decrementing the reference count of the old string object and incrementing the
235 reference count of the interned string object), otherwise it leaves *\*string*
236 alone and interns it (incrementing its reference count). (Clarification: even
237 though there is a lot of talk about reference counts, think of this function as
238 reference-count-neutral; you own the object after the call if and only if you
239 owned it before the call.)
240
241
242.. cfunction:: PyObject* PyString_InternFromString(const char *v)
243
244 A combination of :cfunc:`PyString_FromString` and
245 :cfunc:`PyString_InternInPlace`, returning either a new string object that has
246 been interned, or a new ("owned") reference to an earlier interned string object
247 with the same value.
248
249
250.. cfunction:: PyObject* PyString_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
251
252 Create an object by decoding *size* bytes of the encoded buffer *s* using the
253 codec registered for *encoding*. *encoding* and *errors* have the same meaning
254 as the parameters of the same name in the :func:`unicode` built-in function.
255 The codec to be used is looked up using the Python codec registry. Return
256 *NULL* if an exception was raised by the codec.
257
Jeroen Ruigrok van der Werven0051bf32009-04-29 08:00:05 +0000258 .. versionchanged:: 2.5
259 This function used an :ctype:`int` type for *size*. This might require
260 changes in your code for properly supporting 64-bit systems.
261
Georg Brandlf6842722008-01-19 22:08:21 +0000262
263.. cfunction:: PyObject* PyString_AsDecodedObject(PyObject *str, const char *encoding, const char *errors)
264
265 Decode a string object by passing it to the codec registered for *encoding* and
266 return the result as Python object. *encoding* and *errors* have the same
267 meaning as the parameters of the same name in the string :meth:`encode` method.
268 The codec to be used is looked up using the Python codec registry. Return *NULL*
269 if an exception was raised by the codec.
270
271
272.. cfunction:: PyObject* PyString_Encode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
273
274 Encode the :ctype:`char` buffer of the given size by passing it to the codec
275 registered for *encoding* and return a Python object. *encoding* and *errors*
276 have the same meaning as the parameters of the same name in the string
277 :meth:`encode` method. The codec to be used is looked up using the Python codec
278 registry. Return *NULL* if an exception was raised by the codec.
279
Jeroen Ruigrok van der Werven0051bf32009-04-29 08:00:05 +0000280 .. versionchanged:: 2.5
281 This function used an :ctype:`int` type for *size*. This might require
282 changes in your code for properly supporting 64-bit systems.
283
Georg Brandlf6842722008-01-19 22:08:21 +0000284
285.. cfunction:: PyObject* PyString_AsEncodedObject(PyObject *str, const char *encoding, const char *errors)
286
287 Encode a string object using the codec registered for *encoding* and return the
288 result as Python object. *encoding* and *errors* have the same meaning as the
289 parameters of the same name in the string :meth:`encode` method. The codec to be
290 used is looked up using the Python codec registry. Return *NULL* if an exception
291 was raised by the codec.