Georg Brandl | 54a3faa | 2008-01-20 09:30:57 +0000 | [diff] [blame] | 1 | .. highlightlang:: c |
| 2 | |
| 3 | .. _stringobjects: |
| 4 | |
| 5 | String Objects |
| 6 | -------------- |
| 7 | |
| 8 | These functions raise :exc:`TypeError` when expecting a string parameter and are |
| 9 | called with a non-string parameter. |
| 10 | |
| 11 | .. index:: object: string |
| 12 | |
| 13 | |
| 14 | .. ctype:: PyStringObject |
| 15 | |
| 16 | This subtype of :ctype:`PyObject` represents a Python string object. |
| 17 | |
| 18 | |
| 19 | .. cvar:: PyTypeObject PyString_Type |
| 20 | |
| 21 | .. index:: single: StringType (in module types) |
| 22 | |
| 23 | This instance of :ctype:`PyTypeObject` represents the Python string type; it is |
| 24 | the same object as ``str`` and ``types.StringType`` in the Python layer. . |
| 25 | |
| 26 | |
| 27 | .. cfunction:: int PyString_Check(PyObject *o) |
| 28 | |
| 29 | Return true if the object *o* is a string object or an instance of a subtype of |
| 30 | the string type. |
| 31 | |
| 32 | |
| 33 | .. cfunction:: int PyString_CheckExact(PyObject *o) |
| 34 | |
| 35 | Return true if the object *o* is a string object, but not an instance of a |
| 36 | subtype of the string type. |
| 37 | |
| 38 | |
| 39 | .. cfunction:: PyObject* PyString_FromString(const char *v) |
| 40 | |
| 41 | Return a new string object with a copy of the string *v* as value on success, |
| 42 | and *NULL* on failure. The parameter *v* must not be *NULL*; it will not be |
| 43 | checked. |
| 44 | |
| 45 | |
| 46 | .. cfunction:: PyObject* PyString_FromStringAndSize(const char *v, Py_ssize_t len) |
| 47 | |
| 48 | Return a new string object with a copy of the string *v* as value and length |
| 49 | *len* on success, and *NULL* on failure. If *v* is *NULL*, the contents of the |
| 50 | string are uninitialized. |
| 51 | |
| 52 | |
| 53 | .. cfunction:: PyObject* PyString_FromFormat(const char *format, ...) |
| 54 | |
| 55 | Take a C :cfunc:`printf`\ -style *format* string and a variable number of |
| 56 | arguments, calculate the size of the resulting Python string and return a string |
| 57 | with the values formatted into it. The variable arguments must be C types and |
| 58 | must correspond exactly to the format characters in the *format* string. The |
| 59 | following format characters are allowed: |
| 60 | |
| 61 | .. % XXX: This should be exactly the same as the table in PyErr_Format. |
| 62 | .. % One should just refer to the other. |
| 63 | .. % XXX: The descriptions for %zd and %zu are wrong, but the truth is complicated |
| 64 | .. % because not all compilers support the %z width modifier -- we fake it |
| 65 | .. % when necessary via interpolating PY_FORMAT_SIZE_T. |
Georg Brandl | 54a3faa | 2008-01-20 09:30:57 +0000 | [diff] [blame] | 66 | |
| 67 | +-------------------+---------------+--------------------------------+ |
| 68 | | Format Characters | Type | Comment | |
| 69 | +===================+===============+================================+ |
| 70 | | :attr:`%%` | *n/a* | The literal % character. | |
| 71 | +-------------------+---------------+--------------------------------+ |
| 72 | | :attr:`%c` | int | A single character, | |
| 73 | | | | represented as an C int. | |
| 74 | +-------------------+---------------+--------------------------------+ |
| 75 | | :attr:`%d` | int | Exactly equivalent to | |
| 76 | | | | ``printf("%d")``. | |
| 77 | +-------------------+---------------+--------------------------------+ |
| 78 | | :attr:`%u` | unsigned int | Exactly equivalent to | |
| 79 | | | | ``printf("%u")``. | |
| 80 | +-------------------+---------------+--------------------------------+ |
| 81 | | :attr:`%ld` | long | Exactly equivalent to | |
| 82 | | | | ``printf("%ld")``. | |
| 83 | +-------------------+---------------+--------------------------------+ |
| 84 | | :attr:`%lu` | unsigned long | Exactly equivalent to | |
| 85 | | | | ``printf("%lu")``. | |
| 86 | +-------------------+---------------+--------------------------------+ |
| 87 | | :attr:`%zd` | Py_ssize_t | Exactly equivalent to | |
| 88 | | | | ``printf("%zd")``. | |
| 89 | +-------------------+---------------+--------------------------------+ |
| 90 | | :attr:`%zu` | size_t | Exactly equivalent to | |
| 91 | | | | ``printf("%zu")``. | |
| 92 | +-------------------+---------------+--------------------------------+ |
| 93 | | :attr:`%i` | int | Exactly equivalent to | |
| 94 | | | | ``printf("%i")``. | |
| 95 | +-------------------+---------------+--------------------------------+ |
| 96 | | :attr:`%x` | int | Exactly equivalent to | |
| 97 | | | | ``printf("%x")``. | |
| 98 | +-------------------+---------------+--------------------------------+ |
| 99 | | :attr:`%s` | char\* | A null-terminated C character | |
| 100 | | | | array. | |
| 101 | +-------------------+---------------+--------------------------------+ |
| 102 | | :attr:`%p` | void\* | The hex representation of a C | |
| 103 | | | | pointer. Mostly equivalent to | |
| 104 | | | | ``printf("%p")`` except that | |
| 105 | | | | it is guaranteed to start with | |
| 106 | | | | the literal ``0x`` regardless | |
| 107 | | | | of what the platform's | |
| 108 | | | | ``printf`` yields. | |
| 109 | +-------------------+---------------+--------------------------------+ |
| 110 | |
| 111 | An unrecognized format character causes all the rest of the format string to be |
| 112 | copied as-is to the result string, and any extra arguments discarded. |
| 113 | |
| 114 | |
| 115 | .. cfunction:: PyObject* PyString_FromFormatV(const char *format, va_list vargs) |
| 116 | |
| 117 | Identical to :func:`PyString_FromFormat` except that it takes exactly two |
| 118 | arguments. |
| 119 | |
| 120 | |
| 121 | .. cfunction:: Py_ssize_t PyString_Size(PyObject *string) |
| 122 | |
| 123 | Return the length of the string in string object *string*. |
| 124 | |
| 125 | |
| 126 | .. cfunction:: Py_ssize_t PyString_GET_SIZE(PyObject *string) |
| 127 | |
| 128 | Macro form of :cfunc:`PyString_Size` but without error checking. |
| 129 | |
| 130 | |
| 131 | .. cfunction:: char* PyString_AsString(PyObject *string) |
| 132 | |
| 133 | Return a NUL-terminated representation of the contents of *string*. The pointer |
| 134 | refers to the internal buffer of *string*, not a copy. The data must not be |
| 135 | modified in any way, unless the string was just created using |
| 136 | ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If |
| 137 | *string* is a Unicode object, this function computes the default encoding of |
| 138 | *string* and operates on that. If *string* is not a string object at all, |
| 139 | :cfunc:`PyString_AsString` returns *NULL* and raises :exc:`TypeError`. |
| 140 | |
| 141 | |
| 142 | .. cfunction:: char* PyString_AS_STRING(PyObject *string) |
| 143 | |
| 144 | Macro form of :cfunc:`PyString_AsString` but without error checking. Only |
| 145 | string objects are supported; no Unicode objects should be passed. |
| 146 | |
| 147 | |
| 148 | .. cfunction:: int PyString_AsStringAndSize(PyObject *obj, char **buffer, Py_ssize_t *length) |
| 149 | |
| 150 | Return a NUL-terminated representation of the contents of the object *obj* |
| 151 | through the output variables *buffer* and *length*. |
| 152 | |
| 153 | The function accepts both string and Unicode objects as input. For Unicode |
| 154 | objects it returns the default encoded version of the object. If *length* is |
| 155 | *NULL*, the resulting buffer may not contain NUL characters; if it does, the |
| 156 | function returns ``-1`` and a :exc:`TypeError` is raised. |
| 157 | |
| 158 | The buffer refers to an internal string buffer of *obj*, not a copy. The data |
| 159 | must not be modified in any way, unless the string was just created using |
| 160 | ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated. If |
| 161 | *string* is a Unicode object, this function computes the default encoding of |
| 162 | *string* and operates on that. If *string* is not a string object at all, |
| 163 | :cfunc:`PyString_AsStringAndSize` returns ``-1`` and raises :exc:`TypeError`. |
| 164 | |
| 165 | |
| 166 | .. cfunction:: void PyString_Concat(PyObject **string, PyObject *newpart) |
| 167 | |
| 168 | Create a new string object in *\*string* containing the contents of *newpart* |
| 169 | appended to *string*; the caller will own the new reference. The reference to |
| 170 | the old value of *string* will be stolen. If the new string cannot be created, |
| 171 | the old reference to *string* will still be discarded and the value of |
| 172 | *\*string* will be set to *NULL*; the appropriate exception will be set. |
| 173 | |
| 174 | |
| 175 | .. cfunction:: void PyString_ConcatAndDel(PyObject **string, PyObject *newpart) |
| 176 | |
| 177 | Create a new string object in *\*string* containing the contents of *newpart* |
| 178 | appended to *string*. This version decrements the reference count of *newpart*. |
| 179 | |
| 180 | |
| 181 | .. cfunction:: int _PyString_Resize(PyObject **string, Py_ssize_t newsize) |
| 182 | |
| 183 | A way to resize a string object even though it is "immutable". Only use this to |
| 184 | build up a brand new string object; don't use this if the string may already be |
| 185 | known in other parts of the code. It is an error to call this function if the |
| 186 | refcount on the input string object is not one. Pass the address of an existing |
| 187 | string object as an lvalue (it may be written into), and the new size desired. |
| 188 | On success, *\*string* holds the resized string object and ``0`` is returned; |
| 189 | the address in *\*string* may differ from its input value. If the reallocation |
| 190 | fails, the original string object at *\*string* is deallocated, *\*string* is |
| 191 | set to *NULL*, a memory exception is set, and ``-1`` is returned. |
| 192 | |
| 193 | |
| 194 | .. cfunction:: PyObject* PyString_Format(PyObject *format, PyObject *args) |
| 195 | |
| 196 | Return a new string object from *format* and *args*. Analogous to ``format % |
| 197 | args``. The *args* argument must be a tuple. |
| 198 | |
| 199 | |
| 200 | .. cfunction:: void PyString_InternInPlace(PyObject **string) |
| 201 | |
| 202 | Intern the argument *\*string* in place. The argument must be the address of a |
| 203 | pointer variable pointing to a Python string object. If there is an existing |
| 204 | interned string that is the same as *\*string*, it sets *\*string* to it |
| 205 | (decrementing the reference count of the old string object and incrementing the |
| 206 | reference count of the interned string object), otherwise it leaves *\*string* |
| 207 | alone and interns it (incrementing its reference count). (Clarification: even |
| 208 | though there is a lot of talk about reference counts, think of this function as |
| 209 | reference-count-neutral; you own the object after the call if and only if you |
| 210 | owned it before the call.) |
| 211 | |
| 212 | |
| 213 | .. cfunction:: PyObject* PyString_InternFromString(const char *v) |
| 214 | |
| 215 | A combination of :cfunc:`PyString_FromString` and |
| 216 | :cfunc:`PyString_InternInPlace`, returning either a new string object that has |
| 217 | been interned, or a new ("owned") reference to an earlier interned string object |
| 218 | with the same value. |
| 219 | |
| 220 | |
| 221 | .. cfunction:: PyObject* PyString_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors) |
| 222 | |
| 223 | Create an object by decoding *size* bytes of the encoded buffer *s* using the |
| 224 | codec registered for *encoding*. *encoding* and *errors* have the same meaning |
| 225 | as the parameters of the same name in the :func:`unicode` built-in function. |
| 226 | The codec to be used is looked up using the Python codec registry. Return |
| 227 | *NULL* if an exception was raised by the codec. |
| 228 | |
| 229 | |
| 230 | .. cfunction:: PyObject* PyString_AsDecodedObject(PyObject *str, const char *encoding, const char *errors) |
| 231 | |
| 232 | Decode a string object by passing it to the codec registered for *encoding* and |
| 233 | return the result as Python object. *encoding* and *errors* have the same |
| 234 | meaning as the parameters of the same name in the string :meth:`encode` method. |
| 235 | The codec to be used is looked up using the Python codec registry. Return *NULL* |
| 236 | if an exception was raised by the codec. |
| 237 | |
| 238 | |
| 239 | .. cfunction:: PyObject* PyString_AsEncodedObject(PyObject *str, const char *encoding, const char *errors) |
| 240 | |
| 241 | Encode a string object using the codec registered for *encoding* and return the |
| 242 | result as Python object. *encoding* and *errors* have the same meaning as the |
| 243 | parameters of the same name in the string :meth:`encode` method. The codec to be |
| 244 | used is looked up using the Python codec registry. Return *NULL* if an exception |
| 245 | was raised by the codec. |