Fred Drake | 295da24 | 1998-08-10 19:42:37 +0000 | [diff] [blame] | 1 | \section{\module{string} --- |
Fred Drake | ffbe687 | 1999-04-22 21:23:22 +0000 | [diff] [blame] | 2 | Common string operations} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 3 | |
Fred Drake | ffbe687 | 1999-04-22 21:23:22 +0000 | [diff] [blame] | 4 | \declaremodule{standard}{string} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 5 | \modulesynopsis{Common string operations.} |
| 6 | |
Barry Warsaw | 08b07de | 2004-08-25 03:09:58 +0000 | [diff] [blame] | 7 | The \module{string} module contains a number of useful constants and classes, |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 8 | as well as some deprecated legacy functions that are also available as methods |
| 9 | on strings. See the module \refmodule{re}\refstmodindex{re} for string |
| 10 | functions based on regular expressions. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 11 | |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 12 | \subsection{String constants} |
Guido van Rossum | 0bf4d89 | 1995-03-02 12:37:30 +0000 | [diff] [blame] | 13 | |
Andrew M. Kuchling | be06302 | 2000-12-26 16:14:32 +0000 | [diff] [blame] | 14 | The constants defined in this module are: |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 15 | |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 16 | \begin{datadesc}{ascii_letters} |
| 17 | The concatenation of the \constant{ascii_lowercase} and |
| 18 | \constant{ascii_uppercase} constants described below. This value is |
| 19 | not locale-dependent. |
| 20 | \end{datadesc} |
| 21 | |
| 22 | \begin{datadesc}{ascii_lowercase} |
| 23 | The lowercase letters \code{'abcdefghijklmnopqrstuvwxyz'}. This |
| 24 | value is not locale-dependent and will not change. |
| 25 | \end{datadesc} |
| 26 | |
| 27 | \begin{datadesc}{ascii_uppercase} |
| 28 | The uppercase letters \code{'ABCDEFGHIJKLMNOPQRSTUVWXYZ'}. This |
| 29 | value is not locale-dependent and will not change. |
| 30 | \end{datadesc} |
| 31 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 32 | \begin{datadesc}{digits} |
| 33 | The string \code{'0123456789'}. |
| 34 | \end{datadesc} |
| 35 | |
| 36 | \begin{datadesc}{hexdigits} |
| 37 | The string \code{'0123456789abcdefABCDEF'}. |
| 38 | \end{datadesc} |
| 39 | |
| 40 | \begin{datadesc}{letters} |
Fred Drake | 0682be4 | 2000-04-10 18:35:49 +0000 | [diff] [blame] | 41 | The concatenation of the strings \constant{lowercase} and |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 42 | \constant{uppercase} described below. The specific value is |
| 43 | locale-dependent, and will be updated when |
| 44 | \function{locale.setlocale()} is called. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 45 | \end{datadesc} |
| 46 | |
| 47 | \begin{datadesc}{lowercase} |
| 48 | A string containing all the characters that are considered lowercase |
| 49 | letters. On most systems this is the string |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 50 | \code{'abcdefghijklmnopqrstuvwxyz'}. Do not change its definition --- |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 51 | the effect on the routines \function{upper()} and |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 52 | \function{swapcase()} is undefined. The specific value is |
| 53 | locale-dependent, and will be updated when |
| 54 | \function{locale.setlocale()} is called. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 55 | \end{datadesc} |
| 56 | |
| 57 | \begin{datadesc}{octdigits} |
| 58 | The string \code{'01234567'}. |
| 59 | \end{datadesc} |
| 60 | |
Fred Drake | 480abc2 | 2000-09-18 16:48:13 +0000 | [diff] [blame] | 61 | \begin{datadesc}{punctuation} |
| 62 | String of \ASCII{} characters which are considered punctuation |
| 63 | characters in the \samp{C} locale. |
| 64 | \end{datadesc} |
| 65 | |
| 66 | \begin{datadesc}{printable} |
| 67 | String of characters which are considered printable. This is a |
| 68 | combination of \constant{digits}, \constant{letters}, |
| 69 | \constant{punctuation}, and \constant{whitespace}. |
| 70 | \end{datadesc} |
| 71 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 72 | \begin{datadesc}{uppercase} |
| 73 | A string containing all the characters that are considered uppercase |
| 74 | letters. On most systems this is the string |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 75 | \code{'ABCDEFGHIJKLMNOPQRSTUVWXYZ'}. Do not change its definition --- |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 76 | the effect on the routines \function{lower()} and |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 77 | \function{swapcase()} is undefined. The specific value is |
| 78 | locale-dependent, and will be updated when |
| 79 | \function{locale.setlocale()} is called. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 80 | \end{datadesc} |
| 81 | |
| 82 | \begin{datadesc}{whitespace} |
| 83 | A string containing all characters that are considered whitespace. |
| 84 | On most systems this includes the characters space, tab, linefeed, |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 85 | return, formfeed, and vertical tab. Do not change its definition --- |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 86 | the effect on the routines \function{strip()} and \function{split()} |
| 87 | is undefined. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 88 | \end{datadesc} |
| 89 | |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 90 | \subsection{Template strings} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 91 | |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 92 | Templates provide simpler string substitutions as described in \pep{292}. |
| 93 | Instead of the normal \samp{\%}-based substitutions, Templates support |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 94 | \samp{\$}-based substitutions, using the following rules: |
| 95 | |
| 96 | \begin{itemize} |
| 97 | \item \samp{\$\$} is an escape; it is replaced with a single \samp{\$}. |
| 98 | |
| 99 | \item \samp{\$identifier} names a substitution placeholder matching a mapping |
| 100 | key of "identifier". By default, "identifier" must spell a Python |
| 101 | identifier. The first non-identifier character after the \samp{\$} |
| 102 | character terminates this placeholder specification. |
| 103 | |
| 104 | \item \samp{\$\{identifier\}} is equivalent to \samp{\$identifier}. It is |
| 105 | required when valid identifier characters follow the placeholder but are |
Raymond Hettinger | 785c65c | 2004-09-06 01:01:08 +0000 | [diff] [blame] | 106 | not part of the placeholder, such as "\$\{noun\}ification". |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 107 | \end{itemize} |
| 108 | |
| 109 | Any other appearance of \samp{\$} in the string will result in a |
| 110 | \exception{ValueError} being raised. |
| 111 | |
Raymond Hettinger | 785c65c | 2004-09-06 01:01:08 +0000 | [diff] [blame] | 112 | \versionadded{2.4} |
| 113 | |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 114 | The \module{string} module provides a \class{Template} class that implements |
| 115 | these rules. The methods of \class{Template} are: |
| 116 | |
| 117 | \begin{classdesc}{Template}{template} |
| 118 | The constructor takes a single argument which is the template string. |
| 119 | \end{classdesc} |
| 120 | |
| 121 | \begin{methoddesc}[Template]{substitute}{mapping\optional{, **kws}} |
| 122 | Performs the template substitution, returning a new string. \var{mapping} is |
| 123 | any dictionary-like object with keys that match the placeholders in the |
| 124 | template. Alternatively, you can provide keyword arguments, where the |
| 125 | keywords are the placeholders. When both \var{mapping} and \var{kws} are |
| 126 | given and there are duplicates, the placeholders from \var{kws} take |
| 127 | precedence. |
| 128 | \end{methoddesc} |
| 129 | |
| 130 | \begin{methoddesc}[Template]{safe_substitute}{mapping\optional{, **kws}} |
| 131 | Like \method{substitute()}, except that if placeholders are missing from |
| 132 | \var{mapping} and \var{kws}, instead of raising a \exception{KeyError} |
| 133 | exception, the original placeholder will appear in the resulting string |
Barry Warsaw | 8c72eae | 2004-11-01 03:52:43 +0000 | [diff] [blame] | 134 | intact. Also, unlike with \method{substitute()}, any other appearances of the |
| 135 | \samp{\$} will simply return \samp{\$} instead of raising |
| 136 | \exception{ValueError}. |
| 137 | |
| 138 | While other exceptions may still occur, this method is called ``safe'' because |
| 139 | substitutions always tries to return a usable string instead of raising an |
| 140 | exception. In another sense, \method{safe_substitute()} may be anything other |
| 141 | than safe, since it will silently ignore malformed templates containing |
| 142 | dangling delimiters, unmatched braces, or placeholders that are not valid |
| 143 | Python identifiers. |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 144 | \end{methoddesc} |
| 145 | |
| 146 | \class{Template} instances also provide one public data attribute: |
| 147 | |
| 148 | \begin{memberdesc}[string]{template} |
| 149 | This is the object passed to the constructor's \var{template} argument. In |
| 150 | general, you shouldn't change it, but read-only access is not enforced. |
| 151 | \end{memberdesc} |
| 152 | |
| 153 | Here is an example of how to use a Template: |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 154 | |
| 155 | \begin{verbatim} |
| 156 | >>> from string import Template |
| 157 | >>> s = Template('$who likes $what') |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 158 | >>> s.substitute(who='tim', what='kung pao') |
| 159 | 'tim likes kung pao' |
| 160 | >>> d = dict(who='tim') |
| 161 | >>> Template('Give $who $100').substitute(d) |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 162 | Traceback (most recent call last): |
| 163 | [...] |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 164 | ValueError: Invalid placeholder in string: line 1, col 10 |
| 165 | >>> Template('$who likes $what').substitute(d) |
| 166 | Traceback (most recent call last): |
| 167 | [...] |
| 168 | KeyError: 'what' |
| 169 | >>> Template('$who likes $what').safe_substitute(d) |
| 170 | 'tim likes $what' |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 171 | \end{verbatim} |
| 172 | |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 173 | Advanced usage: you can derive subclasses of \class{Template} to customize the |
| 174 | placeholder syntax, delimiter character, or the entire regular expression used |
| 175 | to parse template strings. To do this, you can override these class |
| 176 | attributes: |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 177 | |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 178 | \begin{itemize} |
| 179 | \item \var{delimiter} -- This is the literal string describing a placeholder |
| 180 | introducing delimiter. The default value \samp{\$}. Note that this |
| 181 | should \emph{not} be a regular expression, as the implementation will |
| 182 | call \method{re.escape()} on this string as needed. |
| 183 | \item \var{idpattern} -- This is the regular expression describing the pattern |
| 184 | for non-braced placeholders (the braces will be added automatically as |
| 185 | appropriate). The default value is the regular expression |
| 186 | \samp{[_a-z][_a-z0-9]*}. |
| 187 | \end{itemize} |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 188 | |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 189 | Alternatively, you can provide the entire regular expression pattern by |
| 190 | overriding the class attribute \var{pattern}. If you do this, the value must |
| 191 | be a regular expression object with four named capturing groups. The |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 192 | capturing groups correspond to the rules given above, along with the invalid |
| 193 | placeholder rule: |
| 194 | |
| 195 | \begin{itemize} |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 196 | \item \var{escaped} -- This group matches the escape sequence, |
| 197 | e.g. \samp{\$\$}, in the default pattern. |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 198 | \item \var{named} -- This group matches the unbraced placeholder name; it |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 199 | should not include the delimiter in capturing group. |
| 200 | \item \var{braced} -- This group matches the brace enclosed placeholder name; |
| 201 | it should not include either the delimiter or braces in the capturing |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 202 | group. |
Barry Warsaw | 33db656 | 2004-09-18 21:13:43 +0000 | [diff] [blame] | 203 | \item \var{invalid} -- This group matches any other delimiter pattern (usually |
| 204 | a single delimiter), and it should appear last in the regular |
| 205 | expression. |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 206 | \end{itemize} |
| 207 | |
| 208 | \subsection{String functions} |
| 209 | |
| 210 | The following functions are available to operate on string and Unicode |
| 211 | objects. They are not available as string methods. |
| 212 | |
| 213 | \begin{funcdesc}{capwords}{s} |
| 214 | Split the argument into words using \function{split()}, capitalize |
| 215 | each word using \function{capitalize()}, and join the capitalized |
| 216 | words using \function{join()}. Note that this replaces runs of |
| 217 | whitespace characters by a single space, and removes leading and |
| 218 | trailing whitespace. |
| 219 | \end{funcdesc} |
| 220 | |
| 221 | \begin{funcdesc}{maketrans}{from, to} |
| 222 | Return a translation table suitable for passing to |
Thomas Wouters | 89f507f | 2006-12-13 04:49:30 +0000 | [diff] [blame] | 223 | \function{translate()}, that will map |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame] | 224 | each character in \var{from} into the character at the same position |
| 225 | in \var{to}; \var{from} and \var{to} must have the same length. |
| 226 | |
| 227 | \warning{Don't use strings derived from \constant{lowercase} |
| 228 | and \constant{uppercase} as arguments; in some locales, these don't have |
| 229 | the same length. For case conversions, always use |
| 230 | \function{lower()} and \function{upper()}.} |
| 231 | \end{funcdesc} |
| 232 | |
| 233 | \subsection{Deprecated string functions} |
| 234 | |
| 235 | The following list of functions are also defined as methods of string and |
| 236 | Unicode objects; see ``String Methods'' (section |
| 237 | \ref{string-methods}) for more information on those. You should consider |
| 238 | these functions as deprecated, although they will not be removed until Python |
| 239 | 3.0. The functions defined in this module are: |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 240 | |
| 241 | \begin{funcdesc}{atof}{s} |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 242 | \deprecated{2.0}{Use the \function{float()} built-in function.} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 243 | Convert a string to a floating point number. The string must have |
| 244 | the standard syntax for a floating point literal in Python, |
Fred Drake | 70a66c9 | 1999-02-18 16:08:36 +0000 | [diff] [blame] | 245 | optionally preceded by a sign (\samp{+} or \samp{-}). Note that |
| 246 | this behaves identical to the built-in function |
| 247 | \function{float()}\bifuncindex{float} when passed a string. |
| 248 | |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 249 | \note{When passing in a string, values for NaN\index{NaN} |
Fred Drake | 70a66c9 | 1999-02-18 16:08:36 +0000 | [diff] [blame] | 250 | and Infinity\index{Infinity} may be returned, depending on the |
| 251 | underlying C library. The specific set of strings accepted which |
| 252 | cause these values to be returned depends entirely on the C library |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 253 | and is known to vary.} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 254 | \end{funcdesc} |
| 255 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 256 | \begin{funcdesc}{atoi}{s\optional{, base}} |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 257 | \deprecated{2.0}{Use the \function{int()} built-in function.} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 258 | Convert string \var{s} to an integer in the given \var{base}. The |
| 259 | string must consist of one or more digits, optionally preceded by a |
| 260 | sign (\samp{+} or \samp{-}). The \var{base} defaults to 10. If it |
| 261 | is 0, a default base is chosen depending on the leading characters |
| 262 | of the string (after stripping the sign): \samp{0x} or \samp{0X} |
| 263 | means 16, \samp{0} means 8, anything else means 10. If \var{base} |
Fred Drake | fffe5db | 2000-09-21 05:25:30 +0000 | [diff] [blame] | 264 | is 16, a leading \samp{0x} or \samp{0X} is always accepted, though |
| 265 | not required. This behaves identically to the built-in function |
| 266 | \function{int()} when passed a string. (Also note: for a more |
| 267 | flexible interpretation of numeric literals, use the built-in |
| 268 | function \function{eval()}\bifuncindex{eval}.) |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 269 | \end{funcdesc} |
| 270 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 271 | \begin{funcdesc}{atol}{s\optional{, base}} |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 272 | \deprecated{2.0}{Use the \function{long()} built-in function.} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 273 | Convert string \var{s} to a long integer in the given \var{base}. |
| 274 | The string must consist of one or more digits, optionally preceded |
| 275 | by a sign (\samp{+} or \samp{-}). The \var{base} argument has the |
| 276 | same meaning as for \function{atoi()}. A trailing \samp{l} or |
| 277 | \samp{L} is not allowed, except if the base is 0. Note that when |
| 278 | invoked without \var{base} or with \var{base} set to 10, this |
| 279 | behaves identical to the built-in function |
| 280 | \function{long()}\bifuncindex{long} when passed a string. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 281 | \end{funcdesc} |
| 282 | |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 283 | \begin{funcdesc}{capitalize}{word} |
Fred Drake | 473f46a | 2002-06-20 21:18:46 +0000 | [diff] [blame] | 284 | Return a copy of \var{word} with only its first character capitalized. |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 285 | \end{funcdesc} |
| 286 | |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 287 | \begin{funcdesc}{expandtabs}{s\optional{, tabsize}} |
Raymond Hettinger | 785c65c | 2004-09-06 01:01:08 +0000 | [diff] [blame] | 288 | Expand tabs in a string replacing them by one or more spaces, |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 289 | depending on the current column and the given tab size. The column |
| 290 | number is reset to zero after each newline occurring in the string. |
| 291 | This doesn't understand other non-printing characters or escape |
Guido van Rossum | 9700e9b | 1999-01-25 22:31:53 +0000 | [diff] [blame] | 292 | sequences. The tab size defaults to 8. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 293 | \end{funcdesc} |
| 294 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 295 | \begin{funcdesc}{find}{s, sub\optional{, start\optional{,end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 296 | Return the lowest index in \var{s} where the substring \var{sub} is |
| 297 | found such that \var{sub} is wholly contained in |
| 298 | \code{\var{s}[\var{start}:\var{end}]}. Return \code{-1} on failure. |
| 299 | Defaults for \var{start} and \var{end} and interpretation of |
| 300 | negative values is the same as for slices. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 301 | \end{funcdesc} |
| 302 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 303 | \begin{funcdesc}{rfind}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 304 | Like \function{find()} but find the highest index. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 305 | \end{funcdesc} |
| 306 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 307 | \begin{funcdesc}{index}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 308 | Like \function{find()} but raise \exception{ValueError} when the |
| 309 | substring is not found. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 310 | \end{funcdesc} |
| 311 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 312 | \begin{funcdesc}{rindex}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 313 | Like \function{rfind()} but raise \exception{ValueError} when the |
| 314 | substring is not found. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 315 | \end{funcdesc} |
| 316 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 317 | \begin{funcdesc}{count}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 318 | Return the number of (non-overlapping) occurrences of substring |
| 319 | \var{sub} in string \code{\var{s}[\var{start}:\var{end}]}. |
| 320 | Defaults for \var{start} and \var{end} and interpretation of |
Andrew M. Kuchling | a4ca07c | 2000-06-21 01:48:46 +0000 | [diff] [blame] | 321 | negative values are the same as for slices. |
Guido van Rossum | ab3a250 | 1994-08-01 12:18:36 +0000 | [diff] [blame] | 322 | \end{funcdesc} |
| 323 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 324 | \begin{funcdesc}{lower}{s} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 325 | Return a copy of \var{s}, but with upper case letters converted to |
| 326 | lower case. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 327 | \end{funcdesc} |
| 328 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 329 | \begin{funcdesc}{split}{s\optional{, sep\optional{, maxsplit}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 330 | Return a list of the words of the string \var{s}. If the optional |
| 331 | second argument \var{sep} is absent or \code{None}, the words are |
| 332 | separated by arbitrary strings of whitespace characters (space, tab, |
| 333 | newline, return, formfeed). If the second argument \var{sep} is |
| 334 | present and not \code{None}, it specifies a string to be used as the |
Fred Drake | a7ce52b0 | 1999-05-27 17:18:08 +0000 | [diff] [blame] | 335 | word separator. The returned list will then have one more item |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 336 | than the number of non-overlapping occurrences of the separator in |
| 337 | the string. The optional third argument \var{maxsplit} defaults to |
| 338 | 0. If it is nonzero, at most \var{maxsplit} number of splits occur, |
| 339 | and the remainder of the string is returned as the final element of |
| 340 | the list (thus, the list will have at most \code{\var{maxsplit}+1} |
| 341 | elements). |
Nicholas Bastin | 07973da | 2004-03-21 16:59:59 +0000 | [diff] [blame] | 342 | |
| 343 | The behavior of split on an empty string depends on the value of \var{sep}. |
| 344 | If \var{sep} is not specified, or specified as \code{None}, the result will |
| 345 | be an empty list. If \var{sep} is specified as any string, the result will |
| 346 | be a list containing one element which is an empty string. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 347 | \end{funcdesc} |
| 348 | |
Hye-Shik Chang | 3ae811b | 2003-12-15 18:49:53 +0000 | [diff] [blame] | 349 | \begin{funcdesc}{rsplit}{s\optional{, sep\optional{, maxsplit}}} |
Hye-Shik Chang | c6f066f | 2003-12-17 02:49:03 +0000 | [diff] [blame] | 350 | Return a list of the words of the string \var{s}, scanning \var{s} |
| 351 | from the end. To all intents and purposes, the resulting list of |
| 352 | words is the same as returned by \function{split()}, except when the |
| 353 | optional third argument \var{maxsplit} is explicitly specified and |
| 354 | nonzero. When \var{maxsplit} is nonzero, at most \var{maxsplit} |
Fred Drake | 32fef9f | 2003-12-30 23:08:14 +0000 | [diff] [blame] | 355 | number of splits -- the \emph{rightmost} ones -- occur, and the remainder |
Hye-Shik Chang | c6f066f | 2003-12-17 02:49:03 +0000 | [diff] [blame] | 356 | of the string is returned as the first element of the list (thus, the |
| 357 | list will have at most \code{\var{maxsplit}+1} elements). |
Hye-Shik Chang | 3ae811b | 2003-12-15 18:49:53 +0000 | [diff] [blame] | 358 | \versionadded{2.4} |
| 359 | \end{funcdesc} |
| 360 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 361 | \begin{funcdesc}{splitfields}{s\optional{, sep\optional{, maxsplit}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 362 | This function behaves identically to \function{split()}. (In the |
| 363 | past, \function{split()} was only used with one argument, while |
| 364 | \function{splitfields()} was only used with two arguments.) |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 365 | \end{funcdesc} |
| 366 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 367 | \begin{funcdesc}{join}{words\optional{, sep}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 368 | Concatenate a list or tuple of words with intervening occurrences of |
| 369 | \var{sep}. The default value for \var{sep} is a single space |
| 370 | character. It is always true that |
| 371 | \samp{string.join(string.split(\var{s}, \var{sep}), \var{sep})} |
| 372 | equals \var{s}. |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 373 | \end{funcdesc} |
| 374 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 375 | \begin{funcdesc}{joinfields}{words\optional{, sep}} |
Fred Drake | b7c1895 | 2002-09-12 14:16:07 +0000 | [diff] [blame] | 376 | This function behaves identically to \function{join()}. (In the past, |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 377 | \function{join()} was only used with one argument, while |
| 378 | \function{joinfields()} was only used with two arguments.) |
Fred Drake | b7c1895 | 2002-09-12 14:16:07 +0000 | [diff] [blame] | 379 | Note that there is no \method{joinfields()} method on string |
| 380 | objects; use the \method{join()} method instead. |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 381 | \end{funcdesc} |
| 382 | |
Walter Dörwald | de02bcb | 2002-04-22 17:42:37 +0000 | [diff] [blame] | 383 | \begin{funcdesc}{lstrip}{s\optional{, chars}} |
| 384 | Return a copy of the string with leading characters removed. If |
| 385 | \var{chars} is omitted or \code{None}, whitespace characters are |
| 386 | removed. If given and not \code{None}, \var{chars} must be a string; |
| 387 | the characters in the string will be stripped from the beginning of |
| 388 | the string this method is called on. |
Neal Norwitz | ffe33b7 | 2003-04-10 22:35:32 +0000 | [diff] [blame] | 389 | \versionchanged[The \var{chars} parameter was added. The \var{chars} |
| 390 | parameter cannot be passed in earlier 2.2 versions]{2.2.3} |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 391 | \end{funcdesc} |
| 392 | |
Walter Dörwald | de02bcb | 2002-04-22 17:42:37 +0000 | [diff] [blame] | 393 | \begin{funcdesc}{rstrip}{s\optional{, chars}} |
| 394 | Return a copy of the string with trailing characters removed. If |
| 395 | \var{chars} is omitted or \code{None}, whitespace characters are |
| 396 | removed. If given and not \code{None}, \var{chars} must be a string; |
| 397 | the characters in the string will be stripped from the end of the |
| 398 | string this method is called on. |
Neal Norwitz | ffe33b7 | 2003-04-10 22:35:32 +0000 | [diff] [blame] | 399 | \versionchanged[The \var{chars} parameter was added. The \var{chars} |
Martin v. Löwis | b0c319a | 2004-07-19 16:34:01 +0000 | [diff] [blame] | 400 | parameter cannot be passed in earlier 2.2 versions]{2.2.3} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 401 | \end{funcdesc} |
| 402 | |
Walter Dörwald | de02bcb | 2002-04-22 17:42:37 +0000 | [diff] [blame] | 403 | \begin{funcdesc}{strip}{s\optional{, chars}} |
| 404 | Return a copy of the string with leading and trailing characters |
| 405 | removed. If \var{chars} is omitted or \code{None}, whitespace |
| 406 | characters are removed. If given and not \code{None}, \var{chars} |
| 407 | must be a string; the characters in the string will be stripped from |
| 408 | the both ends of the string this method is called on. |
Neal Norwitz | ffe33b7 | 2003-04-10 22:35:32 +0000 | [diff] [blame] | 409 | \versionchanged[The \var{chars} parameter was added. The \var{chars} |
Neal Norwitz | a6bdf2a | 2003-04-17 23:07:13 +0000 | [diff] [blame] | 410 | parameter cannot be passed in earlier 2.2 versions]{2.2.3} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 411 | \end{funcdesc} |
| 412 | |
| 413 | \begin{funcdesc}{swapcase}{s} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 414 | Return a copy of \var{s}, but with lower case letters |
| 415 | converted to upper case and vice versa. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 416 | \end{funcdesc} |
| 417 | |
Guido van Rossum | f4d0d57 | 1996-07-30 18:23:05 +0000 | [diff] [blame] | 418 | \begin{funcdesc}{translate}{s, table\optional{, deletechars}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 419 | Delete all characters from \var{s} that are in \var{deletechars} (if |
| 420 | present), and then translate the characters using \var{table}, which |
| 421 | must be a 256-character string giving the translation for each |
Raymond Hettinger | 5c5fca9 | 2003-07-13 02:06:47 +0000 | [diff] [blame] | 422 | character value, indexed by its ordinal. |
Guido van Rossum | f65f278 | 1995-09-13 17:37:21 +0000 | [diff] [blame] | 423 | \end{funcdesc} |
| 424 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 425 | \begin{funcdesc}{upper}{s} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 426 | Return a copy of \var{s}, but with lower case letters converted to |
| 427 | upper case. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 428 | \end{funcdesc} |
| 429 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 430 | \begin{funcdesc}{ljust}{s, width} |
| 431 | \funcline{rjust}{s, width} |
| 432 | \funcline{center}{s, width} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 433 | These functions respectively left-justify, right-justify and center |
| 434 | a string in a field of given width. They return a string that is at |
| 435 | least \var{width} characters wide, created by padding the string |
| 436 | \var{s} with spaces until the given width on the right, left or both |
| 437 | sides. The string is never truncated. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 438 | \end{funcdesc} |
| 439 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 440 | \begin{funcdesc}{zfill}{s, width} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 441 | Pad a numeric string on the left with zero digits until the given |
| 442 | width is reached. Strings starting with a sign are handled |
| 443 | correctly. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 444 | \end{funcdesc} |
Guido van Rossum | 0bf4d89 | 1995-03-02 12:37:30 +0000 | [diff] [blame] | 445 | |
Martin v. Löwis | 8bafb2a | 2003-11-18 19:48:57 +0000 | [diff] [blame] | 446 | \begin{funcdesc}{replace}{str, old, new\optional{, maxreplace}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 447 | Return a copy of string \var{str} with all occurrences of substring |
| 448 | \var{old} replaced by \var{new}. If the optional argument |
Martin v. Löwis | 8bafb2a | 2003-11-18 19:48:57 +0000 | [diff] [blame] | 449 | \var{maxreplace} is given, the first \var{maxreplace} occurrences are |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 450 | replaced. |
Guido van Rossum | c8a80cd | 1997-03-25 16:41:31 +0000 | [diff] [blame] | 451 | \end{funcdesc} |