Fred Drake | 295da24 | 1998-08-10 19:42:37 +0000 | [diff] [blame] | 1 | \section{\module{string} --- |
Fred Drake | ffbe687 | 1999-04-22 21:23:22 +0000 | [diff] [blame] | 2 | Common string operations} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 3 | |
Fred Drake | ffbe687 | 1999-04-22 21:23:22 +0000 | [diff] [blame] | 4 | \declaremodule{standard}{string} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 5 | \modulesynopsis{Common string operations.} |
| 6 | |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame^] | 7 | The \module{string} package contains a number of useful constants and classes, |
| 8 | as well as some deprecated legacy functions that are also available as methods |
| 9 | on strings. See the module \refmodule{re}\refstmodindex{re} for string |
| 10 | functions based on regular expressions. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 11 | |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame^] | 12 | In general, all of these objects are exposed directly in the \module{string} |
| 13 | package so users need only import the \module{string} package to begin using |
| 14 | these constants, classes, and functions. |
| 15 | |
| 16 | \begin{notice} |
| 17 | Starting with Python 2.4, the traditional \module{string} module was turned |
| 18 | into a package, however backward compatibility with existing code has been |
| 19 | retained. Code using the \module{string} module that worked prior to Python |
| 20 | 2.4 should continue to work unchanged. |
| 21 | \end{notice} |
| 22 | |
| 23 | \subsection{String constants} |
Guido van Rossum | 0bf4d89 | 1995-03-02 12:37:30 +0000 | [diff] [blame] | 24 | |
Andrew M. Kuchling | be06302 | 2000-12-26 16:14:32 +0000 | [diff] [blame] | 25 | The constants defined in this module are: |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 26 | |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 27 | \begin{datadesc}{ascii_letters} |
| 28 | The concatenation of the \constant{ascii_lowercase} and |
| 29 | \constant{ascii_uppercase} constants described below. This value is |
| 30 | not locale-dependent. |
| 31 | \end{datadesc} |
| 32 | |
| 33 | \begin{datadesc}{ascii_lowercase} |
| 34 | The lowercase letters \code{'abcdefghijklmnopqrstuvwxyz'}. This |
| 35 | value is not locale-dependent and will not change. |
| 36 | \end{datadesc} |
| 37 | |
| 38 | \begin{datadesc}{ascii_uppercase} |
| 39 | The uppercase letters \code{'ABCDEFGHIJKLMNOPQRSTUVWXYZ'}. This |
| 40 | value is not locale-dependent and will not change. |
| 41 | \end{datadesc} |
| 42 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 43 | \begin{datadesc}{digits} |
| 44 | The string \code{'0123456789'}. |
| 45 | \end{datadesc} |
| 46 | |
| 47 | \begin{datadesc}{hexdigits} |
| 48 | The string \code{'0123456789abcdefABCDEF'}. |
| 49 | \end{datadesc} |
| 50 | |
| 51 | \begin{datadesc}{letters} |
Fred Drake | 0682be4 | 2000-04-10 18:35:49 +0000 | [diff] [blame] | 52 | The concatenation of the strings \constant{lowercase} and |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 53 | \constant{uppercase} described below. The specific value is |
| 54 | locale-dependent, and will be updated when |
| 55 | \function{locale.setlocale()} is called. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 56 | \end{datadesc} |
| 57 | |
| 58 | \begin{datadesc}{lowercase} |
| 59 | A string containing all the characters that are considered lowercase |
| 60 | letters. On most systems this is the string |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 61 | \code{'abcdefghijklmnopqrstuvwxyz'}. Do not change its definition --- |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 62 | the effect on the routines \function{upper()} and |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 63 | \function{swapcase()} is undefined. The specific value is |
| 64 | locale-dependent, and will be updated when |
| 65 | \function{locale.setlocale()} is called. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 66 | \end{datadesc} |
| 67 | |
| 68 | \begin{datadesc}{octdigits} |
| 69 | The string \code{'01234567'}. |
| 70 | \end{datadesc} |
| 71 | |
Fred Drake | 480abc2 | 2000-09-18 16:48:13 +0000 | [diff] [blame] | 72 | \begin{datadesc}{punctuation} |
| 73 | String of \ASCII{} characters which are considered punctuation |
| 74 | characters in the \samp{C} locale. |
| 75 | \end{datadesc} |
| 76 | |
| 77 | \begin{datadesc}{printable} |
| 78 | String of characters which are considered printable. This is a |
| 79 | combination of \constant{digits}, \constant{letters}, |
| 80 | \constant{punctuation}, and \constant{whitespace}. |
| 81 | \end{datadesc} |
| 82 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 83 | \begin{datadesc}{uppercase} |
| 84 | A string containing all the characters that are considered uppercase |
| 85 | letters. On most systems this is the string |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 86 | \code{'ABCDEFGHIJKLMNOPQRSTUVWXYZ'}. Do not change its definition --- |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 87 | the effect on the routines \function{lower()} and |
Fred Drake | 960fdf9 | 2001-07-20 18:38:26 +0000 | [diff] [blame] | 88 | \function{swapcase()} is undefined. The specific value is |
| 89 | locale-dependent, and will be updated when |
| 90 | \function{locale.setlocale()} is called. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 91 | \end{datadesc} |
| 92 | |
| 93 | \begin{datadesc}{whitespace} |
| 94 | A string containing all characters that are considered whitespace. |
| 95 | On most systems this includes the characters space, tab, linefeed, |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 96 | return, formfeed, and vertical tab. Do not change its definition --- |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 97 | the effect on the routines \function{strip()} and \function{split()} |
| 98 | is undefined. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 99 | \end{datadesc} |
| 100 | |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame^] | 101 | \subsection{Template strings} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 102 | |
Barry Warsaw | 8bee761 | 2004-08-25 02:22:30 +0000 | [diff] [blame^] | 103 | Templates are Unicode strings that can be used to provide string substitutions |
| 104 | as described in \pep{292}. There is a \class{Template} class that is a |
| 105 | subclass of \class{unicode}, overriding the default \method{__mod__()} method. |
| 106 | Instead of the normal \samp{\%}-based substitutions, Template strings support |
| 107 | \samp{\$}-based substitutions, using the following rules: |
| 108 | |
| 109 | \begin{itemize} |
| 110 | \item \samp{\$\$} is an escape; it is replaced with a single \samp{\$}. |
| 111 | |
| 112 | \item \samp{\$identifier} names a substitution placeholder matching a mapping |
| 113 | key of "identifier". By default, "identifier" must spell a Python |
| 114 | identifier. The first non-identifier character after the \samp{\$} |
| 115 | character terminates this placeholder specification. |
| 116 | |
| 117 | \item \samp{\$\{identifier\}} is equivalent to \samp{\$identifier}. It is |
| 118 | required when valid identifier characters follow the placeholder but are |
| 119 | not part of the placeholder, e.g. "\$\{noun\}ification". |
| 120 | \end{itemize} |
| 121 | |
| 122 | Any other appearance of \samp{\$} in the string will result in a |
| 123 | \exception{ValueError} being raised. |
| 124 | |
| 125 | Template strings are used just like normal strings, in that the modulus |
| 126 | operator is used to interpolate a dictionary of values into a Template string, |
| 127 | e.g.: |
| 128 | |
| 129 | \begin{verbatim} |
| 130 | >>> from string import Template |
| 131 | >>> s = Template('$who likes $what') |
| 132 | >>> print s % dict(who='tim', what='kung pao') |
| 133 | tim likes kung pao |
| 134 | >>> Template('Give $who $100') % dict(who='tim') |
| 135 | Traceback (most recent call last): |
| 136 | [...] |
| 137 | ValueError: Invalid placeholder at index 10 |
| 138 | \end{verbatim} |
| 139 | |
| 140 | There is also a \class{SafeTemplate} class, derived from \class{Template} |
| 141 | which acts the same as \class{Template}, except that if placeholders are |
| 142 | missing in the interpolation dictionary, no \exception{KeyError} will be |
| 143 | raised. Instead the original placeholder (with or without the braces, as |
| 144 | appropriate) will be used: |
| 145 | |
| 146 | \begin{verbatim} |
| 147 | >>> from string import SafeTemplate |
| 148 | >>> s = SafeTemplate('$who likes $what for ${meal}') |
| 149 | >>> print s % dict(who='tim') |
| 150 | tim likes $what for ${meal} |
| 151 | \end{verbatim} |
| 152 | |
| 153 | The values in the mapping will automatically be converted to Unicode strings, |
| 154 | using the built-in \function{unicode()} function, which will be called without |
| 155 | optional arguments \var{encoding} or \var{errors}. |
| 156 | |
| 157 | Advanced usage: you can derive subclasses of \class{Template} or |
| 158 | \class{SafeTemplate} to use application-specific placeholder rules. To do |
| 159 | this, you override the class attribute \member{pattern}; the value must be a |
| 160 | compiled regular expression object with four named capturing groups. The |
| 161 | capturing groups correspond to the rules given above, along with the invalid |
| 162 | placeholder rule: |
| 163 | |
| 164 | \begin{itemize} |
| 165 | \item \var{escaped} -- This group matches the escape sequence, i.e. \samp{\$\$} |
| 166 | in the default pattern. |
| 167 | \item \var{named} -- This group matches the unbraced placeholder name; it |
| 168 | should not include the \samp{\$} in capturing group. |
| 169 | \item \var{braced} -- This group matches the brace delimited placeholder name; |
| 170 | it should not include either the \samp{\$} or braces in the capturing |
| 171 | group. |
| 172 | \item \var{bogus} -- This group matches any other \samp{\$}. It usually just |
| 173 | matches a single \samp{\$} and should appear last. |
| 174 | \end{itemize} |
| 175 | |
| 176 | \subsection{String functions} |
| 177 | |
| 178 | The following functions are available to operate on string and Unicode |
| 179 | objects. They are not available as string methods. |
| 180 | |
| 181 | \begin{funcdesc}{capwords}{s} |
| 182 | Split the argument into words using \function{split()}, capitalize |
| 183 | each word using \function{capitalize()}, and join the capitalized |
| 184 | words using \function{join()}. Note that this replaces runs of |
| 185 | whitespace characters by a single space, and removes leading and |
| 186 | trailing whitespace. |
| 187 | \end{funcdesc} |
| 188 | |
| 189 | \begin{funcdesc}{maketrans}{from, to} |
| 190 | Return a translation table suitable for passing to |
| 191 | \function{translate()} or \function{regex.compile()}, that will map |
| 192 | each character in \var{from} into the character at the same position |
| 193 | in \var{to}; \var{from} and \var{to} must have the same length. |
| 194 | |
| 195 | \warning{Don't use strings derived from \constant{lowercase} |
| 196 | and \constant{uppercase} as arguments; in some locales, these don't have |
| 197 | the same length. For case conversions, always use |
| 198 | \function{lower()} and \function{upper()}.} |
| 199 | \end{funcdesc} |
| 200 | |
| 201 | \subsection{Deprecated string functions} |
| 202 | |
| 203 | The following list of functions are also defined as methods of string and |
| 204 | Unicode objects; see ``String Methods'' (section |
| 205 | \ref{string-methods}) for more information on those. You should consider |
| 206 | these functions as deprecated, although they will not be removed until Python |
| 207 | 3.0. The functions defined in this module are: |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 208 | |
| 209 | \begin{funcdesc}{atof}{s} |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 210 | \deprecated{2.0}{Use the \function{float()} built-in function.} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 211 | Convert a string to a floating point number. The string must have |
| 212 | the standard syntax for a floating point literal in Python, |
Fred Drake | 70a66c9 | 1999-02-18 16:08:36 +0000 | [diff] [blame] | 213 | optionally preceded by a sign (\samp{+} or \samp{-}). Note that |
| 214 | this behaves identical to the built-in function |
| 215 | \function{float()}\bifuncindex{float} when passed a string. |
| 216 | |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 217 | \note{When passing in a string, values for NaN\index{NaN} |
Fred Drake | 70a66c9 | 1999-02-18 16:08:36 +0000 | [diff] [blame] | 218 | and Infinity\index{Infinity} may be returned, depending on the |
| 219 | underlying C library. The specific set of strings accepted which |
| 220 | cause these values to be returned depends entirely on the C library |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 221 | and is known to vary.} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 222 | \end{funcdesc} |
| 223 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 224 | \begin{funcdesc}{atoi}{s\optional{, base}} |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 225 | \deprecated{2.0}{Use the \function{int()} built-in function.} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 226 | Convert string \var{s} to an integer in the given \var{base}. The |
| 227 | string must consist of one or more digits, optionally preceded by a |
| 228 | sign (\samp{+} or \samp{-}). The \var{base} defaults to 10. If it |
| 229 | is 0, a default base is chosen depending on the leading characters |
| 230 | of the string (after stripping the sign): \samp{0x} or \samp{0X} |
| 231 | means 16, \samp{0} means 8, anything else means 10. If \var{base} |
Fred Drake | fffe5db | 2000-09-21 05:25:30 +0000 | [diff] [blame] | 232 | is 16, a leading \samp{0x} or \samp{0X} is always accepted, though |
| 233 | not required. This behaves identically to the built-in function |
| 234 | \function{int()} when passed a string. (Also note: for a more |
| 235 | flexible interpretation of numeric literals, use the built-in |
| 236 | function \function{eval()}\bifuncindex{eval}.) |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 237 | \end{funcdesc} |
| 238 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 239 | \begin{funcdesc}{atol}{s\optional{, base}} |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 240 | \deprecated{2.0}{Use the \function{long()} built-in function.} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 241 | Convert string \var{s} to a long integer in the given \var{base}. |
| 242 | The string must consist of one or more digits, optionally preceded |
| 243 | by a sign (\samp{+} or \samp{-}). The \var{base} argument has the |
| 244 | same meaning as for \function{atoi()}. A trailing \samp{l} or |
| 245 | \samp{L} is not allowed, except if the base is 0. Note that when |
| 246 | invoked without \var{base} or with \var{base} set to 10, this |
| 247 | behaves identical to the built-in function |
| 248 | \function{long()}\bifuncindex{long} when passed a string. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 249 | \end{funcdesc} |
| 250 | |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 251 | \begin{funcdesc}{capitalize}{word} |
Fred Drake | 473f46a | 2002-06-20 21:18:46 +0000 | [diff] [blame] | 252 | Return a copy of \var{word} with only its first character capitalized. |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 253 | \end{funcdesc} |
| 254 | |
Fred Drake | 15f0666 | 2000-10-04 13:59:52 +0000 | [diff] [blame] | 255 | \begin{funcdesc}{expandtabs}{s\optional{, tabsize}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 256 | Expand tabs in a string, i.e.\ replace them by one or more spaces, |
| 257 | depending on the current column and the given tab size. The column |
| 258 | number is reset to zero after each newline occurring in the string. |
| 259 | This doesn't understand other non-printing characters or escape |
Guido van Rossum | 9700e9b | 1999-01-25 22:31:53 +0000 | [diff] [blame] | 260 | sequences. The tab size defaults to 8. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 261 | \end{funcdesc} |
| 262 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 263 | \begin{funcdesc}{find}{s, sub\optional{, start\optional{,end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 264 | Return the lowest index in \var{s} where the substring \var{sub} is |
| 265 | found such that \var{sub} is wholly contained in |
| 266 | \code{\var{s}[\var{start}:\var{end}]}. Return \code{-1} on failure. |
| 267 | Defaults for \var{start} and \var{end} and interpretation of |
| 268 | negative values is the same as for slices. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 269 | \end{funcdesc} |
| 270 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 271 | \begin{funcdesc}{rfind}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 272 | Like \function{find()} but find the highest index. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 273 | \end{funcdesc} |
| 274 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 275 | \begin{funcdesc}{index}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 276 | Like \function{find()} but raise \exception{ValueError} when the |
| 277 | substring is not found. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 278 | \end{funcdesc} |
| 279 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 280 | \begin{funcdesc}{rindex}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 281 | Like \function{rfind()} but raise \exception{ValueError} when the |
| 282 | substring is not found. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 283 | \end{funcdesc} |
| 284 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 285 | \begin{funcdesc}{count}{s, sub\optional{, start\optional{, end}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 286 | Return the number of (non-overlapping) occurrences of substring |
| 287 | \var{sub} in string \code{\var{s}[\var{start}:\var{end}]}. |
| 288 | Defaults for \var{start} and \var{end} and interpretation of |
Andrew M. Kuchling | a4ca07c | 2000-06-21 01:48:46 +0000 | [diff] [blame] | 289 | negative values are the same as for slices. |
Guido van Rossum | ab3a250 | 1994-08-01 12:18:36 +0000 | [diff] [blame] | 290 | \end{funcdesc} |
| 291 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 292 | \begin{funcdesc}{lower}{s} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 293 | Return a copy of \var{s}, but with upper case letters converted to |
| 294 | lower case. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 295 | \end{funcdesc} |
| 296 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 297 | \begin{funcdesc}{split}{s\optional{, sep\optional{, maxsplit}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 298 | Return a list of the words of the string \var{s}. If the optional |
| 299 | second argument \var{sep} is absent or \code{None}, the words are |
| 300 | separated by arbitrary strings of whitespace characters (space, tab, |
| 301 | newline, return, formfeed). If the second argument \var{sep} is |
| 302 | present and not \code{None}, it specifies a string to be used as the |
Fred Drake | a7ce52b0 | 1999-05-27 17:18:08 +0000 | [diff] [blame] | 303 | word separator. The returned list will then have one more item |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 304 | than the number of non-overlapping occurrences of the separator in |
| 305 | the string. The optional third argument \var{maxsplit} defaults to |
| 306 | 0. If it is nonzero, at most \var{maxsplit} number of splits occur, |
| 307 | and the remainder of the string is returned as the final element of |
| 308 | the list (thus, the list will have at most \code{\var{maxsplit}+1} |
| 309 | elements). |
Nicholas Bastin | 07973da | 2004-03-21 16:59:59 +0000 | [diff] [blame] | 310 | |
| 311 | The behavior of split on an empty string depends on the value of \var{sep}. |
| 312 | If \var{sep} is not specified, or specified as \code{None}, the result will |
| 313 | be an empty list. If \var{sep} is specified as any string, the result will |
| 314 | be a list containing one element which is an empty string. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 315 | \end{funcdesc} |
| 316 | |
Hye-Shik Chang | 3ae811b | 2003-12-15 18:49:53 +0000 | [diff] [blame] | 317 | \begin{funcdesc}{rsplit}{s\optional{, sep\optional{, maxsplit}}} |
Hye-Shik Chang | c6f066f | 2003-12-17 02:49:03 +0000 | [diff] [blame] | 318 | Return a list of the words of the string \var{s}, scanning \var{s} |
| 319 | from the end. To all intents and purposes, the resulting list of |
| 320 | words is the same as returned by \function{split()}, except when the |
| 321 | optional third argument \var{maxsplit} is explicitly specified and |
| 322 | nonzero. When \var{maxsplit} is nonzero, at most \var{maxsplit} |
Fred Drake | 32fef9f | 2003-12-30 23:08:14 +0000 | [diff] [blame] | 323 | number of splits -- the \emph{rightmost} ones -- occur, and the remainder |
Hye-Shik Chang | c6f066f | 2003-12-17 02:49:03 +0000 | [diff] [blame] | 324 | of the string is returned as the first element of the list (thus, the |
| 325 | list will have at most \code{\var{maxsplit}+1} elements). |
Hye-Shik Chang | 3ae811b | 2003-12-15 18:49:53 +0000 | [diff] [blame] | 326 | \versionadded{2.4} |
| 327 | \end{funcdesc} |
| 328 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 329 | \begin{funcdesc}{splitfields}{s\optional{, sep\optional{, maxsplit}}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 330 | This function behaves identically to \function{split()}. (In the |
| 331 | past, \function{split()} was only used with one argument, while |
| 332 | \function{splitfields()} was only used with two arguments.) |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 333 | \end{funcdesc} |
| 334 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 335 | \begin{funcdesc}{join}{words\optional{, sep}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 336 | Concatenate a list or tuple of words with intervening occurrences of |
| 337 | \var{sep}. The default value for \var{sep} is a single space |
| 338 | character. It is always true that |
| 339 | \samp{string.join(string.split(\var{s}, \var{sep}), \var{sep})} |
| 340 | equals \var{s}. |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 341 | \end{funcdesc} |
| 342 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 343 | \begin{funcdesc}{joinfields}{words\optional{, sep}} |
Fred Drake | b7c1895 | 2002-09-12 14:16:07 +0000 | [diff] [blame] | 344 | This function behaves identically to \function{join()}. (In the past, |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 345 | \function{join()} was only used with one argument, while |
| 346 | \function{joinfields()} was only used with two arguments.) |
Fred Drake | b7c1895 | 2002-09-12 14:16:07 +0000 | [diff] [blame] | 347 | Note that there is no \method{joinfields()} method on string |
| 348 | objects; use the \method{join()} method instead. |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 349 | \end{funcdesc} |
| 350 | |
Walter Dörwald | de02bcb | 2002-04-22 17:42:37 +0000 | [diff] [blame] | 351 | \begin{funcdesc}{lstrip}{s\optional{, chars}} |
| 352 | Return a copy of the string with leading characters removed. If |
| 353 | \var{chars} is omitted or \code{None}, whitespace characters are |
| 354 | removed. If given and not \code{None}, \var{chars} must be a string; |
| 355 | the characters in the string will be stripped from the beginning of |
| 356 | the string this method is called on. |
Neal Norwitz | ffe33b7 | 2003-04-10 22:35:32 +0000 | [diff] [blame] | 357 | \versionchanged[The \var{chars} parameter was added. The \var{chars} |
| 358 | parameter cannot be passed in earlier 2.2 versions]{2.2.3} |
Guido van Rossum | e5e55d7 | 1996-08-09 21:44:51 +0000 | [diff] [blame] | 359 | \end{funcdesc} |
| 360 | |
Walter Dörwald | de02bcb | 2002-04-22 17:42:37 +0000 | [diff] [blame] | 361 | \begin{funcdesc}{rstrip}{s\optional{, chars}} |
| 362 | Return a copy of the string with trailing characters removed. If |
| 363 | \var{chars} is omitted or \code{None}, whitespace characters are |
| 364 | removed. If given and not \code{None}, \var{chars} must be a string; |
| 365 | the characters in the string will be stripped from the end of the |
| 366 | string this method is called on. |
Neal Norwitz | ffe33b7 | 2003-04-10 22:35:32 +0000 | [diff] [blame] | 367 | \versionchanged[The \var{chars} parameter was added. The \var{chars} |
Martin v. Löwis | b0c319a | 2004-07-19 16:34:01 +0000 | [diff] [blame] | 368 | parameter cannot be passed in earlier 2.2 versions]{2.2.3} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 369 | \end{funcdesc} |
| 370 | |
Walter Dörwald | de02bcb | 2002-04-22 17:42:37 +0000 | [diff] [blame] | 371 | \begin{funcdesc}{strip}{s\optional{, chars}} |
| 372 | Return a copy of the string with leading and trailing characters |
| 373 | removed. If \var{chars} is omitted or \code{None}, whitespace |
| 374 | characters are removed. If given and not \code{None}, \var{chars} |
| 375 | must be a string; the characters in the string will be stripped from |
| 376 | the both ends of the string this method is called on. |
Neal Norwitz | ffe33b7 | 2003-04-10 22:35:32 +0000 | [diff] [blame] | 377 | \versionchanged[The \var{chars} parameter was added. The \var{chars} |
Neal Norwitz | a6bdf2a | 2003-04-17 23:07:13 +0000 | [diff] [blame] | 378 | parameter cannot be passed in earlier 2.2 versions]{2.2.3} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 379 | \end{funcdesc} |
| 380 | |
| 381 | \begin{funcdesc}{swapcase}{s} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 382 | Return a copy of \var{s}, but with lower case letters |
| 383 | converted to upper case and vice versa. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 384 | \end{funcdesc} |
| 385 | |
Guido van Rossum | f4d0d57 | 1996-07-30 18:23:05 +0000 | [diff] [blame] | 386 | \begin{funcdesc}{translate}{s, table\optional{, deletechars}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 387 | Delete all characters from \var{s} that are in \var{deletechars} (if |
| 388 | present), and then translate the characters using \var{table}, which |
| 389 | must be a 256-character string giving the translation for each |
Raymond Hettinger | 5c5fca9 | 2003-07-13 02:06:47 +0000 | [diff] [blame] | 390 | character value, indexed by its ordinal. |
Guido van Rossum | f65f278 | 1995-09-13 17:37:21 +0000 | [diff] [blame] | 391 | \end{funcdesc} |
| 392 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 393 | \begin{funcdesc}{upper}{s} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 394 | Return a copy of \var{s}, but with lower case letters converted to |
| 395 | upper case. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 396 | \end{funcdesc} |
| 397 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 398 | \begin{funcdesc}{ljust}{s, width} |
| 399 | \funcline{rjust}{s, width} |
| 400 | \funcline{center}{s, width} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 401 | These functions respectively left-justify, right-justify and center |
| 402 | a string in a field of given width. They return a string that is at |
| 403 | least \var{width} characters wide, created by padding the string |
| 404 | \var{s} with spaces until the given width on the right, left or both |
| 405 | sides. The string is never truncated. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 406 | \end{funcdesc} |
| 407 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 408 | \begin{funcdesc}{zfill}{s, width} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 409 | Pad a numeric string on the left with zero digits until the given |
| 410 | width is reached. Strings starting with a sign are handled |
| 411 | correctly. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 412 | \end{funcdesc} |
Guido van Rossum | 0bf4d89 | 1995-03-02 12:37:30 +0000 | [diff] [blame] | 413 | |
Martin v. Löwis | 8bafb2a | 2003-11-18 19:48:57 +0000 | [diff] [blame] | 414 | \begin{funcdesc}{replace}{str, old, new\optional{, maxreplace}} |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 415 | Return a copy of string \var{str} with all occurrences of substring |
| 416 | \var{old} replaced by \var{new}. If the optional argument |
Martin v. Löwis | 8bafb2a | 2003-11-18 19:48:57 +0000 | [diff] [blame] | 417 | \var{maxreplace} is given, the first \var{maxreplace} occurrences are |
Fred Drake | e848976 | 1998-12-21 18:56:13 +0000 | [diff] [blame] | 418 | replaced. |
Guido van Rossum | c8a80cd | 1997-03-25 16:41:31 +0000 | [diff] [blame] | 419 | \end{funcdesc} |