|  | \section{\module{string} --- | 
|  | Common string operations} | 
|  |  | 
|  | \declaremodule{standard}{string} | 
|  | \modulesynopsis{Common string operations.} | 
|  |  | 
|  |  | 
|  | This module defines some constants useful for checking character | 
|  | classes and some useful string functions.  See the module | 
|  | \refmodule{re}\refstmodindex{re} for string functions based on regular | 
|  | expressions. | 
|  |  | 
|  | The constants defined in this module are: | 
|  |  | 
|  | \begin{datadesc}{ascii_letters} | 
|  | The concatenation of the \constant{ascii_lowercase} and | 
|  | \constant{ascii_uppercase} constants described below.  This value is | 
|  | not locale-dependent. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{ascii_lowercase} | 
|  | The lowercase letters \code{'abcdefghijklmnopqrstuvwxyz'}.  This | 
|  | value is not locale-dependent and will not change. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{ascii_uppercase} | 
|  | The uppercase letters \code{'ABCDEFGHIJKLMNOPQRSTUVWXYZ'}.  This | 
|  | value is not locale-dependent and will not change. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{digits} | 
|  | The string \code{'0123456789'}. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{hexdigits} | 
|  | The string \code{'0123456789abcdefABCDEF'}. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{letters} | 
|  | The concatenation of the strings \constant{lowercase} and | 
|  | \constant{uppercase} described below.  The specific value is | 
|  | locale-dependent, and will be updated when | 
|  | \function{locale.setlocale()} is called. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{lowercase} | 
|  | A string containing all the characters that are considered lowercase | 
|  | letters.  On most systems this is the string | 
|  | \code{'abcdefghijklmnopqrstuvwxyz'}.  Do not change its definition --- | 
|  | the effect on the routines \function{upper()} and | 
|  | \function{swapcase()} is undefined.  The specific value is | 
|  | locale-dependent, and will be updated when | 
|  | \function{locale.setlocale()} is called. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{octdigits} | 
|  | The string \code{'01234567'}. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{punctuation} | 
|  | String of \ASCII{} characters which are considered punctuation | 
|  | characters in the \samp{C} locale. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{printable} | 
|  | String of characters which are considered printable.  This is a | 
|  | combination of \constant{digits}, \constant{letters}, | 
|  | \constant{punctuation}, and \constant{whitespace}. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{uppercase} | 
|  | A string containing all the characters that are considered uppercase | 
|  | letters.  On most systems this is the string | 
|  | \code{'ABCDEFGHIJKLMNOPQRSTUVWXYZ'}.  Do not change its definition --- | 
|  | the effect on the routines \function{lower()} and | 
|  | \function{swapcase()} is undefined.  The specific value is | 
|  | locale-dependent, and will be updated when | 
|  | \function{locale.setlocale()} is called. | 
|  | \end{datadesc} | 
|  |  | 
|  | \begin{datadesc}{whitespace} | 
|  | A string containing all characters that are considered whitespace. | 
|  | On most systems this includes the characters space, tab, linefeed, | 
|  | return, formfeed, and vertical tab.  Do not change its definition --- | 
|  | the effect on the routines \function{strip()} and \function{split()} | 
|  | is undefined. | 
|  | \end{datadesc} | 
|  |  | 
|  |  | 
|  | Many of the functions provided by this module are also defined as | 
|  | methods of string and Unicode objects; see ``String Methods'' (section | 
|  | \ref{string-methods}) for more information on those. | 
|  | The functions defined in this module are: | 
|  |  | 
|  | \begin{funcdesc}{atof}{s} | 
|  | \deprecated{2.0}{Use the \function{float()} built-in function.} | 
|  | Convert a string to a floating point number.  The string must have | 
|  | the standard syntax for a floating point literal in Python, | 
|  | optionally preceded by a sign (\samp{+} or \samp{-}).  Note that | 
|  | this behaves identical to the built-in function | 
|  | \function{float()}\bifuncindex{float} when passed a string. | 
|  |  | 
|  | \note{When passing in a string, values for NaN\index{NaN} | 
|  | and Infinity\index{Infinity} may be returned, depending on the | 
|  | underlying C library.  The specific set of strings accepted which | 
|  | cause these values to be returned depends entirely on the C library | 
|  | and is known to vary.} | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{atoi}{s\optional{, base}} | 
|  | \deprecated{2.0}{Use the \function{int()} built-in function.} | 
|  | Convert string \var{s} to an integer in the given \var{base}.  The | 
|  | string must consist of one or more digits, optionally preceded by a | 
|  | sign (\samp{+} or \samp{-}).  The \var{base} defaults to 10.  If it | 
|  | is 0, a default base is chosen depending on the leading characters | 
|  | of the string (after stripping the sign): \samp{0x} or \samp{0X} | 
|  | means 16, \samp{0} means 8, anything else means 10.  If \var{base} | 
|  | is 16, a leading \samp{0x} or \samp{0X} is always accepted, though | 
|  | not required.  This behaves identically to the built-in function | 
|  | \function{int()} when passed a string.  (Also note: for a more | 
|  | flexible interpretation of numeric literals, use the built-in | 
|  | function \function{eval()}\bifuncindex{eval}.) | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{atol}{s\optional{, base}} | 
|  | \deprecated{2.0}{Use the \function{long()} built-in function.} | 
|  | Convert string \var{s} to a long integer in the given \var{base}. | 
|  | The string must consist of one or more digits, optionally preceded | 
|  | by a sign (\samp{+} or \samp{-}).  The \var{base} argument has the | 
|  | same meaning as for \function{atoi()}.  A trailing \samp{l} or | 
|  | \samp{L} is not allowed, except if the base is 0.  Note that when | 
|  | invoked without \var{base} or with \var{base} set to 10, this | 
|  | behaves identical to the built-in function | 
|  | \function{long()}\bifuncindex{long} when passed a string. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{capitalize}{word} | 
|  | Return a copy of \var{word} with only its first character capitalized. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{capwords}{s} | 
|  | Split the argument into words using \function{split()}, capitalize | 
|  | each word using \function{capitalize()}, and join the capitalized | 
|  | words using \function{join()}.  Note that this replaces runs of | 
|  | whitespace characters by a single space, and removes leading and | 
|  | trailing whitespace. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{expandtabs}{s\optional{, tabsize}} | 
|  | Expand tabs in a string, i.e.\ replace them by one or more spaces, | 
|  | depending on the current column and the given tab size.  The column | 
|  | number is reset to zero after each newline occurring in the string. | 
|  | This doesn't understand other non-printing characters or escape | 
|  | sequences.  The tab size defaults to 8. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{find}{s, sub\optional{, start\optional{,end}}} | 
|  | Return the lowest index in \var{s} where the substring \var{sub} is | 
|  | found such that \var{sub} is wholly contained in | 
|  | \code{\var{s}[\var{start}:\var{end}]}.  Return \code{-1} on failure. | 
|  | Defaults for \var{start} and \var{end} and interpretation of | 
|  | negative values is the same as for slices. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{rfind}{s, sub\optional{, start\optional{, end}}} | 
|  | Like \function{find()} but find the highest index. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{index}{s, sub\optional{, start\optional{, end}}} | 
|  | Like \function{find()} but raise \exception{ValueError} when the | 
|  | substring is not found. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{rindex}{s, sub\optional{, start\optional{, end}}} | 
|  | Like \function{rfind()} but raise \exception{ValueError} when the | 
|  | substring is not found. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{count}{s, sub\optional{, start\optional{, end}}} | 
|  | Return the number of (non-overlapping) occurrences of substring | 
|  | \var{sub} in string \code{\var{s}[\var{start}:\var{end}]}. | 
|  | Defaults for \var{start} and \var{end} and interpretation of | 
|  | negative values are the same as for slices. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{lower}{s} | 
|  | Return a copy of \var{s}, but with upper case letters converted to | 
|  | lower case. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{maketrans}{from, to} | 
|  | Return a translation table suitable for passing to | 
|  | \function{translate()} or \function{regex.compile()}, that will map | 
|  | each character in \var{from} into the character at the same position | 
|  | in \var{to}; \var{from} and \var{to} must have the same length. | 
|  |  | 
|  | \warning{Don't use strings derived from \constant{lowercase} | 
|  | and \constant{uppercase} as arguments; in some locales, these don't have | 
|  | the same length.  For case conversions, always use | 
|  | \function{lower()} and \function{upper()}.} | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{split}{s\optional{, sep\optional{, maxsplit}}} | 
|  | Return a list of the words of the string \var{s}.  If the optional | 
|  | second argument \var{sep} is absent or \code{None}, the words are | 
|  | separated by arbitrary strings of whitespace characters (space, tab, | 
|  | newline, return, formfeed).  If the second argument \var{sep} is | 
|  | present and not \code{None}, it specifies a string to be used as the | 
|  | word separator.  The returned list will then have one more item | 
|  | than the number of non-overlapping occurrences of the separator in | 
|  | the string.  The optional third argument \var{maxsplit} defaults to | 
|  | 0.  If it is nonzero, at most \var{maxsplit} number of splits occur, | 
|  | and the remainder of the string is returned as the final element of | 
|  | the list (thus, the list will have at most \code{\var{maxsplit}+1} | 
|  | elements). | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{splitfields}{s\optional{, sep\optional{, maxsplit}}} | 
|  | This function behaves identically to \function{split()}.  (In the | 
|  | past, \function{split()} was only used with one argument, while | 
|  | \function{splitfields()} was only used with two arguments.) | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{join}{words\optional{, sep}} | 
|  | Concatenate a list or tuple of words with intervening occurrences of | 
|  | \var{sep}.  The default value for \var{sep} is a single space | 
|  | character.  It is always true that | 
|  | \samp{string.join(string.split(\var{s}, \var{sep}), \var{sep})} | 
|  | equals \var{s}. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{joinfields}{words\optional{, sep}} | 
|  | This function behaves identically to \function{join()}.  (In the past, | 
|  | \function{join()} was only used with one argument, while | 
|  | \function{joinfields()} was only used with two arguments.) | 
|  | Note that there is no \method{joinfields()} method on string | 
|  | objects; use the \method{join()} method instead. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{lstrip}{s\optional{, chars}} | 
|  | Return a copy of the string with leading characters removed.  If | 
|  | \var{chars} is omitted or \code{None}, whitespace characters are | 
|  | removed.  If given and not \code{None}, \var{chars} must be a string; | 
|  | the characters in the string will be stripped from the beginning of | 
|  | the string this method is called on. | 
|  | \versionchanged[The \var{chars} parameter was added.  The \var{chars} | 
|  | parameter cannot be passed in earlier 2.2 versions]{2.2.3} | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{rstrip}{s\optional{, chars}} | 
|  | Return a copy of the string with trailing characters removed.  If | 
|  | \var{chars} is omitted or \code{None}, whitespace characters are | 
|  | removed.  If given and not \code{None}, \var{chars} must be a string; | 
|  | the characters in the string will be stripped from the end of the | 
|  | string this method is called on. | 
|  | \versionchanged[The \var{chars} parameter was added.  The \var{chars} | 
|  | parameter cannot be passed in 2.2 versions]{2.2.3} | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{strip}{s\optional{, chars}} | 
|  | Return a copy of the string with leading and trailing characters | 
|  | removed.  If \var{chars} is omitted or \code{None}, whitespace | 
|  | characters are removed.  If given and not \code{None}, \var{chars} | 
|  | must be a string; the characters in the string will be stripped from | 
|  | the both ends of the string this method is called on. | 
|  | \versionchanged[The \var{chars} parameter was added.  The \var{chars} | 
|  | parameter cannot be passed in earlier 2.2 versions]{2.2.3} | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{swapcase}{s} | 
|  | Return a copy of \var{s}, but with lower case letters | 
|  | converted to upper case and vice versa. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{translate}{s, table\optional{, deletechars}} | 
|  | Delete all characters from \var{s} that are in \var{deletechars} (if | 
|  | present), and then translate the characters using \var{table}, which | 
|  | must be a 256-character string giving the translation for each | 
|  | character value, indexed by its ordinal. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{upper}{s} | 
|  | Return a copy of \var{s}, but with lower case letters converted to | 
|  | upper case. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{ljust}{s, width} | 
|  | \funcline{rjust}{s, width} | 
|  | \funcline{center}{s, width} | 
|  | These functions respectively left-justify, right-justify and center | 
|  | a string in a field of given width.  They return a string that is at | 
|  | least \var{width} characters wide, created by padding the string | 
|  | \var{s} with spaces until the given width on the right, left or both | 
|  | sides.  The string is never truncated. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{zfill}{s, width} | 
|  | Pad a numeric string on the left with zero digits until the given | 
|  | width is reached.  Strings starting with a sign are handled | 
|  | correctly. | 
|  | \end{funcdesc} | 
|  |  | 
|  | \begin{funcdesc}{replace}{str, old, new\optional{, maxsplit}} | 
|  | Return a copy of string \var{str} with all occurrences of substring | 
|  | \var{old} replaced by \var{new}.  If the optional argument | 
|  | \var{maxsplit} is given, the first \var{maxsplit} occurrences are | 
|  | replaced. | 
|  | \end{funcdesc} |