| \section{\module{regsub} --- |
| String operations using regular expressions} |
| |
| \declaremodule{standard}{regsub} |
| \modulesynopsis{Substitution and splitting operations that use |
| regular expressions. \strong{Obsolete!}} |
| |
| |
| This module defines a number of functions useful for working with |
| regular expressions (see built-in module \refmodule{regex}). |
| |
| Warning: these functions are not thread-safe. |
| |
| \strong{Obsolescence note:} |
| This module is obsolete as of Python version 1.5; it is still being |
| maintained because much existing code still uses it. All new code in |
| need of regular expressions should use the new \refmodule{re} module, which |
| supports the more powerful and regular Perl-style regular expressions. |
| Existing code should be converted. The standard library module |
| \module{reconvert} helps in converting \refmodule{regex} style regular |
| expressions to \refmodule{re} style regular expressions. (For more |
| conversion help, see Andrew Kuchling's\index{Kuchling, Andrew} |
| ``regex-to-re HOWTO'' at |
| \url{http://www.python.org/doc/howto/regex-to-re/}.) |
| |
| |
| \begin{funcdesc}{sub}{pat, repl, str} |
| Replace the first occurrence of pattern \var{pat} in string |
| \var{str} by replacement \var{repl}. If the pattern isn't found, |
| the string is returned unchanged. The pattern may be a string or an |
| already compiled pattern. The replacement may contain references |
| \samp{\e \var{digit}} to subpatterns and escaped backslashes. |
| \end{funcdesc} |
| |
| \begin{funcdesc}{gsub}{pat, repl, str} |
| Replace all (non-overlapping) occurrences of pattern \var{pat} in |
| string \var{str} by replacement \var{repl}. The same rules as for |
| \code{sub()} apply. Empty matches for the pattern are replaced only |
| when not adjacent to a previous match, so e.g. |
| \code{gsub('', '-', 'abc')} returns \code{'-a-b-c-'}. |
| \end{funcdesc} |
| |
| \begin{funcdesc}{split}{str, pat\optional{, maxsplit}} |
| Split the string \var{str} in fields separated by delimiters matching |
| the pattern \var{pat}, and return a list containing the fields. Only |
| non-empty matches for the pattern are considered, so e.g. |
| \code{split('a:b', ':*')} returns \code{['a', 'b']} and |
| \code{split('abc', '')} returns \code{['abc']}. The \var{maxsplit} |
| defaults to 0. If it is nonzero, only \var{maxsplit} number of splits |
| occur, and the remainder of the string is returned as the final |
| element of the list. |
| \end{funcdesc} |
| |
| \begin{funcdesc}{splitx}{str, pat\optional{, maxsplit}} |
| Split the string \var{str} in fields separated by delimiters matching |
| the pattern \var{pat}, and return a list containing the fields as well |
| as the separators. For example, \code{splitx('a:::b', ':*')} returns |
| \code{['a', ':::', 'b']}. Otherwise, this function behaves the same |
| as \code{split}. |
| \end{funcdesc} |
| |
| \begin{funcdesc}{capwords}{s\optional{, pat}} |
| Capitalize words separated by optional pattern \var{pat}. The default |
| pattern uses any characters except letters, digits and underscores as |
| word delimiters. Capitalization is done by changing the first |
| character of each word to upper case. |
| \end{funcdesc} |
| |
| \begin{funcdesc}{clear_cache}{} |
| The regsub module maintains a cache of compiled regular expressions, |
| keyed on the regular expression string and the syntax of the regex |
| module at the time the expression was compiled. This function clears |
| that cache. |
| \end{funcdesc} |