blob: 0a75f367088b4e5da0c097f424a4c346a8e443a8 [file] [log] [blame]
Fred Drake3a0351c1998-04-04 07:23:21 +00001\section{Standard Module \module{regsub}}
Fred Drakeb91e9341998-07-23 17:59:49 +00002\declaremodule{standard}{regsub}
3
4\modulesynopsis{Substitution and splitting operations that use regular expressions.}
5
Fred Drake54c39471998-04-09 14:03:00 +00006
Guido van Rossum5fdeeea1994-01-02 01:22:07 +00007This module defines a number of functions useful for working with
8regular expressions (see built-in module \code{regex}).
9
Guido van Rossum6076ea51996-06-26 19:24:22 +000010Warning: these functions are not thread-safe.
11
Guido van Rossum77796191997-12-30 04:54:47 +000012\strong{Obsolescence note:}
13This module is obsolete as of Python version 1.5; it is still being
14maintained because much existing code still uses it. All new code in
Fred Drake16f88451998-01-22 20:47:26 +000015need of regular expressions should use the new \module{re} module, which
Guido van Rossum77796191997-12-30 04:54:47 +000016supports the more powerful and regular Perl-style regular expressions.
17Existing code should be converted. The standard library module
Fred Drake16f88451998-01-22 20:47:26 +000018\module{reconvert} helps in converting \code{regex} style regular
19expressions to \module{re} style regular expressions. (For more
Fred Drake54c39471998-04-09 14:03:00 +000020conversion help, see Andrew Kuchling's\index{Kuchling, Andrew}
21``regex-to-re HOWTO'' at
22\url{http://www.python.org/doc/howto/regex-to-re/}.)
Guido van Rossum77796191997-12-30 04:54:47 +000023
Guido van Rossum0b3f9511996-08-09 21:43:21 +000024
Fred Drakecce10901998-03-17 06:33:25 +000025\begin{funcdesc}{sub}{pat, repl, str}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000026Replace the first occurrence of pattern \var{pat} in string
27\var{str} by replacement \var{repl}. If the pattern isn't found,
28the string is returned unchanged. The pattern may be a string or an
29already compiled pattern. The replacement may contain references
30\samp{\e \var{digit}} to subpatterns and escaped backslashes.
31\end{funcdesc}
32
Fred Drakecce10901998-03-17 06:33:25 +000033\begin{funcdesc}{gsub}{pat, repl, str}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000034Replace all (non-overlapping) occurrences of pattern \var{pat} in
35string \var{str} by replacement \var{repl}. The same rules as for
36\code{sub()} apply. Empty matches for the pattern are replaced only
37when not adjacent to a previous match, so e.g.
38\code{gsub('', '-', 'abc')} returns \code{'-a-b-c-'}.
39\end{funcdesc}
40
Fred Drakecce10901998-03-17 06:33:25 +000041\begin{funcdesc}{split}{str, pat\optional{, maxsplit}}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000042Split the string \var{str} in fields separated by delimiters matching
43the pattern \var{pat}, and return a list containing the fields. Only
44non-empty matches for the pattern are considered, so e.g.
45\code{split('a:b', ':*')} returns \code{['a', 'b']} and
Guido van Rossum0b3f9511996-08-09 21:43:21 +000046\code{split('abc', '')} returns \code{['abc']}. The \var{maxsplit}
47defaults to 0. If it is nonzero, only \var{maxsplit} number of splits
48occur, and the remainder of the string is returned as the final
49element of the list.
50\end{funcdesc}
51
Fred Drakecce10901998-03-17 06:33:25 +000052\begin{funcdesc}{splitx}{str, pat\optional{, maxsplit}}
Guido van Rossum0b3f9511996-08-09 21:43:21 +000053Split the string \var{str} in fields separated by delimiters matching
54the pattern \var{pat}, and return a list containing the fields as well
55as the separators. For example, \code{splitx('a:::b', ':*')} returns
56\code{['a', ':::', 'b']}. Otherwise, this function behaves the same
57as \code{split}.
58\end{funcdesc}
59
Fred Drakecce10901998-03-17 06:33:25 +000060\begin{funcdesc}{capwords}{s\optional{, pat}}
Guido van Rossum0b3f9511996-08-09 21:43:21 +000061Capitalize words separated by optional pattern \var{pat}. The default
62pattern uses any characters except letters, digits and underscores as
63word delimiters. Capitalization is done by changing the first
64character of each word to upper case.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000065\end{funcdesc}
Barry Warsaw736bb061997-02-18 18:59:37 +000066
67\begin{funcdesc}{clear_cache}{}
68The regsub module maintains a cache of compiled regular expressions,
69keyed on the regular expression string and the syntax of the regex
70module at the time the expression was compiled. This function clears
71that cache.
72\end{funcdesc}