blob: b77919398201341e29637ef937f8d65c19754fb9 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{regsub} ---
2 Substitution and splitting operations that use regular expressions.}
Fred Drakeb91e9341998-07-23 17:59:49 +00003\declaremodule{standard}{regsub}
4
5\modulesynopsis{Substitution and splitting operations that use regular expressions.}
6
Fred Drake54c39471998-04-09 14:03:00 +00007
Guido van Rossum5fdeeea1994-01-02 01:22:07 +00008This module defines a number of functions useful for working with
9regular expressions (see built-in module \code{regex}).
10
Guido van Rossum6076ea51996-06-26 19:24:22 +000011Warning: these functions are not thread-safe.
12
Guido van Rossum77796191997-12-30 04:54:47 +000013\strong{Obsolescence note:}
14This module is obsolete as of Python version 1.5; it is still being
15maintained because much existing code still uses it. All new code in
Fred Drake16f88451998-01-22 20:47:26 +000016need of regular expressions should use the new \module{re} module, which
Guido van Rossum77796191997-12-30 04:54:47 +000017supports the more powerful and regular Perl-style regular expressions.
18Existing code should be converted. The standard library module
Fred Drake16f88451998-01-22 20:47:26 +000019\module{reconvert} helps in converting \code{regex} style regular
20expressions to \module{re} style regular expressions. (For more
Fred Drake54c39471998-04-09 14:03:00 +000021conversion help, see Andrew Kuchling's\index{Kuchling, Andrew}
22``regex-to-re HOWTO'' at
23\url{http://www.python.org/doc/howto/regex-to-re/}.)
Guido van Rossum77796191997-12-30 04:54:47 +000024
Guido van Rossum0b3f9511996-08-09 21:43:21 +000025
Fred Drakecce10901998-03-17 06:33:25 +000026\begin{funcdesc}{sub}{pat, repl, str}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000027Replace the first occurrence of pattern \var{pat} in string
28\var{str} by replacement \var{repl}. If the pattern isn't found,
29the string is returned unchanged. The pattern may be a string or an
30already compiled pattern. The replacement may contain references
31\samp{\e \var{digit}} to subpatterns and escaped backslashes.
32\end{funcdesc}
33
Fred Drakecce10901998-03-17 06:33:25 +000034\begin{funcdesc}{gsub}{pat, repl, str}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000035Replace all (non-overlapping) occurrences of pattern \var{pat} in
36string \var{str} by replacement \var{repl}. The same rules as for
37\code{sub()} apply. Empty matches for the pattern are replaced only
38when not adjacent to a previous match, so e.g.
39\code{gsub('', '-', 'abc')} returns \code{'-a-b-c-'}.
40\end{funcdesc}
41
Fred Drakecce10901998-03-17 06:33:25 +000042\begin{funcdesc}{split}{str, pat\optional{, maxsplit}}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000043Split the string \var{str} in fields separated by delimiters matching
44the pattern \var{pat}, and return a list containing the fields. Only
45non-empty matches for the pattern are considered, so e.g.
46\code{split('a:b', ':*')} returns \code{['a', 'b']} and
Guido van Rossum0b3f9511996-08-09 21:43:21 +000047\code{split('abc', '')} returns \code{['abc']}. The \var{maxsplit}
48defaults to 0. If it is nonzero, only \var{maxsplit} number of splits
49occur, and the remainder of the string is returned as the final
50element of the list.
51\end{funcdesc}
52
Fred Drakecce10901998-03-17 06:33:25 +000053\begin{funcdesc}{splitx}{str, pat\optional{, maxsplit}}
Guido van Rossum0b3f9511996-08-09 21:43:21 +000054Split the string \var{str} in fields separated by delimiters matching
55the pattern \var{pat}, and return a list containing the fields as well
56as the separators. For example, \code{splitx('a:::b', ':*')} returns
57\code{['a', ':::', 'b']}. Otherwise, this function behaves the same
58as \code{split}.
59\end{funcdesc}
60
Fred Drakecce10901998-03-17 06:33:25 +000061\begin{funcdesc}{capwords}{s\optional{, pat}}
Guido van Rossum0b3f9511996-08-09 21:43:21 +000062Capitalize words separated by optional pattern \var{pat}. The default
63pattern uses any characters except letters, digits and underscores as
64word delimiters. Capitalization is done by changing the first
65character of each word to upper case.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000066\end{funcdesc}
Barry Warsaw736bb061997-02-18 18:59:37 +000067
68\begin{funcdesc}{clear_cache}{}
69The regsub module maintains a cache of compiled regular expressions,
70keyed on the regular expression string and the syntax of the regex
71module at the time the expression was compiled. This function clears
72that cache.
73\end{funcdesc}