Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 1 | \section{Standard Module \sectcode{regsub}} |
Guido van Rossum | e47da0a | 1997-07-17 16:34:52 +0000 | [diff] [blame] | 2 | \label{module-regsub} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 3 | |
| 4 | \stmodindex{regsub} |
| 5 | This module defines a number of functions useful for working with |
| 6 | regular expressions (see built-in module \code{regex}). |
| 7 | |
Guido van Rossum | 6076ea5 | 1996-06-26 19:24:22 +0000 | [diff] [blame] | 8 | Warning: these functions are not thread-safe. |
| 9 | |
Guido van Rossum | 7779619 | 1997-12-30 04:54:47 +0000 | [diff] [blame] | 10 | \strong{Obsolescence note:} |
| 11 | This module is obsolete as of Python version 1.5; it is still being |
| 12 | maintained because much existing code still uses it. All new code in |
Fred Drake | 16f8845 | 1998-01-22 20:47:26 +0000 | [diff] [blame] | 13 | need of regular expressions should use the new \module{re} module, which |
Guido van Rossum | 7779619 | 1997-12-30 04:54:47 +0000 | [diff] [blame] | 14 | supports the more powerful and regular Perl-style regular expressions. |
| 15 | Existing code should be converted. The standard library module |
Fred Drake | 16f8845 | 1998-01-22 20:47:26 +0000 | [diff] [blame] | 16 | \module{reconvert} helps in converting \code{regex} style regular |
| 17 | expressions to \module{re} style regular expressions. (For more |
Guido van Rossum | 7779619 | 1997-12-30 04:54:47 +0000 | [diff] [blame] | 18 | conversion help, see the URL |
Fred Drake | 859c797 | 1998-03-06 15:11:30 +0000 | [diff] [blame] | 19 | \url{http://starship.skyport.net/crew/amk/howto/regex-to-re.html}.) |
Guido van Rossum | 7779619 | 1997-12-30 04:54:47 +0000 | [diff] [blame] | 20 | |
Fred Drake | 1947991 | 1998-02-13 06:58:54 +0000 | [diff] [blame] | 21 | \setindexsubitem{(in module regsub)} |
Guido van Rossum | 0b3f951 | 1996-08-09 21:43:21 +0000 | [diff] [blame] | 22 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 23 | \begin{funcdesc}{sub}{pat, repl, str} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 24 | Replace the first occurrence of pattern \var{pat} in string |
| 25 | \var{str} by replacement \var{repl}. If the pattern isn't found, |
| 26 | the string is returned unchanged. The pattern may be a string or an |
| 27 | already compiled pattern. The replacement may contain references |
| 28 | \samp{\e \var{digit}} to subpatterns and escaped backslashes. |
| 29 | \end{funcdesc} |
| 30 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 31 | \begin{funcdesc}{gsub}{pat, repl, str} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 32 | Replace all (non-overlapping) occurrences of pattern \var{pat} in |
| 33 | string \var{str} by replacement \var{repl}. The same rules as for |
| 34 | \code{sub()} apply. Empty matches for the pattern are replaced only |
| 35 | when not adjacent to a previous match, so e.g. |
| 36 | \code{gsub('', '-', 'abc')} returns \code{'-a-b-c-'}. |
| 37 | \end{funcdesc} |
| 38 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 39 | \begin{funcdesc}{split}{str, pat\optional{, maxsplit}} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 40 | Split the string \var{str} in fields separated by delimiters matching |
| 41 | the pattern \var{pat}, and return a list containing the fields. Only |
| 42 | non-empty matches for the pattern are considered, so e.g. |
| 43 | \code{split('a:b', ':*')} returns \code{['a', 'b']} and |
Guido van Rossum | 0b3f951 | 1996-08-09 21:43:21 +0000 | [diff] [blame] | 44 | \code{split('abc', '')} returns \code{['abc']}. The \var{maxsplit} |
| 45 | defaults to 0. If it is nonzero, only \var{maxsplit} number of splits |
| 46 | occur, and the remainder of the string is returned as the final |
| 47 | element of the list. |
| 48 | \end{funcdesc} |
| 49 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 50 | \begin{funcdesc}{splitx}{str, pat\optional{, maxsplit}} |
Guido van Rossum | 0b3f951 | 1996-08-09 21:43:21 +0000 | [diff] [blame] | 51 | Split the string \var{str} in fields separated by delimiters matching |
| 52 | the pattern \var{pat}, and return a list containing the fields as well |
| 53 | as the separators. For example, \code{splitx('a:::b', ':*')} returns |
| 54 | \code{['a', ':::', 'b']}. Otherwise, this function behaves the same |
| 55 | as \code{split}. |
| 56 | \end{funcdesc} |
| 57 | |
Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 58 | \begin{funcdesc}{capwords}{s\optional{, pat}} |
Guido van Rossum | 0b3f951 | 1996-08-09 21:43:21 +0000 | [diff] [blame] | 59 | Capitalize words separated by optional pattern \var{pat}. The default |
| 60 | pattern uses any characters except letters, digits and underscores as |
| 61 | word delimiters. Capitalization is done by changing the first |
| 62 | character of each word to upper case. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 63 | \end{funcdesc} |
Barry Warsaw | 736bb06 | 1997-02-18 18:59:37 +0000 | [diff] [blame] | 64 | |
| 65 | \begin{funcdesc}{clear_cache}{} |
| 66 | The regsub module maintains a cache of compiled regular expressions, |
| 67 | keyed on the regular expression string and the syntax of the regex |
| 68 | module at the time the expression was compiled. This function clears |
| 69 | that cache. |
| 70 | \end{funcdesc} |