| Fred Drake | 295da24 | 1998-08-10 19:42:37 +0000 | [diff] [blame] | 1 | \section{\module{regsub} --- | 
|  | 2 | Substitution and splitting operations that use regular expressions.} | 
| Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 3 | \declaremodule{standard}{regsub} | 
|  | 4 |  | 
|  | 5 | \modulesynopsis{Substitution and splitting operations that use regular expressions.} | 
|  | 6 |  | 
| Fred Drake | 54c3947 | 1998-04-09 14:03:00 +0000 | [diff] [blame] | 7 |  | 
| Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 8 | This module defines a number of functions useful for working with | 
|  | 9 | regular expressions (see built-in module \code{regex}). | 
|  | 10 |  | 
| Guido van Rossum | 6076ea5 | 1996-06-26 19:24:22 +0000 | [diff] [blame] | 11 | Warning: these functions are not thread-safe. | 
|  | 12 |  | 
| Guido van Rossum | 7779619 | 1997-12-30 04:54:47 +0000 | [diff] [blame] | 13 | \strong{Obsolescence note:} | 
|  | 14 | This module is obsolete as of Python version 1.5; it is still being | 
|  | 15 | maintained because much existing code still uses it.  All new code in | 
| Fred Drake | 16f8845 | 1998-01-22 20:47:26 +0000 | [diff] [blame] | 16 | need of regular expressions should use the new \module{re} module, which | 
| Guido van Rossum | 7779619 | 1997-12-30 04:54:47 +0000 | [diff] [blame] | 17 | supports the more powerful and regular Perl-style regular expressions. | 
|  | 18 | Existing code should be converted.  The standard library module | 
| Fred Drake | 16f8845 | 1998-01-22 20:47:26 +0000 | [diff] [blame] | 19 | \module{reconvert} helps in converting \code{regex} style regular | 
|  | 20 | expressions to \module{re} style regular expressions.  (For more | 
| Fred Drake | 54c3947 | 1998-04-09 14:03:00 +0000 | [diff] [blame] | 21 | conversion help, see Andrew Kuchling's\index{Kuchling, Andrew} | 
|  | 22 | ``regex-to-re HOWTO'' at | 
|  | 23 | \url{http://www.python.org/doc/howto/regex-to-re/}.) | 
| Guido van Rossum | 7779619 | 1997-12-30 04:54:47 +0000 | [diff] [blame] | 24 |  | 
| Guido van Rossum | 0b3f951 | 1996-08-09 21:43:21 +0000 | [diff] [blame] | 25 |  | 
| Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 26 | \begin{funcdesc}{sub}{pat, repl, str} | 
| Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 27 | Replace the first occurrence of pattern \var{pat} in string | 
|  | 28 | \var{str} by replacement \var{repl}.  If the pattern isn't found, | 
|  | 29 | the string is returned unchanged.  The pattern may be a string or an | 
|  | 30 | already compiled pattern.  The replacement may contain references | 
|  | 31 | \samp{\e \var{digit}} to subpatterns and escaped backslashes. | 
|  | 32 | \end{funcdesc} | 
|  | 33 |  | 
| Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 34 | \begin{funcdesc}{gsub}{pat, repl, str} | 
| Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 35 | Replace all (non-overlapping) occurrences of pattern \var{pat} in | 
|  | 36 | string \var{str} by replacement \var{repl}.  The same rules as for | 
|  | 37 | \code{sub()} apply.  Empty matches for the pattern are replaced only | 
|  | 38 | when not adjacent to a previous match, so e.g. | 
|  | 39 | \code{gsub('', '-', 'abc')} returns \code{'-a-b-c-'}. | 
|  | 40 | \end{funcdesc} | 
|  | 41 |  | 
| Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 42 | \begin{funcdesc}{split}{str, pat\optional{, maxsplit}} | 
| Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 43 | Split the string \var{str} in fields separated by delimiters matching | 
|  | 44 | the pattern \var{pat}, and return a list containing the fields.  Only | 
|  | 45 | non-empty matches for the pattern are considered, so e.g. | 
|  | 46 | \code{split('a:b', ':*')} returns \code{['a', 'b']} and | 
| Guido van Rossum | 0b3f951 | 1996-08-09 21:43:21 +0000 | [diff] [blame] | 47 | \code{split('abc', '')} returns \code{['abc']}.  The \var{maxsplit} | 
|  | 48 | defaults to 0. If it is nonzero, only \var{maxsplit} number of splits | 
|  | 49 | occur, and the remainder of the string is returned as the final | 
|  | 50 | element of the list. | 
|  | 51 | \end{funcdesc} | 
|  | 52 |  | 
| Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 53 | \begin{funcdesc}{splitx}{str, pat\optional{, maxsplit}} | 
| Guido van Rossum | 0b3f951 | 1996-08-09 21:43:21 +0000 | [diff] [blame] | 54 | Split the string \var{str} in fields separated by delimiters matching | 
|  | 55 | the pattern \var{pat}, and return a list containing the fields as well | 
|  | 56 | as the separators.  For example, \code{splitx('a:::b', ':*')} returns | 
|  | 57 | \code{['a', ':::', 'b']}.  Otherwise, this function behaves the same | 
|  | 58 | as \code{split}. | 
|  | 59 | \end{funcdesc} | 
|  | 60 |  | 
| Fred Drake | cce1090 | 1998-03-17 06:33:25 +0000 | [diff] [blame] | 61 | \begin{funcdesc}{capwords}{s\optional{, pat}} | 
| Guido van Rossum | 0b3f951 | 1996-08-09 21:43:21 +0000 | [diff] [blame] | 62 | Capitalize words separated by optional pattern \var{pat}.  The default | 
|  | 63 | pattern uses any characters except letters, digits and underscores as | 
|  | 64 | word delimiters.  Capitalization is done by changing the first | 
|  | 65 | character of each word to upper case. | 
| Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 66 | \end{funcdesc} | 
| Barry Warsaw | 736bb06 | 1997-02-18 18:59:37 +0000 | [diff] [blame] | 67 |  | 
|  | 68 | \begin{funcdesc}{clear_cache}{} | 
|  | 69 | The regsub module maintains a cache of compiled regular expressions, | 
|  | 70 | keyed on the regular expression string and the syntax of the regex | 
|  | 71 | module at the time the expression was compiled.  This function clears | 
|  | 72 | that cache. | 
|  | 73 | \end{funcdesc} |