| \section{\module{reconvert} --- |
| Convert regular expressions from regex to re form} |
| \declaremodule{standard}{reconvert} |
| \moduleauthor{Andrew M. Kuchling}{amk@amk.ca} |
| \sectionauthor{Skip Montanaro}{skip@pobox.com} |
| |
| |
| \modulesynopsis{Convert regex-, emacs- or sed-style regular expressions |
| to re-style syntax.} |
| |
| |
| This module provides a facility to convert regular expressions from the |
| syntax used by the deprecated \module{regex} module to those used by the |
| newer \module{re} module. Because of similarity between the regular |
| expression syntax of \code{sed(1)} and \code{emacs(1)} and the |
| \module{regex} module, it is also helpful to convert patterns written for |
| those tools to \module{re} patterns. |
| |
| When used as a script, a Python string literal (or any other expression |
| evaluating to a string) is read from stdin, and the translated expression is |
| written to stdout as a string literal. Unless stdout is a tty, no trailing |
| newline is written to stdout. This is done so that it can be used with |
| Emacs \code{C-U M-|} (shell-command-on-region) which filters the region |
| through the shell command. |
| |
| \begin{seealso} |
| \seetitle{Mastering Regular Expressions}{Book on regular expressions |
| by Jeffrey Friedl, published by O'Reilly. The second |
| edition of the book no longer covers Python at all, |
| but the first edition covered writing good regular expression |
| patterns in great detail.} |
| \end{seealso} |
| |
| \subsection{Module Contents} |
| \nodename{Contents of Module reconvert} |
| |
| The module defines two functions and a handful of constants. |
| |
| \begin{funcdesc}{convert}{pattern\optional{, syntax=None}} |
| Convert a \var{pattern} representing a \module{regex}-stype regular |
| expression into a \module{re}-style regular expression. The optional |
| \var{syntax} parameter is a bitwise-or'd set of flags that control what |
| constructs are converted. See below for a description of the various |
| constants. |
| \end{funcdesc} |
| |
| \begin{funcdesc}{quote}{s\optional{, quote=None}} |
| Convert a string object to a quoted string literal. |
| |
| This is similar to \function{repr} but will return a "raw" string (r'...' |
| or r"...") when the string contains backslashes, instead of doubling all |
| backslashes. The resulting string does not always evaluate to the same |
| string as the original; however it will do just the right thing when passed |
| into re.compile(). |
| |
| The optional second argument forces the string quote; it must be a single |
| character which is a valid Python string quote. Note that prior to Python |
| 2.5 this would not accept triple-quoted string delimiters. |
| \end{funcdesc} |
| |
| \begin{datadesc}{RE_NO_BK_PARENS} |
| Suppress paren conversion. This should be omitted when converting |
| \code{sed}-style or \code{emacs}-style regular expressions. |
| \end{datadesc} |
| |
| \begin{datadesc}{RE_NO_BK_VBAR} |
| Suppress vertical bar conversion. This should be omitted when converting |
| \code{sed}-style or \code{emacs}-style regular expressions. |
| \end{datadesc} |
| |
| \begin{datadesc}{RE_BK_PLUS_QM} |
| Enable conversion of \code{+} and \code{?} characters. This should be |
| added to the \var{syntax} arg of \function{convert} when converting |
| \code{sed}-style regular expressions and omitted when converting |
| \code{emacs}-style regular expressions. |
| \end{datadesc} |
| |
| \begin{datadesc}{RE_NEWLINE_OR} |
| When set, newline characters are replaced by \code{|}. |
| \end{datadesc} |