Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 1 | \section{\module{textwrap} --- |
| 2 | Text wrapping and filling} |
| 3 | |
| 4 | \declaremodule{standard}{textwrap} |
| 5 | \modulesynopsis{Text wrapping and filling} |
| 6 | \moduleauthor{Greg Ward}{gward@python.net} |
| 7 | \sectionauthor{Greg Ward}{gward@python.net} |
| 8 | |
| 9 | \versionadded{2.3} |
| 10 | |
| 11 | The \module{textwrap} module provides two convenience functions, |
| 12 | \function{wrap()} and \function{fill()}, as well as |
Greg Ward | 8f6329c | 2003-05-08 02:09:49 +0000 | [diff] [blame] | 13 | \class{TextWrapper}, the class that does all the work, and a utility function |
| 14 | \function{dedent()}. If you're just wrapping or filling one or two |
| 15 | text strings, the convenience functions should be good enough; otherwise, |
| 16 | you should use an instance of \class{TextWrapper} for efficiency. |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 17 | |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 18 | \begin{funcdesc}{wrap}{text\optional{, width\optional{, \moreargs}}} |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 19 | Wraps the single paragraph in \var{text} (a string) so every line is at |
| 20 | most \var{width} characters long. Returns a list of output lines, |
| 21 | without final newlines. |
| 22 | |
| 23 | Optional keyword arguments correspond to the instance attributes of |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 24 | \class{TextWrapper}, documented below. \var{width} defaults to |
| 25 | \code{70}. |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 26 | \end{funcdesc} |
| 27 | |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 28 | \begin{funcdesc}{fill}{text\optional{, width\optional{, \moreargs}}} |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 29 | Wraps the single paragraph in \var{text}, and returns a single string |
| 30 | containing the wrapped paragraph. \function{fill()} is shorthand for |
| 31 | \begin{verbatim} |
| 32 | "\n".join(wrap(text, ...)) |
| 33 | \end{verbatim} |
| 34 | |
| 35 | In particular, \function{fill()} accepts exactly the same keyword |
| 36 | arguments as \function{wrap()}. |
| 37 | \end{funcdesc} |
| 38 | |
| 39 | Both \function{wrap()} and \function{fill()} work by creating a |
| 40 | \class{TextWrapper} instance and calling a single method on it. That |
| 41 | instance is not reused, so for applications that wrap/fill many text |
| 42 | strings, it will be more efficient for you to create your own |
| 43 | \class{TextWrapper} object. |
| 44 | |
Greg Ward | 8f6329c | 2003-05-08 02:09:49 +0000 | [diff] [blame] | 45 | An additional utility function, \function{dedent()}, is provided to |
| 46 | remove indentation from strings that have unwanted whitespace to the |
| 47 | left of the text. |
| 48 | |
| 49 | \begin{funcdesc}{dedent}{text} |
Greg Ward | 7f54740 | 2006-06-11 00:40:49 +0000 | [diff] [blame] | 50 | Remove any common leading whitespace from every line in \var{text}. |
Greg Ward | 8f6329c | 2003-05-08 02:09:49 +0000 | [diff] [blame] | 51 | |
Greg Ward | 7f54740 | 2006-06-11 00:40:49 +0000 | [diff] [blame] | 52 | This can be used to make triple-quoted strings line up with the left |
| 53 | edge of the display, while still presenting them in the source code |
| 54 | in indented form. |
| 55 | |
| 56 | Note that tabs and spaces are both treated as whitespace, but they are |
| 57 | not equal: the lines \code{" {} hello"} and \code{"\textbackslash{}thello"} |
| 58 | are considered to have no common leading whitespace. (This behaviour is |
| 59 | new in Python 2.5; older versions of this module incorrectly expanded |
| 60 | tabs before searching for common leading whitespace.) |
Greg Ward | 8f6329c | 2003-05-08 02:09:49 +0000 | [diff] [blame] | 61 | |
| 62 | For example: |
| 63 | \begin{verbatim} |
| 64 | def test(): |
| 65 | # end first line with \ to avoid the empty line! |
| 66 | s = '''\ |
Greg Ward | 86e1790 | 2003-05-08 02:12:35 +0000 | [diff] [blame] | 67 | hello |
| 68 | world |
Greg Ward | 8f6329c | 2003-05-08 02:09:49 +0000 | [diff] [blame] | 69 | ''' |
Greg Ward | 86e1790 | 2003-05-08 02:12:35 +0000 | [diff] [blame] | 70 | print repr(s) # prints ' hello\n world\n ' |
| 71 | print repr(dedent(s)) # prints 'hello\n world\n' |
Greg Ward | 8f6329c | 2003-05-08 02:09:49 +0000 | [diff] [blame] | 72 | \end{verbatim} |
| 73 | \end{funcdesc} |
| 74 | |
Greg Ward | 285f4a7 | 2002-07-02 21:48:12 +0000 | [diff] [blame] | 75 | \begin{classdesc}{TextWrapper}{...} |
| 76 | The \class{TextWrapper} constructor accepts a number of optional |
| 77 | keyword arguments. Each argument corresponds to one instance attribute, |
| 78 | so for example |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 79 | \begin{verbatim} |
| 80 | wrapper = TextWrapper(initial_indent="* ") |
| 81 | \end{verbatim} |
| 82 | is the same as |
| 83 | \begin{verbatim} |
| 84 | wrapper = TextWrapper() |
| 85 | wrapper.initial_indent = "* " |
| 86 | \end{verbatim} |
| 87 | |
| 88 | You can re-use the same \class{TextWrapper} object many times, and you |
| 89 | can change any of its options through direct assignment to instance |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 90 | attributes between uses. |
| 91 | \end{classdesc} |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 92 | |
Greg Ward | 285f4a7 | 2002-07-02 21:48:12 +0000 | [diff] [blame] | 93 | The \class{TextWrapper} instance attributes (and keyword arguments to |
| 94 | the constructor) are as follows: |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 95 | |
Greg Ward | 285f4a7 | 2002-07-02 21:48:12 +0000 | [diff] [blame] | 96 | \begin{memberdesc}{width} |
Fred Drake | 228f6e4 | 2002-07-03 05:08:48 +0000 | [diff] [blame] | 97 | (default: \code{70}) The maximum length of wrapped lines. As long as |
| 98 | there are no individual words in the input text longer than |
| 99 | \member{width}, \class{TextWrapper} guarantees that no output line |
| 100 | will be longer than \member{width} characters. |
Greg Ward | 285f4a7 | 2002-07-02 21:48:12 +0000 | [diff] [blame] | 101 | \end{memberdesc} |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 102 | |
| 103 | \begin{memberdesc}{expand_tabs} |
Greg Ward | 285f4a7 | 2002-07-02 21:48:12 +0000 | [diff] [blame] | 104 | (default: \code{True}) If true, then all tab characters in \var{text} |
Brett Cannon | 222d5b4 | 2004-12-11 09:53:52 +0000 | [diff] [blame] | 105 | will be expanded to spaces using the \method{expandtabs()} method of |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 106 | \var{text}. |
| 107 | \end{memberdesc} |
| 108 | |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 109 | \begin{memberdesc}{replace_whitespace} |
Fred Drake | 228f6e4 | 2002-07-03 05:08:48 +0000 | [diff] [blame] | 110 | (default: \code{True}) If true, each whitespace character (as defined |
| 111 | by \code{string.whitespace}) remaining after tab expansion will be |
| 112 | replaced by a single space. \note{If \member{expand_tabs} is false |
| 113 | and \member{replace_whitespace} is true, each tab character will be |
| 114 | replaced by a single space, which is \emph{not} the same as tab |
| 115 | expansion.} |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 116 | \end{memberdesc} |
| 117 | |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 118 | \begin{memberdesc}{initial_indent} |
| 119 | (default: \code{''}) String that will be prepended to the first line |
| 120 | of wrapped output. Counts towards the length of the first line. |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 121 | \end{memberdesc} |
| 122 | |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 123 | \begin{memberdesc}{subsequent_indent} |
| 124 | (default: \code{''}) String that will be prepended to all lines of |
| 125 | wrapped output except the first. Counts towards the length of each |
| 126 | line except the first. |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 127 | \end{memberdesc} |
| 128 | |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 129 | \begin{memberdesc}{fix_sentence_endings} |
| 130 | (default: \code{False}) If true, \class{TextWrapper} attempts to detect |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 131 | sentence endings and ensure that sentences are always separated by |
| 132 | exactly two spaces. This is generally desired for text in a monospaced |
| 133 | font. However, the sentence detection algorithm is imperfect: it |
| 134 | assumes that a sentence ending consists of a lowercase letter followed |
| 135 | by one of \character{.}, |
| 136 | \character{!}, or \character{?}, possibly followed by one of |
Greg Ward | 285f4a7 | 2002-07-02 21:48:12 +0000 | [diff] [blame] | 137 | \character{"} or \character{'}, followed by a space. One problem |
| 138 | with this is algorithm is that it is unable to detect the difference |
| 139 | between ``Dr.'' in |
Fred Drake | 228f6e4 | 2002-07-03 05:08:48 +0000 | [diff] [blame] | 140 | |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 141 | \begin{verbatim} |
| 142 | [...] Dr. Frankenstein's monster [...] |
| 143 | \end{verbatim} |
Fred Drake | 228f6e4 | 2002-07-03 05:08:48 +0000 | [diff] [blame] | 144 | |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 145 | and ``Spot.'' in |
Fred Drake | 228f6e4 | 2002-07-03 05:08:48 +0000 | [diff] [blame] | 146 | |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 147 | \begin{verbatim} |
Greg Ward | 285f4a7 | 2002-07-02 21:48:12 +0000 | [diff] [blame] | 148 | [...] See Spot. See Spot run [...] |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 149 | \end{verbatim} |
Fred Drake | 228f6e4 | 2002-07-03 05:08:48 +0000 | [diff] [blame] | 150 | |
| 151 | \member{fix_sentence_endings} is false by default. |
| 152 | |
| 153 | Since the sentence detection algorithm relies on |
| 154 | \code{string.lowercase} for the definition of ``lowercase letter,'' |
| 155 | and a convention of using two spaces after a period to separate |
| 156 | sentences on the same line, it is specific to English-language texts. |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 157 | \end{memberdesc} |
| 158 | |
Fred Drake | ca23ee2 | 2002-07-02 20:37:12 +0000 | [diff] [blame] | 159 | \begin{memberdesc}{break_long_words} |
Greg Ward | 285f4a7 | 2002-07-02 21:48:12 +0000 | [diff] [blame] | 160 | (default: \code{True}) If true, then words longer than |
Fred Drake | 228f6e4 | 2002-07-03 05:08:48 +0000 | [diff] [blame] | 161 | \member{width} will be broken in order to ensure that no lines are |
| 162 | longer than \member{width}. If it is false, long words will not be |
| 163 | broken, and some lines may be longer than \member{width}. (Long words |
| 164 | will be put on a line by themselves, in order to minimize the amount |
| 165 | by which \member{width} is exceeded.) |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 166 | \end{memberdesc} |
| 167 | |
| 168 | \class{TextWrapper} also provides two public methods, analogous to the |
| 169 | module-level convenience functions: |
| 170 | |
| 171 | \begin{methoddesc}{wrap}{text} |
Fred Drake | 228f6e4 | 2002-07-03 05:08:48 +0000 | [diff] [blame] | 172 | Wraps the single paragraph in \var{text} (a string) so every line is |
| 173 | at most \member{width} characters long. All wrapping options are |
| 174 | taken from instance attributes of the \class{TextWrapper} instance. |
| 175 | Returns a list of output lines, without final newlines. |
Greg Ward | ae64f3ad | 2002-06-29 02:38:50 +0000 | [diff] [blame] | 176 | \end{methoddesc} |
| 177 | |
| 178 | \begin{methoddesc}{fill}{text} |
| 179 | Wraps the single paragraph in \var{text}, and returns a single string |
| 180 | containing the wrapped paragraph. |
| 181 | \end{methoddesc} |