|  | .\"	$OpenBSD: re_format.7,v 1.14 2007/05/31 19:19:30 jmc Exp $ | 
|  | .\" | 
|  | .\" Copyright (c) 1997, Phillip F Knaack. All rights reserved. | 
|  | .\" | 
|  | .\" Copyright (c) 1992, 1993, 1994 Henry Spencer. | 
|  | .\" Copyright (c) 1992, 1993, 1994 | 
|  | .\"	The Regents of the University of California.  All rights reserved. | 
|  | .\" | 
|  | .\" This code is derived from software contributed to Berkeley by | 
|  | .\" Henry Spencer. | 
|  | .\" | 
|  | .\" Redistribution and use in source and binary forms, with or without | 
|  | .\" modification, are permitted provided that the following conditions | 
|  | .\" are met: | 
|  | .\" 1. Redistributions of source code must retain the above copyright | 
|  | .\"    notice, this list of conditions and the following disclaimer. | 
|  | .\" 2. Redistributions in binary form must reproduce the above copyright | 
|  | .\"    notice, this list of conditions and the following disclaimer in the | 
|  | .\"    documentation and/or other materials provided with the distribution. | 
|  | .\" 3. Neither the name of the University nor the names of its contributors | 
|  | .\"    may be used to endorse or promote products derived from this software | 
|  | .\"    without specific prior written permission. | 
|  | .\" | 
|  | .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND | 
|  | .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | 
|  | .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE | 
|  | .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE | 
|  | .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | 
|  | .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS | 
|  | .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) | 
|  | .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT | 
|  | .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY | 
|  | .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF | 
|  | .\" SUCH DAMAGE. | 
|  | .\" | 
|  | .\"	@(#)re_format.7	8.3 (Berkeley) 3/20/94 | 
|  | .\" | 
|  | .Dd $Mdocdate: May 31 2007 $ | 
|  | .Dt RE_FORMAT 7 | 
|  | .Os | 
|  | .Sh NAME | 
|  | .Nm re_format | 
|  | .Nd POSIX regular expressions | 
|  | .Sh DESCRIPTION | 
|  | Regular expressions (REs), | 
|  | as defined in | 
|  | .St -p1003.1-2004 , | 
|  | come in two forms: | 
|  | basic regular expressions | 
|  | (BREs) | 
|  | and extended regular expressions | 
|  | (EREs). | 
|  | Both forms of regular expressions are supported | 
|  | by the interfaces described in | 
|  | .Xr regex 3 . | 
|  | Applications dealing with regular expressions | 
|  | may use one or the other form | 
|  | (or indeed both). | 
|  | For example, | 
|  | .Xr ed 1 | 
|  | uses BREs, | 
|  | whilst | 
|  | .Xr egrep 1 | 
|  | talks EREs. | 
|  | Consult the manual page for the specific application to find out which | 
|  | it uses. | 
|  | .Pp | 
|  | POSIX leaves some aspects of RE syntax and semantics open; | 
|  | .Sq ** | 
|  | marks decisions on these aspects that | 
|  | may not be fully portable to other POSIX implementations. | 
|  | .Pp | 
|  | This manual page first describes regular expressions in general, | 
|  | specifically extended regular expressions, | 
|  | and then discusses differences between them and basic regular expressions. | 
|  | .Sh EXTENDED REGULAR EXPRESSIONS | 
|  | An ERE is one** or more non-empty** | 
|  | .Em branches , | 
|  | separated by | 
|  | .Sq \*(Ba . | 
|  | It matches anything that matches one of the branches. | 
|  | .Pp | 
|  | A branch is one** or more | 
|  | .Em pieces , | 
|  | concatenated. | 
|  | It matches a match for the first, followed by a match for the second, etc. | 
|  | .Pp | 
|  | A piece is an | 
|  | .Em atom | 
|  | possibly followed by a single** | 
|  | .Sq * , | 
|  | .Sq + , | 
|  | .Sq ?\& , | 
|  | or | 
|  | .Em bound . | 
|  | An atom followed by | 
|  | .Sq * | 
|  | matches a sequence of 0 or more matches of the atom. | 
|  | An atom followed by | 
|  | .Sq + | 
|  | matches a sequence of 1 or more matches of the atom. | 
|  | An atom followed by | 
|  | .Sq ?\& | 
|  | matches a sequence of 0 or 1 matches of the atom. | 
|  | .Pp | 
|  | A bound is | 
|  | .Sq { | 
|  | followed by an unsigned decimal integer, | 
|  | possibly followed by | 
|  | .Sq ,\& | 
|  | possibly followed by another unsigned decimal integer, | 
|  | always followed by | 
|  | .Sq } . | 
|  | The integers must lie between 0 and | 
|  | .Dv RE_DUP_MAX | 
|  | (255**) inclusive, | 
|  | and if there are two of them, the first may not exceed the second. | 
|  | An atom followed by a bound containing one integer | 
|  | .Ar i | 
|  | and no comma matches | 
|  | a sequence of exactly | 
|  | .Ar i | 
|  | matches of the atom. | 
|  | An atom followed by a bound | 
|  | containing one integer | 
|  | .Ar i | 
|  | and a comma matches | 
|  | a sequence of | 
|  | .Ar i | 
|  | or more matches of the atom. | 
|  | An atom followed by a bound | 
|  | containing two integers | 
|  | .Ar i | 
|  | and | 
|  | .Ar j | 
|  | matches a sequence of | 
|  | .Ar i | 
|  | through | 
|  | .Ar j | 
|  | (inclusive) matches of the atom. | 
|  | .Pp | 
|  | An atom is a regular expression enclosed in | 
|  | .Sq () | 
|  | (matching a part of the regular expression), | 
|  | an empty set of | 
|  | .Sq () | 
|  | (matching the null string)**, | 
|  | a | 
|  | .Em bracket expression | 
|  | (see below), | 
|  | .Sq .\& | 
|  | (matching any single character), | 
|  | .Sq ^ | 
|  | (matching the null string at the beginning of a line), | 
|  | .Sq $ | 
|  | (matching the null string at the end of a line), | 
|  | a | 
|  | .Sq \e | 
|  | followed by one of the characters | 
|  | .Sq ^.[$()|*+?{\e | 
|  | (matching that character taken as an ordinary character), | 
|  | a | 
|  | .Sq \e | 
|  | followed by any other character** | 
|  | (matching that character taken as an ordinary character, | 
|  | as if the | 
|  | .Sq \e | 
|  | had not been present**), | 
|  | or a single character with no other significance (matching that character). | 
|  | A | 
|  | .Sq { | 
|  | followed by a character other than a digit is an ordinary character, | 
|  | not the beginning of a bound**. | 
|  | It is illegal to end an RE with | 
|  | .Sq \e . | 
|  | .Pp | 
|  | A bracket expression is a list of characters enclosed in | 
|  | .Sq [] . | 
|  | It normally matches any single character from the list (but see below). | 
|  | If the list begins with | 
|  | .Sq ^ , | 
|  | it matches any single character | 
|  | .Em not | 
|  | from the rest of the list | 
|  | (but see below). | 
|  | If two characters in the list are separated by | 
|  | .Sq - , | 
|  | this is shorthand for the full | 
|  | .Em range | 
|  | of characters between those two (inclusive) in the | 
|  | collating sequence, e.g.\& | 
|  | .Sq [0-9] | 
|  | in ASCII matches any decimal digit. | 
|  | It is illegal** for two ranges to share an endpoint, e.g.\& | 
|  | .Sq a-c-e . | 
|  | Ranges are very collating-sequence-dependent, | 
|  | and portable programs should avoid relying on them. | 
|  | .Pp | 
|  | To include a literal | 
|  | .Sq ]\& | 
|  | in the list, make it the first character | 
|  | (following a possible | 
|  | .Sq ^ ) . | 
|  | To include a literal | 
|  | .Sq - , | 
|  | make it the first or last character, | 
|  | or the second endpoint of a range. | 
|  | To use a literal | 
|  | .Sq - | 
|  | as the first endpoint of a range, | 
|  | enclose it in | 
|  | .Sq [. | 
|  | and | 
|  | .Sq .] | 
|  | to make it a collating element (see below). | 
|  | With the exception of these and some combinations using | 
|  | .Sq [ | 
|  | (see next paragraphs), | 
|  | all other special characters, including | 
|  | .Sq \e , | 
|  | lose their special significance within a bracket expression. | 
|  | .Pp | 
|  | Within a bracket expression, a collating element | 
|  | (a character, | 
|  | a multi-character sequence that collates as if it were a single character, | 
|  | or a collating-sequence name for either) | 
|  | enclosed in | 
|  | .Sq [. | 
|  | and | 
|  | .Sq .] | 
|  | stands for the sequence of characters of that collating element. | 
|  | The sequence is a single element of the bracket expression's list. | 
|  | A bracket expression containing a multi-character collating element | 
|  | can thus match more than one character, | 
|  | e.g. if the collating sequence includes a | 
|  | .Sq ch | 
|  | collating element, | 
|  | then the RE | 
|  | .Sq [[.ch.]]*c | 
|  | matches the first five characters of | 
|  | .Sq chchcc . | 
|  | .Pp | 
|  | Within a bracket expression, a collating element enclosed in | 
|  | .Sq [= | 
|  | and | 
|  | .Sq =] | 
|  | is an equivalence class, standing for the sequences of characters | 
|  | of all collating elements equivalent to that one, including itself. | 
|  | (If there are no other equivalent collating elements, | 
|  | the treatment is as if the enclosing delimiters were | 
|  | .Sq [. | 
|  | and | 
|  | .Sq .] . ) | 
|  | For example, if | 
|  | .Sq x | 
|  | and | 
|  | .Sq y | 
|  | are the members of an equivalence class, | 
|  | then | 
|  | .Sq [[=x=]] , | 
|  | .Sq [[=y=]] , | 
|  | and | 
|  | .Sq [xy] | 
|  | are all synonymous. | 
|  | An equivalence class may not** be an endpoint of a range. | 
|  | .Pp | 
|  | Within a bracket expression, the name of a | 
|  | .Em character class | 
|  | enclosed | 
|  | in | 
|  | .Sq [: | 
|  | and | 
|  | .Sq :] | 
|  | stands for the list of all characters belonging to that class. | 
|  | Standard character class names are: | 
|  | .Bd -literal -offset indent | 
|  | alnum	digit	punct | 
|  | alpha	graph	space | 
|  | blank	lower	upper | 
|  | cntrl	print	xdigit | 
|  | .Ed | 
|  | .Pp | 
|  | These stand for the character classes defined in | 
|  | .Xr ctype 3 . | 
|  | A locale may provide others. | 
|  | A character class may not be used as an endpoint of a range. | 
|  | .Pp | 
|  | There are two special cases** of bracket expressions: | 
|  | the bracket expressions | 
|  | .Sq [[:<:]] | 
|  | and | 
|  | .Sq [[:>:]] | 
|  | match the null string at the beginning and end of a word, respectively. | 
|  | A word is defined as a sequence of | 
|  | characters starting and ending with a word character | 
|  | which is neither preceded nor followed by | 
|  | word characters. | 
|  | A word character is an | 
|  | .Em alnum | 
|  | character (as defined by | 
|  | .Xr ctype 3 ) | 
|  | or an underscore. | 
|  | This is an extension, | 
|  | compatible with but not specified by POSIX, | 
|  | and should be used with | 
|  | caution in software intended to be portable to other systems. | 
|  | .Pp | 
|  | In the event that an RE could match more than one substring of a given | 
|  | string, | 
|  | the RE matches the one starting earliest in the string. | 
|  | If the RE could match more than one substring starting at that point, | 
|  | it matches the longest. | 
|  | Subexpressions also match the longest possible substrings, subject to | 
|  | the constraint that the whole match be as long as possible, | 
|  | with subexpressions starting earlier in the RE taking priority over | 
|  | ones starting later. | 
|  | Note that higher-level subexpressions thus take priority over | 
|  | their lower-level component subexpressions. | 
|  | .Pp | 
|  | Match lengths are measured in characters, not collating elements. | 
|  | A null string is considered longer than no match at all. | 
|  | For example, | 
|  | .Sq bb* | 
|  | matches the three middle characters of | 
|  | .Sq abbbc ; | 
|  | .Sq (wee|week)(knights|nights) | 
|  | matches all ten characters of | 
|  | .Sq weeknights ; | 
|  | when | 
|  | .Sq (.*).* | 
|  | is matched against | 
|  | .Sq abc , | 
|  | the parenthesized subexpression matches all three characters; | 
|  | and when | 
|  | .Sq (a*)* | 
|  | is matched against | 
|  | .Sq bc , | 
|  | both the whole RE and the parenthesized subexpression match the null string. | 
|  | .Pp | 
|  | If case-independent matching is specified, | 
|  | the effect is much as if all case distinctions had vanished from the | 
|  | alphabet. | 
|  | When an alphabetic that exists in multiple cases appears as an | 
|  | ordinary character outside a bracket expression, it is effectively | 
|  | transformed into a bracket expression containing both cases, | 
|  | e.g.\& | 
|  | .Sq x | 
|  | becomes | 
|  | .Sq [xX] . | 
|  | When it appears inside a bracket expression, | 
|  | all case counterparts of it are added to the bracket expression, | 
|  | so that, for example, | 
|  | .Sq [x] | 
|  | becomes | 
|  | .Sq [xX] | 
|  | and | 
|  | .Sq [^x] | 
|  | becomes | 
|  | .Sq [^xX] . | 
|  | .Pp | 
|  | No particular limit is imposed on the length of REs**. | 
|  | Programs intended to be portable should not employ REs longer | 
|  | than 256 bytes, | 
|  | as an implementation can refuse to accept such REs and remain | 
|  | POSIX-compliant. | 
|  | .Pp | 
|  | The following is a list of extended regular expressions: | 
|  | .Bl -tag -width Ds | 
|  | .It Ar c | 
|  | Any character | 
|  | .Ar c | 
|  | not listed below matches itself. | 
|  | .It \e Ns Ar c | 
|  | Any backslash-escaped character | 
|  | .Ar c | 
|  | matches itself. | 
|  | .It \&. | 
|  | Matches any single character that is not a newline | 
|  | .Pq Sq \en . | 
|  | .It Bq Ar char-class | 
|  | Matches any single character in | 
|  | .Ar char-class . | 
|  | To include a | 
|  | .Ql \&] | 
|  | in | 
|  | .Ar char-class , | 
|  | it must be the first character. | 
|  | A range of characters may be specified by separating the end characters | 
|  | of the range with a | 
|  | .Ql - ; | 
|  | e.g.\& | 
|  | .Ar a-z | 
|  | specifies the lower case characters. | 
|  | The following literal expressions can also be used in | 
|  | .Ar char-class | 
|  | to specify sets of characters: | 
|  | .Bd -unfilled -offset indent | 
|  | [:alnum:] [:cntrl:] [:lower:] [:space:] | 
|  | [:alpha:] [:digit:] [:print:] [:upper:] | 
|  | [:blank:] [:graph:] [:punct:] [:xdigit:] | 
|  | .Ed | 
|  | .Pp | 
|  | If | 
|  | .Ql - | 
|  | appears as the first or last character of | 
|  | .Ar char-class , | 
|  | then it matches itself. | 
|  | All other characters in | 
|  | .Ar char-class | 
|  | match themselves. | 
|  | .Pp | 
|  | Patterns in | 
|  | .Ar char-class | 
|  | of the form | 
|  | .Eo [. | 
|  | .Ar col-elm | 
|  | .Ec .]\& | 
|  | or | 
|  | .Eo [= | 
|  | .Ar col-elm | 
|  | .Ec =]\& , | 
|  | where | 
|  | .Ar col-elm | 
|  | is a collating element, are interpreted according to | 
|  | .Xr setlocale 3 | 
|  | .Pq not currently supported . | 
|  | .It Bq ^ Ns Ar char-class | 
|  | Matches any single character, other than newline, not in | 
|  | .Ar char-class . | 
|  | .Ar char-class | 
|  | is defined as above. | 
|  | .It ^ | 
|  | If | 
|  | .Sq ^ | 
|  | is the first character of a regular expression, then it | 
|  | anchors the regular expression to the beginning of a line. | 
|  | Otherwise, it matches itself. | 
|  | .It $ | 
|  | If | 
|  | .Sq $ | 
|  | is the last character of a regular expression, | 
|  | it anchors the regular expression to the end of a line. | 
|  | Otherwise, it matches itself. | 
|  | .It [[:<:]] | 
|  | Anchors the single character regular expression or subexpression | 
|  | immediately following it to the beginning of a word. | 
|  | .It [[:>:]] | 
|  | Anchors the single character regular expression or subexpression | 
|  | immediately following it to the end of a word. | 
|  | .It Pq Ar re | 
|  | Defines a subexpression | 
|  | .Ar re . | 
|  | Any set of characters enclosed in parentheses | 
|  | matches whatever the set of characters without parentheses matches | 
|  | (that is a long-winded way of saying the constructs | 
|  | .Sq (re) | 
|  | and | 
|  | .Sq re | 
|  | match identically). | 
|  | .It * | 
|  | Matches the single character regular expression or subexpression | 
|  | immediately preceding it zero or more times. | 
|  | If | 
|  | .Sq * | 
|  | is the first character of a regular expression or subexpression, | 
|  | then it matches itself. | 
|  | The | 
|  | .Sq * | 
|  | operator sometimes yields unexpected results. | 
|  | For example, the regular expression | 
|  | .Ar b* | 
|  | matches the beginning of the string | 
|  | .Qq abbb | 
|  | (as opposed to the substring | 
|  | .Qq bbb ) , | 
|  | since a null match is the only leftmost match. | 
|  | .It + | 
|  | Matches the singular character regular expression | 
|  | or subexpression immediately preceding it | 
|  | one or more times. | 
|  | .It ? | 
|  | Matches the singular character regular expression | 
|  | or subexpression immediately preceding it | 
|  | 0 or 1 times. | 
|  | .Sm off | 
|  | .It Xo | 
|  | .Pf { Ar n , m No }\ \& | 
|  | .Pf { Ar n , No }\ \& | 
|  | .Pf { Ar n No } | 
|  | .Xc | 
|  | .Sm on | 
|  | Matches the single character regular expression or subexpression | 
|  | immediately preceding it at least | 
|  | .Ar n | 
|  | and at most | 
|  | .Ar m | 
|  | times. | 
|  | If | 
|  | .Ar m | 
|  | is omitted, then it matches at least | 
|  | .Ar n | 
|  | times. | 
|  | If the comma is also omitted, then it matches exactly | 
|  | .Ar n | 
|  | times. | 
|  | .It \*(Ba | 
|  | Used to separate patterns. | 
|  | For example, | 
|  | the pattern | 
|  | .Sq cat\*(Badog | 
|  | matches either | 
|  | .Sq cat | 
|  | or | 
|  | .Sq dog . | 
|  | .El | 
|  | .Sh BASIC REGULAR EXPRESSIONS | 
|  | Basic regular expressions differ in several respects: | 
|  | .Bl -bullet -offset 3n | 
|  | .It | 
|  | .Sq \*(Ba , | 
|  | .Sq + , | 
|  | and | 
|  | .Sq ?\& | 
|  | are ordinary characters and there is no equivalent | 
|  | for their functionality. | 
|  | .It | 
|  | The delimiters for bounds are | 
|  | .Sq \e{ | 
|  | and | 
|  | .Sq \e} , | 
|  | with | 
|  | .Sq { | 
|  | and | 
|  | .Sq } | 
|  | by themselves ordinary characters. | 
|  | .It | 
|  | The parentheses for nested subexpressions are | 
|  | .Sq \e( | 
|  | and | 
|  | .Sq \e) , | 
|  | with | 
|  | .Sq ( | 
|  | and | 
|  | .Sq )\& | 
|  | by themselves ordinary characters. | 
|  | .It | 
|  | .Sq ^ | 
|  | is an ordinary character except at the beginning of the | 
|  | RE or** the beginning of a parenthesized subexpression. | 
|  | .It | 
|  | .Sq $ | 
|  | is an ordinary character except at the end of the | 
|  | RE or** the end of a parenthesized subexpression. | 
|  | .It | 
|  | .Sq * | 
|  | is an ordinary character if it appears at the beginning of the | 
|  | RE or the beginning of a parenthesized subexpression | 
|  | (after a possible leading | 
|  | .Sq ^ ) . | 
|  | .It | 
|  | Finally, there is one new type of atom, a | 
|  | .Em back-reference : | 
|  | .Sq \e | 
|  | followed by a non-zero decimal digit | 
|  | .Ar d | 
|  | matches the same sequence of characters matched by the | 
|  | .Ar d Ns th | 
|  | parenthesized subexpression | 
|  | (numbering subexpressions by the positions of their opening parentheses, | 
|  | left to right), | 
|  | so that, for example, | 
|  | .Sq \e([bc]\e)\e1 | 
|  | matches | 
|  | .Sq bb\& | 
|  | or | 
|  | .Sq cc | 
|  | but not | 
|  | .Sq bc . | 
|  | .El | 
|  | .Pp | 
|  | The following is a list of basic regular expressions: | 
|  | .Bl -tag -width Ds | 
|  | .It Ar c | 
|  | Any character | 
|  | .Ar c | 
|  | not listed below matches itself. | 
|  | .It \e Ns Ar c | 
|  | Any backslash-escaped character | 
|  | .Ar c , | 
|  | except for | 
|  | .Sq { , | 
|  | .Sq } , | 
|  | .Sq \&( , | 
|  | and | 
|  | .Sq \&) , | 
|  | matches itself. | 
|  | .It \&. | 
|  | Matches any single character that is not a newline | 
|  | .Pq Sq \en . | 
|  | .It Bq Ar char-class | 
|  | Matches any single character in | 
|  | .Ar char-class . | 
|  | To include a | 
|  | .Ql \&] | 
|  | in | 
|  | .Ar char-class , | 
|  | it must be the first character. | 
|  | A range of characters may be specified by separating the end characters | 
|  | of the range with a | 
|  | .Ql - ; | 
|  | e.g.\& | 
|  | .Ar a-z | 
|  | specifies the lower case characters. | 
|  | The following literal expressions can also be used in | 
|  | .Ar char-class | 
|  | to specify sets of characters: | 
|  | .Bd -unfilled -offset indent | 
|  | [:alnum:] [:cntrl:] [:lower:] [:space:] | 
|  | [:alpha:] [:digit:] [:print:] [:upper:] | 
|  | [:blank:] [:graph:] [:punct:] [:xdigit:] | 
|  | .Ed | 
|  | .Pp | 
|  | If | 
|  | .Ql - | 
|  | appears as the first or last character of | 
|  | .Ar char-class , | 
|  | then it matches itself. | 
|  | All other characters in | 
|  | .Ar char-class | 
|  | match themselves. | 
|  | .Pp | 
|  | Patterns in | 
|  | .Ar char-class | 
|  | of the form | 
|  | .Eo [. | 
|  | .Ar col-elm | 
|  | .Ec .]\& | 
|  | or | 
|  | .Eo [= | 
|  | .Ar col-elm | 
|  | .Ec =]\& , | 
|  | where | 
|  | .Ar col-elm | 
|  | is a collating element, are interpreted according to | 
|  | .Xr setlocale 3 | 
|  | .Pq not currently supported . | 
|  | .It Bq ^ Ns Ar char-class | 
|  | Matches any single character, other than newline, not in | 
|  | .Ar char-class . | 
|  | .Ar char-class | 
|  | is defined as above. | 
|  | .It ^ | 
|  | If | 
|  | .Sq ^ | 
|  | is the first character of a regular expression, then it | 
|  | anchors the regular expression to the beginning of a line. | 
|  | Otherwise, it matches itself. | 
|  | .It $ | 
|  | If | 
|  | .Sq $ | 
|  | is the last character of a regular expression, | 
|  | it anchors the regular expression to the end of a line. | 
|  | Otherwise, it matches itself. | 
|  | .It [[:<:]] | 
|  | Anchors the single character regular expression or subexpression | 
|  | immediately following it to the beginning of a word. | 
|  | .It [[:>:]] | 
|  | Anchors the single character regular expression or subexpression | 
|  | immediately following it to the end of a word. | 
|  | .It \e( Ns Ar re Ns \e) | 
|  | Defines a subexpression | 
|  | .Ar re . | 
|  | Subexpressions may be nested. | 
|  | A subsequent backreference of the form | 
|  | .Pf \e Ns Ar n , | 
|  | where | 
|  | .Ar n | 
|  | is a number in the range [1,9], expands to the text matched by the | 
|  | .Ar n Ns th | 
|  | subexpression. | 
|  | For example, the regular expression | 
|  | .Ar \e(.*\e)\e1 | 
|  | matches any string consisting of identical adjacent substrings. | 
|  | Subexpressions are ordered relative to their left delimiter. | 
|  | .It * | 
|  | Matches the single character regular expression or subexpression | 
|  | immediately preceding it zero or more times. | 
|  | If | 
|  | .Sq * | 
|  | is the first character of a regular expression or subexpression, | 
|  | then it matches itself. | 
|  | The | 
|  | .Sq * | 
|  | operator sometimes yields unexpected results. | 
|  | For example, the regular expression | 
|  | .Ar b* | 
|  | matches the beginning of the string | 
|  | .Qq abbb | 
|  | (as opposed to the substring | 
|  | .Qq bbb ) , | 
|  | since a null match is the only leftmost match. | 
|  | .Sm off | 
|  | .It Xo | 
|  | .Pf \e{ Ar n , m No \e}\ \& | 
|  | .Pf \e{ Ar n , No \e}\ \& | 
|  | .Pf \e{ Ar n No \e} | 
|  | .Xc | 
|  | .Sm on | 
|  | Matches the single character regular expression or subexpression | 
|  | immediately preceding it at least | 
|  | .Ar n | 
|  | and at most | 
|  | .Ar m | 
|  | times. | 
|  | If | 
|  | .Ar m | 
|  | is omitted, then it matches at least | 
|  | .Ar n | 
|  | times. | 
|  | If the comma is also omitted, then it matches exactly | 
|  | .Ar n | 
|  | times. | 
|  | .El | 
|  | .Sh SEE ALSO | 
|  | .Xr ctype 3 , | 
|  | .Xr regex 3 | 
|  | .Sh STANDARDS | 
|  | .St -p1003.1-2004 : | 
|  | Base Definitions, Chapter 9 (Regular Expressions). | 
|  | .Sh BUGS | 
|  | Having two kinds of REs is a botch. | 
|  | .Pp | 
|  | The current POSIX spec says that | 
|  | .Sq )\& | 
|  | is an ordinary character in the absence of an unmatched | 
|  | .Sq ( ; | 
|  | this was an unintentional result of a wording error, | 
|  | and change is likely. | 
|  | Avoid relying on it. | 
|  | .Pp | 
|  | Back-references are a dreadful botch, | 
|  | posing major problems for efficient implementations. | 
|  | They are also somewhat vaguely defined | 
|  | (does | 
|  | .Sq a\e(\e(b\e)*\e2\e)*d | 
|  | match | 
|  | .Sq abbbd ? ) . | 
|  | Avoid using them. | 
|  | .Pp | 
|  | POSIX's specification of case-independent matching is vague. | 
|  | The | 
|  | .Dq one case implies all cases | 
|  | definition given above | 
|  | is the current consensus among implementors as to the right interpretation. | 
|  | .Pp | 
|  | The syntax for word boundaries is incredibly ugly. |