blob: 58a60c1dccd6224977c15a66b19f1a028b848866 [file] [log] [blame]
Elliott Hughes2dbd7d22020-06-03 14:32:37 -07001.TH PCRE2_COMPILE 3 "23 May 2019" "PCRE2 10.34"
Janis Danisevskis112c9cc2016-03-31 13:35:25 +01002.SH NAME
3PCRE2 - Perl-compatible regular expressions (revised API)
4.SH SYNOPSIS
5.rs
6.sp
7.B #include <pcre2.h>
8.PP
9.nf
10.B pcre2_code *pcre2_compile(PCRE2_SPTR \fIpattern\fP, PCRE2_SIZE \fIlength\fP,
11.B " uint32_t \fIoptions\fP, int *\fIerrorcode\fP, PCRE2_SIZE *\fIerroroffset,\fP"
12.B " pcre2_compile_context *\fIccontext\fP);"
13.fi
14.
15.SH DESCRIPTION
16.rs
17.sp
18This function compiles a regular expression pattern into an internal form. Its
19arguments are:
20.sp
21 \fIpattern\fP A string containing expression to be compiled
22 \fIlength\fP The length of the string or PCRE2_ZERO_TERMINATED
23 \fIoptions\fP Option bits
24 \fIerrorcode\fP Where to put an error code
25 \fIerroffset\fP Where to put an error offset
26 \fIccontext\fP Pointer to a compile context or NULL
27.sp
Elliott Hughes9bc971b2018-07-27 13:23:14 -070028The length of the pattern and any error offset that is returned are in code
29units, not characters. A compile context is needed only if you want to provide
30custom memory allocation functions, or to provide an external function for
31system stack size checking, or to change one or more of these parameters:
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010032.sp
Elliott Hughes9bc971b2018-07-27 13:23:14 -070033 What \eR matches (Unicode newlines, or CR, LF, CRLF only);
34 PCRE2's character tables;
35 The newline character sequence;
36 The compile time nested parentheses limit;
37 The maximum pattern length (in code units) that is allowed.
38 The additional options bits (see pcre2_set_compile_extra_options())
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010039.sp
Elliott Hughes9bc971b2018-07-27 13:23:14 -070040The option bits are:
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010041.sp
42 PCRE2_ANCHORED Force pattern anchoring
Elliott Hughes9bc971b2018-07-27 13:23:14 -070043 PCRE2_ALLOW_EMPTY_CLASS Allow empty classes
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010044 PCRE2_ALT_BSUX Alternative handling of \eu, \eU, and \ex
45 PCRE2_ALT_CIRCUMFLEX Alternative handling of ^ in multiline mode
Elliott Hughes9bc971b2018-07-27 13:23:14 -070046 PCRE2_ALT_VERBNAMES Process backslashes in verb names
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010047 PCRE2_AUTO_CALLOUT Compile automatic callouts
48 PCRE2_CASELESS Do caseless matching
49 PCRE2_DOLLAR_ENDONLY $ not to match newline at end
50 PCRE2_DOTALL . matches anything including NL
51 PCRE2_DUPNAMES Allow duplicate names for subpatterns
Elliott Hughes9bc971b2018-07-27 13:23:14 -070052 PCRE2_ENDANCHORED Pattern can match only at end of subject
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010053 PCRE2_EXTENDED Ignore white space and # comments
54 PCRE2_FIRSTLINE Force matching to be before newline
Elliott Hughes9bc971b2018-07-27 13:23:14 -070055 PCRE2_LITERAL Pattern characters are all literal
Elliott Hughes2dbd7d22020-06-03 14:32:37 -070056 PCRE2_MATCH_INVALID_UTF Enable support for matching invalid UTF
Elliott Hughes653c2102019-01-09 15:41:36 -080057 PCRE2_MATCH_UNSET_BACKREF Match unset backreferences
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010058 PCRE2_MULTILINE ^ and $ match newlines within data
59 PCRE2_NEVER_BACKSLASH_C Lock out the use of \eC in patterns
60 PCRE2_NEVER_UCP Lock out PCRE2_UCP, e.g. via (*UCP)
61 PCRE2_NEVER_UTF Lock out PCRE2_UTF, e.g. via (*UTF)
62 PCRE2_NO_AUTO_CAPTURE Disable numbered capturing paren-
63 theses (named ones available)
64 PCRE2_NO_AUTO_POSSESS Disable auto-possessification
65 PCRE2_NO_DOTSTAR_ANCHOR Disable automatic anchoring for .*
66 PCRE2_NO_START_OPTIMIZE Disable match-time start optimizations
67 PCRE2_NO_UTF_CHECK Do not check the pattern for UTF validity
68 (only relevant if PCRE2_UTF is set)
69 PCRE2_UCP Use Unicode properties for \ed, \ew, etc.
70 PCRE2_UNGREEDY Invert greediness of quantifiers
Elliott Hughes9bc971b2018-07-27 13:23:14 -070071 PCRE2_USE_OFFSET_LIMIT Enable offset limit for unanchored matching
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010072 PCRE2_UTF Treat pattern and subjects as UTF strings
73.sp
Elliott Hughes9bc971b2018-07-27 13:23:14 -070074PCRE2 must be built with Unicode support (the default) in order to use
75PCRE2_UTF, PCRE2_UCP and related options.
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010076.P
Elliott Hughes0c26e192019-08-07 12:24:46 -070077Additional options may be set in the compile context via the
78.\" HREF
79\fBpcre2_set_compile_extra_options\fP
80.\"
81function.
82.P
83The yield of this function is a pointer to a private data structure that
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010084contains the compiled pattern, or NULL if an error was detected.
85.P
Elliott Hughes9bc971b2018-07-27 13:23:14 -070086There is a complete description of the PCRE2 native API, with more detail on
87each option, in the
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010088.\" HREF
89\fBpcre2api\fP
90.\"
Elliott Hughes9bc971b2018-07-27 13:23:14 -070091page, and a description of the POSIX API in the
Janis Danisevskis112c9cc2016-03-31 13:35:25 +010092.\" HREF
93\fBpcre2posix\fP
94.\"
95page.