blob: a5bab8151dc349c0b0264790f4ab2b08ec6e601f [file] [log] [blame]
Janis Danisevskis112c9cc2016-03-31 13:35:25 +01001.TH PCRE2LIMITS 3 "05 November 2015" "PCRE2 10.21"
2.SH NAME
3PCRE2 - Perl-compatible regular expressions (revised API)
4.SH "SIZE AND OTHER LIMITATIONS"
5.rs
6.sp
7There are some size limitations in PCRE2 but it is hoped that they will never
8in practice be relevant.
9.P
10The maximum size of a compiled pattern is approximately 64K code units for the
118-bit and 16-bit libraries if PCRE2 is compiled with the default internal
12linkage size, which is 2 bytes for these libraries. If you want to process
13regular expressions that are truly enormous, you can compile PCRE2 with an
14internal linkage size of 3 or 4 (when building the 16-bit library, 3 is rounded
15up to 4). See the \fBREADME\fP file in the source distribution and the
16.\" HREF
17\fBpcre2build\fP
18.\"
19documentation for details. In these cases the limit is substantially larger.
20However, the speed of execution is slower. In the 32-bit library, the internal
21linkage size is always 4.
22.P
23The maximum length of a source pattern string is essentially unlimited; it is
24the largest number a PCRE2_SIZE variable can hold. However, the program that
25calls \fBpcre2_compile()\fP can specify a smaller limit.
26.P
27The maximum length (in code units) of a subject string is one less than the
28largest number a PCRE2_SIZE variable can hold. PCRE2_SIZE is an unsigned
29integer type, usually defined as size_t. Its maximum value (that is
30~(PCRE2_SIZE)0) is reserved as a special indicator for zero-terminated strings
31and unset offsets.
32.P
33Note that when using the traditional matching function, PCRE2 uses recursion to
34handle subpatterns and indefinite repetition. This means that the available
35stack space may limit the size of a subject string that can be processed by
36certain patterns. For a discussion of stack issues, see the
37.\" HREF
38\fBpcre2stack\fP
39.\"
40documentation.
41.P
42All values in repeating quantifiers must be less than 65536.
43.P
44The maximum length of a lookbehind assertion is 65535 characters.
45.P
46There is no limit to the number of parenthesized subpatterns, but there can be
47no more than 65535 capturing subpatterns. There is, however, a limit to the
48depth of nesting of parenthesized subpatterns of all kinds. This is imposed in
49order to limit the amount of system stack used at compile time. The limit can
50be specified when PCRE2 is built; the default is 250.
51.P
52There is a limit to the number of forward references to subsequent subpatterns
53of around 200,000. Repeated forward references with fixed upper limits, for
54example, (?2){0,100} when subpattern number 2 is to the right, are included in
55the count. There is no limit to the number of backward references.
56.P
57The maximum length of name for a named subpattern is 32 code units, and the
58maximum number of named subpatterns is 10000.
59.P
60The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
61is 255 for the 8-bit library and 65535 for the 16-bit and 32-bit libraries.
62.
63.
64.SH AUTHOR
65.rs
66.sp
67.nf
68Philip Hazel
69University Computing Service
70Cambridge, England.
71.fi
72.
73.
74.SH REVISION
75.rs
76.sp
77.nf
78Last updated: 05 November 2015
79Copyright (c) 1997-2015 University of Cambridge.
80.fi