blob: 4a745252b25e31da729c8b4ae32a591bfa8cfac4 [file] [log] [blame]
Elliott Hughes5b808042021-10-01 10:56:10 -07001News about PCRE2 releases
2-------------------------
3
4
Elliott Hughes16619d62021-10-29 12:10:38 -07005Version 10.39 29-October-2021
6-----------------------------
7
8This release is happening soon after 10.38 because the bug fix is important.
9
101. Fix incorrect detection of alternatives in first character search in JIT.
11
122. Update to Unicode 14.0.0.
13
143. Some code cleanups (see ChangeLog).
15
16
Elliott Hughes5b808042021-10-01 10:56:10 -070017Version 10.38 01-October-2021
18-----------------------------
19
20As well as some bug fixes and tidies (as always, see ChangeLog for details),
21the documentation is updated to list the new URLs, following the move of the
22source repository to GitHub and the mailing list to Google Groups.
23
24* The CMake build system can now build both static and shared libraries in one
25go.
26
27* Following Perl's lead, \K is now locked out in lookaround assertions by
28default, but an option is provided to re-enable the previous behaviour.
29
30
31Version 10.37 26-May-2021
32-------------------------
33
34A few more bug fixes and tidies. The only change of real note is the removal of
35the actual POSIX names regcomp etc. from the POSIX wrapper library because
36these have caused issues for some applications (see 10.33 #2 below).
37
38
39Version 10.36 04-December-2020
40------------------------------
41
42Again, mainly bug fixes and tidies. The only enhancements are the addition of
43GNU grep's -m (aka --max-count) option to pcre2grep, and also unifying the
44handling of substitution strings for both -O and callouts in pcre2grep, with
45the addition of $x{...} and $o{...} to allow for characters whose code points
46are greater than 255 in Unicode mode.
47
48NOTE: there is an outstanding issue with JIT support for MacOS on arm64
49hardware. For details, please see Bugzilla issue #2618.
50
51
52Version 10.35 15-April-2020
53---------------------------
54
55Bugfixes, tidies, and a few new enhancements.
56
571. Capturing groups that contain recursive backreferences to themselves are no
58longer automatically atomic, because the restriction is no longer necessary
59as a result of the 10.30 restructuring.
60
612. Several new options for pcre2_substitute().
62
633. When Unicode is supported and PCRE2_UCP is set without PCRE2_UTF, Unicode
64character properties are used for upper/lower case computations on characters
65whose code points are greater than 127.
66
674. The character tables (for low-valued characters) can now more easily be
68saved and restored in binary.
69
705. Updated to Unicode 13.0.0.
71
72
73Version 10.34 21-November-2019
74------------------------------
75
76Another release with a few enhancements as well as bugfixes and tidies. The
77main new features are:
78
791. There is now some support for matching in invalid UTF strings.
80
812. Non-atomic positive lookarounds are implemented in the pcre2_match()
82interpreter, but not in JIT.
83
843. Added two new functions: pcre2_get_match_data_size() and
85pcre2_maketables_free().
86
874. Upgraded to Unicode 12.1.0.
88
89
90Version 10.33 16-April-2019
91---------------------------
92
93Yet more bugfixes, tidies, and a few enhancements, summarized here (see
94ChangeLog for the full list):
95
961. Callouts from pcre2_substitute() are now available.
97
982. The POSIX functions are now all called pcre2_regcomp() etc., with wrapper
99functions that use the standard POSIX names. However, in pcre2posix.h the POSIX
100names are defined as macros. This should help avoid linking with the wrong
101library in some environments, while still exporting the POSIX names for
102pre-existing programs that use them.
103
1043. Some new options:
105
106 (a) PCRE2_EXTRA_ESCAPED_CR_IS_LF makes \r behave as \n.
107
108 (b) PCRE2_EXTRA_ALT_BSUX enables support for ECMAScript 6's \u{hh...}
109 construct.
110
111 (c) PCRE2_COPY_MATCHED_SUBJECT causes a copy of a matched subject to be
112 made, instead of just remembering a pointer.
113
1144. Some new Perl features:
115
116 (a) Perl 5.28's experimental alphabetic names for atomic groups and
117 lookaround assertions, for example, (*pla:...) and (*atomic:...).
118
119 (b) The new Perl "script run" features (*script_run:...) and
120 (*atomic_script_run:...) aka (*sr:...) and (*asr:...).
121
122 (c) When PCRE2_UTF is set, allow non-ASCII letters and decimal digits in
123 capture group names.
124
1255. --disable-percent-zt disables the use of %zu and %td in formatting strings
126in pcre2test. They were already automatically disabled for VC and older C
127compilers.
128
1296. Some changes related to callouts in pcre2grep:
130
131 (a) Support for running an external program under VMS has been added, in
132 addition to Windows and fork() support.
133
134 (b) --disable-pcre2grep-callout-fork restricts the callout support in
135 to the inbuilt echo facility.
136
137
138Version 10.32 10-September-2018
139-------------------------------
140
141This is another mainly bugfix and tidying release with a few minor
142enhancements. These are the main ones:
143
1441. pcre2grep now supports the inclusion of binary zeros in patterns that are
145read from files via the -f option.
146
1472. ./configure now supports --enable-jit=auto, which automatically enables JIT
148if the hardware supports it.
149
1503. In pcre2_dfa_match(), internal recursive calls no longer use the stack for
151local workspace and local ovectors. Instead, an initial block of stack is
152reserved, but if this is insufficient, heap memory is used. The heap limit
153parameter now applies to pcre2_dfa_match().
154
1554. Updated to Unicode version 11.0.0.
156
1575. (*ACCEPT:ARG), (*FAIL:ARG), and (*COMMIT:ARG) are now supported.
158
1596. Added support for \N{U+dddd}, but only in Unicode mode.
160
1617. Added support for (?^) to unset all imnsx options.
162
163
164Version 10.31 12-February-2018
165------------------------------
166
167This is mainly a bugfix and tidying release (see ChangeLog for full details).
168However, there are some minor enhancements.
169
1701. New pcre2_config() options: PCRE2_CONFIG_NEVER_BACKSLASH_C and
171PCRE2_CONFIG_COMPILED_WIDTHS.
172
1732. New pcre2_pattern_info() option PCRE2_INFO_EXTRAOPTIONS to retrieve the
174extra compile time options.
175
1763. There are now public names for all the pcre2_compile() error numbers.
177
1784. Added PCRE2_CALLOUT_STARTMATCH and PCRE2_CALLOUT_BACKTRACK bits to a new
179field callout_flags in callout blocks.
180
181
182Version 10.30 14-August-2017
183----------------------------
184
185The full list of changes that includes bugfixes and tidies is, as always, in
186ChangeLog. These are the most important new features:
187
1881. The main interpreter, pcre2_match(), has been refactored into a new version
189that does not use recursive function calls (and therefore the system stack) for
190remembering backtracking positions. This makes --disable-stack-for-recursion a
191NOOP. The new implementation allows backtracking into recursive group calls in
192patterns, making it more compatible with Perl, and also fixes some other
193previously hard-to-do issues. For patterns that have a lot of backtracking, the
194heap is now used, and there is an explicit limit on the amount, settable by
195pcre2_set_heap_limit() or (*LIMIT_HEAP=xxx). The "recursion limit" is retained,
196but is renamed as "depth limit" (though the old names remain for
197compatibility).
198
199There is also a change in the way callouts from pcre2_match() are handled. The
200offset_vector field in the callout block is no longer a pointer to the
201actual ovector that was passed to the matching function in the match data
202block. Instead it points to an internal ovector of a size large enough to hold
203all possible captured substrings in the pattern.
204
2052. The new option PCRE2_ENDANCHORED insists that a pattern match must end at
206the end of the subject.
207
2083. The new option PCRE2_EXTENDED_MORE implements Perl's /xx feature, and
209pcre2test is upgraded to support it. Setting within the pattern by (?xx) is
210also supported.
211
2124. (?n) can be used to set PCRE2_NO_AUTO_CAPTURE, because Perl now has this.
213
2145. Additional compile options in the compile context are now available, and the
215first two are: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES and
216PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.
217
2186. The newline type PCRE2_NEWLINE_NUL is now available.
219
2207. The match limit value now also applies to pcre2_dfa_match() as there are
221patterns that can use up a lot of resources without necessarily recursing very
222deeply.
223
2248. The option REG_PEND (a GNU extension) is now available for the POSIX
225wrapper. Also there is a new option PCRE2_LITERAL which is used to support
226REG_NOSPEC.
227
2289. PCRE2_EXTRA_MATCH_LINE and PCRE2_EXTRA_MATCH_WORD are implemented for the
229benefit of pcre2grep, and pcre2grep's -F, -w, and -x options are re-implemented
230using PCRE2_LITERAL, PCRE2_EXTRA_MATCH_WORD, and PCRE2_EXTRA_MATCH_LINE. This
231is tidier and also fixes some bugs.
232
23310. The Unicode tables are upgraded from Unicode 8.0.0 to Unicode 10.0.0.
234
23511. There are some experimental functions for converting foreign patterns
236(globs and POSIX patterns) into PCRE2 patterns.
237
238
239Version 10.23 14-February-2017
240------------------------------
241
2421. ChangeLog has the details of a lot of bug fixes and tidies.
243
2442. There has been a major re-factoring of the pcre2_compile.c file. Most syntax
245checking is now done in the pre-pass that identifies capturing groups. This has
246reduced the amount of duplication and made the code tidier. While doing this,
247some minor bugs and Perl incompatibilities were fixed (see ChangeLog for
248details.)
249
2503. Back references are now permitted in lookbehind assertions when there are
251no duplicated group numbers (that is, (?| has not been used), and, if the
252reference is by name, there is only one group of that name. The referenced
253group must, of course be of fixed length.
254
2554. \g{+<number>} (e.g. \g{+2} ) is now supported. It is a "forward back
256reference" and can be useful in repetitions (compare \g{-<number>} ). Perl does
257not recognize this syntax.
258
2595. pcre2grep now automatically expands its buffer up to a maximum set by
260--max-buffer-size.
261
2626. The -t option (grand total) has been added to pcre2grep.
263
2647. A new function called pcre2_code_copy_with_tables() exists to copy a
265compiled pattern along with a private copy of the character tables that is
266uses.
267
2688. A user supplied a number of patches to upgrade pcre2grep under Windows and
269tidy the code.
270
2719. Several updates have been made to pcre2test and test scripts (see
272ChangeLog).
273
274
275Version 10.22 29-July-2016
276--------------------------
277
2781. ChangeLog has the details of a number of bug fixes.
279
2802. The POSIX wrapper function regcomp() did not used to support back references
281and subroutine calls if called with the REG_NOSUB option. It now does.
282
2833. A new function, pcre2_code_copy(), is added, to make a copy of a compiled
284pattern.
285
2864. Support for string callouts is added to pcre2grep.
287
2885. Added the PCRE2_NO_JIT option to pcre2_match().
289
2906. The pcre2_get_error_message() function now returns with a negative error
291code if the error number it is given is unknown.
292
2937. Several updates have been made to pcre2test and test scripts (see
294ChangeLog).
295
296
297Version 10.21 12-January-2016
298-----------------------------
299
3001. Many bugs have been fixed. A large number of them were provoked only by very
301strange pattern input, and were discovered by fuzzers. Some others were
302discovered by code auditing. See ChangeLog for details.
303
3042. The Unicode tables have been updated to Unicode version 8.0.0.
305
3063. For Perl compatibility in EBCDIC environments, ranges such as a-z in a
307class, where both values are literal letters in the same case, omit the
308non-letter EBCDIC code points within the range.
309
3104. There have been a number of enhancements to the pcre2_substitute() function,
311giving more flexibility to replacement facilities. It is now also possible to
312cause the function to return the needed buffer size if the one given is too
313small.
314
3155. The PCRE2_ALT_VERBNAMES option causes the "name" parts of special verbs such
316as (*THEN:name) to be processed for backslashes and to take note of
317PCRE2_EXTENDED.
318
3196. PCRE2_INFO_HASBACKSLASHC makes it possible for a client to find out if a
320pattern uses \C, and --never-backslash-C makes it possible to compile a version
321PCRE2 in which the use of \C is always forbidden.
322
3237. A limit to the length of pattern that can be handled can now be set by
324calling pcre2_set_max_pattern_length().
325
3268. When matching an unanchored pattern, a match can be required to begin within
327a given number of code units after the start of the subject by calling
328pcre2_set_offset_limit().
329
3309. The pcre2test program has been extended to test new facilities, and it can
331now run the tests when LF on its own is not a valid newline sequence.
332
33310. The RunTest script has also been updated to enable more tests to be run.
334
33511. There have been some minor performance enhancements.
336
337
338Version 10.20 30-June-2015
339--------------------------
340
3411. Callouts with string arguments and the pcre2_callout_enumerate() function
342have been implemented.
343
3442. The PCRE2_NEVER_BACKSLASH_C option, which locks out the use of \C, is added.
345
3463. The PCRE2_ALT_CIRCUMFLEX option lets ^ match after a newline at the end of a
347subject in multiline mode.
348
3494. The way named subpatterns are handled has been refactored. The previous
350approach had several bugs.
351
3525. The handling of \c in EBCDIC environments has been changed to conform to the
353perlebcdic document. This is an incompatible change.
354
3556. Bugs have been mended, many of them discovered by fuzzers.
356
357
358Version 10.10 06-March-2015
359---------------------------
360
3611. Serialization and de-serialization functions have been added to the API,
362making it possible to save and restore sets of compiled patterns, though
363restoration must be done in the same environment that was used for compilation.
364
3652. The (*NO_JIT) feature has been added; this makes it possible for a pattern
366creator to specify that JIT is not to be used.
367
3683. A number of bugs have been fixed. In particular, bugs that caused building
369on Windows using CMake to fail have been mended.
370
371
372Version 10.00 05-January-2015
373-----------------------------
374
375Version 10.00 is the first release of PCRE2, a revised API for the PCRE
376library. Changes prior to 10.00 are logged in the ChangeLog file for the old
377API, up to item 20 for release 8.36. New programs are recommended to use the
378new library. Programs that use the original (PCRE1) API will need changing
379before linking with the new library.
380
381****