blob: c9b8da2ee2c4915cb805a75edd931bc6051eefe7 [file] [log] [blame]
Elliott Hughes5b808042021-10-01 10:56:10 -07001News about PCRE2 releases
2-------------------------
3
4
Elliott Hughes4e19c8e2022-04-15 15:11:02 -07005Version 10.40 15-April-2022
6---------------------------
7
8This is mostly a bug-fixing and code-tidying release. However, there are some
9extensions to Unicode property handling:
10
11* Added support for Bidi_Class and a number of binary Unicode properties,
12including Bidi_Control.
13
14* A number of changes to script matching for \p and \P:
15
16 (a) Script extensions for a character are now coded as a bitmap instead of
17 a list of script numbers, which should be faster and does not need a
18 loop.
19
20 (b) Added the syntax \p{script:xxx} and \p{script_extensions:xxx} (synonyms
21 sc and scx).
22
23 (c) Changed \p{scriptname} from being the same as \p{sc:scriptname} to being
24 the same as \p{scx:scriptname} because this change happened in Perl at
25 release 5.26.
26
27 (d) The standard Unicode 4-letter abbreviations for script names are now
28 recognized.
29
30 (e) In accordance with Unicode and Perl's "loose matching" rules, spaces,
31 hyphens, and underscores are ignored in property names, which are then
32 matched independent of case.
33
34As always, see ChangeLog for a list of all changes (also the Git log).
35
36
Elliott Hughes16619d62021-10-29 12:10:38 -070037Version 10.39 29-October-2021
38-----------------------------
39
40This release is happening soon after 10.38 because the bug fix is important.
41
421. Fix incorrect detection of alternatives in first character search in JIT.
43
442. Update to Unicode 14.0.0.
45
463. Some code cleanups (see ChangeLog).
47
48
Elliott Hughes5b808042021-10-01 10:56:10 -070049Version 10.38 01-October-2021
50-----------------------------
51
52As well as some bug fixes and tidies (as always, see ChangeLog for details),
53the documentation is updated to list the new URLs, following the move of the
54source repository to GitHub and the mailing list to Google Groups.
55
56* The CMake build system can now build both static and shared libraries in one
57go.
58
59* Following Perl's lead, \K is now locked out in lookaround assertions by
60default, but an option is provided to re-enable the previous behaviour.
61
62
63Version 10.37 26-May-2021
64-------------------------
65
66A few more bug fixes and tidies. The only change of real note is the removal of
67the actual POSIX names regcomp etc. from the POSIX wrapper library because
68these have caused issues for some applications (see 10.33 #2 below).
69
70
71Version 10.36 04-December-2020
72------------------------------
73
74Again, mainly bug fixes and tidies. The only enhancements are the addition of
75GNU grep's -m (aka --max-count) option to pcre2grep, and also unifying the
76handling of substitution strings for both -O and callouts in pcre2grep, with
77the addition of $x{...} and $o{...} to allow for characters whose code points
78are greater than 255 in Unicode mode.
79
80NOTE: there is an outstanding issue with JIT support for MacOS on arm64
81hardware. For details, please see Bugzilla issue #2618.
82
83
84Version 10.35 15-April-2020
85---------------------------
86
87Bugfixes, tidies, and a few new enhancements.
88
891. Capturing groups that contain recursive backreferences to themselves are no
90longer automatically atomic, because the restriction is no longer necessary
91as a result of the 10.30 restructuring.
92
932. Several new options for pcre2_substitute().
94
953. When Unicode is supported and PCRE2_UCP is set without PCRE2_UTF, Unicode
96character properties are used for upper/lower case computations on characters
97whose code points are greater than 127.
98
994. The character tables (for low-valued characters) can now more easily be
100saved and restored in binary.
101
1025. Updated to Unicode 13.0.0.
103
104
105Version 10.34 21-November-2019
106------------------------------
107
108Another release with a few enhancements as well as bugfixes and tidies. The
109main new features are:
110
1111. There is now some support for matching in invalid UTF strings.
112
1132. Non-atomic positive lookarounds are implemented in the pcre2_match()
114interpreter, but not in JIT.
115
1163. Added two new functions: pcre2_get_match_data_size() and
117pcre2_maketables_free().
118
1194. Upgraded to Unicode 12.1.0.
120
121
122Version 10.33 16-April-2019
123---------------------------
124
125Yet more bugfixes, tidies, and a few enhancements, summarized here (see
126ChangeLog for the full list):
127
1281. Callouts from pcre2_substitute() are now available.
129
1302. The POSIX functions are now all called pcre2_regcomp() etc., with wrapper
131functions that use the standard POSIX names. However, in pcre2posix.h the POSIX
132names are defined as macros. This should help avoid linking with the wrong
133library in some environments, while still exporting the POSIX names for
134pre-existing programs that use them.
135
1363. Some new options:
137
138 (a) PCRE2_EXTRA_ESCAPED_CR_IS_LF makes \r behave as \n.
139
140 (b) PCRE2_EXTRA_ALT_BSUX enables support for ECMAScript 6's \u{hh...}
141 construct.
142
143 (c) PCRE2_COPY_MATCHED_SUBJECT causes a copy of a matched subject to be
144 made, instead of just remembering a pointer.
145
1464. Some new Perl features:
147
148 (a) Perl 5.28's experimental alphabetic names for atomic groups and
149 lookaround assertions, for example, (*pla:...) and (*atomic:...).
150
151 (b) The new Perl "script run" features (*script_run:...) and
152 (*atomic_script_run:...) aka (*sr:...) and (*asr:...).
153
154 (c) When PCRE2_UTF is set, allow non-ASCII letters and decimal digits in
155 capture group names.
156
1575. --disable-percent-zt disables the use of %zu and %td in formatting strings
158in pcre2test. They were already automatically disabled for VC and older C
159compilers.
160
1616. Some changes related to callouts in pcre2grep:
162
163 (a) Support for running an external program under VMS has been added, in
164 addition to Windows and fork() support.
165
166 (b) --disable-pcre2grep-callout-fork restricts the callout support in
167 to the inbuilt echo facility.
168
169
170Version 10.32 10-September-2018
171-------------------------------
172
173This is another mainly bugfix and tidying release with a few minor
174enhancements. These are the main ones:
175
1761. pcre2grep now supports the inclusion of binary zeros in patterns that are
177read from files via the -f option.
178
1792. ./configure now supports --enable-jit=auto, which automatically enables JIT
180if the hardware supports it.
181
1823. In pcre2_dfa_match(), internal recursive calls no longer use the stack for
183local workspace and local ovectors. Instead, an initial block of stack is
184reserved, but if this is insufficient, heap memory is used. The heap limit
185parameter now applies to pcre2_dfa_match().
186
1874. Updated to Unicode version 11.0.0.
188
1895. (*ACCEPT:ARG), (*FAIL:ARG), and (*COMMIT:ARG) are now supported.
190
1916. Added support for \N{U+dddd}, but only in Unicode mode.
192
1937. Added support for (?^) to unset all imnsx options.
194
195
196Version 10.31 12-February-2018
197------------------------------
198
199This is mainly a bugfix and tidying release (see ChangeLog for full details).
200However, there are some minor enhancements.
201
2021. New pcre2_config() options: PCRE2_CONFIG_NEVER_BACKSLASH_C and
203PCRE2_CONFIG_COMPILED_WIDTHS.
204
2052. New pcre2_pattern_info() option PCRE2_INFO_EXTRAOPTIONS to retrieve the
206extra compile time options.
207
2083. There are now public names for all the pcre2_compile() error numbers.
209
2104. Added PCRE2_CALLOUT_STARTMATCH and PCRE2_CALLOUT_BACKTRACK bits to a new
211field callout_flags in callout blocks.
212
213
214Version 10.30 14-August-2017
215----------------------------
216
217The full list of changes that includes bugfixes and tidies is, as always, in
218ChangeLog. These are the most important new features:
219
2201. The main interpreter, pcre2_match(), has been refactored into a new version
221that does not use recursive function calls (and therefore the system stack) for
222remembering backtracking positions. This makes --disable-stack-for-recursion a
223NOOP. The new implementation allows backtracking into recursive group calls in
224patterns, making it more compatible with Perl, and also fixes some other
225previously hard-to-do issues. For patterns that have a lot of backtracking, the
226heap is now used, and there is an explicit limit on the amount, settable by
227pcre2_set_heap_limit() or (*LIMIT_HEAP=xxx). The "recursion limit" is retained,
228but is renamed as "depth limit" (though the old names remain for
229compatibility).
230
231There is also a change in the way callouts from pcre2_match() are handled. The
232offset_vector field in the callout block is no longer a pointer to the
233actual ovector that was passed to the matching function in the match data
234block. Instead it points to an internal ovector of a size large enough to hold
235all possible captured substrings in the pattern.
236
2372. The new option PCRE2_ENDANCHORED insists that a pattern match must end at
238the end of the subject.
239
2403. The new option PCRE2_EXTENDED_MORE implements Perl's /xx feature, and
241pcre2test is upgraded to support it. Setting within the pattern by (?xx) is
242also supported.
243
2444. (?n) can be used to set PCRE2_NO_AUTO_CAPTURE, because Perl now has this.
245
2465. Additional compile options in the compile context are now available, and the
247first two are: PCRE2_EXTRA_ALLOW_SURROGATE_ESCAPES and
248PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.
249
2506. The newline type PCRE2_NEWLINE_NUL is now available.
251
2527. The match limit value now also applies to pcre2_dfa_match() as there are
253patterns that can use up a lot of resources without necessarily recursing very
254deeply.
255
2568. The option REG_PEND (a GNU extension) is now available for the POSIX
257wrapper. Also there is a new option PCRE2_LITERAL which is used to support
258REG_NOSPEC.
259
2609. PCRE2_EXTRA_MATCH_LINE and PCRE2_EXTRA_MATCH_WORD are implemented for the
261benefit of pcre2grep, and pcre2grep's -F, -w, and -x options are re-implemented
262using PCRE2_LITERAL, PCRE2_EXTRA_MATCH_WORD, and PCRE2_EXTRA_MATCH_LINE. This
263is tidier and also fixes some bugs.
264
26510. The Unicode tables are upgraded from Unicode 8.0.0 to Unicode 10.0.0.
266
26711. There are some experimental functions for converting foreign patterns
268(globs and POSIX patterns) into PCRE2 patterns.
269
270
271Version 10.23 14-February-2017
272------------------------------
273
2741. ChangeLog has the details of a lot of bug fixes and tidies.
275
2762. There has been a major re-factoring of the pcre2_compile.c file. Most syntax
277checking is now done in the pre-pass that identifies capturing groups. This has
278reduced the amount of duplication and made the code tidier. While doing this,
279some minor bugs and Perl incompatibilities were fixed (see ChangeLog for
280details.)
281
2823. Back references are now permitted in lookbehind assertions when there are
283no duplicated group numbers (that is, (?| has not been used), and, if the
284reference is by name, there is only one group of that name. The referenced
285group must, of course be of fixed length.
286
2874. \g{+<number>} (e.g. \g{+2} ) is now supported. It is a "forward back
288reference" and can be useful in repetitions (compare \g{-<number>} ). Perl does
289not recognize this syntax.
290
2915. pcre2grep now automatically expands its buffer up to a maximum set by
292--max-buffer-size.
293
2946. The -t option (grand total) has been added to pcre2grep.
295
2967. A new function called pcre2_code_copy_with_tables() exists to copy a
297compiled pattern along with a private copy of the character tables that is
298uses.
299
3008. A user supplied a number of patches to upgrade pcre2grep under Windows and
301tidy the code.
302
3039. Several updates have been made to pcre2test and test scripts (see
304ChangeLog).
305
306
307Version 10.22 29-July-2016
308--------------------------
309
3101. ChangeLog has the details of a number of bug fixes.
311
3122. The POSIX wrapper function regcomp() did not used to support back references
313and subroutine calls if called with the REG_NOSUB option. It now does.
314
3153. A new function, pcre2_code_copy(), is added, to make a copy of a compiled
316pattern.
317
3184. Support for string callouts is added to pcre2grep.
319
3205. Added the PCRE2_NO_JIT option to pcre2_match().
321
3226. The pcre2_get_error_message() function now returns with a negative error
323code if the error number it is given is unknown.
324
3257. Several updates have been made to pcre2test and test scripts (see
326ChangeLog).
327
328
329Version 10.21 12-January-2016
330-----------------------------
331
3321. Many bugs have been fixed. A large number of them were provoked only by very
333strange pattern input, and were discovered by fuzzers. Some others were
334discovered by code auditing. See ChangeLog for details.
335
3362. The Unicode tables have been updated to Unicode version 8.0.0.
337
3383. For Perl compatibility in EBCDIC environments, ranges such as a-z in a
339class, where both values are literal letters in the same case, omit the
340non-letter EBCDIC code points within the range.
341
3424. There have been a number of enhancements to the pcre2_substitute() function,
343giving more flexibility to replacement facilities. It is now also possible to
344cause the function to return the needed buffer size if the one given is too
345small.
346
3475. The PCRE2_ALT_VERBNAMES option causes the "name" parts of special verbs such
348as (*THEN:name) to be processed for backslashes and to take note of
349PCRE2_EXTENDED.
350
3516. PCRE2_INFO_HASBACKSLASHC makes it possible for a client to find out if a
352pattern uses \C, and --never-backslash-C makes it possible to compile a version
353PCRE2 in which the use of \C is always forbidden.
354
3557. A limit to the length of pattern that can be handled can now be set by
356calling pcre2_set_max_pattern_length().
357
3588. When matching an unanchored pattern, a match can be required to begin within
359a given number of code units after the start of the subject by calling
360pcre2_set_offset_limit().
361
3629. The pcre2test program has been extended to test new facilities, and it can
363now run the tests when LF on its own is not a valid newline sequence.
364
36510. The RunTest script has also been updated to enable more tests to be run.
366
36711. There have been some minor performance enhancements.
368
369
370Version 10.20 30-June-2015
371--------------------------
372
3731. Callouts with string arguments and the pcre2_callout_enumerate() function
374have been implemented.
375
3762. The PCRE2_NEVER_BACKSLASH_C option, which locks out the use of \C, is added.
377
3783. The PCRE2_ALT_CIRCUMFLEX option lets ^ match after a newline at the end of a
379subject in multiline mode.
380
3814. The way named subpatterns are handled has been refactored. The previous
382approach had several bugs.
383
3845. The handling of \c in EBCDIC environments has been changed to conform to the
385perlebcdic document. This is an incompatible change.
386
3876. Bugs have been mended, many of them discovered by fuzzers.
388
389
390Version 10.10 06-March-2015
391---------------------------
392
3931. Serialization and de-serialization functions have been added to the API,
394making it possible to save and restore sets of compiled patterns, though
395restoration must be done in the same environment that was used for compilation.
396
3972. The (*NO_JIT) feature has been added; this makes it possible for a pattern
398creator to specify that JIT is not to be used.
399
4003. A number of bugs have been fixed. In particular, bugs that caused building
401on Windows using CMake to fail have been mended.
402
403
404Version 10.00 05-January-2015
405-----------------------------
406
407Version 10.00 is the first release of PCRE2, a revised API for the PCRE
408library. Changes prior to 10.00 are logged in the ChangeLog file for the old
409API, up to item 20 for release 8.36. New programs are recommended to use the
410new library. Programs that use the original (PCRE1) API will need changing
411before linking with the new library.
412
413****