blob: f940750980a68e4d6f8f99affec2274dae86ed27 [file] [log] [blame]
Ian Hodson2ee91b42012-05-14 12:29:36 +01001RE2 regular expression syntax reference
2-------------------------­-------­-----
3
4Single characters:
Alexander Gutkin0d4c5232013-02-28 13:47:27 +00005. any character, possibly including newline (s=true)
Ian Hodson2ee91b42012-05-14 12:29:36 +01006[xyz] character class
7[^xyz] negated character class
8\d Perl character class
9\D negated Perl character class
10[:alpha:] ASCII character class
11[:^alpha:] negated ASCII character class
12\pN Unicode character class (one-letter name)
13\p{Greek} Unicode character class
14\PN negated Unicode character class (one-letter name)
15\P{Greek} negated Unicode character class
16
17Composites:
18xy «x» followed by «y»
19x|y «x» or «y» (prefer «x»)
20
21Repetitions:
22x* zero or more «x», prefer more
23x+ one or more «x», prefer more
24x? zero or one «x», prefer one
25x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more
26x{n,} «n» or more «x», prefer more
27x{n} exactly «n» «x»
28x*? zero or more «x», prefer fewer
29x+? one or more «x», prefer fewer
30x?? zero or one «x», prefer zero
31x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer
32x{n,}? «n» or more «x», prefer fewer
33x{n}? exactly «n» «x»
34x{} (== x*) NOT SUPPORTED vim
35x{-} (== x*?) NOT SUPPORTED vim
36x{-n} (== x{n}?) NOT SUPPORTED vim
37x= (== x?) NOT SUPPORTED vim
38
39Possessive repetitions:
40x*+ zero or more «x», possessive NOT SUPPORTED
41x++ one or more «x», possessive NOT SUPPORTED
42x?+ zero or one «x», possessive NOT SUPPORTED
43x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED
44x{n,}+ «n» or more «x», possessive NOT SUPPORTED
45x{n}+ exactly «n» «x», possessive NOT SUPPORTED
46
47Grouping:
48(re) numbered capturing group
49(?P<name>re) named & numbered capturing group
50(?<name>re) named & numbered capturing group NOT SUPPORTED
51(?'name're) named & numbered capturing group NOT SUPPORTED
52(?:re) non-capturing group
53(?flags) set flags within current group; non-capturing
54(?flags:re) set flags during re; non-capturing
55(?#text) comment NOT SUPPORTED
56(?|x|y|z) branch numbering reset NOT SUPPORTED
57(?>re) possessive match of «re» NOT SUPPORTED
58re@> possessive match of «re» NOT SUPPORTED vim
59%(re) non-capturing group NOT SUPPORTED vim
60
61Flags:
62i case-insensitive (default false)
Alexander Gutkin0d4c5232013-02-28 13:47:27 +000063m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false)
Ian Hodson2ee91b42012-05-14 12:29:36 +010064s let «.» match «\n» (default false)
65U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false)
66Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»).
67
68Empty strings:
69^ at beginning of text or line («m»=true)
70$ at end of text (like «\z» not «\Z») or line («m»=true)
71\A at beginning of text
72\b at word boundary («\w» on one side and «\W», «\A», or «\z» on the other)
73\B not a word boundary
74\G at beginning of subtext being searched NOT SUPPORTED pcre
75\G at end of last match NOT SUPPORTED perl
76\Z at end of text, or before newline at end of text NOT SUPPORTED
77\z at end of text
78(?=re) before text matching «re» NOT SUPPORTED
79(?!re) before text not matching «re» NOT SUPPORTED
80(?<=re) after text matching «re» NOT SUPPORTED
81(?<!re) after text not matching «re» NOT SUPPORTED
82re& before text matching «re» NOT SUPPORTED vim
83re@= before text matching «re» NOT SUPPORTED vim
84re@! before text not matching «re» NOT SUPPORTED vim
85re@<= after text matching «re» NOT SUPPORTED vim
86re@<! after text not matching «re» NOT SUPPORTED vim
87\zs sets start of match (= \K) NOT SUPPORTED vim
88\ze sets end of match NOT SUPPORTED vim
89\%^ beginning of file NOT SUPPORTED vim
90\%$ end of file NOT SUPPORTED vim
91\%V on screen NOT SUPPORTED vim
92\%# cursor position NOT SUPPORTED vim
93\%'m mark «m» position NOT SUPPORTED vim
94\%23l in line 23 NOT SUPPORTED vim
95\%23c in column 23 NOT SUPPORTED vim
96\%23v in virtual column 23 NOT SUPPORTED vim
97
98Escape sequences:
99\a bell (== \007)
100\f form feed (== \014)
101\t horizontal tab (== \011)
102\n newline (== \012)
103\r carriage return (== \015)
104\v vertical tab character (== \013)
105\* literal «*», for any punctuation character «*»
106\123 octal character code (up to three digits)
107\x7F hex character code (exactly two digits)
108\x{10FFFF} hex character code
109\C match a single byte even in UTF-8 mode
110\Q...\E literal text «...» even if «...» has punctuation
111
112\1 backreference NOT SUPPORTED
113\b backspace NOT SUPPORTED (use «\010»)
114\cK control char ^K NOT SUPPORTED (use «\001» etc)
115\e escape NOT SUPPORTED (use «\033»)
116\g1 backreference NOT SUPPORTED
117\g{1} backreference NOT SUPPORTED
118\g{+1} backreference NOT SUPPORTED
119\g{-1} backreference NOT SUPPORTED
120\g{name} named backreference NOT SUPPORTED
121\g<name> subroutine call NOT SUPPORTED
122\g'name' subroutine call NOT SUPPORTED
123\k<name> named backreference NOT SUPPORTED
124\k'name' named backreference NOT SUPPORTED
125\lX lowercase «X» NOT SUPPORTED
126\ux uppercase «x» NOT SUPPORTED
127\L...\E lowercase text «...» NOT SUPPORTED
128\K reset beginning of «$0» NOT SUPPORTED
129\N{name} named Unicode character NOT SUPPORTED
130\R line break NOT SUPPORTED
131\U...\E upper case text «...» NOT SUPPORTED
132\X extended Unicode sequence NOT SUPPORTED
133
134\%d123 decimal character 123 NOT SUPPORTED vim
135\%xFF hex character FF NOT SUPPORTED vim
136\%o123 octal character 123 NOT SUPPORTED vim
137\%u1234 Unicode character 0x1234 NOT SUPPORTED vim
138\%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim
139
140Character class elements:
141x single character
142A-Z character range (inclusive)
143\d Perl character class
144[:foo:] ASCII character class «foo»
145\p{Foo} Unicode character class «Foo»
146\pF Unicode character class «F» (one-letter name)
147
148Named character classes as character class elements:
149[\d] digits (== \d)
150[^\d] not digits (== \D)
151[\D] not digits (== \D)
152[^\D] not not digits (== \d)
153[[:name:]] named ASCII class inside character class (== [:name:])
154[^[:name:]] named ASCII class inside negated character class (== [:^name:])
155[\p{Name}] named Unicode property inside character class (== \p{Name})
156[^\p{Name}] named Unicode property inside negated character class (== \P{Name})
157
158Perl character classes:
159\d digits (== [0-9])
160\D not digits (== [^0-9])
161\s whitespace (== [\t\n\f\r ])
162\S not whitespace (== [^\t\n\f\r ])
163\w word characters (== [0-9A-Za-z_])
164\W not word characters (== [^0-9A-Za-z_])
165
166\h horizontal space NOT SUPPORTED
167\H not horizontal space NOT SUPPORTED
168\v vertical space NOT SUPPORTED
169\V not vertical space NOT SUPPORTED
170
171ASCII character classes:
172[:alnum:] alphanumeric (== [0-9A-Za-z])
173[:alpha:] alphabetic (== [A-Za-z])
174[:ascii:] ASCII (== [\x00-\x7F])
175[:blank:] blank (== [\t ])
176[:cntrl:] control (== [\x00-\x1F\x7F])
177[:digit:] digits (== [0-9])
178[:graph:] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~])
179[:lower:] lower case (== [a-z])
180[:print:] printable (== [ -~] == [ [:graph:]])
181[:punct:] punctuation (== [!-/:-@[-`{-~])
182[:space:] whitespace (== [\t\n\v\f\r ])
183[:upper:] upper case (== [A-Z])
184[:word:] word characters (== [0-9A-Za-z_])
185[:xdigit:] hex digit (== [0-9A-Fa-f])
186
187Unicode character class names--general category:
188C other
189Cc control
190Cf format
191Cn unassigned code points NOT SUPPORTED
192Co private use
193Cs surrogate
194L letter
195LC cased letter NOT SUPPORTED
196L& cased letter NOT SUPPORTED
197Ll lowercase letter
198Lm modifier letter
199Lo other letter
200Lt titlecase letter
201Lu uppercase letter
202M mark
203Mc spacing mark
204Me enclosing mark
205Mn non-spacing mark
206N number
207Nd decimal number
208Nl letter number
209No other number
210P punctuation
211Pc connector punctuation
212Pd dash punctuation
213Pe close punctuation
214Pf final punctuation
215Pi initial punctuation
216Po other punctuation
217Ps open punctuation
218S symbol
219Sc currency symbol
220Sk modifier symbol
221Sm math symbol
222So other symbol
223Z separator
224Zl line separator
225Zp paragraph separator
226Zs space separator
227
228Unicode character class names--scripts:
229Arabic Arabic
230Armenian Armenian
231Balinese Balinese
232Bengali Bengali
233Bopomofo Bopomofo
234Braille Braille
235Buginese Buginese
236Buhid Buhid
237Canadian_Aboriginal Canadian Aboriginal
238Carian Carian
239Cham Cham
240Cherokee Cherokee
241Common characters not specific to one script
242Coptic Coptic
243Cuneiform Cuneiform
244Cypriot Cypriot
245Cyrillic Cyrillic
246Deseret Deseret
247Devanagari Devanagari
248Ethiopic Ethiopic
249Georgian Georgian
250Glagolitic Glagolitic
251Gothic Gothic
252Greek Greek
253Gujarati Gujarati
254Gurmukhi Gurmukhi
255Han Han
256Hangul Hangul
257Hanunoo Hanunoo
258Hebrew Hebrew
259Hiragana Hiragana
260Inherited inherit script from previous character
261Kannada Kannada
262Katakana Katakana
263Kayah_Li Kayah Li
264Kharoshthi Kharoshthi
265Khmer Khmer
266Lao Lao
267Latin Latin
268Lepcha Lepcha
269Limbu Limbu
270Linear_B Linear B
271Lycian Lycian
272Lydian Lydian
273Malayalam Malayalam
274Mongolian Mongolian
275Myanmar Myanmar
276New_Tai_Lue New Tai Lue (aka Simplified Tai Lue)
277Nko Nko
278Ogham Ogham
279Ol_Chiki Ol Chiki
280Old_Italic Old Italic
281Old_Persian Old Persian
282Oriya Oriya
283Osmanya Osmanya
284Phags_Pa 'Phags Pa
285Phoenician Phoenician
286Rejang Rejang
287Runic Runic
288Saurashtra Saurashtra
289Shavian Shavian
290Sinhala Sinhala
291Sundanese Sundanese
292Syloti_Nagri Syloti Nagri
293Syriac Syriac
294Tagalog Tagalog
295Tagbanwa Tagbanwa
296Tai_Le Tai Le
297Tamil Tamil
298Telugu Telugu
299Thaana Thaana
300Thai Thai
301Tibetan Tibetan
302Tifinagh Tifinagh
303Ugaritic Ugaritic
304Vai Vai
305Yi Yi
306
307Vim character classes:
308\i identifier character NOT SUPPORTED vim
309\I «\i» except digits NOT SUPPORTED vim
310\k keyword character NOT SUPPORTED vim
311\K «\k» except digits NOT SUPPORTED vim
312\f file name character NOT SUPPORTED vim
313\F «\f» except digits NOT SUPPORTED vim
314\p printable character NOT SUPPORTED vim
315\P «\p» except digits NOT SUPPORTED vim
316\s whitespace character (== [ \t]) NOT SUPPORTED vim
317\S non-white space character (== [^ \t]) NOT SUPPORTED vim
318\d digits (== [0-9]) vim
319\D not «\d» vim
320\x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim
321\X not «\x» NOT SUPPORTED vim
322\o octal digits (== [0-7]) NOT SUPPORTED vim
323\O not «\o» NOT SUPPORTED vim
324\w word character vim
325\W not «\w» vim
326\h head of word character NOT SUPPORTED vim
327\H not «\h» NOT SUPPORTED vim
328\a alphabetic NOT SUPPORTED vim
329\A not «\a» NOT SUPPORTED vim
330\l lowercase NOT SUPPORTED vim
331\L not lowercase NOT SUPPORTED vim
332\u uppercase NOT SUPPORTED vim
333\U not uppercase NOT SUPPORTED vim
334\_x «\x» plus newline, for any «x» NOT SUPPORTED vim
335
336Vim flags:
337\c ignore case NOT SUPPORTED vim
338\C match case NOT SUPPORTED vim
339\m magic NOT SUPPORTED vim
340\M nomagic NOT SUPPORTED vim
341\v verymagic NOT SUPPORTED vim
342\V verynomagic NOT SUPPORTED vim
343\Z ignore differences in Unicode combining characters NOT SUPPORTED vim
344
345Magic:
346(?{code}) arbitrary Perl code NOT SUPPORTED perl
347(??{code}) postponed arbitrary Perl code NOT SUPPORTED perl
348(?n) recursive call to regexp capturing group «n» NOT SUPPORTED
349(?+n) recursive call to relative group «+n» NOT SUPPORTED
350(?-n) recursive call to relative group «-n» NOT SUPPORTED
351(?C) PCRE callout NOT SUPPORTED pcre
352(?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED
353(?&name) recursive call to named group NOT SUPPORTED
354(?P=name) named backreference NOT SUPPORTED
355(?P>name) recursive call to named group NOT SUPPORTED
356(?(cond)true|false) conditional branch NOT SUPPORTED
357(?(cond)true) conditional branch NOT SUPPORTED
358(*ACCEPT) make regexps more like Prolog NOT SUPPORTED
359(*COMMIT) NOT SUPPORTED
360(*F) NOT SUPPORTED
361(*FAIL) NOT SUPPORTED
362(*MARK) NOT SUPPORTED
363(*PRUNE) NOT SUPPORTED
364(*SKIP) NOT SUPPORTED
365(*THEN) NOT SUPPORTED
366(*ANY) set newline convention NOT SUPPORTED
367(*ANYCRLF) NOT SUPPORTED
368(*CR) NOT SUPPORTED
369(*CRLF) NOT SUPPORTED
370(*LF) NOT SUPPORTED
371(*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre
372(*BSR_UNICODE) NOT SUPPORTED pcre
373