Regex Pattern Features

This page describes the specRegex regex dialect, and how it compares to some mainstream regex engines. In most cases, your regex will “just work” with specregex.

The following comparison chart is essentially the one from the reference material provided by regular-expressions.info, with specRegex added. The tutorial provided on that website is probably the best possible way to learn how to write regular expressions.

Special and Non-Printable Characters

Basic Features

Character Classes

Unless otherwise noted, the syntax in this section is valid only inside a character class. In effect, this is describing the language used to define character classes in each regex dialect.

Shorthand Character Classes

Anchors

Word Boundaries

Quantifiers

Unicode

Named Groups and Backreferences

Special Groups

Mode Modifiers

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
Literal Character	Any character except `[\^$.\\|?*+()`	All non-special characters match a single instance of themselves	`a` matches `a`	✔	✔	✔	✔	✔
Literal curly braces	`{` and `}`	`{` and `}` are literals unless they’re part of a token such as a `{3}`		✔	✔	✔	✔	✔
Backslash escapes a metacharacter	`\`followed by any of `[\^$.\\|?*+(){}`	A backslash escapes special characters to suppress their special meaning	`\` matches ``	✔	✔	✔	✔	✔
Escape sequence	`\Q...\E`	Matches all characters in `...` literally	`\Q.+\E` matches `.+`	✔	❌	❌	✔	✔
Hexadecimal escape	`\xFF` where FF are 2 hexadecimal digits	Matches the single-byte character `FF`		✔	✔	✔	✔	✔
Character escape	`\n`, `\r` and `\t`	Match an LF character, CR character and a tab character respectively	`\r\n` matches a CRLF line break	✔	✔	✔	✔	✔
Line break	`\R`	Any line break: CRLF, CR, LF, `\f`, `\v`, or any Unicode line break		❌	❌	❌	✔	❌
Line break	`\R`	Matches the next line control character `U+0085`		❌	❌	❌	✔	❌
Line break	`\R`	CRLF line breaks are indivisible		❌	❌	❌	✔	❌
Line break	A literal line break	Matches any line break, regardless of the line break style used.		❌	❌	❌	❌	❌
Character escape	`\a`	Match the “alert” or “bell” control character (ASCII 0x07)		✔	✔	✔	✔	✔
Character escape	`\b`	Match the “backspace” control character (ASCII 0x08)		✔	❌	✔	❌	❌
Character escape	`\B`	Match a backslash	(Use `\\` instead)	❌	❌	❌	❌	❌
Character escape	`\e`	Match the “escape” control character (ASCII 0x1A)		✔	✔	❌	✔	❌
Character escape	`\f`	Match the “form feed” control character (ASCII 0x0C)		✔	✔	✔	✔	✔
Character escape	`\v`	Match the “vertical tab” control character (ASCII 0x0B)		✔	✔	✔	❌	✔
Control character escape	`\cA` through `\cZ`	ASCII character `^A` through `^Z`, equivalent to `\x01` through`\x1A`	`\cM\cJ` matches a CRLF linebreak	✔	✔	✔	✔	❌
Control character escape	`\ca` through `\cz`	ASCII character `^A` through `^Z`, equivalent to `\x01` through`\x1A`	`\cm\cj` matches a CRLF linebreak	✔	✔	✔	✔	❌
NULL escape	`\0`	Matches the null character		❌	✔	✔	✔	✔
Octal escape	`\o{7777}` for any octal number	Matches the character with the specified number	`\o{20254}` matches `€`	✔	❌	❌	✔	❌
Octal escape	`\1` through `\7`	Matches the character at the specified position in the ASCII table	`\7` matches the “bell” character	✔	✔	✔ (Opt)	❌	❌
Octal escape	`\10` through `\77`	Matches the character at the specified position in the ASCII table	`\77` matches `?`	✔	✔	✔ (Opt)	✔	✔
Octal escape	`\100` through `\177`	Matches the character at the specified position in the ASCII table	`\100` matches `@`	✔	✔	✔ (Opt)	✔	✔
Octal escape	`\200` through `\377`	Matches the character at the specified position in the active code page	`\377` matches `ÿ`	✔	✔	✔ (Opt)	✔	✔
Octal escape	`\400` through `\777`	Matches the character at the specified position in the active code page	`\777` matches `ǿ`	✔	✔	✔ (Opt)	✔	✔
Octal escape	`\01` through `\07`	Matches the character at the specified position in the ASCII table	`\07` matches the “bell” character	✔	✔	✔ (Opt)	❌	✔
Octal escape	`\010` through `\077`	Matches the character at the specified position in the ASCII table	`\077` matches `?`	✔	✔	✔ (Opt)	✔	✔
Octal escape	`\0100` through `\0177`	Matches the character at the specified position in the ASCII table	`\0100` matches `@`	✔	❌	❌	❌	❌
Octal escape	`\0200` through `\0377`	Matches the character at the specified position in the active code page	`\0377` matches `ÿ`	✔	❌	❌	❌	❌

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
Dot	`.`	Matches any character except line breaks, depending on options.		✔	✔	✔	✔	✔
Not a line break	`\N`	Matches any character except line break, regardless of options.		❌	❌	❌	✔	❌
Alternation	`\\|`	Match either of several possible patterns.	`ab\\|de\\|xy` matches ab, de or xy	✔	✔	✔	✔	✔
Line feed is alternation	A literal line break	A literal line break character functions as alternation.		❌	❌	✔	❌	❌
Alternation is eager	`\\|`	Alternation takes the first option from the left that matches	`a\\|ab` matches `a` in `ab`	✔	✔	✔ (Opt)	✔	✔
Alternation is greedy	`\\|`	Alternation takes the longest possible match from all options.	`a\\|ab` matches `ab` in `ab`	❌	❌	✔ (Opt)	❌	❌

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
Character classes	`[`	`[` begins a character class. Inside a character class, different rules apply		✔	✔	✔	✔	✔
Literal character	Any character except `^-]\`	Non-special characters add themselves to the character class	`[abc]` matches `a`, `b` or `c`	✔	✔	✔	✔	✔
Backslash escapes metacharacter	`\\` followed by any of `^-]\`	A backslash escapes special characters to suppress their special meaning.	`[\^\]]` matches `^` or `]`	✔	✔	✔	✔	✔
Range	Hyphen between two characters: eg.	Adds a range of characters to the character class.	`[a-zA-Z0-9]` matches any ASCII letter or digit	✔	✔	✔	✔	✔
Ranges with escapes	Ranges support character escapes	Adds a range of characters to the character class.	`[\0-z]` matches characters between NULL and `z`	❌	✔	✔	✔	✔
Negated character class	`[^`	Negates the character class, so it matches any character not in the set.	`[^a-d]` matches any character except `a`, `b`, `c` or `d`	✔	✔	✔	✔	✔
Literal opening bracket	`[`	An `[` inside a character class adds `[` to the class.	`[ab[cd]ef]` matches `aef]`, `bef]`, `[ef]`, `cef]`, and `def]`	✔	✔	✔	✔	✔
Nested character class	`[`	An `[` inside a character class starts a nested character class.	`[ab[cd]ef]` is the same as `[abcdef]`	❌	❌	❌	❌	❌
Character class subtraction	`[base-[subtract]]`	Removes all characters in the “subtract” class from the “base” class.	`[a-z-[aeiuo]]` matches a single letter that is not a vowel.	❌	✔ (Opt)	❌	❌	❌
Character class intersection	`[base&&[intersect]]`	Reduces the character class to the characters present in both “base” and “intersect”.	`[a-z&&[^aeiuo]]` matches a single letter that is not a vowel.	❌	❌	❌	❌	❌
Character class intersection	`[base&&intersect]`	Reduces the character class to the characters present in both “base” and “intersect”.	`[a-z&&[^aeiuo]]` matches a single letter that is not a vowel.	❌	❌	❌	❌	❌
Character escape	`\n`, `\r` and `\t`	Add an LF character, a CR character, or a tab character to the character class.	`[\n\r\t]` a line feed, a carriage return, or a tab.	✔	✔	✔	✔	✔
Character escape	`\a`	Add the “alert” or “bell” control character (ASCII 0x07) to the character class.	`[\a\t]` matches a bell or a tab character.	✔	✔	✔	✔	✔
Character escape	`\b`	Add the “backspace” control character (ASCII 0x08) to the character class.	`[\b\t]` matches a backspace or a tab character.	✔	✔	✔	✔	❌
Character escape	`\B`	Add a backslash to the character class.		❌	❌	❌	❌	❌
Character escape	`\e`	Add the “escape” control character (ASCII 0x1A) to the character class.	`[\e\t]` matches an escape or a tab character.	✔	✔	❌	✔	❌
Character escape	`\f`	Add the “form feed” control character (ASCII 0x0C) to the character class.	`[\f\t]` matches a form feed or a tab character.	✔	✔	✔	✔	✔
Character escape	`\v`	Add the “vertical tab” control character (ASCII 0x0B) to the character class.	`[\v\t]` matches a vertical tab or a tab character.	✔	✔	✔	❌	✔
POSIX class	`[:alpha:]`	Adds the members of the named POSIX character class.	`[[:digit:][:lower:]]` is the same as `[0-9a-z]`	❌	❌	❌	✔	✔
Negated POSIX class	`[:^alpha:]`	Adds everything except the members of the named POSIX character class.	`[5[:^digit:]]` matches `5` or any non-digit.	❌	❌	❌	✔	✔
POSIX shorthand class	`[:d:]`, `[:s:]`, `[:w:]`	Shorthands for the “digit”, “space”, and “word” classes.	`[[:s:][:d:]]` matches space, tab, line break, or `0-9`	❌	❌	❌	❌	❌
POSIX shorthand class	`[:l:]`, `[:u:]`	Shorthands for the “upper” and “lower” classes.	`[[:u:]][[:l:]]` matches `Aa` but not `aA`.	❌	❌	❌	❌	❌
POSIX shorthand class	`[:h:]`	Shorthand for the “blank” class.	`[[:h:]]` matches a space.	❌	❌	❌	❌	❌
POSIX shorthand class	`[:v:]`	Shorthand for the “vertical space” class.	`[[:v:]]` matches any single vertical whitespace character.	❌	❌	❌	❌	❌
POSIX class	Any supported `\p{...}` syntax	`\p{...}` syntax can be used inside character classes.	`[\p{Digit}\p{Lower}]` matches one of `0-9` or `a-z`	❌	❌	❌	❌	❌
`\p` to identify POSIX classes	`\p{SomePosixClass}`	Matches a single character from POSIX class “SomePosixClass”. May be outside a char class.	`\p{Digit}` matches any single digit.	❌	❌	❌	❌	❌
`\p` to identify POSIX classes	`\p{IsSomePosixClass}`	Matches a single character from POSIX class “SomePosixClass”. May be outside a char class.	`\p{IsDigit}` matches any single digit.	❌	❌	❌	❌	❌
POSIX collation sequence	`[.span-ll.]`	Matches a POSIX collation sequence.	`[[.span-ll.]]` matches `ll` in the Spanish locale	❌	❌	❌	❌	❌
POSIX character equivalence	`[=x=]`	Matches a POSIX character equivalence.	`[[=e=]]` matches `e`, `é`, `è` and `ê` in the French locale	❌	❌	❌	❌	❌

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
Shorthand	Any shorthand outside character classes	Shorthands can be used outside character classes.	`\w` matches a single word character	✔	✔	✔ (Opt)	✔	✔
Shorthand	Any shorthand inside character classes	Shorthands can be used inside character classes.	`[\w]` matches a single word character	✔	✔	✔ (Opt)	✔	✔
Shorthand	Negated shorthand inside character classes.	Negated shorthands can be used inside character classes.	`[\W]` matches a single non-word character	✔	✔	✔ (Opt)	✔	✔
Shorthand	`\d`	Adds all digits to the class, or matches a single digit.	`[\d]` or `\d` match a single digit	✔	✔	✔	✔	✔
Shorthand	`\w`	Adds all word characters to the class, or matches a single word character.	`[\w]`/`\w` match a single word character	✔	✔	✔	✔	✔
Shorthand	`\s`	Adds all whitespace to the class, or matches a single whitespace character.	`[\s]`/`\s` match a single whitespace character	✔	✔	✔	✔	✔
Shorthand	`\l`/`\u`	Adds all lowercase/uppercase characters to the class, or matches one such character.	`\u\l` matches `Aa` but not `aA`.	✔	❌	❌	❌	❌
Shorthand	`\v`	Adds all vertical whitespace characters to the class, or matches one such character.	`[\v]`/`\v` match a single vertical whitespace character	❌	❌	❌	✔ (Opt)	❌
Shorthand	`\h`	Adds all horizontal whitespace characters to the class, or matches one such character.	`[\h]`/`\h` match a single horizontal whitespace character	✔	❌	❌	✔ (Opt)	❌
Shorthand	`\h`	Adds all hex digit characters to the class, or matches one such character.	`[\h]`/`\h` match a single hex digit	❌	❌	❌	✔ (Opt)	❌
Shorthand	`\i`	Adds all characters allowed at the start of an XML identifier (or matches one).	`\i\c*` matches an XML name	❌	❌	❌	❌	❌
Shorthand	`\c`	Adds all characters allowed in an XML identifier after the first.	`\i\c*` matches an XML name	❌	❌	❌	❌	❌

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
String anchor	`^`	Matches at the start of the string the regex pattern is applied to.	`^.` matches `a` in `abc\ndef`	✔	✔	✔	✔	✔
String anchor	`$`	Matches at the end of the string the regex pattern is applied to.	`.$` matches `f` in `abc\ndef`	✔	✔	✔	✔	✔
String anchor	`$`	Matches before the final line break, if any, as well as the end of the string.	`.$` matches `f` in `abc\ndef\n`	❌	✔	❌	✔	❌
Line anchor	`^`	Matches after each line break as well as the start of the string.	`^.` matches `a` and `d` in `abc\ndef`	❌	✔ (Opt)	✔	✔ (Opt)	✔ (Opt)
Line anchor	`$`	Matches before each line break as well as the end of the string.	`.$` matches `c` and `f` in `abc\ndef`	❌	✔ (Opt)	✔	✔ (Opt)	✔ (Opt)
String anchor	`\A`	Matches at the start of the string the regex pattern is applied to.	`\A\w` matches only `a` in `abc`	❌	✔	❌	✔	❌

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
Word boundary	`\b`	Matches a position that is followed by a word character but not preceded by one, or that is preceded by a word character but not followed by one.	`\b.` matches `a`, , and `d` in `abc def`	✔	✔ (Opt)	✔ (Opt)	✔ (Opt)	✔
Word boundary	`\B`	Matches at a position that is preceded and followed by a word character, or that is not preceded and not followed by a word character.	`\B.` matches `b`, `c`, `e`, and `f` in `abc def`	✔	✔ (Opt)	✔ (Opt)	✔ (Opt)	✔

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
Greedy quantifier	`?`	Makes the preceding item optional. Greedy, so prefers to match if possible.	`abc?` matches `abc` or `ab`	✔	✔	✔	✔	✔
Lazy quantifier	`??`	Makes the preceding item optional. Lazy, so prefers not to match.	`abc??` matches `ab` or `abc`	✔	✔	✔ (Opt)	✔	✔
Possessive quantifier¹	`?+`	Makes the preceding item optional. Possessive, so will never relinquish match.	`abc?+c` matches `abcc` but not `abc`	❌	❌	❌	✔	❌
Greedy quantifier	`*`	Match preceding item 0 or more times. Greedy, so prefers to match if possible.	`".*"` matches `"def" "ghi"` in `abc "def" "ghi" jkl`	✔	✔	✔	✔	✔
Lazy quantifier	`*?`	Match preceding item 0 or more times. Lazy, so prefers not to match.	`".*?"` matches `"def"` and `"ghi"` in `abc "def" "ghi" jkl`	✔	✔	✔ (Opt)	✔	✔
Possessive quantifier²	`*+`	Match preceding item 0 or more times. Possessive, so will never relinquish match.	`".*+"` can never match anything	❌	❌	❌	✔	❌
Greedy quantifier	`+`	Match preceding item 1 or more times. Greedy, so prefers to match if possible.	`".+"` matches `"def" "ghi"` in `abc "def" "ghi" jkl`	✔	✔	✔	✔	✔
Lazy quantifier	`+?`	Match preceding item 1 or more times. Lazy, so prefers not to match.	`".+?"` matches `"def"` and `"ghi"` in `abc "def" "ghi" jkl`	✔	✔	✔ (Opt)	✔	✔
Possessive quantifier³	`++`	Match preceding item 1 or more times. Possessive, so will never relinquish match.	`".++"` can never match anything	❌	❌	❌	✔	❌
Fixed quantifier	`{N}` for integer `N >= 0`	Match preceding item `N` times.	`a{3}` matches `aaa`	✔	✔	✔	✔	✔
Greedy ranged quantifier	`{N,M}` for integers `N,M >= 0, M>=N`	Match preceding item between `N` and `M` times, preferring more repetitions.	`a{2,3}` matches `aa` and `aaa`.	✔	✔	✔	✔	✔
Lazy ranged quantifier	`{N,M}?` for integers `N,M >= 0, M>=N`	Match preceding item between `N` and `M` times, preferring fewer repetitions.	`a{2,4}?` matches `aa`, `aaa` or `aaaa`	✔	✔	✔	✔	✔
Possessive ranged quantifier⁴	`{N,M}+` for integers `N,M >= 0, M>=N`	Match preceding item between `N` and `M` times, possessive.	`a{2,4}+a` matches `aaaaa` but not `aaaa`	❌	❌	❌	✔	❌
Greedy variable quantifier	`{N,}` for integer `N >= 0`	Match preceding item at least `N` times, preferring more repetitions.	`a{2,}` matches `aa`, `aaa`, `aaaa`, etc.	✔	✔	✔	✔	✔
Lazy variable quantifier	`{N,}?` for integer `N >= 0`	Match preceding item at least `N` times, preferring fewer repetitions.	`a{2,}?` matches `aa` in `aaaaa`	✔	✔	✔	✔	✔
Possessive variable quantifier⁵	`{N,}+` for integer `N >= 0`	Match preceding item at least `N` times, possessive.	`a{2,}+a` never matches anything	❌	❌	❌	✔	❌
Greedy variable quantifier	`{,N}` for integer `N >= 0`	Match preceding item no more than `N` times, preferring more repetitions.	`a{,4}` matches `aaaa`, `aaa`, `aa`, `a`, or the empty string	✔	❌	❌	❌	❌
Lazy variable quantifier	`{,N}?` for integer `N >= 0`	Match preceding item no more than `N` times, preferring fewer repetitions.	`a{,4}?` matches the empty string, `a`, `aa`, `aaa` or `aaaa`	✔	❌	❌	❌	❌

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
Grapheme	`\X`	Matches a single Unicode grapheme, whether encoded as a single code point or multiple code points using combining marks.	`\X` matches `à` encoded as `U+0061 U+0300`, `à` encoded as `U+00E0`, `©`, etc.	❌	❌	❌	✔	❌
Code point	`\uFFFF` where `FFFF` are 4 hex digits	Matches a specific Unicode code point.	`\u00E0` matches `à` encoded as `U+00E0` only. `\u00A9` matches `©`	✔	✔	✔ (Opt)	❌	❌
Code point	`\u{FFFF}` where `FFFF` are 1-4 hex digits	Matches a specific Unicode code point.	`\u{E0}` matches `à` encoded as `U+00E0` only. `\u{A9}` matches `©`	✔	❌	❌	❌	❌
Code point	`\xFFFF` where `FFFF` are 4 hex digits	Matches a specific Unicode code point.	`\x00E0` matches `à` encoded as `U+00E0` only. `\x00A9` matches `©`	✔	❌	✔ (Opt)	❌	❌
Code point	`\x{FFFF}` where `FFFF` are 1-4 hex digits	Matches a specific Unicode code point.	`\x{E0}` matches `à` encoded as `U+00E0` only. `\x{A9}` matches `©`	✔	❌	❌	✔	✔
Unicode category	`\pL` where `L` is a Unicode category	Matches a single Unicode code point in the specified Unicode category.	`\pL` matches `à` encoded as `U+00E0`; `\pS` matches `©`	❌	❌	❌	✔	✔
Unicode category	`\p{L}` where `L` is a Unicode category	Matches a single Unicode code point in the specified Unicode category.	`\p{L}` matches `à` encoded as `U+00E0`; `\p{S}` matches `©`	❌	✔	❌	✔	✔
Unicode category	`\p{IsL}` where `L` is a Unicode category	Matches a single Unicode code point in the specified Unicode category.	`\p{IsL}` matches `à` encoded as `U+00E0`; `\p{IsS}` matches `©`	❌	❌	❌	❌	❌
Negated unicode category	`\PL` where `L` is a Unicode category	Matches a single Unicode code point not in the specified Unicode category.	`\PS` matches `à` encoded as `U+00E0`; `\PL` matches `©`	❌	❌	❌	✔	✔
Longhand category	`\p{Category}`	Matches a single Unicode code point in the specified Unicode category.	`\p{Letter}` matches `à` encoded as `U+00E0`; `\p{Symbol}` matches `©`	❌	❌	❌	❌	❌
Longhand category	`\p{IsCategory}`	Matches a single Unicode code point in the specified Unicode category.	`\p{IsLetter}` matches `à` encoded as `U+00E0`; `\p{IsSymbol}` matches `©`	❌	❌	❌	❌	❌
Unicode script	`\p{Script}`	Matches a single Unicode code point in the specified Unicode script.	`\p{Greek}` matches `Ω`	❌	❌	❌	✔	✔
Unicode script	`\p{IsScript}`	Matches a single Unicode code point in the specified Unicode script.	`\p{IsGreek}` matches `Ω`	❌	❌	❌	❌	❌
Unicode block	`\p{Block}`	Matches a single Unicode code point in the specified Unicode block.	`\p{Arrows}` matches any of the code points from `U+2190` until `U+21FF` (`←` until `⇿`)	❌	❌	❌	❌	❌
Unicode block	`\p{InBlock}`	Matches a single Unicode code point in the specified Unicode block.	`\p{InArrows}` matches any of the code points from `U+2190` until `U+21FF` (`←` until `⇿`)	❌	❌	❌	❌	❌
Unicode block	`\p{IsBlock}`	Matches a single Unicode code point in the specified Unicode block.	`\p{IsArrows}` matches any of the code points from `U+2190` until `U+21FF` (`←` until `⇿`)	❌	✔	❌	❌	❌
Negated unicode property	`\P{Property}`	Matches a single code point that lacks the named property (block/script/category).	`\P{L}` matches `©`	❌	✔	❌	✔	✔
Negated unicode property	`\p{^Property}`	Matches a single code point that lacks the named property (block/script/category).	`\p{^L}` matches `©`	❌	❌	❌	✔	✔
Unicode property	`\P{^Property}`	Matches a single code point that lacks the named property (block/script/category).	`\P{^L}` matches `q`	❌	❌	❌	✔	✔

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
Named capture group	`(?<name>regex)`	A capture group named `name`. `name` must start with a letter	`(?<x>abc){3}` matches `abcabcabc`. The group `x` matches `abc`	✔	✔	❌	✔	❌
Named capture group	`(?'name'regex)`	A capture group named `name`. `name` must start with a letter	`(?'x'abc){3}` matches `abcabcabc`. The group `x` matches `abc`	✔	✔	❌	✔	❌
Named capture group	`(?P<name>regex)`	A capture group named `name`. `name` must start with a letter	`(?P<x>abc){3}` matches `abcabcabc`. The group `x` matches `abc`	✔	❌	❌	✔	✔
Duplicate named groups	Any named group	Two named groups can share the same name.	`(?<x>a)\\|(?<x>b)` matches `a` or `b`.	✔	✔	❌	✔	✔
Duplicate named groups	Any named group	Named groups that share the same name are treated as one an the same group.		✔	✔	❌	❌	❌
Duplicate named groups	Any named group	Backreferences refer to the leftmost participating group with the given name.		❌	❌	❌	✔	❌
Named backreference	`\k<name>`	Refers to the text matched by group `name`	`(?<x>abc\\|def)=\k<x>` matches `abc=abc` or `def=def`, but not `abc=def` or `def=abc`.	❌	✔	❌	✔	❌
Named backreference	`\k'name'`	Refers to the text matched by group `name`	`(?'x'abc\\|def)=\k'x'` matches `abc=abc` or `def=def`, but not `abc=def` or `def=abc`.	❌	✔	❌	✔	❌
Named backreference	`\k{name}`	Refers to the text matched by group `name`	`(?'x'abc\\|def)=\k{x}` matches `abc=abc` or `def=def`, but not `abc=def` or `def=abc`.	❌	❌	❌	✔	❌
Named backreference	`\g{name}`	Refers to the text matched by group `name`	`(?'x'abc\\|def)=\g{x}` matches `abc=abc` or `def=def`, but not `abc=def` or `def=abc`.	❌	❌	❌	✔	❌
Named backreference	`(?P=name)`	Refers to the text matched by group `name`	`(?P<x>abc\\|def)=(?P=x)` matches `abc=abc` or `def=def`, but not `abc=def` or `def=abc`.	❌	❌	❌	✔	❌
Failed backreference	Any supported backreference	Backreferences to groups that did not participate in the match attempt fail to match	`(?<x>a)?\k<x>` matches `aa` but fails to match `b`.	❌	✔	❌	✔	❌
Nested backreference	Any supported backreference	Backreferences can be used inside the group they reference.	`(?<x>a\k<x>?){3}` matches `aaaaaa`.	❌	✔	❌	✔	❌
Forward backreference	Any supported backreference	Backreferences can be used before the group they reference.	`(\k<x>?(?<x>a)){3}` matches `aaaaaa`.	❌	✔	❌	✔	❌
Named capturing group	Any supported named capture group	A number is a valid name for a capturing group.	`(?<17>abc){3}` matches `abcabcabc`. The group named “17” matches `abc`.	✔	✔	❌	❌	✔
Named capturing group	Any capture group named with a number	If the name of the group is a number, that becomes the group’s name and the group’s number.	`(?<17>abc\\|def)=\17` matches `abc=abc` or `def=def`, but not `abc=def` or `def=abc`.	❌	✔	❌	❌	❌
Named capturing group	Any supported named capture group	A negative number is a valid name for a capturing group.	`(?<-17>abc){3}` matches `abcabcabc`. The group named “-17” matches `abc`.	✔	❌	❌	❌	❌
Named backreference	Any supported backreference	A negative number can be used in a named backreference to refer to a negatively-named group		❌	❌	❌	❌	❌

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
Comment	`(?#comment)`	Ignored by the regex engine.	`a(?#foobar)b` matches `ab`	✔	✔	❌	✔	❌
Branch reset group	`(?\\|foo\\|bar\\|baz)`	Capture group numbering starts from the same offset in each branch.	`(?\\|(a)\\|(b))` has only group `1`	✔	❌	❌	✔	❌
Atomic group⁶	`(?>regex)`	Prevent backtracking into the group after it matches.	`a(?>bc\\|b)c`matches `abcc` but not `abc`	❌	✔	❌	✔	❌
Positive lookahead	`(?=regex)`	Assert that `regex` matches immediately after this position.	`t(?=s)` matches the second `t` in `streets`	❌	✔	✔ (Opt)	✔	❌
Negative lookahead	`(?!regex)`	Assert that `regex` doesn’t match immediately after this position.	`t(?!s)` matches the first `t` in `streets`	❌	✔	✔ (Opt)	✔	❌
Positive lookbehind	`(?<=regex)`	Assert that `regex` matches match immediately before this position.	`(?<=s)t` matches the first `t` in `streets`	❌	✔	❌	✔	❌
Negative lookbehind	`(?<!regex)`	Assert that `regex` doesn’t match immediately before this position.	`(?<!s)t`matches the second `t` in `streets`.	❌	✔	❌	✔	❌
Lookbehind	`(?<=regex\\|longer regex)`	Alternatives inside lookbehind can differ in length.		❌	✔	❌	✔	❌
Lookbehind	`(?<=x{n,m})`	Quantifiers with maximum repetition count can be used inside lookbehind.	`(?<=s\w{1,7})t` matches the fourth `t` in `twisty streets`.	❌	✔	❌	❌	❌
Lookbehind	`(?<=regex)`	The full regular expression syntax can be used inside lookbehind.	`(?<=s\w+)t` matches only the fourth `t` in `twisty streets`.	❌	✔	❌	❌	❌
Lookbehind	`(group)(?<=\1)`	Backreferences can be used inside lookbehind.	`(\w).+(?<=\1)` matches `twisty street` in twisty `streets`.	❌	✔	❌	✔	❌
Exclude text from match	`\K`	Text left of `\K` is omitted from overall match, but groups are unaffected.	`s\Kt` matches only the first `t` in `streets`.	✔	❌	❌	✔	❌

Feature	Syntax	Description	Example	specRegex	.NET	std::regex	PCRE2	RE2
Mode modifier	`(?letters)` at the start of the regex	A mode modifier at the start of the regex affects the whole regex and overrides any options set outside the regex.	`(?i)a` matches `a` and `A`.	✔	✔	❌	✔	✔
Mode modifier	`(?letters)` at in the middle of the regex	A mode modifier affects regex tokens to the right of it, until overridden by a contradictory mode modifier.	`te(?i)st` matches `test` and `teST` but not `TEst` or `TEST`	✔	✔	❌	✔	✔
Mode modifier group	`(?letters:regex)`	Non-capturing group with modifiers that affect only the part of the regex inside the group.	`te(?i:st)` matches test and `teST` but not `TEst` or `TEST`	✔	✔	❌	✔	✔
Negative modifier	`(?on-off)` and `(?on-off:regex)`	Modifier letters (if any) before the hyphen are turned on, while modifier letters after the hyphen are turned off.	`(?i)te(?-i)st` matches `test` and `TEst` but not `teST` or `TEST`	✔	✔	❌	✔	✔
Reset modifiers	`(?^)`	Turn off all options. The caret can be followed by modifier letters to turn some options back on.	`(?i)te(?^)st` matches `test` and `TEst` but not `teST` or `TEST`	✔	❌	❌	❌	❌
Case insensitive	`(?i)`	Turn on case insensitivity.	`(?i)a` matches `a` and `A`	✔	✔	❌	✔	✔
Free spacing	`(?x)`	Turn on free-spacing mode to ignore whitespace between regex tokens and allow `#` comments.	`(?x)a#b` matches `a`	✔	✔	❌	✔	❌
Freer spacing	`(?xx)`	Like `(?x)`, but also allows free spacing inside character classes.	`(?xx)[ a]` matches `a` but not	❌	❌	❌	✔	❌
Tight spacing	`(?t)`	Disables free spacing mode.	`(?t)a#b` matches `a#b`	✔	❌	❌	❌	❌
Single line	`(?s)`	Make the dot match all characters including line break characters.	`(?s).*` matches `ab\n\ndef` in `ab\n\ndef`	✔	✔	❌	✔	✔
Multi line	`(?m)`	Make ^ and $ match at the start and end of each line.	`(?m)^.` matches `a` and `d` in `ab\n\ndef`	❌	✔	❌	✔	✔
Explicit capture	`(?n)`	Plain parentheses are non-capturing groups instead of numbered capturing groups. Only named capturing groups actually capture.	`(?n)(a\\|b)c` is the same as `(?:a\\|b)c`	✔	✔	❌	✔	❌
Duplicate named groups	`(?J)`	Allow multiple named capturing groups to share the same name.	`(?J)(?:(?'x'a)\\|(?'x'b))\k'x'` matches `aa` or `bb`	✔	❌	❌	✔	❌
Ungreedy quantifiers	`(?U)`	Switches the syntax for greedy and lazy quantifiers.	`(?U)a` is lazy and `(?U)a?` is greedy	✔	❌	❌	✔	✔
UNIX lines	`(?d)`	When anchors match at line breaks, or dot does not match line breaks, only consider the line feed character as a line break.	`(?dm)^.` matches `a` and `c` in `a\rb\nc`	❌	❌	❌	❌	❌
Literal	`(?q)`	Interpret the regular expression as a literal string (excluding the modifier)	`(?q)[a\]+` matches `[a\]+` literally	✔	❌	❌	❌	❌

The main application for atomic groups and possessive quantifiers is to avoid catastrophic backtracking. Since specRegex is not a backtracking-based regex engine, it is immune to this problem by design. One side effect of this design is that atomic groups and possessive quantifiers are difficult to implement, but it also makes them largely unnecessary.↩︎
The main application for atomic groups and possessive quantifiers is to avoid catastrophic backtracking. Since specRegex is not a backtracking-based regex engine, it is immune to this problem by design. One side effect of this design is that atomic groups and possessive quantifiers are difficult to implement, but it also makes them largely unnecessary.↩︎
The main application for atomic groups and possessive quantifiers is to avoid catastrophic backtracking. Since specRegex is not a backtracking-based regex engine, it is immune to this problem by design. One side effect of this design is that atomic groups and possessive quantifiers are difficult to implement, but it also makes them largely unnecessary.↩︎
The main application for atomic groups and possessive quantifiers is to avoid catastrophic backtracking. Since specRegex is not a backtracking-based regex engine, it is immune to this problem by design. One side effect of this design is that atomic groups and possessive quantifiers are difficult to implement, but it also makes them largely unnecessary.↩︎
The main application for atomic groups and possessive quantifiers is to avoid catastrophic backtracking. Since specRegex is not a backtracking-based regex engine, it is immune to this problem by design. One side effect of this design is that atomic groups and possessive quantifiers are difficult to implement, but it also makes them largely unnecessary.↩︎
The main application for atomic groups and possessive quantifiers is to avoid catastrophic backtracking. Since specRegex is not a backtracking-based regex engine, it is immune to this problem by design. One side effect of this design is that atomic groups and possessive quantifiers are difficult to implement, but it also makes them largely unnecessary.↩︎

Introduction