Regular expressions - character classes, construction of choice, meta-sequences
J. Friedl’s book, Regular Expressions, has a beautiful tablet that I want to quote here.
Regular expressions open up wide possibilities for searching and replacing in any text. Using regular expressions, you can flexibly and simply process text documents. One of the simplest applications of regular expressions is to search for text - many text editors provide the ability to search by regular expression patterns . In regexp there are several types of metacharacters that perform different functions, let's briefly consider them:
* These features are not supported by all versions of egrep
Regular expressions open up wide possibilities for searching and replacing in any text. Using regular expressions, you can flexibly and simply process text documents. One of the simplest applications of regular expressions is to search for text - many text editors provide the ability to search by regular expression patterns . In regexp there are several types of metacharacters that perform different functions, let's briefly consider them:
^
and$
- “bind” matches of the rest of the regular expression to the beginning and end of the line, respectively. For example, it^cat
matches the lines “cat” and “caterpillar”, and thedog$
lines “bulldog” and “hotdog”.[…]
- character classes , allow you to list the characters that may be in a given position of the text. For example, itgr[ea]y
matches the lines “gray” and “gray”.[^…]
- excluding character classes , allow you to list characters that cannot be in a given position of the text. For example, itg[^ae]rdy
does not match the lines "gardy" and "gerdy", but it matches the lines "gurdy", "g3rdy" and "girdy".(…|…)
- design choice , a choice of several options. It should be noted that each of the parts of the choice construct is a full-fledged regular expression . For example,Jeff(re|er)y
matches the strings Jeffrey and Jeffery.\<
and\>
- meta - sequences are analogous to ^ and $ but at the level of words. For example, matches the single word cat.\
Regular Metacharacters | ||
---|---|---|
Symbol | Title | Interpretation |
Single character elements | ||
. | point | any one character |
[…] | character class | any of the listed characters |
[^…] | inverted character class | any character not listed in the class |
\символ | shielding | if the prefix "\" precedes the character, the character is interpreted as the corresponding literal |
Quantifiers | ||
? | question mark | one instance allowed (none required) |
* | star | any number of instances allowed (none required) |
+ | a plus | one instance required; any number of instances allowed |
{min, max} | interval quantifier * | “min” copies are required, “max” results are allowed |
Positional metacharacters | ||
^ | cover circumflex | position at the beginning of the line |
$ | dollar | position at the end of the line |
\< | word boundary * | position at the beginning of a word |
\> | word boundary * | position at the end of a word |
Other metacharacters | ||
| | design of choice | any of the above expressions |
(…) | round brackets | constraint for choice construct, grouping for applying quantifiers, and saving text for backlinks |
\1, \2, … | trackback | text that previously coincided with the first, second, etc. pairs of parentheses |