Jump to content

Regular Expressions/Basic Regular Expressions

From Wikibooks, open books for an open world

Basic Regular Expressions: Note that particular implementations of regular expressions interpret the backslash symbol differently in front of some of the metacharacters. For example, egrep and perl interpret unbackslashed parentheses and vertical bars as metacharacters, reserving the backslashed versions to mean the literal characters themselves. Old versions of grep did not support the pipe alternation operator.

Operators
Operator Effect
. The dot operator matches any single character.
[ ] A box enables a single character to be matched against a character list or character range.
[^ ] A compliment box enables a single character not within a character list or character range to be matched.
* An asterisk specifies zero or more characters to match.
^ The caret anchor matches the beginning of the line.
$ The dollar anchor matches the end of the line.
Examples:
Example Match
".at" any three-character string like hat, cat or bat
"[hc]at" hat and cat
"[^b]at" all the matched strings from the regex ".at" except bat
"^[hc]at" hat and cat, but only at the beginning of a line
"[hc]at$" hat and cat, but only at the end of a line

Many ranges of characters depend on the chosen locale setting. For example, in some settings letters are organized as abc..yzABC..YZ, while in some they are organized as aAbBcC..yYzZ.

The Posix Basic Regular Expressions syntax provided extensions for consistency between utility programs such as grep, sed and awk. These extensions are not supported by some traditional implementations of Unix tools.

Use in Tools

[edit | edit source]

Tools and languages that utilize this regular expression syntax include: TBD