Regular Expressions/Simple Regular Expressions
Appearance
The Simple Regular Expression syntax is widely used on Unix based systems for the purposes of backwards compatibility. Most regular-expression-aware Unix utilities, such as grep and sed, use it by default while providing support for extended regular expressions with command line arguments (see below). This syntax is deprecated on POSIX compliant systems and should not be used by new utilities.
When simple regular expression syntax is being used, most characters, except metacharacters are treated as literal characters and match only themselves (for example, "a" matches "a", "(bc" matches "(bc", etc).
Operator | Effect |
---|---|
. | The dot operator matches any single character. |
[ ] | boxes enable a single character to be matched against character lists or character ranges. |
[^ ] | A complement box enables a single character not within a character list or character range to be matched. |
^ | A caret anchor matches the start of the line (or any line, when applied in multiline mode) |
$ | A dollar anchor matches the end of the line (or any line, when applied in multiline mode) |
( ) | parentheses are used to define a marked subexpression. The matched text section can be recalled at a later time. |
\n | Where n is a digit from 1 to 9; matches what the nth marked subexpression matched. This irregular construct has not been adopted in the extended regular expression syntax. |
* | A single character expression followed by "*" matches zero or more copies of the expression. For example, "ab*c" matches "ac", "abc", "abbbc" etc. "[xyz]*" matches "", "x", "y", "zx", "zyx", and so on.
|
Examples
[edit | edit source]Examples:
- "^[hc]at"
- Matches hat and cat but only at the beginning of a line.
- "[hc]at$"
- Matches hat and cat but only at the end of a line.
Use in Tools
[edit | edit source]Tools and languages that utilize this regular expression syntax include: