Jump to content

Perl Programming/Regular expressions reference

From Wikibooks, open books for an open world
(Redirected from Perl Programming/Regular Expressions Reference)
Previous: Regular expression operators Index Next: Code reuse (modules)

Regular expressions with Perl examples

[edit | edit source]
Metacharacter Description Example
Note that all the if statements return a TRUE value
. Matches an arbitrary character, but not a newline.
$string1 = "Hello World\n";
if ($string1 =~ m/...../) {
print "$string1 has length >= 5\n";
}
( ) Groups a series of pattern elements to a single element. When you match a pattern within parentheses, you can use any of $1, $2, $9 later to refer to the previously matched pattern.

Program:

$string1 = "Hello World\n";
if ($string1 =~ m/(H..).(o..)/) {
print "We matched '$1' and '$2'\n";
}

Output:

We matched 'Hel' and 'o W';
+ Matches the preceding pattern element one or more times.
$string1 = "Hello World\n";
if ($string1 =~ m/l+/) {
print "There are one or more consecutive l's in $string1\n";
}
? Matches zero or one times.
$string1 = "Hello World\n";
if ($string1 =~ m/H.?e/) {
print "There is an 'H' and a 'e' separated by ";
print "0-1 characters (Ex: He Hoe)\n";
}
? Matches the *, +, or {M,N}'d regexp that comes before as few times as possible.
$string1 = "Hello World\n";
if ($string1 =~ m/(l+?o)/) {
print "The non-greedy match with one or more 'l'
print "followed by an 'o' is 'lo', not 'llo'.\n";
}
* Matches zero or more times.
$string1 = "Hello World\n";
if ($string1 =~ m/el*o/) {
print "There is an 'e' followed by zero to many";
print "'l' followed by 'o' (eo, elo, ello, elllo)\n";
}
{M,N} Denotes the minimum M and the maximum N match count.
$string1 = "Hello World\n";
if ($string1 =~ m/l{1,2}/) {
print "There exists a substring with at least one";
print "and at most two l's in $string1\n";
}
[...] Denotes a set of possible matches.
$string1 = "Hello World\n";
if ($string1 =~ m/[aeiou]+/) {
print "$string1 contains a one or more";
print "vowels\n";
}
[^...] Matches any character not in the square brackets.
$string = "Sky.";
if (String =~ /[^aeiou]/) {
print "$string doesn't contain any vowels";
}
| Matches one of the left or right operand.
$string1 = "Hello World\n";
if ($string1 =~ m/(Hello|Hi)/) {
print "Hello or Hi is ";
print "contained in $string1";
}
\b Matches a word boundary.
$string1 = "Hello World\n";
if ($string1 =~ m/ello?\b/) {
print "There is a word that ends with";
print " 'ello'\n";
} else {
print "There are no words that end with";
print "'ello'\n";
}
\w Matches alphanumeric, including "_".
$string1 = "Hello World\n";
if ($string1 =~ m/\w/) {
print "There is at least one alpha-";
print "numeric char in $string1 (A-Z, a-z, 0-9, _)\n";
}
\W Matches a non-alphanumeric character.
$string1 = "Hello World\n";
if ($string1 =~ m/\W/) {
print "The space between Hello and ";
print "World is not alphanumeric\n";
}
\s Matches a whitespace character (space, tab, newline, formfeed)
$string1 = "Hello World\n";
if ($string1 =~ m/\s.*\s/) {
print "There are TWO whitespace ";
print "characters separated by other characters in $string1";
}
\S Matches anything but a whitespace.
$string1 = "Hello World\n";
if ($string1 =~ m/\S.*\S/) {
print "There are TWO non-whitespace ";
print "characters separated by other characters in $string1";
}
\d Matches a digit, same as [0-9].
$string1 = "99 bottles of beer on the wall.";
if ($string1 =~ m/(\d+)/) {
print "$1 is the first number in '$string1'\n";
}
'''Output:'''
99 is the first number in '<tt>99 bottles of beer on the wall.</tt>'
\D Matches a non-digit.
$string1 = "Hello World\n";
if ($string1 =~ m/\D/) {
print "There is at least one character in $string1";
print "that is not a digit.\n";
}
^ Matches the beginning of a line or string.
$string1 = "Hello World\n";
if ($string1 =~ m/^He/) {
print "$string1 starts with the characters 'He'\n";
}
$ Matches the end of a line or string.
$string1 = "Hello World\n";
if ($string1 =~ m/rld$/) {
print "$string1 is a line or string";
print "that ends with 'rld'\n";
}
Previous: Regular expression operators Index Next: Code reuse (modules)