PHP Programming/Regular expressions
Syntax
[edit | edit source]Character | Type | Explanation |
---|---|---|
. | Dot | any character |
[...] | Brackets | character class: all the enumerated characters in the class |
[^...] | Brackets and circumflex | complemented class: all the characters except for the enumerated ones |
^ | Circumflex | string or line start |
$ | Dollar | string or line end |
| | Pipe | alternative |
(...) | Parenthesis | capture group: also used to limit the range of an alternative |
* | Asterisk | 0, 1 or several occurrences |
+ | Plus | 1 or several occurrences |
? | Interrogation | 0 or 1 occurrence |
Classe | Signification |
---|---|
[[:alpha:]] | any letter |
[[:digit:]] | any digit |
[[:xdigit:]] | hexadecimal characters |
[[:alnum:]] | any letter or digit |
[[:space:]] | any white space |
[[:punct:]] | any punctuation letter |
[[:lower:]] | any small cap letter |
[[:upper:]] | any capital letter |
[[:blank:]] | space or tabulation |
[[:graph:]] | displayable et printable characters |
[[:cntrl:]] | escaping characters |
[[:print:]] | printable characters, except for the control ones |
Expression | Signification |
---|---|
\A | String start |
\b | Start or end of word character |
\d | Digit |
\D | Non digit |
\s | Space characters |
\S | Non space characters |
\w | Letter, digit or underscore |
\W | Non letter, digit or underscore character |
\X | Unicode character |
\z | String end |
?:
: ignore the capture group when numeration. Ex:((?:ignored_substring|other).)
?!
: negation. Ex:((?!excluded_substring).)
$1
: first capture group result.
Attention: to search for a dollar, "\$"
doesn't work because it's the variables format, so the simple quotes must be used instead of the double quotes: '\$'
.
in PHP, the regex patterns must always be surrounded by a delimiter symbol. We generally use the grave accent (`), but we also find / and #.
In addition, we can add some options after these delimiters:
i | case insensibility |
m | the "." include carriage returns |
x | ignore spaces |
o | only treat the first match |
u | count the Unicode characters (in multi-byte) |
Research
[edit | edit source]The function ereg()
, which allowed to research in regex, has been replaced by preg_match()
since PHP 5.3.
preg_match()
[edit | edit source]The function preg_match
[3] is the main regex search function[4]. It returns a Boolean and asks the two mandatory parameters: the regex pattern and the string to scan.
The third parameter represents the variable which stores the results array.
Finally, the fourth accepts an PHP flag allowing to modify the function base behavior.
- Minimal example:
<?php
$string = 'PHP regex test for the English Wikibooks.';
if (preg_match('`.*Wikibooks.*`', $string)) {
print('This texts talks about Wikibooks');
} else {
print('This texts doesn\'t talk about Wikibooks');
}
?>
- Advanced example:
<?php
$string = 'PHP regex test for the English Wikibooks.';
if (preg_match('`.*Wikibooks.*`', $string), results, $flag) {
var_dump(results);
} else {
print('This texts doesn\'t talk about Wikibooks');
}
?>
Flag examples:[5]
- PREG_OFFSET_CAPTURE: displays the searched substring position in the string.
- PREG_GREP_INVERT: displays the inverse in
preg_grep()
.
preg_grep()
[edit | edit source]This function searches into arrays[6].
preg_match_all()
[edit | edit source]To get all true results in one array, replace preg_match by preg_match_all[7], and print by print_r.
Example to filter a file content:
$regex = "/\(([^)]*)\)/";
preg_match_all($regex, file_get_contents($filename), $matches);
print_r($matches);
Replacement
[edit | edit source]preg_replace()
[edit | edit source]The function preg_replace accepts three parameters: the replaced and replacing string to treat.
<?php
// Replace spaces by underscores
$string = "PHP regex test for the English Wikibooks.";
$sortedString = preg_replace('`( )`', '_', $string);
echo $sortedString;
?>
preg_filter()
[edit | edit source]Same as preg_replace()
but its result only include the replacements.
preg_split()
[edit | edit source]Decomposes a string.
References
[edit | edit source]- ↑ https://www.regular-expressions.info/posixbrackets.html
- ↑ http://www.regular-expressions.info/unicode.html
- ↑ http://php.net/manual/en/function.preg-match.php
- ↑ http://php.net/manual/en/ref.pcre.php
- ↑ http://php.net/manual/en/pcre.constants.php
- ↑ http://php.net/manual/fr/function.preg-grep.php
- ↑ http://www.expreg.com/pregmatchall.php