Regular Expression Using Perl : Perl is the language that is the most famous for its use of regular expression for good reasons.
We use the =~ operator to denote a match or an assignment depending upon the context. The use of !~ is to reverse the sense of the match.
There are basically two regex operators in perl:
- Matching:
m// - Substitution:
s///
The purpose of the // is to enclose the regex. However, any other delimiters like {}</codmy ($hours, $minutes, $seconds) = ($time =~ m/(\d+):(\d+):(\d+)/); e>, "", etc could be used.
Matching
To use the matching operator, we simply check both sides using the =~ and m// operator.
The following sets $true to 1 if and only if $foo matches the regular expression foo:
$true = ($foo =~ m/foo/);
It is not difficult to see that just the opposite is achieved with !~:
$false = ($foo !~ m/foo/);
Capturing
As promised, the () could be used for capturing parts of the regexes. When the pattern inside a parentheses match, they go into special variables like $1, $2, etc in that order.
Example:
Here’s how one would extract the hours, minutes, seconds from a time string:
if ($time =~ /(\d\d):(\d\d):(\d\d)/) { # match hh:mm:ss format
$hours = $1;
$minutes = $2;
$seconds = $3;
}
In list context, the list ($1, $2, $3, .. ) would be returned.
my ($hours, $minutes, $seconds) = ($time =~ m/(\d+):(\d+):(\d+)/);
Substitution
This is our favorite search and replace feature. Almost the same syntax rules apply here except that there is an extra clause between the second // that tells us what to match with.
$x = "Time to feed the cat!";
$x =~ s/cat/hacker/; # $x contains "Time to feed the hacker!"
if ($x =~ s/^(Time.*hacker)!$/$1 now!/) {
$more_insistent = 1;
}
$y = "'quoted words'";
$y =~ s/^'(.*)'$/$1/; # strip single quotes,
# $y contains "quoted words"
Modifiers
Modifiers could be appended to the end of the regex operation expression to modify their matching behavior.
Here is a list of some important modifiers:
| Modifier | Description |
i |
Case insensisitive matching |
s |
Allows the use of . to match newlines |
x |
Allows use of whitespace in the regex for clarity |
g |
Globally find all matches |
Here’s how one might want to use the
gmodifier:$x = "I batted 4 for 4"; $x =~ s/4/four/; # doesn't do it all: # $x contains "I batted four for 4" $x = "I batted 4 for 4"; $x =~ s/4/four/g; # does it all: # $x contains "I batted four for four"
