12.2. Line Boundaries
Always use the /m flag .
In addition to always using the /x flag, always use the /m flag. In every regular expression you ever write.The normal behaviour of the ^ and $ metacharacters is unintuitive to most programmers, especially if they're coming from a Unix background. Almost all of the Unix utilities that feature regular expressions (e.g., sed, grep, awk) are intrinsically line-oriented. So in those utilities, ^ and $ naturally mean "match at the start of any line" and "match at the end of any line", respectively.But they don't mean that in Perl.In Perl, ^ and $ mean "match at the start of the entire string " and "match at the end of the entire string ". That's a crucial difference, and one that leads to a very common type of mistake:In fact, what that code really does is:
# Find the end of a Perl program...
$text =~ m{ [^\0]*? # match the minimal number of non-null chars
^_ _END_ _$ # until a line containing only an end-marker
}x;
The minimal number of characters until the start of the string is, of course, zero[*]. Then the regex has to match '_ _END_ _'. And then it has to be at the end of the string. So the only strings that this pattern matches are those that consist of '_ _END_ _'. That is clearly not what was intended.
$text =~ m{ [^\0]*? # match the minimal number of non-null chars
^ # until the start of the string
_ _END_ _ # then match the end-marker
$ # then match the end of the string
}x;
[*] "What part of 'the start' don't you understand???"
The /m mode makes ^ and $ work "naturally"[
[] That is, it makes them work in the unnatural way in which most programmers think they work.
The previous example could be fixed by making those two metacharacters actually mean what the original developer thought they meant, simply by adding a /m:
Which now really means:
# Find the end of a Perl program...
$text =~ m{ [^\0]*?# any non-nulls
^_ _END_ _$# until an end-marker line
}xm;
Consistently using the /m on every regex makes Perl's behaviour consistently conform to your unreasonable expectations. So you don't have to unreasonably change your expectations to conform to Perl's behaviour[*].
$text =~ m{ [^\0]*?# match the minimal number of chars
^# until the start of any line (/m mode)
_ _END_ _# then match the end-marker
$# then match the end of a line (/m mode)
}xm;
[*] In Maxims for Revolutionists (1903), George Bernard Shaw observed: "The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man." That is an equally deep and powerful approach to programming.