Mastering Regular Expressions (2nd Edition) [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Mastering Regular Expressions (2nd Edition) [Electronic resources] - نسخه متنی

Jeffrey E. F. Friedl

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید










2.1 About the Examples


This chapter takes a few sample problems validating user input; working with
email headers; converting plain text to HTML and wanders through the regular
expression landscape with them. As I develop them, I'll "think out loud" to offer a
few insights into the thought processes that go into crafting a regex. During our
journey, we'll see some constructs and features that egrep doesn't have, and we'll
take plenty of side trips to look at other important concepts as well.

Toward the end of this chapter, and in subsequent chapters, I'll show examples in
a variety of languages including Java and Visual Basic .NET, but the examples
throughout most of this chapter are in Perl. Any of these languages, and most others
for that matter, allow you to employ regular expressions in much more complex
ways than egrep, so using any of them for the examples would allow us to
see interesting things. I choose to start with Perl primarily because it has the most
ingrained, easily accessible regex support among the popular languages. Also, Perl
provides many other concise data-handling constructs that alleviate much of the
"dirty work" of our example tasks, letting us concentrate on regular expressions.

Just to quickly demonstrate some of these powers, recall the file-check example
from Section 1.1, where I needed to ensure that each file contained 'ResetSize'
exactly as many times as 'SetSize'. The utility I used was Perl, and the command
was:

% perl -0ne 'print "$ARGV\n" if s/ResetSize//ig != s/SetSize//ig' *

(I don't expect that you understand this yet I hope merely that you'll be
impressed with the brevity of the solution.)

I like Perl, but it's important not to get too caught up in its trappings here.
Remember, this chapter concentrates on regular expressions. As an analogy, consider
the words of a computer science professor in a first-year course: "You're
going to learn computer-science concepts here, but we'll use Pascal to show you."
[1]

[1] Pascal is a traditional programming language originally designed for teaching. Thanks to William F.
Maton, and his professor, for the analogy.


Since this chapter doesn't assume that you know Perl, I'll be sure to introduce
enough to make the examples understandable. (Chapter 7, which looks at all the
nitty-gritty details of Perl, does assume some basic knowledge.) Even if you have
experience with a variety of programming languages, normal Perl may seem quite
odd at first glance because its syntax is very compact and its semantics thick. In
the interest of clarity, I won't take advantage of much that Perl has to offer, instead
presenting programs in a more generic, almost pseudo-code style. While not "bad,"
the examples are not the best models of The Perl Way of programming. But, we
will see some great uses of regular expressions.


2.1.1 A Short Introduction to Perl


Perl is a powerful scripting language first developed in the late 1980s, drawing
ideas from many other programming languages and tools. Many of its concepts of
text handling and regular expressions are derived from two specialized languages
called awk and sed, both of which are quite different from a "traditional" language
such as C or Pascal.

Perl is available for many platforms, including DOS/Windows, MacOS, OS/2, VMS,
and Unix. It has a powerful bent toward text handling, and is a particularly common
tool used for Web-related processing. See
www.perl.com
for information on
how to get a copy of Perl for your system.

This book addresses the Perl language as of Version 5.8, but the examples in this
chapter are written to work with versions as early as Version 5.005.

Let's look at a simple example:


$celsius = 30;
$fahrenheit = ($celsius * 9 / 5) + 32; # calculate Fahrenheit
print "$celsius C is $fahrenheit F.\n"; # report both temperatures

When executed, this produces:


30 C is 86 F.

Simple variables, such as $fahrenheit and $celsius, always begin with a dollar
sign, and can hold a number or any amount of text. (In this example, only numbers
are used.) Comments begin with # and continue for the rest of the line.

If you're used to languages such as C, C#, Java, or VB.NET, perhaps most surprising
is that in Perl, variables can appear within a double-quoted string. With the
string "$celsius C is $fahrenheit F.\n", each variable is replaced by its
value. In this case, the resulting string is then printed. (The \n represents a
newline.)

Perl offers control structures similar to other popular languages:


$celsius = 20;
while ($celsius <= 45)
{
$fahrenheit = ($celsius * 9 / 5) + 32; # calculate Fahrenheit
print "$celsius C is $fahrenheit F.\n";
$celsius = $celsius + 5;
}

The body of the code controlled by the while loop is executed repeatedly so long
as the condition (the
$celsius <= 45
in this case) is true. Putting this into a file,
say temps, we can run it directly from the command line.

Here's how a run looks:


% perl -w temps
20 C is 68 F.
25 C is 77 F.
30 C is 86 F.
35 C is 95 F.
40 C is 104 F.
45 C is 113 F.

The -w option is neither necessary nor has anything directly to do with regular
expressions. It tells Perl to check your program more carefully and issue warnings
about items it thinks to be dubious, (such as using uninitialized variables and the
like variables do not normally need to be predeclared in Perl). I use it here
merely because it is good practice to always do so.

Well, that's it for the general introduction to Perl. We'll move on now to see how
Perl allows us to use regular expressions.


/ 83