Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید

7.14. Writing a Unix-Style Filter Program


7.14.1. Problem


You want to write a program that takes a
list of filenames on the command line and reads from
STDIN if no filenames were given. You''d like the
user to be able to give the file "-" to indicate
STDIN or "someprogram
|" to indicate the output of another program. You
might want your program to modify the files in place or to produce
output based on its input.

7.14.2. Solution


Read lines with
<>:

while (<>) {
# do something with the line
}

7.14.3. Discussion


When you say:

while (<>) {
# ...
}

Perl translates this into:[14]

[14]Except that the code
written here won''t work, because ARGV has internal
magic.


unshift(@ARGV, "-") unless @ARGV;
while ($ARGV = shift @ARGV) {
unless (open(ARGV, $ARGV)) {
warn "Can''t open $ARGV: $!\n";
next;
}
while (defined($_ = <ARGV>)) {
# ...
}
}

You can access ARGV and $ARGV
inside the loop to read more from the filehandle or to find the
filename currently being processed. Let''s look at how this works.

7.14.3.1. Behavior


If the user supplies no arguments, Perl sets @ARGV
to a single string, "-". This is shorthand for
STDIN when opened for reading and
STDOUT when opened for writing. It''s also what
lets the user of your program specify "-" as a
filename on the command line to read from STDIN.

Next, the file-processing loop removes one argument at a time from
@ARGV and copies the filename into the global
variable $ARGV. If the file cannot be opened, Perl
goes on to the next one. Otherwise, it processes a line at a time.
When the file runs out, the loop goes back and opens the next one,
repeating the process until @ARGV is exhausted.

The open statement didn''t say open(ARGV,
"<", $ARGV)
. There''s no extra less-than sign supplied.
This allows for interesting effects, like passing the string
"gzip -dc
file.gz |" as an argument, to
make your program read the output of the command
"gzip -dc
file.gz". See Recipe 16.6 for more about this use
of magic open.

You can change @ARGV before or inside the loop.
Let''s say you don''t want the default behavior of reading from
STDIN if there aren''t any arguments—you want
it to default to all C or C++ source and header files. Insert this
line before you start processing <ARGV>:

@ARGV = glob("*.[Cch]") unless @ARGV;

Process options before the loop, either with one of the Getopt
libraries described in Chapter 15 or manually:

# arg demo 1: Process optional -c flag
if (@ARGV && $ARGV[0] eq "-c") {
$chop_first++;
shift;
}
# arg demo 2: Process optional -NUMBER flag
if (@ARGV && $ARGV[0] =~ /^-(\d+)$/) {
$columns = $1;
shift;
}
# arg demo 3: Process clustering -a, -i, -n, or -u flags
while (@ARGV && $ARGV[0] =~ /^-(.+)/ && (shift, ($_ = $1), 1)) {
next if /^$/;
s/a// && (++$append, redo);
die "usage: $0 [-ainu] [filenames] ...\n";
}

Other than its implicit looping over command-line arguments,
<> is not special. The special variables
controlling I/O still apply; see Chapter 8 for
more on them. You can set $/ to set the line
terminator, and $. contains the current line
(record) number. If you undefine $/, you don''t get
the concatenated contents of all files at once; you get one complete
file each time:

undef $/;             
while (<>) {
# $_ now has the complete contents of
# the file whose name is in $ARGV
}

If you localize $/, the old value is automatically
restored when the enclosing block exits:

{     # create block for local        
local $/; # record separator now undef
while (<>) {
# do something; called functions still have
# undeffed version of $/
}
} # $/ restored here

Because processing <ARGV> never explicitly
closes filehandles, the record number in $. is not
reset. If you don''t like that, you can explicitly close the file
yourself to reset $.:

while (<>) {  
print "$ARGV:$.:$_";
close ARGV if eof;
}

The eof function defaults to checking the
end-of-file status of the last file read. Since the last handle read
was ARGV, eof reports whether
we''re at the end of the current file. If so, we close it and reset
the $. variable. On the other hand, the special
notation eof( ) with parentheses but no argument
checks if we''ve reached the end of all files in the
<ARGV> processing.

7.14.3.2. Command-line options






Perl has command-line
options, -n, -p, -a, and
-i, to make writing filters and
one-liners easier.



The
-n option adds the
while (<>) loop around
your program text. It''s normally used for filters like
grep or programs that summarize the data they
read. The program is shown in Example 7-2.

Example 7-2. findlogin1


  #!/usr/bin/perl
# findlogin1 - print all lines containing the string "login"
while (<>) {# loop over files on command line
print if /login/;
}

The program in Example 7-2 could be written as shown
in Example 7-3.

Example 7-3. findlogin2


  #!/usr/bin/perl -n
# findlogin2 - print all lines containing the string "login"
print if /login/;


You can combine the
-n and -e options to run Perl code from the command
line:

% perl -ne ''print if /login/''

The -p option is like -n but adds a print right
before the end of the loop. It''s normally used for programs that
translate their input, such as the program shown in Example 7-4.

Example 7-4. lowercase1


  #!/usr/bin/perl
# lowercase - turn all lines into lowercase
while (<>) { # loop over lines on command line
s/(\p{Letter})/\l$1/g; # change all letters to lowercase
print;
}

The program in Example 7-4 could be written as shown
in Example 7-5.

Example 7-5. lowercase2


  #!/usr/bin/perl -p
# lowercase - turn all lines into lowercase
s/(\p{Letter})/\l$1/g;# change all letters to lowercase

Or it could be written from the command line as:

% perl -pe ''s/(\p{Letter})/\l$1/g''

While using -n or -p for
implicit input looping, the special label LINE: is
silently created for the whole input loop. That means that from an
inner loop, you can skip to the following input record by using
next LINE (which is like
awk''s next statement), or go
on to the next file by closing ARGV (which is like
awk''s nextfile statement).
This is shown in Example 7-6.

Example 7-6. countchunks


  #!/usr/bin/perl -n
# countchunks - count how many words are used.
# skip comments, and bail on file if _ _END_ _
# or _ _DATA_ _ seen.
for (split /\W+/) {
next LINE if /^#/;
close ARGV if /_ _(DATA|END)_ _/;
$chunks++;
}
END { print "Found $chunks chunks\n" }

The tcsh keeps a .history
file in a format such that every other line contains a commented out
timestamp in Epoch seconds:

#+0894382237
less /etc/motd
#+0894382239
vi ~/.exrc
#+0894382242
date
#+0894382242
who
#+0894382288
telnet home

A simple one-liner can render that legible:

% perl -pe ''s/^#\+(\d+)\n/localtime($1) . " "/e''
Tue May 5 09:30:37 1998 less /etc/motd
Tue May 5 09:30:39 1998 vi ~/.exrc
Tue May 5 09:30:42 1998 date
Tue May 5 09:30:42 1998 who
Tue May 5 09:31:28 1998 telnet home

The -i option changes each file on
the command line. It is described in Recipe 7.16, and is normally used in conjunction with
-p.

7.14.4. See Also


perlrun(1), and the "Switches" section of
Chapter 19 of Programming Perl; Recipe 7.16; Recipe 16.6

/ 875