Chapter 24. Common Practices - Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید

Chapter 24. Common Practices


Contents:


Efficiency

Programming with Style

Fluent Perl

Program Generation


Ask almost any Perl programmer, and they'll be glad to give you reams
of advice on how to program. We're no different (in case you hadn't
noticed). In this chapter, rather than trying to tell you about
specific features of Perl, we'll go at it from the other direction and
use a more scattergun approach to describe idiomatic Perl. Our hope is
that, by putting together various bits of things that seemingly aren't
related, you can soak up some of the feeling of what it's like to
actually "think Perl". After all, when you're programming, you don't
write a bunch of expressions, then a bunch of subroutines, then a bunch
of objects. You have to go at everything all at once, more or less.
So this chapter is a bit like that.

There is, however, a rudimentary organization to the chapter, in that
we'll start with the negative advice and work our way towards the positive
advice. We don't know if that will make you feel any better, but it makes
us feel better.

24.1. Common Goofs for Novices



The biggest goof of all is forgetting to use
warnings
, which identifies many errors. The second biggest
goof is forgetting to use strict when it's
appropriate. These two pragmas can save you hours of head-banging
when your program starts getting bigger. (And it will.) Yet another
faux pas is to forget to consult the online FAQ. Suppose you want to
find out if Perl has a round function. You might
try searching the FAQ first:


% perlfaq round


Apart from those "metagoofs", there are several kinds of programming
traps. Some traps almost everyone falls into, and other traps you'll
fall into only if you come from a particular culture that does things
differently. We've separated these out in the following sections.

24.1.1. Universal Blunders





  • Putting a comma after the filehandle in a print
    statement. Although it looks extremely regular and pretty to say:


    print STDOUT, "goodbye", $adj, "world!\n";    # WRONG


    this is nonetheless incorrect, because of that first comma. What you
    want instead is the indirect object syntax:

    print STDOUT "goodbye", $adj, "world!\n";     # ok


    The syntax works this way so that you can say:

    print $filehandle "goodbye", $adj, "world!\n";


    where $filehandle is a scalar holding the name of a
    filehandle at run time. This is distinct from:

    print $notafilehandle, "goodbye", $adj, "world!\n";


    where $notafilehandle is simply a string that is
    part of the list of things to be printed. See "indirect object" in
    the Glossary.




  • Using == instead of eq and
    != instead of ne. The
    == and != operators are
    numeric tests. The other two are
    string tests. The strings
    "123" and "123.00" are equal as
    numbers, but not equal as strings. Also, any nonnumeric string is
    numerically equal to zero. Unless you are dealing with numbers, you
    almost always want the string comparison operators instead.




  • Forgetting the trailing semicolon. Every statement in Perl is
    terminated by a semicolon or the end of a block. Newlines aren't
    statement terminators as they are in awk, Python,
    or FORTRAN. Remember that Perl is like C.


    A statement containing a here document is particularly prone to losing
    its semicolon. It ought to look like this:


    print <<'FINIS';
    A foolish consistency is the hobgoblin of little minds,
    adored by little statesmen and philosophers and divines.
    --Ralph Waldo Emerson
    FINIS





  • Forgetting that a BLOCK requires braces. Naked statements are not
    BLOCKs. If you are creating a control structure such as a while
    or an if that requires one or more BLOCKs, you must use braces
    around each BLOCK. Remember that Perl is not like C.




  • Not saving $1, $2, and so on, across regular
    expressions. Remember that every new m/atch/ or s/ubsti/tution/
    will set (or clear, or mangle) your $1, $2...variables, as well
    as $`, $&, and $'. One way to save them right away is to
    evaluate the match within a list context, as in:


    my ($one, $two) = /(\w+) (\w+)/;





  • Not realizing that a local also changes the
    variable's value as seen by other subroutines called within the scope
    of the local. It's easy to forget that local is a
    run-time statement that does dynamic scoping, because there's no
    equivalent in languages like C. See the section "Scoped
    Declarations" in
    Chapter 4, "Statements and Declarations". Usually you want a
    my anyway.



  • Losing track of brace pairings. A good text editor will help you find
    the pairs. Get one. (Or two.)



  • Using loop control
    statements in do {} while. Although the braces in
    this control structure look suspiciously like part of a loop
    BLOCK, they aren't.



  • Saying @foo[1] when you mean
    $foo[1]. The @foo[1] reference
    is an array slice, meaning an array consisting of
    the single element $foo[1]. Sometimes this doesn't
    make any difference, as in:


    print "the answer is @foo[1]\n";


    but it makes a big difference for things like:

    @foo[1] = <STDIN>;


    which will slurp up all the rest of STDIN, assign
    the first line to $foo[1], and
    discard everything else. This is probably not what you intended. Get
    into the habit of thinking that $ means a single
    value, while @ means a list of values, and you'll
    do okay.




  • Forgetting the parentheses of a list operator like my:


    my  $x, $y  = (4, 8);     # WRONG 
    my ($x, $y) = (4, 8); # ok





  • Forgetting to select the right filehandle before setting
    $^, $~, or
    $|. These variables depend on the currently
    selected filehandle, as determined by
    select(FILEHANDLE). The
    initial filehandle so selected is STDOUT. You
    should really be using the filehandle methods from the
    FileHandle module instead. See Chapter 28, "Special Names".



24.1.2. Frequently Ignored Advice


Practicing Perl Programmers should take note of the following:




  • Remember that many operations behave differently in a list context than
    they do in a scalar one. For instance:


    ($x) = (4, 5, 6);        # List context; $x is set to 4
    $x = (4, 5, 6); # Scalar context; $x is set to 6
    @a = (4, 5, 6);
    $x = @a; # Scalar context; $x is set to 3 (the array list)





  • Avoid barewords if you can, especially all lowercase ones. You can't
    tell just by looking at it whether a word is a function or a bareword
    string. By using quotes on strings and parentheses around function
    call arguments, you won't ever get them confused. In fact, the pragma
    use strict at the beginning of your program makes
    barewords a compile-time error--probably a good thing.



  • You can't tell just by looking
    which built-in functions are unary operators (like
    chop and chdir), which are list
    operators (like print and
    unlink), and which are argumentless (like
    time). You'll want to learn them by reading Chapter 29, "Functions". As always, use parentheses if you
    aren't sure--or even if you aren't sure you're sure. Note also that
    user-defined subroutines are by default list operators, but they can
    be declared as unary operators with a prototype of
    ($) or argumentless with a prototype of
    ().



  • People
    have a hard time remembering that some functions default to
    $_, or @ARGV, or
    whatever, while others do not. Take the time to learn which are
    which, or avoid default arguments.



  • <FH> is not the name of a filehandle, but an
    angle operator that does a line-input operation on the handle. This
    confusion usually manifests itself when people try to
    print to the angle operator:


    print <FH> "hi";    # WRONG, omit angles




  • Remember also that data read by the angle operator is assigned to
    $_ only when the file read is the sole condition in
    a while loop:


    while (<FH>) { }   # Data assigned to $_.
    <FH>; # Data read and discarded!




  • Don't use = when you need =~;
    the two constructs are quite different:


    $x =  /foo/;  # Searches $_ for "foo", puts result in $x
    $x =~ /foo/; # Searches $x for "foo", discards result





  • Use my for local variables whenever you can get
    away with it. Using local merely gives a temporary
    value to a global variable, which leaves you open to unforeseen side
    effects of dynamic scoping.



  • Don't use local on a module's exported variables.
    If you localize an exported variable, its exported value will not
    change. The local name becomes an alias to a new value but the
    external name is still an alias for the original.



24.1.3. C Traps



Cerebral C programmers should take note of the following:



  • Curlies are required for if and while blocks.




  • You must use elsif rather than "else if" or
    "elif". Syntax like this:


    if (expression) {
    block;
    }
    else if (another_expression) { # WRONG
    another_block;
    }


    is illegal. The else part is always a block, and a
    naked if is not a block. You mustn't expect Perl to
    be exactly the same as C. What you want instead is:

    if (expression) {
    block;
    }
    elsif (another_expression) {
    another_block;
    }


    Note also that "elif" is "file" spelled backward. Only
    Algol-ers would want a keyword that was the same as another word
    spelled backward.




  • The break and continue keywords
    from C become in Perl last and
    next, respectively. Unlike in C, these do
    not work within a do {} while
    construct.




  • There's no switch statement. (But it's easy to build one on the fly;
    see "Bare Blocks" and "Case Structures" in Chapter 4, "Statements and Declarations".)




  • Variables begin with $, @, or % in Perl.




  • Comments begin with #, not /*.




  • You can't take the address of anything, although a similar operator in
    Perl is the backslash, which creates a reference.



  • ARGV must be
    capitalized. $ARGV[0] is C's
    argv[1], and C's argv[0] ends up
    in $0.




  • Syscalls such as link, unlink,
    and rename return true for success, not
    0.




  • The signal handlers in %SIG deal with signal names,
    not numbers.



24.1.4. Shell Traps



Sharp shell programmers should take note of the following:




  • Variables are prefixed with $,
    @, or % on the left side of the
    assignment as well as the right. A shellish assignment like:


    camel='dromedary';      # WRONG


    won't be parsed the way you expect. You need:

    $camel='dromedary';     # ok





  • The loop variable of a foreach also requires a $. Although csh
    likes:


    foreach hump (one two)
    stuff_it $hump
    end


    in Perl, this is written as:

    foreach $hump ("one", "two") {
    stuff_it($hump);
    }





  • The backtick operator does variable interpolation without regard to the
    presence of single quotes in the command.



  • The backtick operator does no translation of the return value. In Perl,
    you have to trim the newline explicitly, like this:


    chomp($thishost = `hostname`);





  • Shells (especially csh) do several levels of substitution on each
    command line. Perl does interpolation only within certain constructs
    such as double quotes, backticks, angle brackets, and search patterns.



  • Shells tend to interpret scripts a little bit at a time. Perl compiles
    the entire program before executing it (except for BEGIN blocks,
    which execute before the compilation is done).




  • Program arguments are available via @ARGV, not
    $1, $2, and so on.



  • The environment is not automatically made available as individual scalar
    variables. Use the standard Env module if you want that to happen.



24.1.5. Previous Perl Traps



Penitent Perl 4 (and Prior) Programmers should take note of the
following changes between release 4 and release 5 that might affect old
scripts:



  • @ now always interpolates an array in double-quotish strings. Some
    programs may now need to use backslashes to protect any @ that
    shouldn't interpolate.




  • Barewords that used to look like strings to Perl will now look like
    subroutine calls if a subroutine by that name is defined before the
    compiler sees them. For example:


    sub SeeYa { die "Hasta la vista, baby!" }
    $SIG{'QUIT'} = SeeYa;


    In prior versions of Perl, that code would set the signal handler. Now,
    it actually calls the function! You may use the

    -w switch to find
    such risky usage or use strict to outlaw it.




  • Identifiers starting with "_" are no longer forced
    into package main, except for the bare underscore
    itself (as in $_, @_, and so on).




  • A double colon is now a valid package separator in an identifier. Thus,
    the statement:


    print "$a::$b::$c\n";


    now parses $a:: as the variable reference, where in
    prior versions only the $a was considered to be the
    variable reference. Similarly:

    print "$var::abc::xyz\n";


    is now interpreted as a single variable
    $var::abc::xyz, whereas in prior versions, the
    variable $var would have been followed by the
    constant text ::abc::xyz.



  • s'$pattern'replacement' now performs no interpolation on
    $pattern. (The $ would be
    interpreted as an end-of-line assertion.) This behavior occurs only
    when using single quotes as the substitution delimiter; in other
    substitutions, $pattern is always interpolated.




  • The second and third arguments of splice are now
    evaluated in scalar context rather than in list context.




  • These are now semantic errors because of precedence:


    shift @list + 20;    # Now parses like shift(@list + 20), illegal!
    $n = keys %map + 20; # Now parses like keys(%map + 20), illegal!


    Because if those were to work, then this couldn't:

    sleep $dormancy + 20;





  • The precedence of assignment operators is now the same as the precedence
    of assignment. Previous versions of Perl mistakenly gave them the
    precedence of the associated operator. So you now must parenthesize
    them in expressions like:


    /foo/ ? ($a += 2) : ($a -= 2);


    Otherwise:

    /foo/ ? $a += 2 : $a -= 2;


    would be erroneously parsed as:

    (/foo/ ? $a += 2 : $a) -= 2;


    On the other hand:

    $a += /foo/ ? 1 : 2;


    now works as a C programmer would expect.



  • open FOO || die is incorrect. You need parentheses around the
    filehandle, because open has the precedence of a list operator.




  • The elements of argument lists for formats are now evaluated in list
    context. This means you can interpolate list values now.



  • You can't do a goto into a block that is optimized away. Darn.




  • It is no longer legal to use whitespace as the name of a variable or
    as a delimiter for any kind of quote construct. Double darn.




  • The caller function now returns a false value in scalar context
    if there is no caller. This lets modules determine whether they're
    being required or run directly.



  • m//g now attaches its state to the searched string rather than the
    regular expression. See Chapter 5, "Pattern Matching", for further
    details.



  • reverse is no longer allowed as the name of a sort subroutine.



  • taintperl is no longer a separate executable. There is now a

    -T
    switch to turn on tainting when it isn't turned on automatically.




  • Double-quoted strings may no longer end with an unescaped $ or @.



  • The archaic ifBLOCK BLOCK syntax is no longer supported.



  • Negative array subscripts now count from the end of the array.



  • The comma operator in a scalar context is now guaranteed to give a
    scalar context to its arguments.



  • The ** operator now binds more tightly than unary minus.



  • Setting $#array lower now discards array elements immediately.



  • delete is not guaranteed to return the deleted value for tied
    arrays, since this capability may be onerous for some modules to
    implement.



  • The construct "this is $$x", which used to interpolate the process
    ID at that point, now tries to dereference $x. $$ by itself
    still works fine, however.




  • The behavior of foreach when it iterates over a
    list that is not an array has changed slightly. It used to assign the
    list to a temporary array but now, for efficiency, no longer does so.
    This means that you'll now be iterating over the actual values, not
    copies of the values. Modifications to the loop variable can change
    the original values, even after the grep!
    For instance:


    % perl4 -e '@a = (1,2,3); for (grep(/./, @a)) { $_++ }; print "@a\n"'
    1 2 3
    % perl5 -e '@a = (1,2,3); for (grep(/./, @a)) { $_++ }; print "@a\n"'
    2 3 4


    To retain prior Perl semantics, you'd need to explicitly assign your list to a temporary array and then iterate over that. For
    example, you might need to change:

    foreach $var (grep /x/, @list) { ... }


    to:

    foreach $var (my @tmp = grep /x/, @list) { ... }


    Otherwise changing $var will clobber the values of @list. (This
    most often happens when you use $_ for the loop variable and call
    subroutines in the loop that don't properly localize $_.)



  • Some error messages and warnings will be different.



  • Some bugs may have been inadvertently removed.





/ 875