A few days ago, someone asked what things people disliked about Perl, and
no one answered.  Well, I've worked, played, and fought with Perl quite
enough to have by list of personal pet perl peeves.  Some are inherent to
the language and beyond redemption, others could in theory be fixed, and
others already have been.  In fact, I've omitted some minor ones that are
fixed in Perl5.  Yes, this will become a FAQ; note the long expiration
date.  All but the last question/issue are new.


   1.  Scalar vs Array
   2.  Strings and Numbers
   3.  Barewords
   4.  Unary-Op vs List-ops
   5.  Filehandles not real objects
   6.  Filehandles need funny vars for ops
   7.  Sometimes need scalar, other times tolerate expr
   8.  Remembering Defaults
   9.  Do {} isn't a controllable loop
   10. Variable suicide



1.  Scalar vs Array

   The absolutely toughest thing for me to teach people is the scalar
   versus array context stuff.  I mean, I can get them to get this

       $count = @list;
       @nlist = @olist;

   But the hidden things mystify them.   In some languages, the world
   acts differently if you call them with the wrong type.  Perl is
   largely typeless, but calling things in the wrong *context* is
   what really confuses you.  No one can grasp this quickly.

   For example, these are all different:

       $x = /pat/;
       ($x) = /pat/;
       $x = /(pat)/;
       ($x) = /(pat)/;

       $x = $var =~ /pat/;
       ($x) = $var =~ /pat/;
       $x = $var =~ /(pat)/;
       ($x) = $var =~ /(pat)/;

   Or what difference is between all these:

       $x = @x;
       $x = %x;
       @x = %x;
       @x = $x;
       %x = @x;
       %x = $x;

       $x = `cat`;
       @x = `cat`;
       %x = `cat`;

       ($x) = `cat`;
       $x[3] = `cat`;
       @x[3] = `cat`;

   I just can't stop people from using @x[3] to mean $x[3].  Perl 5's
   -w switch should help that.

   Or what all this has to be done to reverse lines and characters:

       print reverse `cat`;
       print scalar reverse scalar `cat`;

   Because it's not the argument types that make these overloaded, but
   rather that they overload on return type, which is set as a sideeffect
   of things few people can anticipate even with training.  I used to
   like it, but that was before I had to teach it.

   I cannot offer any solution to this difficulty.  It's just
   too deeply set as part of the language by now.


2.  Strings and Numbers

   No one ever remembers to use "eq" instead of "==".

       "foo" < "bar"
       $a["foo"]

   In Perl 5, all possible numbers that can be misconverted get
   flagged as warnings with -w, but before that, it's quite silent.
   Soemtimes I'm not sure that it wouldn't be an evil thing to just
   punt and assume the other thing, but that would mean automatic
   conversion to this:

       "foo" lt "bar"
       $a{"foo"}

   which while I might tolerate in the first case, would not in the
   second one.


3.  Barewords

   The use of barewords is a real source of trouble.  That because
   in something like this:

       1a.  $x{ "date" }       = 42;
       1b.  $x{ "time" }       = 24;

       2a.  $x { date }        = 42;
       2b.  $x { time }        = 24;

       3a.  $x{ date() }       = 42;
       3b.  $x{ time() }       = 24;

   You can't tell which type of thing the class 2 cases are.
   There's no way to look at a bareword and no what it's going
   to do.  In this case, 2a is like 1a, but 2b is like 3b.
   No noe can know that time() is a built-in, or that it hasn't
   been declared a list-op kind of sub.  And there's nothing to
   stop them from say

       sub DATE;

   So even the capitalizied ones aren't safe.

   So always use quotes and parens and you'll steer clear of this?


4.  Unary-Op vs List-ops

   There are some built-in functions that take one argument only,
   like chop() and chdir(), but others like print() and unlink(),
   which take lists.  People using || die get burned:

           chdir "/tmp" || die;
           unlink $file || die;

   Are *NOT* consistent.  These are really:

           chdir("/tmp") || die;
           unlink( $file || die );

   Which isn't what you want, but rather:

           unlink ($file) || die;

   Of course, they get bitten by print as well

           print (4+5)*8;

   Which does at least elicit a warning with -w these days.

   The real problem is you CANNOT predict which does which!
   And while you can make your own listops in Perl v5.0, you
   can't make your own unary ops.

   So the only solution is to always use parens, which makes
   things harder to read.

5.  Filehandles not real objects

   Filehandles are strange creatures.  They appear to be
   barewords.  People expect them to be variables as they
   are in other languages.

       open(FILE, $path);
       print FILE "stuff\n";
       $line = <FILE>;
       close(FILE);

   But folks would more expect something like this:

       open($fh, $path);
       print $fh "stuff\n";
       $line = <$fh>;
       close($fh);

   That's especially true because then you don't become mystified
   by how to pass things to subroutines.   While you are officially
   supposed to do this

       &flush("FILE");

   because of barewords, this works:

       &flush(FILE);

   but down in the function you have

       sub flush {
           local($file) = shift;
       }

   and now you use it normally.  I think the best solution here is
   never to use bareword filehandles.   Right now, you have to
   remember to fill them in:

       $fh = $path;
       open($fh, $path);

   But it's too bad that this doesn't happen automatically.
   In fact, in Perl 5.0, existing scripts that people tried to
   use this way but without filling in the are currently going to
   *BREAK*.  It would be far better to fill in the $fh, I think.

   Another difficulty with filehandles is that if you pass one
   to a subroutine in a different package, you can't use it without
   name-munging, which isn't something folks expect to have to
   do.  You find yourself doing this:

       # force unqualified filehandles into caller's package
       local($fh) = shift;
       local($package) = caller;
       $fh =~ s/^[^']+$/$package'$&/;

   I believe that file handles, directory handles, and formats
   should always be used with $foo indirect objects.  Of course,
   you have to declare the format with a BAREWORD though.

   Furthermore, there's no way shy of heavy-handed and potentially
   perilous type-globbing to declare a local filehandle.  But if
   all you did were this:

       local($fh);
       open ($fh, "> $path");

   and it got filled in, this wouldn't be a problem.  You just have
   to be able to pass it on to subs in other packages.

   Then there's the issue of directory handles not acting like
   filehandles:

       $line  = <FH>;
       $fname = <DH>;  # doesn't work.


6.  Filehandles need funny vars for ops

   Another filehandle-related problem is that you have to do this
   crufty select business:

       $ofh = select(WISH);
       $| = 1;
       select($ofh);

   or

       $ofh = select($wfh);
       $| = 1;
       select($ofh);

   Rather than saying

       WISH->flush();
       $wfh->flush();

   But I think that we've pretty much scoped out how to fix
   that in Perl v5.  Filehandles will be objects with methods.

   The non-fh funny variables will themselves have some sort of
   mnemonic alias, although whether this is $SYS'ERRNO or whatnot
   hasn't been precisely worked out yet.


7.  Sometimes need scalar, other times tolerate expr

   As it is, people have a great deal of difficulty making lists
   or tables of things.  That's because there are things in the
   grammar which absolutely require a literal scalar variable, (or
   sometimes a list or table), like

       print $fh "stuff\n";
       &$funcptr():
       push(@foo, "bar");

   but others that tolerate expressions:

       open(foo.bar, "file");

   Some of this is fixed in Perl 5.0:

       &{$functable{$keystroke}} ( $args );
       push( @$foo, "bar" );

   but much is now.  I would like that everywhere that took a scalar
   thing would also take an expression that evaluated to one of those.

       open($fh[1], "fname");
       print $fh[1] "stuff\n";


8.  Remembering Defaults

   People have a hard type remembering that some functions
   default to $_, or @ARGV, or whatever, but that others which
   you might expect to do not.  For example, the split works on
   $_, but the unpack doesn't even compile:

       @x = split ( /\s+/ );
       @x = unpack( "A10" x 10 );

   Of course, a solution to this is never to use defaults.


9.  Do {} isn't a controllable loop

   Because you can write this:

       $x = 7 + do {
           local($i);
           for(@a) {$i += $_}
           $i;
       } * 8;

   you can't get out of a do{} early.  This is VERY counterintuitive
   for C programmers.  You can get out of an eval{} early in Perl 5
   with a "return", but better not use that with do.  Someone, the
   normal next, last, and redo should be made to work.  Perhaps even
   warnings could be issued until then on loop control to non-named
   (first enclosing) loops.


10.  Variable Suicide

   Variable suicide is a nasty side-effect of dynamic scoping and
   the way variables are passed by reference.  If you say

       $x = 17;
       &munge($x);
       sub munge {
           local($x);
           local($myvar) = $_[0];
           ...
       }

   Then you have just clubbered $_[0]!  Why this is occurring
   is pretty heavy wizardry: the reference to $x stored in
   $_[0] was temporarily occluded by the previous local($x)
   statement (which, you're recall, occurs at run-time, not
   compile-time).  The work around is simple, however: declare
   your formal parameters first:

       sub munge {
           local($myvar) = $_[0];
           local($x);
           ...
       }

   That doesn't help you if you're going to be trying to access
   @_ directly after the local()s.  In this case, careful use
   of the package facility is your only recourse.

   Another manifestation of this problem occurs due to the
   magical nature of the index variable in a foreach() loop.

       @num = 0 .. 4;
       print "num begin  @num\n";
       foreach $m (@num) { &ug }
       print "num finish @num\n";
       sub ug {
           local($m) = 42;
           print "m=$m  $num[0],$num[1],$num[2],$num[3]\n";
       }

   Which prints out the mysterious:

       num begin  0 1 2 3 4
       m=42  42,1,2,3
       m=42  0,42,2,3
       m=42  0,1,42,3
       m=42  0,1,2,42
       m=42  0,1,2,3
       num finish 0 1 2 3 4

   What's happening here is that $m is an alias for each
   element of @num.  Inside &ug, you temporarily change
   $m.  Well, that means that you've also temporarily
   changed whatever $m is an alias to!!  The only workaround
   is to be careful with global variables, using packages,
   and/or just be aware of this potential in foreach() loops.

   The Perl 5 statically scoped autos available via "my" will not have
   this problem, and the loop index on foreach() in Perl 5 will
   now be statically, not dynamically scoped.