A few days ago, someone asked what things people disliked about Perl, and
no one answered. Well, I've worked, played, and fought with Perl quite
enough to have by list of personal pet perl peeves. Some are inherent to
the language and beyond redemption, others could in theory be fixed, and
others already have been. In fact, I've omitted some minor ones that are
fixed in Perl5. Yes, this will become a FAQ; note the long expiration
date. All but the last question/issue are new.
1. Scalar vs Array
2. Strings and Numbers
3. Barewords
4. Unary-Op vs List-ops
5. Filehandles not real objects
6. Filehandles need funny vars for ops
7. Sometimes need scalar, other times tolerate expr
8. Remembering Defaults
9. Do {} isn't a controllable loop
10. Variable suicide
1. Scalar vs Array
The absolutely toughest thing for me to teach people is the scalar
versus array context stuff. I mean, I can get them to get this
$count = @list;
@nlist = @olist;
But the hidden things mystify them. In some languages, the world
acts differently if you call them with the wrong type. Perl is
largely typeless, but calling things in the wrong *context* is
what really confuses you. No one can grasp this quickly.
Because it's not the argument types that make these overloaded, but
rather that they overload on return type, which is set as a sideeffect
of things few people can anticipate even with training. I used to
like it, but that was before I had to teach it.
I cannot offer any solution to this difficulty. It's just
too deeply set as part of the language by now.
2. Strings and Numbers
No one ever remembers to use "eq" instead of "==".
"foo" < "bar"
$a["foo"]
In Perl 5, all possible numbers that can be misconverted get
flagged as warnings with -w, but before that, it's quite silent.
Soemtimes I'm not sure that it wouldn't be an evil thing to just
punt and assume the other thing, but that would mean automatic
conversion to this:
"foo" lt "bar"
$a{"foo"}
which while I might tolerate in the first case, would not in the
second one.
3. Barewords
The use of barewords is a real source of trouble. That because
in something like this:
1a. $x{ "date" } = 42;
1b. $x{ "time" } = 24;
2a. $x { date } = 42;
2b. $x { time } = 24;
3a. $x{ date() } = 42;
3b. $x{ time() } = 24;
You can't tell which type of thing the class 2 cases are.
There's no way to look at a bareword and no what it's going
to do. In this case, 2a is like 1a, but 2b is like 3b.
No noe can know that time() is a built-in, or that it hasn't
been declared a list-op kind of sub. And there's nothing to
stop them from say
sub DATE;
So even the capitalizied ones aren't safe.
So always use quotes and parens and you'll steer clear of this?
4. Unary-Op vs List-ops
There are some built-in functions that take one argument only,
like chop() and chdir(), but others like print() and unlink(),
which take lists. People using || die get burned:
chdir "/tmp" || die;
unlink $file || die;
Are *NOT* consistent. These are really:
chdir("/tmp") || die;
unlink( $file || die );
Which isn't what you want, but rather:
unlink ($file) || die;
Of course, they get bitten by print as well
print (4+5)*8;
Which does at least elicit a warning with -w these days.
The real problem is you CANNOT predict which does which!
And while you can make your own listops in Perl v5.0, you
can't make your own unary ops.
So the only solution is to always use parens, which makes
things harder to read.
5. Filehandles not real objects
Filehandles are strange creatures. They appear to be
barewords. People expect them to be variables as they
are in other languages.
That's especially true because then you don't become mystified
by how to pass things to subroutines. While you are officially
supposed to do this
&flush("FILE");
because of barewords, this works:
&flush(FILE);
but down in the function you have
sub flush {
local($file) = shift;
}
and now you use it normally. I think the best solution here is
never to use bareword filehandles. Right now, you have to
remember to fill them in:
$fh = $path;
open($fh, $path);
But it's too bad that this doesn't happen automatically.
In fact, in Perl 5.0, existing scripts that people tried to
use this way but without filling in the are currently going to
*BREAK*. It would be far better to fill in the $fh, I think.
Another difficulty with filehandles is that if you pass one
to a subroutine in a different package, you can't use it without
name-munging, which isn't something folks expect to have to
do. You find yourself doing this:
# force unqualified filehandles into caller's package
local($fh) = shift;
local($package) = caller;
$fh =~ s/^[^']+$/$package'$&/;
I believe that file handles, directory handles, and formats
should always be used with $foo indirect objects. Of course,
you have to declare the format with a BAREWORD though.
Furthermore, there's no way shy of heavy-handed and potentially
perilous type-globbing to declare a local filehandle. But if
all you did were this:
local($fh);
open ($fh, "> $path");
and it got filled in, this wouldn't be a problem. You just have
to be able to pass it on to subs in other packages.
Then there's the issue of directory handles not acting like
filehandles:
$line = <FH>;
$fname = <DH>; # doesn't work.
6. Filehandles need funny vars for ops
Another filehandle-related problem is that you have to do this
crufty select business:
$ofh = select(WISH);
$| = 1;
select($ofh);
or
$ofh = select($wfh);
$| = 1;
select($ofh);
Rather than saying
WISH->flush();
$wfh->flush();
But I think that we've pretty much scoped out how to fix
that in Perl v5. Filehandles will be objects with methods.
The non-fh funny variables will themselves have some sort of
mnemonic alias, although whether this is $SYS'ERRNO or whatnot
hasn't been precisely worked out yet.
7. Sometimes need scalar, other times tolerate expr
As it is, people have a great deal of difficulty making lists
or tables of things. That's because there are things in the
grammar which absolutely require a literal scalar variable, (or
sometimes a list or table), like
but much is now. I would like that everywhere that took a scalar
thing would also take an expression that evaluated to one of those.
open($fh[1], "fname");
print $fh[1] "stuff\n";
8. Remembering Defaults
People have a hard type remembering that some functions
default to $_, or @ARGV, or whatever, but that others which
you might expect to do not. For example, the split works on
$_, but the unpack doesn't even compile:
@x = split ( /\s+/ );
@x = unpack( "A10" x 10 );
Of course, a solution to this is never to use defaults.
you can't get out of a do{} early. This is VERY counterintuitive
for C programmers. You can get out of an eval{} early in Perl 5
with a "return", but better not use that with do. Someone, the
normal next, last, and redo should be made to work. Perhaps even
warnings could be issued until then on loop control to non-named
(first enclosing) loops.
10. Variable Suicide
Variable suicide is a nasty side-effect of dynamic scoping and
the way variables are passed by reference. If you say
Then you have just clubbered $_[0]! Why this is occurring
is pretty heavy wizardry: the reference to $x stored in
$_[0] was temporarily occluded by the previous local($x)
statement (which, you're recall, occurs at run-time, not
compile-time). The work around is simple, however: declare
your formal parameters first:
sub munge {
local($myvar) = $_[0];
local($x);
...
}
That doesn't help you if you're going to be trying to access
@_ directly after the local()s. In this case, careful use
of the package facility is your only recourse.
Another manifestation of this problem occurs due to the
magical nature of the index variable in a foreach() loop.
@num = 0 .. 4;
print "num begin @num\n";
foreach $m (@num) { &ug }
print "num finish @num\n";
sub ug {
local($m) = 42;
print "m=$m $num[0],$num[1],$num[2],$num[3]\n";
}
Which prints out the mysterious:
num begin 0 1 2 3 4
m=42 42,1,2,3
m=42 0,42,2,3
m=42 0,1,42,3
m=42 0,1,2,42
m=42 0,1,2,3
num finish 0 1 2 3 4
What's happening here is that $m is an alias for each
element of @num. Inside &ug, you temporarily change
$m. Well, that means that you've also temporarily
changed whatever $m is an alias to!! The only workaround
is to be careful with global variables, using packages,
and/or just be aware of this potential in foreach() loops.
The Perl 5 statically scoped autos available via "my" will not have
this problem, and the loop index on foreach() in Perl 5 will
now be statically, not dynamically scoped.