The Seven Deadly Sins of Perl

                           by Tom Christiansen
                            advocatus diaboli

----------------------------------------------------------------------------

Back in the Perl4 days I made a list of the greatest ``gotchas'' in Perl.
Almost all of those have been subsequently fixed in current incarnations of
Perl, some to my deep and abiding amazement. In that same spirit, here's of
my current list of what's--um... let's be charitable and just say
``suboptimal'' in Perl from a reasonably serious programming languages
design point of view. I believe these are real gotchas, and not always
obvious. A few of these are fixable through programming rigor; a few of them
are rumored to be fixed in Larry's own copy of 5.002 :-); but a few are
simply inherent design decisions that quite possibly cannot be solved
without breaking what the language is, much as csh's design flaws cannot be
solved.

Of course, for many many kinds of apps, there's also so very much more
that's right with Perl to make it not only a reasonable but often even the
the best choice from what's in the field today. I'm just trying to provide
perspective here; think of it as eventual updates for the Perltrap podpage.

1. Implicit Behaviours and Hidden Context Dependencies

Functions overload only on return type rather than parameter type, which is
always implicit and while inferrible by the language. This is often a
shocking and terrible surprise to the programmer who doesn't have their
fingers in Perl code all day every day. Type conversions (of non-reference
types) are silent and deadly, especially between aggregates and scalars.
They are hard for many to predict. The presence of subobvious default
behaviours of various functions, and the inability to turn this off is too
surprising, and more than somewhat dangerous.

2. To Paren || !To Paren?

That adding or not adding parens should have the strong potential for
semantic changes instead of merely grouping is hard to fathom. Sometimes
you're damned if you do, damned if you don't. By allowing but not requiring
parens in almost all situations, people are confused by whether they should
put them in, and deeply disturbed when doing or not can radically alter
their program's behaviour. This is especially annoying in trying to figure
out how to get regexps to return what they match.

3. Global Variables

There's no mandatory enforcement of declaration or detection of fully global
variables, this can cause very difficult to detect program errors. Implicit
use $_ is one of the classics, causing functions way up the stack to
mysterious fail. There's no use strict globals or some such to force
declaration even of exportable module-level globals. There's no way to have
lexically-scoped pre-defined file-handles or built-in variables (like $_,
$?, etc), and the dynamically-scoped versions are confusing to programmers
of traditional languages.

4. References vs Non-References

Although introducing references in v5 was a critical step, by keeping
backwards compatibility with older v4 code, the legacy code and basic system
still uses too many types and ensuing confusions. That means people are
still confused about $ vs. @ vs. %. In particular, they expect things that
work on arrays or hashes to transparently work on references to the same, or
vice versa. This shows up when folks try to work out complex data
structures.

5. No Prototypes

Not having prototypes (function signatures) makes it impossible to create
one's own functions that exactly duplicate builtins, as well as making
static analysis of errors difficult. Even if you introduce prototypes for
normal functions, how does this extend to user-defined object classes and
methods? How do you prototype return values?

6. No Compiler Support for I/O or Regexp Objects

The I/O system's use of barewords is unclean and unpleasant, as there isn't
really good compiler-aware support for i/o handles. The open() interface and
friends must be entirely redone, preferably into an o-o paradigm, but
without breakin old code. The regexp system is likewise archaic: since
there's no real compiler support for compiled regexps, you either get very
poor performance or else opaque hacks to work around it.

7. Haphazard Exception Model

There's no standard model or guidelines for exception handling in libraries,
modules, or classes, which means you don't know what to trap and what not to
trap. Does a library throw an exception or does it just return false? Even
if it does, there is no standard nomenclature for exceptions, so it's hard
to know how, for example, to catch all numeric exceptions, all i/o
exceptions, etc. People mistakenly use eval $str for both code-generation
and exception handling, thus not only delaying errors until run-time but
also standing a good chance of losing them.