30-Nov-86 16:01:00-PST,16724;000000000000
Mail-From: NEUMANN created at 30-Nov-86 15:58:57
Date: Sun 30 Nov 86 15:58:57-PST
From: RISKS FORUM    (Peter G. Neumann -- Coordinator) <[email protected]>
Subject: RISKS DIGEST 4.21
Sender: [email protected]
To: [email protected]

RISKS-LIST: RISKS-FORUM Digest,  Sunday, 30 November 1986  Volume 4 : Issue 21

          FORUM ON RISKS TO THE PUBLIC IN COMPUTER SYSTEMS
  ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Contents:
 Risks of Computer Modeling and Related Subjects (Mike Williams--LONG MESSAGE)

The RISKS Forum is moderated.  Contributions should be relevant, sound, in good
taste, objective, coherent, concise, nonrepetitious.  Diversity is welcome.
(Contributions to [email protected], Requests to [email protected])
 (Back issues Vol i Issue j available in CSL.SRI.COM:<RISKS>RISKS-i.j.  MAXj:
 Summary Contents Vol 1: RISKS-1.46; Vol 2: RISKS-2.57; Vol 3: RISKS-3.92.)

----------------------------------------------------------------------

Date: Fri, 28 Nov 86 13:02 EST
From: "John Michael (Mike) Williams" <[email protected]>
To: [email protected]
Subject: Risks of Computer Modeling and Related Subjects (LONG MESSAGE)

 Taking the meretricious "con" out of econometrics and computer modeling:
                 "Con"juring the Witch of Endor
               John Michael Williams, Bethesda MD

Quite a few years ago, the Club of Rome perpetrated its "Limits to Growth"
public relations exercise.  Although not my field, I instinctively found it
bordering on Aquarian numerology to assign a quantity, scalar or otherwise,
to "Quality of Life," and a gross abuse of both scientific method and
scientific responsibility to the culture at large.  Well after the initial
report's firestorm, I heard that a researcher at McGill proved the model was
not even internally consistent, had serious typographical/syntactical errors
that produced at least an order of magnitude error, and that when the errors
were corrected, the model actually predicted an improving, not declining
"Quality of Life."  I called the publisher of "Limits to Growth," into its
umpteenth edition, and asked if they intended to publish a correction or
retraction.  They were not enthusiastic, what with Jerry Brown, as Governor
and candidate for Presidential nomination, providing so much lucrative
publicity.  Jimmy Carter's "malaise" and other speeches suggest that these
dangerously flawed theses also affected, and not for the better, both his
campaign and administration.

This shaman-esque misuse of computers embarrassed the computing
community, but with no observable effect.

On 31 October 1986, Science ran a depressing article entitled:  "Asking
Impossible Questions About the Economy and Getting Impossible Answers"
(Gina Kolata, Research News, Vol.  234, Issue 4776, pp.  545-546).  The
subtitle and the sidebar insert are informative:

 Some economists say that large-scale computer models of the economy are no
 better at forecasting than economists who simply use their best judgment...
 "People are overly impressed by answers that come out of a computer"...

Additional pertinent citations (cited with permission):

  "There are two things you would be better not seeing in the making--
  sausages and econometric estimates," says Edward Learner, an economist at
  [UCLA].  These estimates are used by policymakers to decide, for example,
  how the new tax law will affect the economy or what would happen if a new
  oil import tax were imposed.  They are also used by businesses to decide
  whether there is a demand for a new product.  Yet the computer models that
  generate these estimates, say knowledgeable critics, have so many flaws
  that, in Learner's words, it is time to take the "con out of econometrics."

  ...[E]ven the defenders of the models... [such as e]conomists Kenneth
  Arrow of Stanford and Stephen McNees of the Federal Reserve Board in Boston
  say they believe the models can be useful but also say that one reason the
  models are made and their predictions so avidly purchased is that people
  want answers to impossible questions and are overly impressed by answers
  that come out of a computer...

  The problem, says statistician David Freedman of the University of
  California at Berkeley, is that "there is no economic theory that tells you
  exactly what the equations should look like."  Some model builders do not
  even try to use economic theory...: most end up curve-fitting--a risky
  business since there are an infinite number of equations that will fit any
  particular data set...

  "What you really have," says William Ascher of Duke University, "is a man-
  model system."  And this system, say the critics, is hardly scientific.
  Wassily Leontief of New York University remarks, "I'm very much in favor of
  mathematics, but you can do silly things with mathematics as well as with
  anything else."

  Defenders of the models point out that economists are just making the best
  of an impossible situation.  Their theory is inadequate and it is
  impossible to write down a set of equations to describe the economy in any
  event... But the critics of the models say that none of these defenses
  makes up for the fact that the models are, as Leontief says, "hot air."
  Very few of the models predict accurately, the economic theory behind the
  models is extremely weak if it exists at all, in many cases the data used to
  build the models are of such poor quality as to be essentially useless, and
  the model builders, with their subjective adjustments, produce what is,
  according to Learner, "an uncertain mixture of data and judgment."

When David Stockman made "subjective adjustments," he was reviled for
cooking the numbers.  It seems they may have been hash to begin with.

  [Douglas Hale, director of quality assurance at the (Federal) Energy
  Information Administration] whose agency is one of the few that regularly
  assess models to see how they are doing, reports that, "in many cases, the
  models are oversold.  The scholarship is very poor, the degree of testing
  and peer review is far from adequate by any scientific measure, and there
  is very little you can point to where one piece of work is a building block
  for the next."

  For example, the Energy Information Administration looked at the accuracy
  of short-term forecasts for the cost of crude oil...  At first glance, it
  looks as if they did not do too badly...  But, says Hale, "what we are
  really interested in is how much does the price change over time.  The
  error in predicting change is 91%"

This is about the same error, to the hour, of a stopped clock.

In the Washington Post for 23 November 1986, pg K1 et seq., in an
interview entitled "In Defense of Public Choice," Assar Lindbeck,
chairman of the Swedish Royal Academy's committee for selecting the
Nobel Prize in economics, explains the committee's choice of Professor
James M. Buchanan, and is asked by reporter Jane Seaberry:

  It seems the economics profession has come into some disrepute.  Economists
  forecast economic growth and forecasts are wrong.  The Reagan administration
  has really downplayed advice from economists.  What do you think about the
  economics profession today?

Chairman Lindbeck replies:

  Well, there's something in what you say in the following sense, I think,
  that in the 1960s, it was a kind of hubris development in the economic
  profession ... in the sense that it was an overestimation of what research
  and scientific knowledge can provide about the possibilities of
  understanding the complex economic system.  And also an overestimation
  about the abilities of economists to give good advice and an overestimation
  of the abilities of politicians and public administrators to pursue public
  policy according to that advice.

  The idea about fine tuning the economy was based on an oversimplified
  vision of the economy.  So from that point of view, for instance,
  economists engaged in forecasting--they are, in my opinion, very much
  overestimating the possibilities of making forecasts because the economic
  system is too complex to forecast.  Buchanan has never been engaged in
  forecasting.  He does not even give policy advice because he thinks it's
  quite meaningless...

What econometric computer model is not "an oversimplified vision of the
economy?" When is forecasting an "economic system ...  too complex to
forecast" not fortune-telling?

To return to Kolata's article:

  [Victor Zarnowitz of the University of Chicago] finds that "when you
  combine the forecasts from the large models, and take an average, they are
  no better than the average of forecasts from people who just use their best
  judgment and do not use a model."

I cannot resist noting that when a President used his own judgment, and
pursued an economic policy that created the greatest Federal deficit in
history but the lowest interest rates in more than a decade, the high
priests of the dismal science called it "voodoo economics." It takes one
to know one, I guess.

  Ascher finds that "econometric models do a little bit worse than judgment.
  And for all the elaboration over the years they haven't gotten any better.
  Refining the models hasn't helped."  Ascher says he finds it "somewhat
  surprising that the models perform worse than judgment since judgment is
  actually part of the models; it is incorporated in when modelers readjust
  their data to conform to their judgment."

Fascinating! Assuming the same persons are rendering "judgments," at
different times perhaps, it implies that the elaboration and mathematical
sophistry of the models actually cloud their judgment when expressed through
the models:  they appear to have lost sight of the real forest for the
papier-mache trees.

  Another way of assessing models is to ask whether you would be better off
  using them, or just predicting that next year will be like this year.  This
  is the approach taken by McNees...  "I would argue that, if you average
  over all the periods [1974-1982] you would make smaller errors with the
  models [on GNP and inflation rates] than you would by simply assuming that
  next year will be just like this year," he says.  "But the errors would not
  be tremendously smaller.  We're talking about relatively small orders of
  improvement."

I seem to recall that this is the secret of the Farmer's Almanac success
in predicting weather, and that one will only be wrong 15% of the time
if one predicts tomorrow's weather will be exactly like today's.

  Other investigators are asking whether the models' results are
  reproducible...  Suprisingly the answer seems to be no.  "There is a real
  problem with scholarship in the profession," says Hale of the Energy
  Information Administration.  "Models are rarely documented well enough so
  that someone else can get the same result..."

  [In one study, about two-thirds of the] 62 authors whose papers were
  published in the [J]ournal [of Money, Credit and Banking]... were unwilling
  to supply their data in enough detail for replication.  In those cases
  where the data and equations were available, [the researchers] succeeded in
  replicating the original results only about half the time...

What a sorry testament!  What has become of scientific method, peer review?

  "Even if you think the models are complete garbage, until there is an
  obviously superior alternative, people will continue to use them," [McNees]
  says.

Saul, failing to receive a sign from Jehovah, consulted a fortune-teller on the
eve of a major battle.  The Witch of Endor's "model" was the wraith of Samuel,
and it wasn't terribly good for the body politic either.  I keep a sprig of
laurel on my CRT, a "model" I gathered from the tree at Delphi, used to send
the Oracle into trance, to speak Apollo's "truth." I do it as amusement and
memento, not as talisman for public policy.  History and literature are filled
with the mischief that superstition and fortune-telling have wrought, yet
some economic and computer scientists, the latter apparently as inept as the
Sorcerer's Apprentice, are perpetuating these ancient evils.  Are Dynamo and
decendents serving as late-twentieth-century substitutes for I Ching sticks?

Is the problem restricted to econometrics, or is the abuse of computer
modeling widespread?  Who reproduces the results of weather models, for
instance?  Who regularly assesses and reports on, and culls the unworthy
models?  Weather models are interesting because they may be among the
most easily "validated," yet there remains the institutional question:
when the Washington Redskins buy a weather service, for example, to
predict the next game's weather, how can they objectively predetermine
that they are buying acceptable, "validated" modeling rather than snake
oil?  After all, even snake oil can be objectively graded SAE 10W-40, or
not.  A posteriori "invalidation" by losing while playing in the "wrong"
weather is no answer, any more than invalidation by catastrophic engine
failure would be in motor oils.  The Society of Automotive Engineers at
least has promulgated a viscosity standard:  what have we done?

Where is scientific method at work in computer modeling?  When peer review
is necessarily limited by classification, in such applications as missile
engagement modeling and war gaming, what body of standards may the closed
community use to detect and eliminate profitable, or deadly, hokum?  Is this
just one more instance of falsified data and experiments in science
generally, of the sort reported on the front page of the Washington Post as
or before it hits the journals?  (See:  "Harvard Researchers Retract
Published Medical 'Discovery;'" Boyce Rensberger, Washington Post, 22
November 1986 pg 1 et seq.; and Science, Letters, 28 November 1986.)

Several reforms (based on the "publish or perish" practice that is
itself in need of reform) immediately suggest themselves.  I offer them
both as a basis for discussion, and as a call to action, or we shall
experience another aspect of Limits to Growth-- widespread rejection of
the contributions of computer science, as a suspect specialty:

  o Refusal to supply data to a peer for purposes of replication might
result in the journal immediately disclaiming the article, and temporary
or permanent prohibition from publication in the journal in question.

  o Discovery of falsified data in one publication resulting in
restriction from publication (except replies, clarification or
retraction) in all publications of the affiliated societies.  In
computer science, this might be all IEEE publications at the first
level, AFIPS, IFIPS and so on.

  o Widespread and continuing publication of the identities of the authors,
and in cases of multiple infractions, their sponsoring institutions, in
those same journals, as a databank of refuseniks and frauds.

  o Prohibition of the use of computer models in public policymaking (as in
sworn testimony before Congress) that have not been certified, or audited,
much as financial statements of publicly traded companies must now be audited.

  o Licensing by the state of sale and conveyance of computer models of
general economic or social significance, perhaps as defined and
maintained by the National Academy of Sciences.

The last is extreme, of course, implying enormous bureaucracy and
infrastructure to accomplish, and probably itself inevitably subject to
abuse.  The reforms are all distasteful in a free society.  But if we do
nothing to put our house in order, much worse is likely to come from the
pen or word-processor of a technically naive legislator.

In exchange for a profession's privileged status, society demands it be
self-policing.  Doctors, lawyers, CPAs and the like are expected to
discipline their membership and reform their methods when (preferably
before) there are gross abuses.  Although some of them have failed to do
so in recent years, is that an excuse for us not to?

Finally, how can we ensure that McNees' prediction, that people will
continue to re-engineer our society on models no better than garbage,
will prove as false as the models he has described?

------------------------------

End of RISKS-FORUM Digest
************************
-------