Path: usenet.cise.ufl.edu!huron.eel.ufl.edu!usenet.eel.ufl.edu!news-out.internetmci.com!newsfeed.internetmci.com!199.60.229.5!newsfeed.direct.ca!news.he.net!nac!nntp.teleport.com!news.teleport.com!not-for-mail
From:
[email protected]
Newsgroups: comp.lang.perl.announce,comp.lang.perl.misc
Subject: ANNOUNCE: jdb, flat-ascii database functions for shell scripting
Followup-To: comp.lang.perl.misc
Date: 27 Feb 1998 16:07:52 GMT
Organization: USC Information Sciences Institute
Lines: 61
Sender:
[email protected]
Approved:
[email protected] (comp.lang.perl.announce)
Message-ID: <
[email protected]>
NNTP-Posting-Host: gadget.cscaper.com
X-Disclaimer: The "Approved" header verifies header information for article transmission and does not imply approval of content.
Xref: usenet.cise.ufl.edu comp.lang.perl.announce:82 comp.lang.perl.misc:18163
JDB is package of commands for manipulating flat-ASCII databases from
shell scripts. JDB is useful to process medium amounts of data (with
very little data you'd do it by hand, with megabytes you might want a
real database). JDB is very good at doing things like:
- extracting measurements from experimental output
- re-examining data to address different hypotheses
- joining data from different experiments
- eliminating/detecting outliers
- computing statistics on data (mean, confidence intervals, histograms)
- reformatting data for graphing programs
Rather than hand-code scripts to do each special case, JDB provides
higher-level functions. Although it's often easy throw together a
custom script to do any single task, I believe that there are several
advantages to using this library:
- these programs provide a higher level interface than plain Perl
=> dbrow '_size == 1024' | dbstats bw
rather than:
while (<>) { split; $sum+=$F[2]; $ss+=$F[2]^2; $n++; }
$mean = $sum / $n; $std_dev = ...
etc.
in dozens of places
- the library uses names for columns
=> no more $F[2], use _bw
=> new or different order columns? no changes to your scripts!
- the library is self-documenting (each program records what it did)
=> no more wondering what hacks were used to compute the
final data, just look at the comments at the end
of the output
- unusual cases and error checking are already handled
=> custom scripts often skimp on error checking
(The disadvantage is that you need to learn what functions JDB provides.)
JDB is built on flat-ASCII databases. By storing data in simple text
files and processing it with pipelines it is easy to experiment (in
the shell) and look at the output. The original implementation of
this idea was /rdb, a commercial product described in the book ``UNIX
relational database management: application development in the UNIX
environment'' by Rod Manis, Evan Schaffer, and Robert Jorgensen (and
also at the web page <
http://www.rdb.com/>). JDB is an incompatible
re-implementation of their idea without any accelerated indexing or
forms support. (But it's free!).
Installation instructions follow at the end of this document. JDB
requires Perl 5.003 to run. There are no man pages currently, but
each command has a complete description in its usage string. All
commands are backed by an automated test suite.
The most recent version of JDB is available on the web at
<
http://www.isi.edu/~johnh/SOFTWARE/JDB/index.html>.
-John Heidemann
11-Feb-98