Go4zgate Gateway: Using
                            ZDist and Isite

                           Neophytos Iacovou

    Last year we implemented a Gopher to Z39.50 gateway using CNIDR's
        fw101 package. Since then CNIDR has switched its focus away from
        fw101 and towards Zdist and Isite. We have updated Go4zgate to
        reflect this move. At the same time we added some features of our
        own. This paper describes Go4zgate - a gopher to Z39.50 gateway.

                                  June 9, 1995


1.0     Introduction

At the moment a student can access the University of Min-
nesota's library system by either: walking up to any one of
the public access terminals found in every library on cam-
pus; or they can open a telnet/TN3270 session to the
libraries's mainframes. In either case the result is the
same, the student is greeted with a what can only be
described as "clunky" user-interface.

The University of Minnesota library system has embraced
Z39.50 and is beginning to move away from its current
mainframe/terminal architecture to a client/server archi-
tecture. As this transition occurs the opportunity to give
the students a better user-interface exists.

Z39.50 is a NISO standard; its official name is "Informa-
tion Retrieval Application Service Definition and Protocol
Specification". In April of 1995 a new version of the pro-
tocol was agreed on.

A Gopher to Z39.50 version 2 gateway (Go4zgate) was
developed in order to allow students to access the various
Z39.50 databases currently available. At the present time
these databases are generally on-line library catalogs, but
as more and more people become familiar with Z39.50 the
expected number of available Z39.50 servers will
increase.

The rest of this paper describes Go4zgate and how it can
be used to access on-line library catalogs.




2.0     Why Use Go4zgate?

The benefits of accessing Z39.50 library catalogs via
Go4zgate are:

·      Users don't have to learn a new IR system, they are
already familiar with Gopher.

·      The interface is kept simple and does not assume the
user to be a "library catalog power user".

·      As the user moves from catalog to catalog (potentially
from site to site) she is not forced to learn a new com-
mand set, the Go4zgate interface remains the same.




3.0     An Example Session

The following is a walk through of what a typical Z39.50
session might look like:

Selecting the desired library catalog to search the user is
greeted with a screen asking her to create a query by pro-
viding the appropriate data for each of the 3 search fields.
At least one of the search fields must be filled in. To keep
the user-interface simple all 3 search fields are tied
together by a boolean "AND" operation. Go4zgate can
also handle an "OR" operation across all fields as well.

The interface to this Z39.50 database creates queries based
on the Title, Author, and Subject fields. Had the user not
been searching a library catalog but rather a database on
cars, the fields from which the query was to be created
could just as easily have been Make, Model, and Year. The
MaxRecs field prompts the user for the maximum number
of records to return from the search.

Being a William S. Burroughs fan this user does a search
on Interzone.

FIGURE 1.       Creating a Query

The query returned 5 records. The records are displayed in
their "short" form. Selecting any one of the records will
display the record in its "long" form.

FIGURE 2.       Results of the Query

Selecting any one of the returned records will display the
record in its "long" format. One thing to notice about this
format is that information concerning the book's availabil-
ity and location within the library are not shown. The rea-
son for this is that holdings information is not a part of the
Z39.50 standard.

Another thing to note is the "Other authors" field. Z39.50
queries involving the Author are also checked with the
"Other authors" field. This may confuse people are first,
records will be returned even though the specified author
is not the principle author. However, this feature also
accounts for anthologies in which the work of the speci-
fied author may appear.

FIGURE 3.       A Record's Long Format

The first record returned is actually a pseudo record and is
created by Go4zgate. By selecting this record the user is
allowed to view the long format of all the records returned.
This comes in handy particularly when the result of a
query produces more than a handful of records. Without
this feature the user would be forced to manually open
each and every record in order to gain the same informa-
tion. Users will also want to save/download this record.

FIGURE 4.       All Records Returned

This record also supplies the user with some information
regarding her request. In this example case the user is told
that her query produced 5 records, of which 5 were
returned. This feature is useful because often a query may
produce more records than were requested by MaxRecs. In
such a case the user can then decide if she wants to rerun
the query with a higher value for the maximum number of
records to be returned.




4.0     An Example Of A More Advanced Session

With Go4zgate once the results of a search have been
returned the user is told how many total hits her query pro-
duced and how many of those hits were returned to her.
The later number is set by the MaxRecs field supplied by
the user. If the query produced more hits than were
returned the user is faced with three options:

·      Ignore the other hits

·      Resubmit her query with a larger MaxRecs value

·      Resubmit her query specifying that the result set start
with a higher hit record.

The first option may not be all bad, especially if she found
what she was looking for. The second option can be dan-
gerous because the Z39.50 server may place a cap on the
size of the result set. In such a case a MaxRecs value
higher than such a cap would prove futile.

One of the features in this version of Go4zgate is to allow
users the ability to ask for a result set starting with a spe-
cific record. The following two figures show this feature at
work.

FIGURE 5.       Starting A Search With Record 1

FIGURE 6.       The Result

FIGURE 7.       The Same Search Starting With Record 3

FIGURE 8.       The Result

You can see that by using the StartRec field user's can
really save themselves a lot of time by not re-running the
query with a larger MaxRecs size Using the StartRec field
also reduces the amount of resources consumed by 1 user.



Another feature we think users will find useful is the abil-
ity to search across multiple servers and catalogs for the
same query. We've already searched Melvyl for Naked
Lunch (see Figures 5 - 8), let us try the same query on
Penn State.

FIGURE 9.       Searching For Naked Lunch

FIGURE 10.      The Result

Now, we will attempt the same search on both catalogs at
the same time.

FIGURE 11.      Using a Screen That Prompts For More Info

FIGURE 12.      The Set Of Both Queries

As you can see, the result set is the addition of each of the
independent queries. Also, you may have noticed that the
screen looks different in this example than it did in previ-
ous examples.

We are still using the same gateway (Go4zgate) all that has
changed is the ASK block. This particular ASK block
prompts the user for a host, catalog, and a port number.
When searching multiple catalogs the user must provide
the same number of Hosts, Catalogs, and Ports; otherwise,
an error will occur.




5.0     Exactly What Information Is Returned?

Currently the following fields are displayed to the user:
Author, Title, Publisher, Pages, Series, Notes, Subjects,
Other authors, and Call Numbers.

The Call Numbers field can be a bit misleading. Most
libraries stopped populating the Call Number field ever
since they started to store information about a book in its
Holdings Record. Because of this the information in the
Call Number field may not always be valid. In the best
case there is no information available in the field, or any
information present is accurate. Go4zgate can be config-
ured to display call numbers or to not display them via the
command line. The trick is to know which databases still
support valid call numbers.




6.0     Future plans

·      We plan on making Go4zgate understand HTML.

·      We plan on supporting the complete list of USE
Attributes. Currently we only support "author", "title",
and "subject".

·      Explore some other Z39.50 packages and see how they
compare against ZDist.

·      We also plan on following the development of Z39.50
and implement changes in the standard in order to keep
the gateway as useful as can be.




7.0     Technical Details


7.1     Internet Gopher and the Gopher Protocol

Internet Gopher is an information system used to publish
and organize information on servers distributed across the
Internet. Initially developed at the University of Minne-
sota in early 1991, it has spread to over 4800 sites world-
wide as of December 1993.

The Gopher system is a client-server system that can be
used to build a Campus Wide Information System
(CWIS). Clients, which browse and search information are
available for most major platforms (Macintosh, DOS,
Windows, Unix, VMS, MVS, VM/CMS, OS/2). Servers,
which translate and publish information, are also available
for all of the platforms mentioned above.

This client-server architecture uses the Internet Gopher
Protocol. The Gopher protocol has been described as "bru-
tally simple." It is based on a web/tree metaphor of files
and directories. Its basic primitives are a list directory
transaction, a retrieve file transaction and a search for
directory entries transaction.


7.2     The Z39.50 Architecture

The Z39.50 server may have one or more on-line catalogs
available. The name of the catalog to be searched must be
provided along with the query sent to the server.

Because Z39.50 is a client/server protocol the user's client
must make a connection to the Z39.50 server, make a
request, and await for the output. Go4zgate was developed
because Gopher clients aren't force to speak Z39.50ish;
therefore, they require a translator who understands both
Gopher+ and Z39.50 version 2. Go4zgate resides between
the Gopher+ client and the Z39.50 server.

The interface between the user and Go4zgate is a Gopher+
ASK form. This makes creating new interfaces fast and
simple.

The Gopher client, Go4zgate, and the Z39.50 server are all
connected together via the Information Super Highway.


7.3     What Does the Ask Form Look Like?

<image>

As you can see a rather simple form is used. Note that
because MaxRecs must be supplied a value a default num-
ber of 10 is supplied for the user. This of course can be
changed. If no value is provided by the user the query will
not be attempted and the user will be informed that invalid
input has been provided.

This interface can be easily expanded as well. If you recall
we had previously mentioned that all three of the search
fields are tied together by a boolean "AND" operation, and
that an "OR" operation was also valid. Because some
users may want to decide between the two choices it is
possible to add the following line to the above script in
order to provide this service:

Choose: Operation:       AND OR

In the ASK form shown above this is not an option
because it adds another variable which may confuse users.
This is not to say that it is never going to be presented to
users as an option to control. Below is an example of what
a form with more user-specified options might look like:

<image>


7.4     What Goes on After the User Has Completed the Form?

<image>

The above code is a perl script which processes the infor-
mation entered into the form. As you can see Operator is
preset to an AND operation. Included in this script are pre-
set values for Host (which is set to melvyl.ucop.edu for
this particular form), the Catalog, and the Z39.50 port
(Port). On thing to note, most servers will be listening to
Z39.50 requests on port 210, but this may not always be
so.

The only other variable which is interesting to look at is
"longdir". This variable tells Go4zgate if it should return
information using the long dir. format or not. As an exam-
ple, the cursors version of the Gopher client will request
for long dir. listings, TurboGopher (the Macintosh Gopher
client) will not.

One final word, as you make changes to the ASK form
(i.e., by adding options such as allowing the user to select
by which boolean operator the search fields are bound)
you will also have to make changes to this script.


7.5     A Few Words on Go4zgate itself

You may have noticed that the above script does not apply
any error checks on the input provided by the user. All of
that is done within the gateway software itself. Rest
assured that invalid options are checked for and an error
message is returned to the user. Invalid input includes: a
value of 0, or less, or NULL for MaxRecs; and NULL data
for each of the search fields.

The smaller the catalog the server is searching through and
the easier the query the user requested is, the faster the
gateway is. The bottleneck in terms of time is, and this
shouldn't come as a big surprise, the time required for the
server to execute the query. In order to make life easier for
the user, Go4zgate caches the records returned by the
server. This means that once the request has been fulfilled
the user will be able to select and view any of the returned
records relatively fast. If the user tries to retrieve one of
the records returned and, for any reason, the cache has
been removed the gateway will go back to the server and
request that the original query be fulfilled - the original
cache will be recreated.




8.0     The Gateway Software

Parts of the gateway software were written by: Ray Larson
at the School of Library and Information Studies, UC Ber-
keley, who wrote most of the Z39.50 query engine; Clear-
inghouse for Networked Information Discovery and
Retrieval (CNIDR), who took the query engine and made
a WWW gateway from it; and the University of Minnesota
who took the software from CNIDR and made a Gopher+
gateway, in addition to adding more functions provided for
by the query engine into the gateway.

The Go4zgate gateway allows the Gopher Client to:

·      Request Z39.50 based queries from a Z39.50 server.

·      Retrieve and display the records which were returned
by the server.

The gateway is available via anonymous ftp from

boombox.micro.umn.edu

via anonymous ftp from the directory

/pub/gopher/Unix/gopher-gateways/go4zgate

Included in this distribution are some sample ASK blocks
that define the form users are presented with. Also
included is this document.