Path: usenet.cis.ufl.edu!usenet.eel.ufl.edu!tank.news.pipex.net!pipex!news.mathworks.com!newsfeed.internetmci.com!in1.uu.net!gasco!nntp.teleport.com!usenet
From: [email protected] (Ulrich Pfeifer)
Newsgroups: comp.lang.perl.announce,comp.lang.perl.misc
Subject: Module Wais 2.1 available
Followup-To: comp.lang.perl.misc
Date: 13 Dec 1995 15:50:17 GMT
Organization: University of Dortmund, Germany
Lines: 319
Approved: [email protected] (comp.lang.perl.announce)
Message-ID: <[email protected]>
Reply-To: [email protected]
NNTP-Posting-Host: linda.teleport.com
X-Disclaimer: The "Approved" header verifies header information for article transmission and does not imply approval of content.
Xref: usenet.cis.ufl.edu comp.lang.perl.announce:205 comp.lang.perl.misc:14663

For all WWW-WAIS gateway implementors:

       The Perl module Wais 2.1 is available real soon now at your
       favourite CPAN site.

I append the documentation for convenience.

       ... Yes there is documentation now ;-)

Randal: Would you as the author of chat2 replace the 'do' by a '&'? It
       would make the tests look prettier:


t/basic.............Use of "do" to call subroutines is deprecated at /usr/local/ls6/perl5.001n/lib/perl5/chat2.pl line 265.
ok
t/dict..............Use of "do" to call subroutines is deprecated at /usr/local/ls6/perl5.001n/lib/perl5/chat2.pl line 265.
ok
t/parallel..........Use of "do" to call subroutines is deprecated at /usr/local/ls6/perl5.001n/lib/perl5/chat2.pl line 265.
ok
All tests successful.
Files=3,  Tests=17, 34 secs ( 1.83 cusr  0.68 csys =  2.52 cpu)


Currently I also do this in Wais.pm to avoid warnings:-(

       # make strict happy

       @we_know = ($chat::name, $chat::debug, $chat::aliases,
                   $chat::family, $chat::nfound, $chat::thisbuf,
                   $chat::thishost, $chat::timeleft);
       @we_know = ();

The module is tested with 5.001n and 5.002b1f.

--
@J = split //,"J!k Phau^eHeens%rarrot&\ncl t ";
for(0..24){print $J[$_*7%($#J+1)]}
------------------------------------------------------------------------
NAME
      Wais - access to freeWAIS-sf libraries

SYNOPSIS
      use Wais;

DESCRIPTION
      The interface is divided in four major parts.

      SFgate 4.0
                For backward compatibility the functions used in
                SFgate up to version 4 are still present. Their
                use is deprecated and they are not documented
                here. These functions may no be supported in
                following versions of this module.

      Protocol  XS functions which provide a low-level access to
                the WAIS protocol. E.g. generate_search_apdu()
                constructs a request message.

      SFgate 5.0
                Perl functions that implement high-level access
                to WAIS servers. E.g. parallel searching is
                supported.

      dictionary
                A bunch of XS functions useful for inspecting
                local databases.

      We will start with the SFgate 5.0 functions.

USAGE
      The main high-level interface are the functions
      Wais::Search and Wais::Retrieve. Both return a reference
      to an object of the class Wais::Result.

      Wais::Search

      Arguments of Wais::Search are hash references, one for
      each database to search. The keys of the hashes should be:

      query     The query to submit.

      database  The database which should be searched.

      host      host is optional. It defaults to 'localhost'.

      port      port is optional. It defaults to 210.

      tag       A tag by which individual results can be
                associated to a database/host/port triple. If
                omitted defaults to the database name.

      relevant  If present must be a reference to an array
                containing alternating document id's and types.
                Document id's must be of type Wais:Docid.

      Here is a complete example:

           $result = Wais::Search({'query'    => 'pfeifer',
                                   'database' => $db1,
                                   'host'     => 'ls6',
                                   'relevant' => [$id, 'TEXT']},
                                  {'query'    => 'pfeifer',
                                   'database' => $db2});

      If host is 'localhost' and database.src exists, local
      search is performed instead of connecting a server.

      Wais::Search will open $Wais::maxnumfd connections in
      parallel at most.

      Wais::Retrieve

      Wais::Retrieve should be called with named parameters
      (i.e. a hash).  Valid parameters are database, host, port,
      docid, and type.

              $result = Wais::Retrieve('database' => $db,
                                       'docid'    => $id,
                                       'host'     => 'ls6',
                                       'type'     => 'TEXT');

      Defaults are the same as for Wais::Search. In addition
      type defaults to 'TEXT'.

      Wais:Result

      The functions Wais::Search and Wais::Retrieve return
      references to objects blessed into Wais:Result. The
      following methods are available:

      diagnostics
                Returns and array of diagnostic messages. Each
                element (if any) is a reference to an array
                consisting of

           tag       The tag of the corresponding search request
                     or 'document' if the request was a retrieve
                     request.

           code      The WAIS diagnostic code.

           message   A textual diagnostic message.

      header    Returns and array of WAIS document headers. Each
                element (if any) is a reference to an array
                consisting of

           tag       The tag of the corresponding search request
                     or 'document' if the request was a retrieve
                     request.

           score

           lines     Length of the corresponding dcoument in
                     lines.

           length    Length of the corresponding document in
                     bytes.

           headline

           types     A reference to an array of types valid for
                     docid.

           docid     A reference to the WAIS identifier blessed
                     into Wais::Docid.

      text      Returns the text fetched by Wais::Retrieve.

Dictionary
      There are a couple of functions to inspect local
      databases. See the inspect script in the distribution. You
      need the Curses module to run it. Also adapt the directory
      settings in the top part.

      Wais::dictionary

             %frequency = Wais::dictionary($database);
             %frequency = Wais::dictionary($database, $field);
             %frequency = Wais::dictionary($database, 'foo*');
             %frequency = Wais::dictionary($database,  $field, 'foo*');

      The function returns an array containing alternating the
      matching words in the global or field dictionary matching
      the prefix if given and the freqence of the preceding
      word. In a sclar context, the number of matching word is
      returned.

      Wais::list_offset

      The function takes the same arguments as Wais::dictionary.
      It returns the same array rsp. wordcount with the word
      frequencies replaced by the offset of the postinglist in
      the inverted file.

      Wais::postings

             %postings = Wais::dictionary($database, 'foo');
             %postings = Wais::dictionary($database, $field, 'foo');

      Returns and an array containing alternating numeric
      document id's and a reference to an array whichs first
      element is the internal weight if the word with respect to
      the document. The other elements are the word/character
      positions of the occurances of the word in the document.
      If freeWAIS-sf is compiled with -DPROXIMITY, word
      positions are returned otherwise character postitions.

      In an scalar context the number of occurances of the word
      is returned.

      Wais::headline

             $headline = Wais::headline($database, $docid);

      The function retrieves the headline (only the text!) of
      the document numbered $docid.

Protocol
      Wais::generate_search_apdu

             $apdu = Wais::generate_search_apdu($query,$database);
             $relevant = [$id1, 'TEXT', $id2, 'HTML'];
             $apdu = Wais::generate_search_apdu($query,$database,$relevant);

      Document id's must be of type WAIS::Docid as returned by
      Wais::Result::header or Wais::Search::header.
      $WAIS::maxdoc may be set to modify the number of documents
      to retrieve.

      Wais::generate_retrieval_apdu

             $apdu = Wais::generate_retrieval_apdu($database, $docid, $type);
             $apdu = Wais::generate_retrieval_apdu($database, $docid,
                                                   $type, $chunk);

      Request to send the $chunk's chunk of the document whichs
      id is $docid (must be of type WAIS::Docid). $chunk
      defaults to 0.  $Wais::CHARS_PER_PAGE may be set to
      influence the chunk size.

      Wais::local_answer

             $answer = Wais::local_answer($apdu);

      Answer the request by local search/retrieval. The message
      header is stripped from the result for convenience (see
      the code of Wais::Search rsp. documentaion of
      Wais::Search::new below).

      Wais::Search::new

             $result = Wais::Search::new($message);

      Turn the result message in an object of type Wais::Search.
      The following methods are available: diagnostics, header,
      and text. Result of the message is pretty the same as for
      Wais::Result. Just the tags are missing.

      diagnostics
                Return an array of references to [$code,
                $message]

      header    Return an array of references to [$score,
                $lines, $length, $headline, $types, $docid].

      text      Returns the chunk of the document requested. For
                documents larger than $Wais::CHARS_PER_PAGE more
                than one request must be send.

      Wais::Search::DESTROY

      The objects will be destroyed by Perl.

VARIABLES
      $Wais::version
                Generated by: sprintf(buf, "Wais %3.1f%d",
                VERSION, PATCHLEVEL);

      $Wais:errmsg
                Set to an verbose error message if something
                went wrong. Most functions return undef on
                failure after setting $Wais:errmsg.

      $Wais::maxdoc
                Maximum number of hits to return when searching.
                Defaults to 40.

      $Wais::CHARS_PER_PAGE
                Maximum number of bytes to retrieve in a single
                retrieve request.  Wais:Retrieve sends multiple
                requests if necessary to retrieve a document.
                CHARS_PER_PAGE defaults to 4096.

      $Wais::timeout
                Number of seconds to wait for an answer from
                remote servers. Defaults to 120.

      $Wais::maxnumfd
                Maximum number of file descriptors to use
                simultaneously in Wais::Search.

BUGS
      Wais::Search currently splits the request in groups of
      $Wais::maxnumfd requests. Since some requests of the group
      might be local and/or some might refer to the same
      host/port, groups may not use all $Wais::maxnumfd possible
      file descriptors. Therefore some performance my be lost
      when more than $Wais::maxnumfd requests are processed.

AUTHOR
      Ulrich Pfeifer <[email protected]>

--
Ulrich  UNIVERSITAET-DORTMUND     telefax:  49 231 755 2405        /////
Pfeifer Lehrstuhl Informatik VI   voice:    49 231 755 3032  ____UNI DO
@RR     D-44221 Dortmund          postbox:  50 05 00         \\*\\////
http://ls6-www.informatik.uni-dortmund.de/WhoIsWhoAtLS6.html  \\\\\//