Path: usenet.cise.ufl.edu!usenet.eel.ufl.edu!news.mathworks.com!news.bbnplanet.com!cam-news-hub1.bbnplanet.com!uunet!in3.uu.net!192.108.254.3!news.teleport.com!not-for-mail
From: [email protected] (Steffen Beyer)
Newsgroups: comp.lang.perl.announce,comp.lang.perl.modules
Subject: ANNOUNCE: Set-IntegerFast 3.0
Followup-To: comp.lang.perl.modules
Date: 13 Jan 1997 16:13:22 GMT
Organization: sd&m GmbH & Co. KG Munich, Germany
Lines: 468
Sender: -yp- @gadget.cscaper.com
Approved: [email protected] (comp.lang.perl.announce)
Message-ID: <[email protected]>
Reply-To: [email protected] (Steffen Beyer)
NNTP-Posting-Host: gadget.cscaper.com
X-Disclaimer: The "Approved" header verifies header information for article transmission and does not imply approval of content.
Xref: usenet.cise.ufl.edu comp.lang.perl.announce:89 comp.lang.perl.modules:1332

I am glad and proud to be able to present

=========================================
 Package "Set-IntegerFast" Version 3.0
=========================================

to the Perl community.


Contents of this message:
-------------------------

   -   Legal stuff
   -   Features
   -   Requirements
   -   Most important differences between versions 1.x and 2.0
   -   What does it do
   -   Version history
   -   Plans for the future
   -   Credits
   -   Where to find
   -   Final note


Legal stuff:
------------

Copyright (c) 1995, 1996, 1997 by Steffen Beyer. All rights reserved.
This package is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.


Features:
---------

*   "lib_set.c":

       +   efficient (fast) handling of bit vectors and set operations,
           auto-configuring for using machine word as basic storage unit
           (most efficient!)
           (C library, completely independent of Perl)

*   "Set::IntegerFast":

       +   efficient (fast) object-oriented methods for handling sets
           of integers (intervals from zero to some positive integer)
           (Perl XSUBs in C, uses "lib_set.c")

*   "Set::IntegerRange":

       +   object-oriented methods for handling sets of integers
           (arbitrary intervals)
           (in Perl, uses "Set::IntegerFast")

       +   overloaded arithmetic and relational operators
           for still more ease of use
           (in Perl, uses first part of "Set::IntegerRange")

*   "Math::MatrixBool":

       +   object-oriented methods for handling matrices of booleans
           (Boolean Algebra)
           (in Perl, uses "Set::IntegerFast")

       +   overloaded arithmetic and relational operators
           for still more ease of use
           (in Perl, uses first part of "Math::MatrixBool")

       +   computes reflexive transitive closure using Kleene's
           algorithm (essential for solving path-problem in graphs)

       +   this is mainly an example application to help you build
           your own (using "Set::IntegerFast")

*   "Math::MatrixReal":

       +   object-oriented methods for handling matrices of reals
           (for demonstration purposes only)
           (in Perl, independent stand-alone module)

       +   overloaded arithmetic and relational operators
           allow you to use this data type (almost) like
           any other built-in Perl data type

       +   features another implementation of Kleene's algorithm to
           compute the minimal costs for all paths in a graph with
           weighted edges (the "weights" being the costs associated
           with each edge)

       +   allows to solve linear equation systems using an efficient
           algorithm known as "L-R-decomposition" and several approxi-
           mative (iterative) methods

       +   allows you to convert a matrix into a string (in a nice,
           human-readable format) and to read it back in later (for
           instance from a file!), or using the shell-like "here-
           document" syntax (among other possibilities):

           $matrix = Math::MatrixReal->new_from_string(<<"MATRIX");
           [   3    2    0   ]
           [   0    3    2   ]
           [  $c1  $c2  $c3  ]
           MATRIX

*   "DFA::Kleene":

       +   still another implementation of Kleene's algorithm to compute
           the language accepted by a Deterministic Finite Automaton
           (for demonstration purposes only)
           (in Perl, independent stand-alone module)

*   "Graph::Kruskal":

       +   implementation of Kruskal's efficient algorithm for Minimal
           Spanning Trees in graphs in O( n * ld(n) )
           (for demonstration purposes only)
           (in Perl, independent stand-alone module)

       +   example of an algorithm relying heavily on sets which uses
           a different (fascinating!) representation for sets than the
           "Set::IntegerFast" module (see the Graph::Kruskal(3) man page!)

*   "Kleene.pod":

       +   a short introduction into the theory behind Kleene's algorithm


Requirements:
-------------

Perl version 5.000 or higher, a C compiler capable of the ANSI C standard (!)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


What are the most important differences between versions 1.x and 2.0:
---------------------------------------------------------------------

1) The standard calling convention for the object constructor method
  is supported now, i.e.

           $set = Set::IntegerFast::Create($elements);

  is _gone_ and is replaced by

           $set = new Set::IntegerFast($elements);

  (and also including all other possibilities of calling a class method
  offered by Perl)

2) The object destructor method has also changed its name:

           $set->Destroy();

  is also _gone_ and replaced by

           $set->DESTROY();

  Note however that you don't need to call this method explicitly
  anymore (!) - Perl will do it automatically for you when the last
  reference to your set is deleted, for instance through assigning
  a different value to the Perl variable containing the reference
  to your set, like in "$set = 0;".

3) The man page is no separate file anymore, it is now included in
  the file "IntegerFast.pm" in POD format, where it will automatically
  be found and installed in your "man" directory by "make install".

4) A wrapper module named "Set::IntegerRange" has been added to this
  package which allows you to use sets of integers in an *arbitrary*
  interval (instead of from zero to some positive integer).


What does it do:
----------------

The base module of this package, "Set::IntegerFast", allows you to create
sets of arbitrary size (only limited by the size of a machine word and avai-
lable memory on your system) of an interval of positive integers starting
with zero, to dynamically change the size of such sets and to perform all
the basic operations for sets on them, like

- adding or removing elements,

- testing for the presence of a certain element,

- computing the union, intersection, difference, symmetric difference or
 complement of sets,

- copying sets,

- testing two sets for equality or inclusion, and

- computing the minimum, the maximum and the norm (number of elements) of
 a set.

Note that it is extremely easy to implement sets of arbitrary intervals
of integers using this module (negative indices are no obstacle), despite
the fact that only intervals of positive integers (from zero to some posi-
tive integer) are supported directly.

Please refer to the Set::IntegerFast(3) man page and the "Set::IntegerRange"
module to see how this can be done!

The module is mainly intended for mathematical or algorithmical computa-
tions. There are also a number of efficient algorithms that rely on sets.

An example of such an efficient algorithm (which uses a different repre-
sentation for sets than this module, however) is Kruskal's algorithm for
minimal spanning trees in graphs. (That algorithm is included in this dis-
tribution as a Perl module for those interested. Please refer to the
Graph::Kruskal(3) man page for more details!)

Another famous algorithm using sets is the "Seave of Erathostenes" for
calculating prime numbers, which is included here as a demo program
(see "Set/primes.pl").

An important field of application is the computation of "first", "follow"
and "look-ahead" character sets for the construction of LL, SLR, LR and LALR
parsers for compilers (or a compiler-compiler, like "yacc", for instance).

(That's what the C library in this package was initially written for)

(See Aho, Hopcroft, Ullman, "The Design and Analysis of Computer Algorithms"
for an excellent book on efficient algorithms and the famous "Dragon Book"
on how to build compilers by Aho, Sethi, Ullman)

Therefore, this module is primarily designed for efficiency and not for a
comfortable user interface (the latter can be added by additional modules,
as shown by the "Set::IntegerRange" and "Math::MatrixBool" modules).

It only offers a basic functionality and leaves it up to your application
to add whatever special handling it needs (for example, negative indices
can be realized by biasing the whole range with an offset).

(Please refer to the "Set::IntegerRange" module in this package to see how!)

Sets in this module are implemented as bit vectors, and elements are positive
integers from zero to the maximum number of elements (which you specify when
creating the set) minus one.

Each element (i.e., number or "index") thus corresponds to one bit in the
bit array. Bit number 0 of word number 0 corresponds to element number 0,
element number 1 corresponds to bit number 1 of word number 0, and so on.

The module doesn't use bytes as basic storage unit, it rather uses machine
words, assuming that a machine word is the most efficiently handled size of
all scalar types on any machine (that's what the C standard proposes and
assumes anyway).

In order to achieve this, it automatically determines the number of bits
in a machine word on your system and then adjusts its internal constants
accordingly.

The greater the size of this basic storage unit, the better the complexity
of the methods in this module (but also the greater the average waste of
unused bits in the last word).

See the section on COMPLEXITY in the Set::IntegerFast(3) man page for an
overview of the complexity of each method!

Note that the C library in this package ("lib_set.c") is designed in such
a way that it may be used independently from Perl and this Perl extension
module. (!)

For this, you can use the file "lib_set.o" exactly as it is produced when
building this module! It contains no references to Perl, and it doesn't need
any Perl header files in order to compile. (It only needs "lib_defs.h" and
some system header files)

Note however that this C library does not perform any bounds checking
whatsoever! (This is left to your application!)

(See the corresponding explanation in the file "lib_set.c" for more details
and the file "IntegerFast.xs" for an example of how this can be done!)

In this module, all bounds and type checking (which should be absolutely
fool-proof, by the way!) is done in the XS subroutines.

For more details on the modules in this package, please refer to their
respective man pages!


Version history:
----------------

Version 1.0 was the initial release.

Version 1.1 offered a new "Resize" method which allows you to change the
size of an existing set while keeping the information it contains (as much
of it as will fit into the new set) and fixed some errors in the documen-
tation (the methods Create, Empty, Fill and Copy had complexity n/8 and not
n/b) by changing the implementation of these methods (so that they now have
complexity n/b).

The interface of the C routines was made more consistent (the pointer to
the set is now always the first argument) and a few more paragraphs were
added to the documentation.

The method "ExclusiveOr" (which calculates the symmetric difference X =
(Y + Z) \ (Y * Z) of two sets Y and Z) was also added in this version.

Version 1.1 broke with the next new release of Perl, version 5.002 (problems
with the "Destroy" method and "bad free() ignored" warnings that caused some
of the tests in "make test" to fail).

Version 2.0 fixed the problem that appeared with Perl 5.002 in version 1.1.

As a matter of fact, version 2.0 was a complete rewrite of the XSUB part
of this package. The C library of the package ("lib_set.c") has also been
slightly changed; the functions "lexorder" and "Compare" are handled more
efficiently now (complexity 1..n/b instead of 1..n/8). Parameter types have
been adjusted to reflect their nature as those integers that the sets of this
package are all about. The documentation has been completely rewritten and
ported to POD format.

A new module (in fact a "wrapper" module for the "Set::IntegerFast" module)
named "Set::IntegerRange" (with version number 1.0) has been added in version
2.0 of this package which allows you to use sets of integers in an *arbitrary*
interval (instead of from zero to some positive integer).

Version 2.1 of the "Set::IntegerFast" module (and version 2.0 of the
"Set::IntegerRange" module) introduces a new method, "flip", which flips
an element and returns its new state. It also fixes the "known bug" men-
tioned in the README file and man page of "Set::IntegerFast" version 2.0.

Version 2.0 of the "Set::IntegerRange" module also introduces the possibility
to use overloaded operators (instead of explicit method calls).

Starting with the version 2.1 of the "Set::IntegerFast" module, the
"Set-IntegerFast" distribution has a version number independent of the
"Set::IntegerFast" module, beginning with version number 3.0.

This is because in this version (version 3.0 of the "Set-IntegerFast"
distribution and version 2.1 of the "Set::IntegerFast" module) a couple
of companion modules have been added:

"Math::MatrixBool", "Math::MatrixReal" and "DFA::Kleene"; and the former
"kruskal" demo program has been upgraded to a Perl module ("Graph::Kruskal")
as well.

A short essay about the theory behind Kleene's algorithm has also been added.

The directory structure of the distribution had to be adjusted accordingly,
and separate Makefiles had to be provided instead of a single one in order
to assure correct building and installation of the modules without changing
the standard procedure: "perl Makefile.PL ; make ; make test ; make install".

There exists an alternate way of doing this which doesn't need the many
Makefile.PL's which would consist in creating a "lib" subdirectory in the
root directory of this distribution and to move the subdirectories "DFA",
"Graph" and "Math" into it. It's just that I like the first way better...


Plans for the future:
---------------------

See if "confess" instead of "croak" everywhere (especially in the XSUBs)
would provide the user with a hint where in his program an error detected
at the bottom of a calling hierarchy of modules really comes from as I
imagine it. See if this really is a problem that can possibly occur.
(Till now, I could only provoke this using dirty tricks!)

Define test cases for "Math::MatrixBool" and "Math::MatrixReal" (which is
not trivial for the latter since results will depend on the local imple-
mentation of floating point arithmetics on a given machine!).


Credits:
--------

Many thanks to Andreas Koenig <[email protected]> for his
efforts as upload-manager for the CPAN, his patience, and lots of good
advice and suggestions! Thank you for doing such a tremendous (and time-
consuming) job!!

Also many thanks to David Jenkins <[email protected]> for reviewing the
first version of this README file and the man page.

Many thanks to Jarkko Hietaniemi <[email protected]> for his
suggestions while I was developing the first release of this package!

Many thanks also to the people of the perl5-porters <[email protected]>
mailing list, specifically:

Andreas Koenig <[email protected]>
Tim Bunce <[email protected]>
Jarkko Hietaniemi <[email protected]>
Felix Gallo <[email protected]>
Mark A Biggar <[email protected]>
Nick Ing-Simmons <[email protected]>
John Macdonald <[email protected]>

for discussing and clarifying the naming and other issues of this package!

Also many thanks to David Thompson <[email protected]> for reporting a
problem he encountered concerning the inclusion of the Perl distribution
("Unable to find include file ...") and for suggesting a solution for this
problem. (That's the most pleasant kind of problem report, of course! ;-) )

Many thanks to Rob Johnson <[email protected]> for an improved algorithm
for computing binomials with always integer intermediate results (and
numbers never getting too big)!

Thanks to Banchong Harangsri <[email protected]> for reporting the
problem of the version 1.1 of this module with Perl 5.002!

Special thanks to Dean Roehrich <[email protected]> for his assistance
in trying to find the cause of and a remedy for the above problem!

Many thanks to Andreas Koenig for notifying me of the alternative for the
directory structure using the "lib" subdirectory and a way to use "confess"
in an XSUB via "perl_eval_sv".


Where to find:
--------------

At the usual ftp sites for Perl (CPAN = "Comprehensive Perl Archive Network"):

The file

       Set-IntegerFast-3.0.tar.gz

can be found in directory

   .../CPAN/authors/id/STBEY/

or

   .../CPAN/modules/by-category/06_Data_Type_Utilities/Set/

or

   .../CPAN/modules/by-module/Set/

(See "The Perl 5 Module List" by Tim Bunce and Andreas Koenig
in news:comp.lang.perl.modules for a list of CPAN ftp servers)


Final note:
-----------

If you need any assistance or have any comments, problems, suggestions,
findings, complaints, questions, insights, compliments or donations to give ;-)
then please don't hesitate to send me a mail:

[email protected] (Steffen Beyer)

In fact I'd be glad if you could drop me an e-mail when you are using this
package, so I can see how much interest exists in it and how much time is
reasonable to spend on its further development.

Therefore, I would also be glad to know what you liked and what you disliked
about this package!

And I would also be very interested to know what your application is in
which you found this package to be useful, just to get an idea what can
all be done with it and in which direction it should be developed further.

Many thanks in advance!!

With kind regards,
--
   |s  |d &|m  |    Steffen Beyer <[email protected]> (+49 89) 63812-244 fax -150
   |   |   |   |    software design & management GmbH & Co. KG
   |   |   |   |    Thomas-Dehler-Str. 27, 81737 Munich, Germany.