NAME
   Set::Files - routines to work with files, each definining a single set

SYNOPSIS
     use Set::Files;
     $Version = $Set::Files::VERSION;

     $obj     = new Set::Files(OPT => VAL, OPT => VAL, ...);

     @set     = $obj->list_sets( [TYPE] );

     @uid     = $obj->owner;
     $uid     = $obj->owner(SET);

     @set     = $obj->owned_by(UID [,TYPE]);

     @ele     = $obj->members(SET);

     $flag    = $obj->is_member(SET, ELE);

     @type    = $obj->list_types( [SET] );

     @dir     = $obj->dir;
     $dir     = $obj->dir(SET);

     %opts    = $obj->opts(SET);
     $val     = $obj->opts(SET,VAR);

     $obj->cache;

     $num     = $obj->add   (SET, FORCE, COMMIT, ELE1,ELE2,...);
     $num     = $obj->remove(SET, ELE1,ELE2,...);

     $obj->commit(SET1,SET2,...);

     $obj->delete(SET);
     $obj->delete(SET,1);

DESCRIPTION
   This is a module for working with simple sets of elements where each set
   is defined in a separate file (one file for each set to be defined).

   The advantages of putting each set in a separate file are:

   Set managment can be delegated
       If all sets are defined in a single file, management of all sets
       must be done by a single user, or by using a suid program. By
       putting each set in a separate file, different files can be owned by
       different users so management of different sets can be delegated.

   Set files are a simple format
       Because a file consists of a single set only, there is no need to
       have a complex file format which has to be parsed to get information
       about the set. As a result, set files can easily be autogenerated or
       edited with any simple text editor, and errors are less likely to be
       introduced into the file.

   The disadvantages are:

   Permissions problems
       Some applications may need to read all of the data, but since the
       different set files may be owned by different people, permissions
       may get set such that not all set files are readable.

       Applications which actually gather all of the data will need to be
       run as root in order to be reliable. Alternately, some means of
       enforcing the appropriate permissions needs to be in place.

   No central data location
       Usually, when you want to define sets, the data ultimately needs to
       be stored in one central location (which might be a single file or
       database).

       To get around this, a wrapper must be written using this module to
       copy the data to the central location.

   Simple elements only
       Many types of sets have elements which have attributes (for example,
       a ranking within the set or some other attribute). When you start
       adding attributes, you need a more complex file structure in order
       to store this information, so that type of set is not addressed with
       this module. The only attribute that an element has is membership in
       the set.

   Slow data access
       Because the data is spread out over several files, each of which
       must be parsed, and any error checking done, accessing the data can
       be significantly slower than if the data were stored in a central
       location.

   Features of this module include:

   Data caching
       This module provides routines for caching the information from all
       the set files. This can be used to avoid the permissions problems
       (allowing user run applications access to all cached data) and
       decrease access time (no parsing is left, and error checking can be
       done prior to caching the information).

       This still requires that a privileged user or suid script be used to
       update the cache.

   Multiple type of sets
       Often, it is conveniant to define different types of sets using a
       single set of files as there may be considerable overlap between the
       sets of different types.

       For example, it might be useful to create files containing sets of
       users who belong to different committees in a department. Also,
       there might be sets of users who belong to various departmental
       mailing lists. One solution is to have two different directories,
       one with set files with lists of users on the various committees;
       one with set files with lists of users on each mailing list. Since
       there might be overlap between these groups, it might be nice to
       have the two sets of files overlap. For example, some committees may
       want to have a mailing list associated with the group, others don't
       want a mailing list, and there may be mailing lists not associated
       with a committee.

       This allows you to have a single file for each set of users, but
       some sets will have mailing lists, some will be committees, and some
       will be both.

   Set ownership
       Since the different files may be owned by different people,
       operations based on set ownership can be done.

METHODS
   The following methods are available:

   VERSION
         use Set::Files;
         $Version=$Set::Files::VERSION;

       Check the module version.

   new
         $obj = new Set::Files(OPT => VAL, OPT => VAL, ...);

       This creates a new Set::Files object which reads the appropriate set
       files (or a cache of the information in set files). The
       initialization options available are described below.

   list_sets
         @set     = $obj->list_sets( [TYPE] );

       Returns a list of all defined sets or the sets of the specified
       type.

   owner
         @uid     = $obj->owner;
         $uid     = $obj->owner(SET);

       Lists all UIDs who own a set, or the owner of the specified set.

   owned_by
         @set     = $obj->owned_by(UID [,TYPE]);

       Lists all sets owned by the specified UID (or those of a specific
       type).

   members
         @ele     = $obj->members(SET);

       Lists all elements in the specified set.

   is_member
         $flag    = $obj->is_member(SET, ELE);

       Returns 1 if ELE is a member of SET.

   list_types
         @type    = $obj->list_types( [SET] );

       A list of all types defined, or the types that the specified set
       belong to.

   dir
         @dir     = $obj->dir;
         $dir     = $obj->dir(SET);

       All directories containing set files, or the directory containing
       the file of the specified set.

   opts
         %opts    = $obj->opts(SET);
         $val     = $obj->opts(SET,VAR);

       Returns a hash of all options set for a set, or the value of a
       specific option. If the specific option is not set, 0 is returned.

   delete
         $obj->delete($set);
         $obj->delete($set,1);

       This removes the specified set file. By default, it renames the set
       file to .set_files.$set (which are ignored when reading in set
       data). If the optional second argument is passed in, no backup is
       made (i.e. the set file is deleted completely).

       This method is only available to those who have write access to the
       directory containing the set file.

   cache
         $obj->cache;

       This dumps the current set information to a cache file. This method
       is only valid if the data was read in from files. If it was read in
       from the cache, this method will fail.

   add, remove
         $num = $obj->add   (SET, FORCE, COMMIT, ELE1,ELE2,...);
         $num = $obj->remove(SET, FORCE, COMMIT, ELE1,ELE2,...);

       These functions add/remove the specified elements to/from the set.

       When adding elements to a set, it is first checked to see if the
       element is already in the set, and if so, whether it is explicitely
       excluded in the set file, or comes from some other set file via. an
       INCLUDE tag.

       If the element is not in the set, it is added. If the FORCE flag is
       true, the element will be added to the set file explicitly if it is
       already in the set, but only via. an INCLUDE tag. In either case,
       any OMIT tag which removes this element will be removed from the
       list.

       When removing elements from a set, a similar set of tests are done.
       If the element is in the set, it is removed from the file (if it
       appears in the file) AND a OMIT tag is included. If the element does
       NOT appear in the set, the file is unmodified unless the FORCE flag
       is true, in which case an OMIT tag is added.

       The COMMIT flag is used to determine whether the file should be
       written out over the existing file. The file can only be written out
       if data was read from the files. If it was read in from the cache,
       this will fail.

       The return value is the number of changes made to the set.

   commit
         $obj->commit(SET1,SET2,...);

       Any changes that have been made with the add and remove methods can
       be written out to the set file(s) with this method. This method is
       only valid if the data was read in from files. If it was read in
       from the cache, this method will fail.

INIT OPTIONS
   The following options can be passed in to the new method:

   path
         path => DIR1:DIR2:...
         path => [ DIR1, DIR2, ... ]

       The set files may be stored in one or more different directories. By
       default, set files are assumed to be in the current directory, but
       using this option, the directory (or directories) can be explicitely
       set.

       One thing to note. If multiple directories are used, and a file of
       the same name exists in more than one of the directories, the first
       one found (in the order that the directories are included in the
       list) is used. A warning will be issued for files of the same name
       in other directories, but they will be ignored.

       Warnings will be issued for unreadable directories, or unreadable
       files within a directory.

   valid_file
         valid_file => REGEXP
         valid_file => !REGEXP
         valid_file => \&FUNCTION

       By default, all files in the directories are used. With this option,
       filenames are tested and only those that pass will be used. Others
       will be silently ignored.

       REGEXP is a regular expression. Only filenames which match the
       REGEXP will pass (or if !REGEXP is used, only filenames which do NOT
       match REGEXP will pass).

       If a reference to a function is passed in, the function
       &FUNCTION(dir,file) will be evaluated for each file. If it returns
       0, the file will be silently ignored. Otherwise it will be used.

   invalid_quiet
         invalid_quiet = 1

       By default, when a file is ignored due to failing a valid_file test,
       or when an element is ignored due to failing a valid_ele test, a
       warning is issued. With this option, no warning is issued.

   cache
         cache => DIR

       Data from the set files may be cached in order to speed up data
       access. If this option is used, you must specify the directory where
       the data will be cached. The directory may be the same as one of the
       directories containing the set files.

       The cache directory defaults to the first directory given in the
       path option (or the current directory if no path option is given).

   read
         read => "cache"
         read => "files"
         read => "file"

       When an application wants to use data from the set files, they can
       either read the data from set files or the cache.

       If the cache option was used, the default is to read from the cache
       if it exists, read from the files otherwise. If no cache option was
       used, the default is to read from the files. When data is read in
       from the cache, the commit and cache methods are disabled.

       If the file option is used, it reads a single set from a single file
       along with all dependancy sets (i.e. sets that are included or
       excluded via. the appropriate tags). This allows someone to make
       changes to a single set file that they own even if permissions are
       set so that they cannot read other set files. The commit method is
       available, but the cache method is disabled. The file option
       requires that the set option also be present.

       With the files option, all set files are read. Both the commit and
       cache methods are enabled.

   set
         set => SET

       This defines which set to read when the read = file> option is used.
       This option is required when read = file> and ignored for any other
       value for read.

   types
         types => TYPE
         types => [ TYPE1, TYPE2, ... ]

       Sets can be of one or more types (or they can belong to no type and
       be used solely in building other sets using the INCLUDE or EXCLUDE
       tags described in the FILE FORMAT section below).

       This option can be used to specify the names of the different types
       of sets defined by these files.

       If this option is not given, then there is only one type and by
       default, all sets belong to it.

   default_types
         default_types => [ TYPEa, TYPEb, ... ]
         default_types => "all"
         default_types => "none"
         default_typew => TYPE

       Some types of sets may be more common than others, and you may or
       may not want to have to explicitely define which types a set belong
       to.

       If a list of types are passed in, every type must be defined in the
       types option (warnings will be issued if they weren't). If a value
       of "all" is passed in, sets belong to all types by default. If a
       value of "none" is passed in, sets don't belong to any type by
       default.

       By default, sets belong to all types available.

   comment
         comment => REGEXP

       This defines a regular expression used to recognize (and strip out)
       comments from a set file. The default expression is "#.*" which
       means that all characters from a pound sign to the end of the line
       are removed.

       If REGEXP is passed in as an empty string, there are no comments.
       All lines are either empty or contain an element.

   tagchars
         tagchars => STRING

       This defines a character (or a string) which marks a line of the set
       file as containing a tag. The default value is "@".

   valid_ele
         valid_ele => REGEXP
         valid_ele => !REGEXP
         valid_ele => \&FUNCTION

       By default, every non-blank line (after comments have been stripped
       out) is treated as an element. If this option is used, elements are
       tested, and only those that pass the test are treated as valid.
       Others are invalid and produce a warning.

       If a reference to a function is passed in, the function
       &FUNCTION(set,ele) will be evaluated for each element. If it returns
       0, the element will be silently ignored. Otherwise it will be
       included in the set.

   scratch
         scratch => DIR

       When automatically updating a set file, the directory where the
       files live may or may not be writable by a user who owns a set file.

       If the directory is writable by the user, there is no problem. In
       this case, when a new set file is written, the old one is backed up
       and the new one written in it's place.

       If the directory is NOT writable by the user, the old copy is backed
       up to the scratch directory. This directory must be writable by the
       user. It defaults to /tmp.

FILE FORMAT
   A set file has a very simple format. It consists of blank lines, tags,
   and elements. Comments may be included as whole lines or part of one of
   the above lines.

   Each line is checked for comments and they are removed before any other
   processing is done. A comment is anything that matches a regular
   expression which can be set using the comment Init option. The default
   regular expression is "#.*" which means that comments start with a pound
   sign anywhere on the line and go to the end of the line.

   Tags are lines which begin with begin with a special string (which can
   be set with the tagchars Init option. The default string is "@". Tag
   lines are of one of the formats:

     @TAG
     @TAG VAL1,VAL2,...

   All other lines are elements. Elements are any string (one per line).

   Leading/trailing spaces are ignored in all cases.

   The set name is the name of the set file.

   The following TAGs are known:

   INCLUDE SET1,SET2,...
       This includes all members of one or more other sets in the current
       set.

   EXCLUDE SET1,SET2,...
       This excludes all members of one or more other sets from the current
       set. This overrides any members included from other sets, but does
       NOT exclude members explicitely included in the set file.

   OMIT ELE
       This exludes a specific element from the current set. This overrides
       any elements included via. an INCLUDE tag, or any elements
       explicitly included in the set file.

       Each element must be specified separately since there is no
       guarantee that elements may not contain commas.

   TYPE TYPE1,TYPE2,...
       The default types that this set belongs to are determined by the
       types and default_types Init options.

       This tag explicitely puts this set if the specified types, even if
       it is not in those types of default.

   NOTYPE TYPE1,TYPE2,....
       Similar to the TYPE tag, but this tag explicitely removes the set
       from the specified types, even if it is in them by default.

   OPTION VARIABLE [= VALUE]
       Although there is no support for element specific attributes, there
       IS support for attributes which apply to the entire set (and which
       can be made available to applications using these sets).

       Each set may have a hash associated with with key/value pairs (if no
       value is include, it defaults to 1). These attributes are available
       using the info method.

   All tag lines can be repeated any number of times, so:

     @INCLUDE foo,bar

   is equivalent to

     @INCLUDE foo
     @INCLUDE bar

   All tags are case insensitive.

   When determining the members of a set which includes and excludes other
   sets, or omits specific elements from the set, all inclusions are
   evaluted first, followed by all exclusions (i.e. all exclusions override
   all inclusions). If there is a cyclic dependancy (i.e. A depends on B
   depends on A where a dependancy can either be an INCLUDE or EXCLUDE), an
   error is reported and the cyclic dependancy is ignored.

   A few examples illustrate the use of INCLUDE, EXCLUDE, and OMIT tags. In
   the examples, the set file A contains the elements: E1, E2, E3. The set
   file B contains the elements: E3, E4, E5. The set file contains the
   following lines:

     @INCLUDE A
     @EXCLUDE B
     E5
     E6

   defines a set contains the elements: E1, E2, E5, E6. The first line
   includes E1, E2, E3. The second line excludes E3. It does NOT exclude E5
   since the EXCLUDE tag does not override elements explicitly included in
   the set file. Finally, the E5 and E6 elements are added.

   The set file containing the following lines:

     @INCLUDE A
     @EXCLUDE B
     @OMIT    E2
     @OMIT    E6
     E5
     E6

   defines a set contains the elements: E1, E5. This is similar to the
   above example, except that the OMIT tags override elements included via.
   the INCLUDE tag AND elements explicitly included in the set file.

FILES
   Several files are used by the Set::Files module. They all live in the
   directory set by the cache Init Option except for set specific files
   which live in the same directory as the set file. Files are:

   .set_files.SET
       A backup of the given set. When a set file is updated, the original
       file is stored in this file. The file is stored either in the same
       directory as the set file (if it is writable) or in the directory
       specified by the scratch Init Option.

   .set_files.SET.new
       A temporary file where a new set file (or the update to an old one)
       is written. Once completed, this file is moved into place as the new
       set file. This file lives in the same directory as the set file or
       in the scratch directory.

   .set_files.cache
       The file containing the cache. This is created using the cache
       method.

   .set_files.template
       When creating a new set file (or updating an existing one), this
       file is used (if it exists) as a starting point and then all the
       data is appended to it. This is a good place to store comments
       describin how to edit the set files, etc., that set file maintainers
       can read for help.

KNOWN PROBLEMS
   None at this point.

LICENSE
   This script is free software; you can redistribute it and/or modify it
   under the same terms as Perl itself.

AUTHOR
   Sullivan Beck ([email protected])