NAME
   Win32-ProcFarm - system for parallelization of code under Win32

OVERVIEW
 What is Win32::ProcFarm?

   "Win32::ProcFarm" is the code I wrote to speed up tasks that are limited
   by network latency, but not by network bandwidth or local computer
   power. For instance, say you want to ping every address on a subnet. The
   simple approach (excluding pinging the broadcast address) is to
   sequentially ping every address on the subnet. If only 30% of the
   addresses are in use and you wait 1 second before deciding an address is
   not in use, it will take roughly 3 minutes to ping a class C subnet. The
   limitation here is obviously not the local CPU or even network
   bandwidth, but rather latency. One solution would be to break up the
   task. Unfortunately, the thread support in Perl doesn't work with
   ActivePerl, and in any event the support is currently experimental.
   Another approach would be to spin off 10 processes, have each take 25
   addresses, and funnel the information back into a single process for
   reporting.

   This is the approach "Win32::ProcFarm" takes, but it is somewhat more
   sophisticated. A "pool" of processes is created that communicate with
   the parent process using TCP sockets. The parent process communicates
   with the child processes using a "RPC" style library to assign tasks to
   the child processes and to retrieve the return data from those tasks.

   Each child process is comprised of a library file that includes the
   communications routines, as well as whatever subroutines pertain to the
   problem at hand. The parent process spins off the child process, which
   then connects back to the parent process through a TCP port. The parent
   process uses "Data::Dumper" to package up the desired subroutine name
   along with any associated parameters and ships it off to the child
   process. The child process then executes that subroutine and uses
   "Data::Dumper" to package up the return values and send them back to the
   parent. What makes the library useful is that the child process can
   operate asynchronously from the parent; the parent simply calls
   "execute" to instruct the child process to execute a subroutine. The
   parent process can then periodically call "get_state", which will return
   "wait" while the child process is still executing the subroutine. When
   the child process finishes and ships the return values back up the
   socket, the "get_state" method call on the parent object will return the
   "fin" state. The parent then calls "get_retval" to obtain the returned
   values, and the child process can then be used to execute another task.

   The pool system is based upon this simplistic "RPC" system. To use the
   "Win32::ProcFarm::Pool" object, one simply creates a new pool, passing
   it the number of child processes to start as well as the name of the
   child process and a few other parameters. Once the pool has been
   created, one adds jobs to the waiting pool. This might be a list of IP
   addresses to ping, for instance. Then one tells the
   "Win32::ProcFarm::Pool" object to execute all the jobs. The pool assigns
   a job to each of the child processes until all the child processes are
   busy. It then checks the child processes periodically to see if they
   have finished with the task. If they have, it places the return values
   into a hash, identified by an ID passed when the job was created, and
   sends the child process another job. When all the jobs have finished,
   one simply requests the hash of return values and proceeds on.

 Process Farm Advantages

   Speed
       By farming the work out over a large number of processes (I
       typically use from 5 to 30), large speedup factors can be achieved
       fairly easily.

   Reuse
       The process farm system is designed to be fairly easy to use. Simply
       write the function of use, include it in a child process, and add
       roughly 10 lines of boilerplate code to the parent.

   Efficiency in face of variable length jobs
       Because jobs are assigned one-by-one to the child processes as they
       come free, jobs are allocated as efficiently as possible given the
       constraint that the job execution time cannot be predicted.

   Low probability of child process orphaning
       Because the code to kill the child processes when everything is over
       is implemented in the "DESTROY" for the parent, orphaning is a rare
       event.

 Process Farm Limitations

   The Process Farm code is very useful in certain situations, but it has a
   number of limitations that should be kept in mind.

   Child Process Startup Time
       On a dual Pent-Pro/200 with 128MB of RAM, child process startup time
       is roughly 1/3rd of a second. This means spinning off 30 child
       processes takes 10 seconds. The code already uses asynchronous
       startup, and I believe the major limitation remaining is the time
       necessary to start up a Perl process and create the TCP socket.

   Child Process Memory Utilization
       By keeping an eye on total memory utilization, it appears that each
       bare child process uses roughly 2.3MB of memory. A child process
       that also uses "Net::Ping" to implement a ping function uses roughly
       2.6MB of memory. If you spin off 30 of these processes, that's 75MB
       of RAM. If you start swapping, the thrash of 30 processes running
       simultaneously is going to kill any speed benefit, so keep memory
       utilization in mind when selecting the number of child processes to
       use.

 Real World Results

   Despite the limitations, I have found the Process Farm system to be very
   useful. In the previous example of pinging a range of IP addresses, with
   roughly 10% coverage on a Class C, and 31 child processes, total ping
   time runs roughly 21 seconds, a speed up of a factor of 10 on a problem
   that otherwise takes an obnoxious amount of time.

 Further Information

   Please see the "tutorial" in "Docs/tutorial.pod" for more information,
   as well as the POD contained within the actual Perl modules.