NAME
   Session::Token - Secure, efficient, simple random session token
   generation

SYNOPSIS
 Simple 128-bit session token
       my $token = Session::Token->new->get;
       ## 74da9DABOqgoipxqQDdygw

 Keep generator around
       my $generator = Session::Token->new;

       my $token = $generator->get;
       ## bu4EXqWt5nEeDjTAZcbTKY

       my $token2 = $generator->get;
       ## 4Vez56Zc7el5Ggx4PoXCNL

 Custom minimum entropy in bits
       my $token = Session::Token->new(entropy => 256)->get;
       ## WdLiluxxZVkPUHsoqnfcQ1YpARuj9Z7or3COA4HNNAv

 Custom alphabet and length
       my $token = Session::Token->new(alphabet => 'ACGT', length => 100_000_000)->get;
       ## AGTACTTAGCAATCAGCTGGTTCATGGTTGCCCCCATAG...

DESCRIPTION
   This module provides a secure, efficient, and simple interface for
   creating session tokens, password reset codes, temporary passwords,
   random identifiers, and anything else you can think of.

   When a Session::Token object is created, 1024 bytes will be read from
   "/dev/urandom" (Linux, Solaris, most BSDs), "/dev/arandom" (some older
   BSDs), or with Crypt::Random::Source::Strong::Win32 (Windows). These
   bytes will be used to seed the <ISAAC-32> pseudo random number
   generator.

   Once a generator is created, you can repeatedly call the "get" method on
   the generator object and it will return new tokens.

   IMPORTANT: If your application calls "fork", make sure that any
   generators are re-created in one of the processes after the fork since
   forking will duplicate the generator state and otherwise both parent and
   child processes will go on to produce identical tokens (as with perl's
   rand after seeding).

   After the generator context is created, no system calls are used to
   generate tokens. This is one way that Session::Token helps with
   efficiency. However, this is only important for certain use cases
   (generally not web sessions).

   ISAAC is a cryptographically secure PRNG that improves on the well known
   RC4 algorithm in some important areas. For instance, it doesn't have
   short cycles like RC4 does. A theoretical shortest possible cycle in
   ISAAC is "2**40", although no cycles this short have ever been found
   (and probably don't exist at all). On average, ISAAC cycles are a
   ridiculous "2**8295".

   In a server application the most important reason you should use the
   "keep generator around" mode instead of creating Session::Token objects
   every time you need a token is that in this mode generating a new token
   cannot fail due to a full descriptor table. Creating new generators for
   every token can fail for this reason. Programs that re-use the generator
   are also more efficient and are less likely to cause problems in
   "chroot" environments.

   Aside: Some crappy (usually C) programs that assume opening
   "/dev/urandom" will always succeed can return session tokens based only
   on the contents of nulled or uninitialised memory! Unix really ought to
   provide a system call for random data.

CUSTOM ALPHABETS
   Being able to choose exactly which characters appear in your token is
   sometimes useful. This set of characters is called the *alphabet*. The
   default alphabet size is 62 characters: uppercase latin letters,
   lowercase latin letters, and digits ("a-zA-Z0-9").

   For some purposes, base-62 is a sweet spot. It is much more compact than
   hexadecimal encoding which helps with efficiency because session tokens
   are usually transfered over the network many times during a session
   (often uncompressed in HTTP headers).

   Also, base-62 tokens don't use "wacky" characters like base-64 encodings
   do. These characters sometimes cause encoding/escaping problems (ie when
   embedded in URLs) and are annoying because often you can't select tokens
   by double-clicking on them.

   Although the default is base-62, there are all kinds of reasons you
   might like to use another alphabet. One example is if your users are
   reading tokens from a print-out or SMS or whatever, you may choose to
   omit characters like "o", "O", and 0 that can easily be confused.

   To set a custom alphabet, just pass in either a string or an array of
   characters to the "alphabet" parameter of the constructor:

       Session::Token->new(alphabet => '01')->get;
       Session::Token->new(alphabet => ['0', '1'])->get; # same thing
       Session::Token->new(alphabet => ['a'..'z'])->get; # character range

ENTROPY
   There are two ways to specify the length of tokens. The first is
   directly in terms of characters:

       print Session::Token->new(length => 5)->get;
       ## -> wpLH4

   The second way is to specify their minimum entropy in terms of bits:

       print Session::Token->new(entropy => 24)->get;
       ## -> Fo5SX

   In the above example, the resulting token is guaranteed to have at least
   24 bits of entropy. Given the default base-62 alphabet, we can compute
   the exact entropy of a 5 character token as follows:

       $ perl -E 'say 5 * log(62)/log(2)'
       29.7709815519344

   So these tokens have about 29.8 bits of entropy. Note that if we removed
   one character from this token, it would bring it below our desired 24
   bits of entropy:

       $ perl -E 'say 4 * log(62)/log(2)'
       23.8167852415475

   The default minimum entropy is 128 bits. Default tokens are 22
   characters long and therefore have about 131 bits of entropy:

       $ perl -E 'say 22 * log(62)/log(2)'
       130.992318828511

   An interesting observation is that in base-64 representation, 128-bit
   minimum tokens also require 22 characters and that these tokens contain
   only 1 more bit of entropy.

   Another Session::Token design criterion is that all tokens should be the
   same length. The default token length is 22 characters and the tokens
   are always exactly 22 characters (no more, no less). This is nice
   because it makes writing matching regular expressions easier, simplifies
   storage (you never have to store length), and causes various log files
   and things to line up neatly on your screen. Instead of tokens that are
   exactly "N" characters, some libraries that use arbitrary precision
   arithmetic end up creating tokens of *at most* "N" characters.

   In summary, the default token length of exactly 22 characters is a
   consequence of other decisions: base-62 representation, 128 bit minimum
   token entropy, and consistent token length.

MOD BIAS
   Some token generation libraries that implement custom alphabets generate
   a random value, compute its modulus over the size of an alphabet, and
   then use this modulus to index into the alphabet to determine an output
   character.

   Why is this bad? Consider the alphabet "abc". The ideal output
   probability distribution for each character in the token is:

       P(a) = 1/3
       P(b) = 1/3
       P(c) = 1/3

   Assume we have a uniform random number source that generates values in
   the set "[0,1,2,3]" (most PRNGs provide sequences of bits, in other
   words power-of-2 size sets). If we use the naïve modulus algorithm
   described above, 0 maps to "a", 1 maps to "b", 2 maps to "c", and 3
   *also* maps to "a". Instead of the even distribution above, we have the
   following biased distribution:

       P(a) = 2/4 = 1/2
       P(b) = 1/4
       P(c) = 1/4

   Bias like this is bad because some tokens are more likely than other
   tokens which provides a starting point when token guessing. Tokens that
   are unbiased are equally likely and therefore there is no starting
   point.

   Session::Token provides unbiased tokens regardless of the size of your
   alphabet (though see the next section for a mis-use warning). It does
   this in the same way that you might simulate producing unbiased random
   numbers from 1 to 5 given an unbiased 6-sided die: Re-roll every time a
   6 comes up.

   In the above example, Session::Token eliminates bias by only using
   values of 0, 1, and 2 (the "t/no-mod-bias.t" test contains some more
   notes on this topic).

   Of course throwing away a portion of random data is slightly
   inefficient. In the worst case scenario of an alphabet with 129
   characters, for each output byte this module consumes on average 1.9845
   bytes from the random number generator. This inefficiency isn't a
   problem because ISAAC is quite fast.

   Note that mod bias can be made arbitrarily small by increasing the
   amount of data consumed from the random number generator for each
   character (providing that arbitrary precision modulus is available).
   Because this module fundamentally avoids mod bias, it can use each of
   the 4 bytes from an ISAAC-32 word for a separate character (excepting
   "re-rolls").

INTRODUCING BIAS
   If your alphabet contains the same character two or more times, this
   character will be more biased than a character that only occurs once.
   You should be careful that your alphabets don't overlap if you are
   trying to create random session tokens.

   However, if you wish to introduce bias this library doesn't try to stop
   you. (Maybe it should issue a warning?)

       Session::Token->new(alphabet => '0000001', length => 5000)->get; # don't do this
       ## -> 0000000000010000000110000000000000000000000100...

   Due to a limitation discussed below, alphabets larger than 256 aren't
   currently supported so your bias can't get very granular.

   Aside: If you have a constant-biased output stream like the above
   example produces then you can re-construct an un-biased bit sequence
   with the von neumann algorithm. This works by comparing pairs of bits.
   If the pair consists of identical bits, it is discarded. Otherwise the
   order of the different bits is used to determine an output bit, ie 00
   and 11 are discarded but 01 and 10 are mapped to output bits of 0 and 1
   respectively. This only works if the bias in each bit is constant (like
   all characters in a Session::Token are).

ALPHABET SIZE LIMITATION
   Due to a limitation in this module's code, alphabets can't be larger
   than 256 characters. Everywhere the above manual says "characters" it
   actually means bytes. This isn't a Unicode limitation per se, just the
   maximum size of the alphabet. Remember you can easily map bytes to
   characters with tr.

       use utf8;
       $z = Session::Token->new(alphabet => '01', length => 10)->get;
       $z =~ tr/01/-λ/;
       ## -> λλ--λλλλ-λ

   Here's an interesting way to generate a uniform random integer between 0
   to 999 inclusive:

       0 + Session::Token->new(alphabet => ['0'..'9'], length => 3)->get

   If you wanted to natively support high code points, there is no point in
   hard-coding a limitation on the size of Unicode or some arbitrary
   machine word. Instead, arbitrary precision "characters" should be
   supported with bigint. Here's an example of something similar in lisp:
   <isaac.lisp>.

   This module is not however designed to be the ultimate random number
   generator and at this time I think changing the design as described
   above would interfere with its goal of being secure, efficient, and
   simple.

SEEDING
   This module is designed to always seed itself from your kernel's secure
   random number source. You should never need to seed it yourself.

   However if you know what you're doing, you can pass in a custom seed as
   a 1024 byte long string. For example, here is how to create a "null
   seeded" generator:

       my $gen = Session::Token(seed => "\x00" x 1024);

   This is done in the test-suite to compare against Jenkins' reference
   ISAAC output, but obviously don't do this in regular applications
   because the generated tokens will always be the same.

   One valid reason for seeding is if you have some reason to believe that
   there isn't enough entropy in your kernel's randomness pool and
   therefore you don't trust "/dev/urandom". In this case you should
   acquire your own seed data from somewhere trustworthy (maybe
   "/dev/random" or a previously stored trusted seed).

BUGS
   It might be a good idea if this library could detect forks and re-seed
   in the child process.

   There is currently no way to extract the seed from a Session::Token
   object. Note when implementing this: The saved seed must either store
   the current state of the ISAAC round as well as the 1024 byte "randsl"
   array or else do some kind of minimum fast forwarding in order to
   protect against a partially duplicated keystream bug.

SEE ALSO
   <The Session::Token github repo>

   There are lots of different modules for generating random data.

   Like this module, perl's "rand()" function implements a user-space PRNG
   seeded from "/dev/urandom". However, perl "rand()" is seeded with a mere
   4 bytes from "/dev/urandom" and the perldoc doesn't seem to specify a
   PRNG algorithm.

   Data::Token is the first thing I saw when I looked around on CPAN. It
   has an inflexible and unspecified (?) alphabet. It tries to get its
   source of unpredictability from UUIDs and then hashes these UUIDs with
   SHA1. I think this is bad design because some standard UUID formats
   aren't designed to be unpredictable at all. Knowing a target's MAC
   address and the rough time the token was issued may help you predict a
   reduced area of token-space to concentrate guessing attacks upon. I
   don't know if Data::Token uses these types of UUIDs or the potentially
   secure "version 4" UUIDs, but because this wasn't addressed in the
   documentation and because of an apparent misapplication of hash
   functions (if you really had a good random UUID type, there would be no
   need to hash), I don't feel good about using this module.

   There are several decent random number generators like
   Math::Random::Secure, Crypt::URandom &c, but they usually don't
   implement alphabets and some of them require you open "/dev/urandom" for
   every chunk of random bytes. Note that Math::Random::Secure does prevent
   mod bias in its random integers and could be used to implement
   alphabets.

   String::Random is a cool module with a neat regexp-like language for
   specifying random tokens which is more flexible than alphabets. However,
   inspecting the code indicates that it uses perl's "rand()". Also, the
   lack of performance, bias, and security discussion in the docs made me
   decide to not use this otherwise very interesting module.

   String::Urandom has alphabets, but it uses the flawed mod algorithm
   described above and opens "/dev/urandom" on every token.

   Data::Random is also a pretty nice looking library but it seems to use
   "rand()" and the docs don't discuss security.

AUTHOR
   Doug Hoyte, "<[email protected]>"

COPYRIGHT & LICENSE
   Copyright 2012 Doug Hoyte.

   This module is licensed under the same terms as perl itself.

   ISAAC code:

       By Bob Jenkins.  My random number generator, ISAAC.  Public Domain