Asri-unix.460
net.chess
utzoo!decvax!ucbvax!menlo70!sri-unix!YODER@USC-ECLB
Thu Jan  7 22:53:28 1982
storing positions
       Something to think about if your concern is saving space in data
files is to use symmetries.  These are least useful in the opening and
become more important as time progresses.  What is noteworthy is that
while the hacking to save the e.p. and castling bits reduces space usage
by about 5.5%, using symmetries will get you a factor of 2, 4, or 16
depending on the situation.  There is only one symmetry which is always
valid:  one can reflect the board so 1st and 8th ranks exchange and reverse
the color of all the pieces.  Once castling is no longer possible, one can
add the reflection about a perpendicular axis (exchanging the rook files).
Finally, when there are no pawns on the board, one has the full dihedral
group of symmetries on the square board in addition to the reflection with
color change (so the full group has 16 elements).
       It should be noted that for openings the alleged factor of 2 is
largely useless, since the extra positions being stored don't generally
come up (for example the reflection of the opening position is the same
position, but with black to move).
       For endings, however, it clearly wins to use this method:  if
you have one pawn on the board, you need only know how to play with the
pawn on the a, b, c, or d files to know the ending.  You need not separately
remember how to play on the e, f, g, or h files (or, for that matter, how
to play if the pawn is a different color).
       Also note that the Huffman scheme is less efficient than simply
storing n squares if the number of pieces, n, is 12 or less -- a board with
only two kings takes 72 bits to encode, whereas 6*n is at most 72.
       Finally it should be observed that it may not be necessary to
use any bits at all to encode a position if the encoding is implicit.
This probably sounds like nonsense, but consider an array of 1000 integers.
Where are the bits which encode the numbers 1 through 1000 which are the
indices into this array?  The same principle can be applied here, in fact
I did use it a few years back when I was working on a program to play pawn
endings.  For a fixed pawn structure (only kings and pawns on the board),
a "set of positions" was represented as an array indexed by king one's
position of sets of king two's position.  (This was Pascal -- In Ada you
would want a two-dimensional array of booleans whose indices are the two
king positions.)  The desired information can be stored as a small number
of such sets:  e.g. those where white wins, those where advancing a pawn
wins etc.  This scheme is almost certainly about as compact as you can get
provided of course that you wish to know the data being stored for a very
large number of positions;  a typical application would be a program which
completely solved an endgame.
       Returning to the Huffman scheme:  if you expect the average number
of bits, k, to encode a "typical" position to be considerably less than 156,
one can store them in such a way that the space used per position is about
(k + log2(k*m)) where m is the number of positions stored.  You use two
files:  one is a stream of bits which is the Huffman codes for all your
positions, concatenated.  The other is a hash table (encoded by the hash
function of your choice on positions) whose contents are indices into the
first file giving the location of the first bit, plus whatever data you
wish to associate with the position.  This will, however, probably incur
a cost for wasted slots in the hash table, since it is a good idea to make
it somewhat larger than the number of positions actually present to avoid
collision overhead.
       I would be interested to hear of any other methods people have for
saving space in encoding positions, in particular whether the methods above
have ever been employed in working chess programs.  The question of "how
many bits per position are needed" would seem to be 156 for openings and
0 for endings with sufficiently few pieces.  With more pieces (up to 12),
one can use 6*n bits where n is the number of pieces.  With the middlegame,
one could try the hash scheme mentioned above, but I'm not sure there is a
use for storing lots of middlegame positions.
-------

-----------------------------------------------------------------
gopher://quux.org/ conversion by John Goerzen <[email protected]>
of http://communication.ucsd.edu/A-News/


This Usenet Oldnews Archive
article may be copied and distributed freely, provided:

1. There is no money collected for the text(s) of the articles.

2. The following notice remains appended to each copy:

The Usenet Oldnews Archive: Compilation Copyright (C) 1981, 1996
Bruce Jones, Henry Spencer, David Wiseman.