% AroundTheBend.tex concatenation of Around The Bend
\begin{filecontents}{bend.ist}
% MakeIndex style file bend.ist for use with AroundTheBend.tex
% @ may be a valid character in the index, use ? instead
actual '?'
\end{filecontents}
%\documentclass[draft,openany]{memoir}
\documentclass[openany]{memoir}
\usepackage{comment}
\usepackage{url}
\ifpdf
\usepackage[pdftex,
plainpages=false,
pdfpagelabels,
bookmarksnumbered
]{hyperref}
\else
\usepackage[%pdf,
plainpages=false,
pdfpagelabels,
bookmarksnumbered
]{hyperref}
\fi
\usepackage{graphicx}
\settrimmedsize{11in}{210mm}{*}% min letterpaper/A4 sizes
\setlength{\trimtop}{0pt}
\setlength{\trimedge}{\stockwidth}
\addtolength{\trimedge}{-\paperwidth}
\settypeblocksize{7.75in}{33pc}{*}
\setulmargins{4cm}{*}{*}
\setlrmargins{1.25in}{*}{*}
\setmarginnotes{17pt}{51pt}{\onelineskip}
\setheadfoot{\onelineskip}{2\onelineskip}
\setheaderspaces{*}{2\onelineskip}{*}
\checkandfixthelayout
%\addtolength{\textwidth}{1in}
%\addtolength{\oddsidemargin}{-0.5in}
%\addtolength{\evensidemargin}{-0.5in}
\newcommand{\ed}[1]{\emph{(Ed: #1)}}
\newcommand*{\oposted}[1]{Originally posted on #1}
\newcommand*{\arch}[1]{Archived as {\normalfont \ttfamily #1}}
\newenvironment{solution}[1]{%
\begin{description}
\item[#1]\mbox{}}%
% {\par\noindent\textbf{End solution}\end{description}}
{\end{description}\vspace{-0.5\onelineskip}\textbf{End solution}}
\newcommand*{\pfile}[1]{\texttt{#1}}% print a file name
\newfixedcaption{\freetabcaption}{table}
\renewcommand*{\chaptername}{QA}
\renewcommand*{\chaptername}{}
% \piif{if...} print and index \if...
\newcommand*{\piif}[1]{\cs{#1}\index{#1?\cs{#1}}}
\makeatletter
\newcommand*{\zeroseps}{%
\topsep\z@
\partopsep\z@
\parskip\z@}
\newlength{\gparindent} \gparindent 0.5\parindent
\newenvironment{lcode}{\zeroseps
\renewcommand{\verbatim@startline}%
{\verbatim@line{\hskip\gparindent}}%
\small\setlength{\baselineskip}{\onelineskip}\verbatim}%
{\endverbatim
\vspace{-\baselineskip}\noindent}
\makeatother
\nouppercaseheads
\headstyles{bringhurst}
%\setlength{\beforechapskip}{2\onelineskip}
\chapterstyle{section}
\setlength{\beforechapskip}{2\onelineskip}
\setlength{\beforechapskip}{0pt}
\setlength{\afterchapskip}{1\onelineskip}
\settocdepth{subsubsection}
\setsecnumdepth{subsubsection}
\makeindex
%\title{Around The Bend}
%\author{Michael Downes \\
%(edited by Peter Wilson)}
%\date{}
\newlength{\drop}
\providecommand*{\wb}[2]{\fontsize{#1}{#2}\usefont{U}{webo}{xl}{n}}
\newcommand*{\titleAB}{\begingroup
\drop=4\baselineskip
\centering
\vspace*{\drop}
{\Huge AROUND THE BEND}\\[\drop]
{\hspace*{1.5em}\scalebox{8}[1]{{\wb{10}{12}4}}}\\[\drop]
{\Large\itshape A Collection of TeX Challenges by}\\[\baselineskip]
{\Large MICHAEL DOWNES}\\[\baselineskip]
{\wb{10}{12}4}\\[\baselineskip]
{\Large\itshape edited by}\\[\baselineskip]
{\Large Peter Wilson}\par
\vfill
{\hspace*{1.5em}\scalebox{8}[1]{{\wb{10}{12}4}}}\\[\drop]
{\large The Herries Press}\\
{July 2008}\par
\vspace*{\drop}
\endgroup}
%% normally \parindent = 1.5em, but 0pt in \titleAB
\begin{document}
\tightlists
\raggedbottom
\frontmatter
%\maketitle
\thispagestyle{empty}
\titleAB
\cleardoublepage
\tableofcontents
\chapter{Preface}
In the early 90's the late and much missed Michael Downes (1958--2003)
ran a column in the INFO-TeX mailing list
called \emph{Around The Bend} where he proposed macro-related problems and
then posted
submitted solutions. Although it was archived on CTAN in \url{info/aro-bend}
it is not well known which is a shame as it provides
answers to many problems that keep cropping up. (The archive is now
at \url{info/challenges/aro-bend}). This is an attempt to
make his work more accessible by providing the collection as a single
document.
As much as possible what follows is what Michael wrote; I have tried to
limit myself to marking up the original ASCII text emails but I have not
repeated administrative elements such as email headers.
In some cases the
original TeX code was replete with comments explaining what was going on.
Where the comments were long with respect to the code I have set them in
the regular body type so as to make the actual code more obvious; this has a
side effect of slightly decreasing the amount of paper required to
print the document. If you
want to use the code solutions I suggest that you cut and paste them
from the original archived versions.
I thought that there were eighteen Around the Bends as that is all that
are archived on CTAN. However I googled the Google Groups \url{comp.text.tex}
group
and found three more, nos.~19, 20 and~21. I have included what I could find
of these, but answers to no.~19 appear to be missing, which is a pity as
I think that I could have put them to use. Perhaps some of you might be
willing to take up the challenge on this, or on any of the others.
{\raggedleft \textsc{PW}\\ July, 2008 \par}
\chapter{Introduction}
\ed{This is Michael's introduction to his scheme, originally posted on
1991/10/10 as the initial portion of exercise~1.}
%%[Exercises 1,2,3 were originally posted together on 10 Oct 91]
\begin{verbatim}
Date: Thu 10 Oct 91 09:51:32-EST
From: Michael Downes <
[email protected]>
Subject: Around the bend
To:
[email protected]
\end{verbatim}
Proposal for a regular feature:
AROUND THE BEND
With the encouragement of George Greenwade (the INFO-TeX list owner), I
would like to propose a regular department for INFO-TeX, called `Around
the bend'. It will consist of macro-writing challenges on the level of
the dangerous-bend exercises in the \emph{TeXbook}, with interested parties
invited to collaborate and/or compete to find the best solution. My
motivation for doing this is partly selfish: to get more feedback from
other macro writers about some of the interesting macro-writing
problems that I run into.
I originally approached George for advice about setting up a separate
mailing list, but he thought that INFO-TeX and comp.text.tex readers
would be interested. Since INFO-TeX mail is also channeled to
comp.text.tex, readers of the latter should let me know if they don't
want the extra traffic (although I don't expect it to be that much). I
don't currently have access to read comp.text.tex directly, although
George has been investigating the possibility of piping it through the
INFO-TeX mailing list. So if you object by posting to comp.text.tex, I
may not see your objection; send me mail, instead.
The sample below should give a pretty good idea of what `Around the
bend' would be like. Solutions should be sent to me instead of to
INFO-TeX or comp.text.tex, on the premise that people usually won't want
to read others' solutions until they've had a chance to try their own
hand. A summary of the results would then be posted to the INFO-TeX
list after two or three weeks; to those who submit solutions before the
deadline, I could forward without delay solutions submitted by other
people, for comparison.
I will try to keep the difficulty of the exercises down to something
reasonable, let's say, on the level of a homework assignment which a
university student must complete in two weeks, finding time in the
normal way from the usual busy schedule of other homework, class
attendance, sports, and social life. However, be warned that the
challenges will be hard. I'm planning to follow a `hard and fast'
format: one or two hard questions, followed by one or two fast
questions, where if you don't know the answer off the top of your head,
you can either look it up in the \emph{TeXbook} or find it by running a quick
test.
\mainmatter
\chapter{Expansion}
\section{Exercise (hard)}
%%\input{ex001.tex}
% ex001.tex
\begin{comment}
(Originally posted on 1991/10/10)
[Exercises 1,2,3 were originally posted together on 10 Oct 91]
Date: Thu 10 Oct 91 09:51:32-EST
From: Michael Downes <
[email protected]>
Subject: Around the bend
To:
[email protected]
Proposal for a regular feature:
AROUND THE BEND
With the encouragement of George Greenwade (the INFO-TeX list owner), I
would like to propose a regular department for INFO-TeX, called `Around
the bend'. It will consist of macro-writing challenges on the level of
the dangerous-bend exercises in the TeXbook, with interested parties
invited to collaborate and/or compete to find the best solution. My
motivation for doing this is partly selfish: to get more feedback from
other macro writers about some of the interesting macro-writing
problems that I run into.
I originally approached George for advice about setting up a separate
mailing list, but he thought that INFO-TeX and comp.text.tex readers
would be interested. Since INFO-TeX mail is also channeled to
comp.text.tex, readers of the latter should let me know if they don't
want the extra traffic (although I don't expect it to be that much). I
don't currently have access to read comp.text.tex directly, although
George has been investigating the possibility of piping it through the
INFO-TeX mailing list. So if you object by posting to comp.text.tex, I
may not see your objection; send me mail, instead.
The sample below should give a pretty good idea of what `Around the
bend' would be like. Solutions should be sent to me instead of to
INFO-TeX or comp.text.tex, on the premise that people usually won't want
to read others' solutions until they've had a chance to try their own
hand. A summary of the results would then be posted to the INFO-TeX
list after two or three weeks; to those who submit solutions before the
deadline, I could forward without delay solutions submitted by other
people, for comparison.
I will try to keep the difficulty of the exercises down to something
reasonable, let's say, on the level of a homework assignment which a
university student must complete in two weeks, finding time in the
normal way from the usual busy schedule of other homework, class
attendance, sports, and social life. However, be warned that the
challenges will be hard. I'm planning to follow a `hard and fast'
format: one or two hard questions, followed by one or two fast
questions, where if you don't know the answer off the top of your head,
you can either look it up in the TeXbook or find it by running a quick
test.
All right, here are the first three.
\end{comment}
%**********************************************************************
%*** Exercise 1 (hard):
\ed{\oposted{1991/10/10}. \arch{exercise.001}.}\\%[0.5\baselineskip]
Given arbitrary \cmd{\b}, \cmd{\c}, \cmd{\d} (macros without arguments), for example
\begin{lcode}
\def\b{\c\c} \def\c{*} \def\d{\b\c}
\end{lcode}
figure out how to define \cmd{\a} so that its replacement text consists
of \cmd{\b} fully expanded plus \cmd{\c} not expanded plus \cmd{\d} expanded
exactly once.
I.e., with the above definitions the replacement text of \cmd{\a}
should be
\begin{lcode}
**\c\b\c
\end{lcode}
You may not use \cmd{\the} or \cmd{\noexpand} in your solution. This is Exercise
20.16 in the \emph{TeXbook}, except that there's an added restriction: your
answer must also not use the \cmd{\halign}\texttt{\ldots}\cmd{\span} method given in the
answer to 20.16. (Yes, that means you can't use \cmd{\valign} either!)
Why would anyone want to do such a hard exercise? Answer: advanced
macro writing requires a thorough knowledge of expansion control
principles.
\begin{comment}
[Exercise 2 moved to exercise.002]
[Exercise 3 moved to exercise.003]
Send answers to:
Michael Downes
[email protected] (Internet)
A summary will be posted Friday, October 25, 1991.
\end{comment}
%%\endinput
\section{Answers}
%%\input{ans001.tex}
% ans001.tex
\ed{\oposted{1991/10/25}. \arch{answer.001}.}\\
\begin{comment}
[Solutions for exercises 1,2,3 were originally posted together on 25 Oct 91]
Date: Fri 25 Oct 91 15:19:44-EST
From: Michael Downes <
[email protected]>
Subject: `Around the bend' #1 solutions
To:
[email protected]
Solutions to the exercises of `Around the bend' #1.
"*** Exercise 1 (hard):
"Given arbitrary \b, \c, \d (macros without arguments), for example
"
" \def\b{\c\c} \def\c{*} \def\d{\b\c}
"
"figure out how to define \a so that its replacement text consists
"of \b fully expanded plus \c not expanded plus \d expanded exactly once.
"I.e., with the above definitions the replacement text of \a
"should be
"
" **\c\b\c
"
"You may not use \the or \noexpand in your solution. This is Exercise
"20.16 in the TeXbook, except that there's an added restriction: your
"answer must also not use the \halign ... \span method given in the
"answer to 20.16. (Yes, that means you can't use \valign either!)
\end{comment}
The restrictions leave us with (essentially) three expansion-control
commands: \\
\cmd{\expandafter}, \cmd{\edef} and \cmd{\def}.
%\begin{description}
%\item[Solution 1 {[Peter Schmitt]}] \mbox{}
\begin{solution}{Solution 1 (Peter Schmitt)}\index{Schmitt, Peter}
\begin{lcode}
\edef\B{\b}
\def\defA#1{\def\defa##1##2{\def\a{#1##2##1}}}
\expandafter\defA\expandafter{\B}
\expandafter\defa\expandafter{\d}{\c}
\end{lcode}
\end{solution}
%%>>EndSolution
%\item[Solution 2 {[Donald Arseneau]}] \mbox{}
\begin{solution}{Solution 2 (Donald Arseneau)}\index{Arseneau, Donald}
\begin{lcode}
\edef\e{\b}
\expandafter \expandafter \expandafter \def\expandafter \expandafter
\expandafter \a\expandafter \expandafter \expandafter {\expandafter
\e\expandafter \c\d}
\end{lcode}
\end{solution}
%%>>EndSolution
%\item[Solution 3 {[mine]}] \mbox{}
\begin{solution}{Solution 3 (mine)}\index{Downes, Michael}
\begin{lcode}
\edef\a{\b}
\expandafter\expandafter\expandafter\def
\expandafter\expandafter\expandafter\a
\expandafter\expandafter\expandafter{\expandafter\a\expandafter\c\d}
\end{lcode}
\end{solution}
%%>>EndSolution
%\end{description}
My solution differed from Arseneau's only in using \cmd{\a} rather than \cmd{\e}
in the first step.
\begin{comment}
[Solution for exercise 2 moved to answer.002]
[Solution for exercise 3 moved to answer.003]
Michael Downes
[email protected] (Internet)
\end{comment}
%%\endinput
\chapter{Empty argument}
\section{Exercise (hard)}
%%\input{ex002.tex}
% ex002.tex
\begin{comment}
[Posted to info-tex on 10 Oct 91; see exercise.001]
**********************************************************************
*** Exercise 2 (hard):
\end{comment}
\ed{\oposted{1991/10/10}. \arch{exercise.002}.}\\
Define an `ifempty' macro that takes one argument and resolves
essentially to \piif{iftrue} if the argument is empty, and \piif{iffalse}
otherwise. This is useful for handling arguments given by
users to commands defined in a macro package.
Plain TeX or LaTeX-style solutions are both acceptable, that
is,
\begin{lcode}
\ifempty{...}TRUE CASE\else FALSE CASE\fi
\end{lcode}
or
\begin{lcode}
\ifempty{...}{TRUE CASE}{FALSE CASE}
\end{lcode}
(In the former case you will need to do something to avoid problems
in the situation
\begin{lcode}
\iffalse ... \ifempty{...} ... \fi ... \fi
\end{lcode}
there
are different possibilities here, so I will refrain from
indicating any particular one.)
Use the following test suite to verify the robustness of your
solution:
\begin{lcode}
\long\def\test#1{\begingroup \toks0{[#1]}%
\newlinechar`\/\message{/\the\toks0:
% LaTeX-style solution; modify the following line according
% to the syntax of your solution.
\ifempty{#1}{EMPTY}{NOT empty}%
}\endgroup}
\test{} \test{ }
\test{aabc} \test{-}
\test{$} \test{\empty}
\test{\endinput} \test{\iftrue a\else b\fi}
\test{\else} \test{#}
\test{\par} \halign{#\cr\test{&}\cr}
\test{\relax} \test{\relax\relax\relax}
\expandafter\iffalse\test{x}\fi \test{{}}
\end{lcode}
%$
The two tests on the first line should produce a message `EMPTY' and
the remaining ones, `NOT empty'. The reason for saying that the second
test should return `EMPTY' is that (1) this is the ideal behavior for
the applications I've encountered so far; (2) at least one other person
working independently arrived before me at a solution essentially
identical to mine, including this behavior. The details and credit to
the other guy will be given at solution time.
%%\endinput
\section{Answers}
%%\input{ans002.tex}
% ans002.tex
\begin{comment}
[Posted to info-tex on 25 Oct 91; see answer.001]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
"*** Exercise 2 (hard):
"Define an "ifempty" macro that takes one argument and resolves
"essentially to \iftrue if the argument is empty, and \iffalse
"otherwise. This is useful for handling arguments given by
"users to commands defined in a macro package such as LaTeX.
"
"Plain TeX or LaTeX-style solutions are both acceptable, that
"is,
"
" \ifempty{...}TRUE CASE\else FALSE CASE\fi
"
"or
"
" \ifempty{...}{TRUE CASE}{FALSE CASE}
\end{comment}
\ed{\oposted{1991/10/25}. \arch{answer.002}.}\\
The LaTeX-style solution that I had prepared was, I thought, pretty
good, but Donald Arseneau\index{Arseneau, Donald}
observed that it fails the test
\begin{lcode}
\test{{\iftrue a\else b\fi}}
\end{lcode}
which was not in my list of tests.
%\begin{description}
%\item[Solution 1 {[mine]}] \mbox{}
\begin{solution}{Solution 1 (mine)}\index{Downes, Michael}
\begin{lcode}
\catcode`\@=11
% \@car is actually already defined in latex.tex, but for
% maximum robustness it needs to have the \long prefix:
\long\def\@car#1#2\@nil{#1}
\long\def\@first#1#2{#1}
\long\def\@second#1#2{#2}
\long\def\ifempty#1{\expandafter\ifx\@car#1@\@nil @\@empty
\expandafter\@first\else\expandafter\@second\fi}
\catcode`\@=12
\long\def\test#1{\begingroup \toks0{[#1]}%
\newlinechar`\/\message{/\the\toks0:
\ifempty{#1}{EMPTY}{NOT empty}%
}\endgroup}
\end{lcode}
\end{solution}
%%>>EndSolution
The advantage of using the auxiliary macros \cmd{\@first} and \cmd{\@second},
together with the \cmd{\expandafter}'s, is that it allows the true and/or
false cases to end with arbitrary things, even macros that require
arguments that have not yet been read (any number of arguments, even
delimited arguments).
From here it is easy to implement an \piif{ifnotempty} test that has a
null false case. This is often useful in dealing with user-supplied
arguments: `If \#1 is empty, do nothing; otherwise, do the following
with \#1: ...'
\begin{lcode}
\long\def\ifnotempty#1{\ifempty{#1}{}}
\end{lcode}
%\item[Solution 2 {[Donald Arseneau]}]
\begin{solution}{Solution 2 (Donald Arseneau)}\index{Arseneau, Donald}
Don Arseneau came up with a plain TeX style solution, using an
ingenious device with \cmd{\then} to pass the test case
\begin{lcode}
\expandafter\iffalse\test{x}\fi
\end{lcode}
The comments in the solution are his.
\begin{lcode}
% \ifblank{...}\then Test if a parameter is blank (null or spaces).
% Use the inaccessable "letter" @ to separate parameters. The two cases are:
% _text_is_not_blank_ _text_is_blank_
% #1<- whatever #1<-@
% #2<- whatever (possibly null) #2<-
% #3<- @ #3<-.
% #4<- .. #4<-.
% \if @.. {false} \if .. {true}
% In the {false} case, the extra period is skipped so it doesn't hurt.
\catcode`\@=11 % as in plain.tex
\let\then\iftrue
\long\def\ifblank#1\then{\Ifbl@nk#1@@..\then}%
\long\def\Ifbl@nk#1#2@#3#4\then{\if#3#4}
\catcode`\@=12
\long\def\test#1{\begingroup \toks0{[#1]}%
\newlinechar`\/\message{/\the\toks0:
\ifblank{#1}\then EMPTY\else NOT empty\fi%
}\endgroup}
\end{lcode}
\end{solution}
%%>>EndSolution
The good thing about this solution is that it doesn't subject any part
of the user-supplied argument to the \piif{ifx} test. Using @ with category
code of 11 as a delimiter for the user-supplied text is extremely safe
because even in internal code @ doesn't appear by itself, only as part
of control sequence names. In a partial solution,
Peter Schmitt\index{Schmitt, Peter} pushed
the idea a little further by using space with category code 3 as the
delimiter.
There is another way of handling the problematic \piif{iffalse} test, in a
plain-TeX style solution, by using a suggestion of Donald Knuth that
appeared in TeXhax a while ago, in reply to a query of Stephan von
Bechtolsheim (texhax89, \#38 (post from svb, 17 Apr 89)).
%\item[Solution 3 {[Arseneau/Knuth]}] \mbox{}
\begin{solution}{Solution 3 (Arseneau/Knuth)}\index{Arseneau, Donald}\index{Knuth, Donald}
\begin{lcode}
% Usage: \if\blank{#1}...\else...\fi
\catcode`\@=11 % as in plain.tex
\long\def\blank#1{\bl@nk#1@@..\bl@nk}%
\long\def\bl@nk#1#2@#3#4\bl@nk{#3#4}
\catcode`\@=12
\long\def\test#1{\begingroup \toks0{[#1]}%
\newlinechar`\/\message{/\the\toks0:
\if\blank{#1}EMPTY\else NOT empty\fi%
}\endgroup}
\end{lcode}
\end{solution}
%>>EndSolution
At the end of Exercise 2 I wrote:
\begin{quote}
The two tests on the first line should produce a message `EMPTY' and
the remaining ones, `NOT empty'. The reason for saying that the second
test should return `EMPTY' is that (1) this is the ideal behavior for
the applications I've encountered so far; (2) at least one other person
working independently arrived before me at a solution essentially
identical to mine, including this behavior. The details and credit to
the other guy will be given at solution time.
\end{quote}
The name of the `other guy' is Michael Wester\index{Wester, Michael};
a listing of his macros
was published in the preprints for the July 1991 TUG meeting in Dedham,
Massachusetts (`Form Letter in LaTeX with 3-across Mailing Labels
Capability', joint paper with Jackie Damrau). In rereading the preprint
recently, it seems to me the presentation is more different from
Exercise 2 and its solutions than I had previously imagined, but the
essential ideas are there. See \cmd{\wcar}, \cmd{\wcdr} and related macros.
By the way, if anyone came up with a fully expandable test (suitable
for use inside a \cmd{\message}) for which \verb?\test{ }? came up
false instead of
true, I would be interested to hear about it. I didn't mean to
eliminate that possibility in my original statement of the problem.
%%\endinput
\chapter{Discretionary}
\section{Exercise (fast)}
%%\input{ex003.tex}
% ex003.tex
\ed{\oposted{1991/10/10}. \arch{exercise.003}.}\\
\begin{comment}
[Posted to info-tex on 10 Oct 91; see exercise.001]
**********************************************************************
*** Exercise 3 (fast):
\end{comment}
What's the most important difference between \cs{-} and
\begin{lcode}
\discretionary{-}{}{} ?
\end{lcode}
%%**********************************************************************
%%\endinput
\section{Answers}
%%\input{ans003.tex}
% ans003.tex
\ed{\oposted{1991/10/25}. \arch{answer.003}.}\\
\begin{comment}
[Posted to info-tex on 25 Oct 91; see answer.001]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
"*** Exercise 3 (fast):
"What's the most important difference between \- and
"\discretionary{-}{}{}?
\end{comment}
The most important difference between \cs{-} and \cmd{\discretionary}\verb?{-}{}{}?
is that the latter always puts in the character from font position 45
("2D, '55) of the current font when a word must be broken at the end of
a line; \cs{-} puts in the character from font position \cmd{\hyphenchar} of the
current font, which is NOT NECESSARILY position 45. It would be rather
unusual for \cmd{\hyphenchar} to be something other than 45; in certain
special applications, however (possibly in some foreign languages as
well?) a variant value of \cmd{\hyphenchar} can be useful. I have an idea for
using this in a future exercise\ldots
Credit to Donald Arseneau\index{Arseneau, Donald} for a correct answer.
Thanks to Peter Schmitt\index{Schmitt, Peter}
for providing the perfect opening for another point I wanted to make:
\begin{quotation}
The \emph{TeXbook} states explicitly: \\
\cs{-} is equivalent to \verb?\discretionary{-}{}{}? \\
and both are internal.
I do not see where to the question aims:
\begin{itemize}
\item control symbol : control sequence
\item no paramaters : three parameters
\item two characters : 21 characters to type
\item ???
\end{itemize}
\end{quotation}
Schmitt is quoting from the last page of Chapter 25; the point is,
that in newer versions of the \emph{TeXbook} that sentence has been revised.
I'm not sure what the latest printing says, since I don't have a copy,
but I think it simply refers the reader to Appendix H, where the
significance of \cmd{\hyphenchar} is explained. \cmd{\hyphenchar} is a feature that
was added late in the development of TeX82 (\pfile{TeX82.bug} reveals that is
was not added until May 25, 1983). Even if the source files for the
\emph{TeXbook} were immediately updated by Knuth at that time, the changes did
not appear in the published version being sold to the general public
until some time later when the first revised edition was published,
which was no earlier than October 1984, the date of the \emph{TeXbook} copy
that I have on hand, and probably later.
The statement of purpose in `Around the bend' \#1 said something
about finding the `best solution', but conspicuously failed to define
what `best' should mean in this context. It was my intention to address
this question in future exercises; for now, let me just say that I
don't intend to arbitrarily rule out of consideration answers such as
Schmitt's `two characters : 21 characters to type', since depending on
how you look at it, it could be argued that this is much more
significant than dumb old \cmd{\hyphenchar} minutiae. I promised that these
exercises would be challenging; that means, among other things, that
they won't always be well-defined, well-bounded, or well-behaved, and
part of the job of finding the `best solution' will be to decide what
parts of the problem need to be specified further, and to examine the
ramifications of alternatives.
%%\endinput
\chapter{What is `best'?}
\section{Exercise (essay)}
%%\input{ex004}
% ex004.tex
\begin{comment}
[Exercises 4,5,6,7 were originally posted together on 4 Nov 91]
Date: Mon 4 Nov 91 16:42:44-EST
From: Michael Downes <
[email protected]>
Subject: Around the bend #2
To:
[email protected]
\end{comment}
\ed{\oposted{1991/11/04}. \arch{exercise.004}.}
The statement of purpose in `Around the bend' \#1 said something about
finding the `best solution', but failed to define what `best' should
mean when comparing pieces of TeX code. I'll start by throwing out
a few ideas.
\begin{description}
\item[Simplicity] A good solution gets hold of the essential idea of the
problem and attacks it directly, rather than beating around the bush
and resorting to separate clauses to handle troublesome subcases.
\item[Economy] If two solutions compare equal in other respects, then the
better solution is the one that uses less of TeX's resources (main
memory, hash table, string pool, and so forth). Therefore I
(immodestly) say that my solution to Exercise 1 was ever so slightly
better than the other two given, because it avoided introducing any
auxiliary macros that were not included in the original statement of
the problem.
\item[Robustness] If a solution only works under limited friendly
circumstances, and otherwise blows up with an error message, that's not
good. My solution to Exercise 2 was flawed in this respect, since D.A.
found a test case that caused it to go wrong.
\end{description}
%%***********************************************************************
*** Exercise 4 (essay):
What should `best' mean when comparing solutions to an `Around the
bend' exercise? What qualities of a good solution are most important?
Why? How can they be objectively measured? (Or can they?) On the
negative side, what qualities indicate an inferior solution?
%%***********************************************************************
\begin{comment}
[Exercise 5 moved to exercise.005]
[Exercise 6 moved to exercise.006]
[Exercise 7 moved to exercise.007]
Send answers to:
Michael Downes
[email protected] (Internet)
A summary will be posted Tuesday, December 4, 1991. However, because of
the difficulty of E7, I will probably procrastinate on posting the
solutions for that exercise until the first or second week of December.
\end{comment}
Table of special characters, to verify accurate transmission:
\begin{lcode}
ASCII 33: ! exclamation point ASCII 60: < left elbow
ASCII 34: " double quote ASCII 61: = equals sign
ASCII 35: # number/pound sign ASCII 62: > right elbow
ASCII 36: $ dollar sign ASCII 63: ? question mark
ASCII 37: % percent sign ASCII 64: @ at sign
ASCII 38: & ampersand ASCII 91: [ left square bracket
ASCII 39: ' right quote/apostrophe ASCII 92: \ backslash
ASCII 40: ( left parenthesis ASCII 93: ] right square bracket
ASCII 41: ) right parenthesis ASCII 94: ^ circumflex/hat/caret
ASCII 42: * star/asterisk ASCII 95: _ underscore
ASCII 45: - hyphen ASCII 96: ` left quote
ASCII 47: / slash ASCII 123: { left curly brace
ASCII 58: : colon ASCII 124: | vert bar
ASCII 59: ; semicolon ASCII 125: } right curly brace
ASCII 126: ~ tilde
\end{lcode}
%$
%%\endinput
\section{Answers}
%%\input{ans004}
% ans004.tex
\ed{\oposted{1991/12/10}. \arch{answer.004}.}
\begin{comment}
[Solutions for exercises 4,5 were originally posted together on 5 Dec 91]
Date: Thu 5 Dec 91 10:26:58-EST
From: Michael Downes <
[email protected]>
Subject: `Around the bend' #2 solutions (4,5)
To:
[email protected]
Answers to exercises 4 and 5 of `Around the bend' #2. Discussion of E6
will follow in a separate post because it is rather lengthy. Discussion
of E7 will follow in another couple of weeks (I'm going to be on
vacation next week.)
"***********************************************************************
"*** Exercise 4 (essay):
"
"What should `best' mean when comparing solutions to an `Around the
"bend' exercise? What qualities of a good solution are most important?
"Why? How can they be objectively measured? (Or can they?) On the
"negative side, what qualities indicate an inferior solution?
\end{comment}
Peter Schmitt\index{Schmitt, Peter} writes:
\begin{quotation}
What is to be rated as `best' clearly depends on the function used to
measure quality. And therefore the question makes sense only with
respect to some particular rating function. Seemingly nothing is gained
by this statement: Instead of discussing what qualities are required
for a good solution one has to discuss how the rating system should be
defined. But nevertheless this shifted point of view has an important
an important advantage. It makes clear that there is no unique answer:
Quality is not an absolute notion but a notion relative to some
(agreed) measure. This measure is not independent of the context ---
under different conditions different rating functions may be used.
One further important point must not be forgotten: If matters of
personal taste are to be excluded than the measuring function has to be
precisely defined --- demanding simplicity, without giving this notion
a precise (formal) meaning, is not sufficient.
Therefore I would like to split the original question into two seperate
questions:
(a) What (formal and informal) rating functions are likely to be
useful, and under what circumstances?
(b) With respect to some formal rating function, is there always a best
solution?
Some answers to the first questions are the following (no completeness
claimed or even intended):
(1) the first solution:
If some special effect is needed for a single application then the
best solution is the first solution (the solution that can be
realized with the least effort). This is, however, a purely
individual criterion that cannot be formalized.
(2) the most economic (in some sense) solution:
Economic considerations are important if a code is used frequently,
Depending on the nature of the applications running time, memory
usage, and others, may be relevant. But the time spent for finding
a good solution still cannot be neglected in a real world
situation. Of course, for theoretical investigations the time spent
for research does not matter.
(3) the more robust solution:
If some set of macros is used by a large number of people who not
always know how to use them correctly (or even do not care to know)
then it is certainly an advantage if they are robust, i.e. work in
as many cases (even strange ones) as possible. But again, one has
to decide what price (in terms of resources) is acceptable for this
robustness. (In many cases the item (4) below will be more
important.)
(4) ease-of-use:
If a set of macros is used frequently (by one or more persons) then
ease-of-use is certainly a mark of quality: easy to remember
syntax, short commands, natural and good readable embedding into
the surrounding text, and similar criteria, decide about this.
(5) simplicity:
Simple solutions certainly have a strong appeal --- but what is a
simple solution? Again this is hard to formalize, since simplicity
basically is an aesthetic value, closely related to the concepts of
elegance and beauty. (This is similar to the situation in
mathematics.) But be careful: Simple is not equivalent to short!
(6) the shortest solution:
This may seem to be an easy rating function, but is it? Should
length be measured by the number of characters (probably not!), or
by the number of tokens, or by the number of control sequences? Or
by something else?
Most of the measures mentioned are difficult to formalize, or cannot be
formalized at all. Only the resources used (in (2)) and the length of a
code (in (6)) can be precisely defined. Therefore, with respect to one
of these cases two solutions of the same problem can be compared.
Furthermore, in many cases it will be possible to proof that an optimal
solution exists. (For instance, since the length of a code (in any
interpretation) is a positive integer, there must exist one or more
solutions with minimal length, provided there is at least one
solution.) But unfortunately this does not imply that one is able to
construct an optimal solution, or to decide whether a given piece of
code is an optimal solution (or at least near to one). And in some
cases it may happen that no optimal solution exists, e.g. if to every
solution there is better --- but longer! --- one.
What is the conclusion of all this? That there may be a best solution
relative to some side conditions. But that there is no globally best
solution. This statement is, of course, not very satisfying. One
would rather prefer to have at least some notion (even a tentative one)
of a best solution than none at all. I propose therefore the following
informal definition (often subject to personal taste): If some code is
optimal or near-optimal in more than one category then it is probably
as near to a globally optimal solution as this is possible.
\end{quotation}
My comments:
I propose the following list, based on (1) [my interpretation of]
Knuth's ideas about good macro writing as demonstrated in the \emph{TeXbook}
and plain.tex, (2) various articles in TUGboat, (3) Schmitt's comments,
(4) discussions I've had in the past with other macro writers, and so
forth.
The characteristics of a good solution to an `Around the bend' exercise
are (in order of decreasing importance):
\begin{enumerate}
\item Robustness
\item Brevity (= minimal usage of TeX's main memory)3
\item Simplicity
\item Ease of use
\item Suitable commentary
\item Speed
\item Minimal hash table load
\item Minimal save stack load
\item Minimal load in other categories of TeX's memory
\item Comprehensive test suite (when applicable)
\end{enumerate}
Schmitt's\index{Schmitt, Peter} point about 'first solution' is well taken
but does not apply
to `Around the bend' exercises, because of the stated goal of finding a
'best' solution, with the presumption that normally more than one
solution will be found.
Measurement of these qualities is not too difficult, I think,
except for 3 and 5. Here's how I see the measurements:
\begin{description}
\item[1. Robustness] A solution is robust if no one who reads it offers a
counterexample that causes it to fail. If two solutions both fail, the
one with more counterexamples is less robust; if two solutions have
different counterexamples, the solution whose counterexample is more
likely to occur in normal use is the less robust solution.
\item[2. Brevity] Of two different solutions, the one that is
briefer/shorter/more compact is the one that uses less of TeX's main
memory as measured by \cmd{\tracingstats}.
\item[3. Simplicity] Of two different solutions, the shorter one (in the
sense of the previous item) is usually the simpler one, but not always.
A solution that condenses all the necessary operations into a dense,
incomprehensible Gordian knot is less simple than a longer solution
that lays out the operations in a series of easily comprehended steps.
A solution that relies on arcane dirty tricks is less simple than a
solution that uses better-known techniques in a straightforward
approach.
\item[4. Ease of use] I believe this will not be extremely hard to measure in
the context of the particular application; it can't sensibly be
discussed out of context.
\item[5. Suitable commentary] The commentary surrounding a solution should
explicitly mention any necessary assumptions. If the code is complex,
the commentary should give an outline or overview of the intended
algorithm. It should explain the operation of any macro if its
operation is not evident from the code. If an unusual construction is
used where a different construction would normally be expected, the
commentary should give the reason.
\item[6. Speed] Of two solutions, the speedier one is the one that runs
faster on common computer systems. If one solution runs faster and
slower than another, depending on the system \ldots well, let's not cross
that bridge unless it turns out to be real.
\item[7,8,9. Minimal hash table load, save stack load, etc.] These can be
measured by \\
\cmd{\tracingstats}.
\item[10. Comprehensive test suite] If two solutions are equal in other
respects, the one whose accompanying test suite covers more distinct
cases than the other's is better by that much.
\end{description}
It may be argued that I have not sufficiently answered the question of
subjectivity. For example, who's to decide what's an 'arcane dirty
trick' and what's not? What does 'suitable' mean in number 5? The
answer is that I will say that something is an 'arcane dirty trick' if
I think so, and anyone else can do the same. In most cases I believe
that there will be general agreement on such a question; if not, and an
ensuing discussion fails to reach a clear settlement, then each of the
solutions in question will be decreed 'subjectively just as good as the
others'.
Other qualities of a good solution can be expressed in terms of the
ones listed above. For example, self-sufficiency may be considered an
aspect of robustness---if a solution is not entirely self-sufficient,
it can easily be shown to be not robust by giving a counterexample that
exploits the assumption that makes the solution non-self-sufficient.
Elegance? If a solution is simple and easy to use, then I say it is
elegant. A solution doesn't necessarily have to be robust in order to
be elegant, nor even short (although of two solutions that are
otherwise equal, the shorter one is undoubtedly more elegant).
\begin{comment}
[Solution for exercise 5 moved to answer.005]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Table of special characters (ASCII):
33: ! exclamation point; 59: ; semicolon;
34: " double quote; 60: < left elbow;
35: # number/pound sign; 61: = equals sign;
36: $ dollar sign; 62: > right elbow;
37: % percent sign; 63: ? question mark;
38: & ampersand; 64: @ at sign;
39: ' right quote/apostrophe; 91: [ left square bracket;
40: ( left parenthesis; 92: \ backslash;
41: ) right parenthesis; 93: ] right square bracket;
42: * star/asterisk; 94: ^ circumflex/hat/caret;
43: + plus sign; 95: _ underscore;
44: , comma; 96: ` left quote;
45: - hyphen; 123: { left curly brace;
46: . period/dot/point; 124: | vert bar;
47: / slash; 125: } right curly brace;
58: : colon; 126: ~ tilde
%$
Michael Downes
[email protected] (Internet)
\end{comment}
%%\endinput
\chapter{\cs{string} tokens}
\section{Exercise (fast)}
%%\input{ex005}
% ex005.tex
\ed{\oposted{1991/11/04}. \arch{exercise.005}.}
\begin{comment}
[Posted to info-tex on 4 Nov 91; see exercise.004]
***********************************************************************
*** Exercise 5 (fast):
\end{comment}
Assuming a normal value for \cmd{\escapechar}
\begin{lcode}
\string\a
\end{lcode}
produces two character tokens. What is the category code of the second?
Write an experiment (as short as possible) to demonstrate the
correctness of your answer.
%%%**********************************************************************
%%\endinput
\section{Answers}
%%\input{ans005}
% ans005.tex
\ed{\oposted{1991/12/05}. \arch{answer.005}.}
\begin{comment}
[Posted to info-tex on 5 Dec 91; see answer.004]
"***********************************************************************
"*** Exercise 5 (fast):
"
"Assuming a normal value for \escapechar,
"
" \string\a
"
"produces two character tokens. What is the category code of the second?
"Write an experiment (as short as possible) to demonstrate the
"correctness of your answer.
\end{comment}
The category of the 'a' token is 12. All tokens produced by \cmd{\string}
have category 12, except for space tokens, which have category 10.
\begin{solution}{Solution 1 (mine)}
\begin{lcode}
\def\answercheck#1#2{\message{#2: \ifcat0#2\else NOT \fi Category 12}}
\expandafter\answercheck\string\a
\answercheck bb
\end{lcode}
This produces on screen the following message:
\begin{lcode}
a: Category 12 b: NOT Category 12
\end{lcode}
\end{solution}
%%>>EndSolution
%%>>Solution 2 [Peter Schmitt]:
\begin{solution}{Solution 2 (Peter Schmitt)}\index{Schmitt, Peter}
\begin{lcode}
\def\test#1#2#3{%
\message{\ifcat#2#3 #2 and #3 have the same category code
\else #2 and #3 have not the same category code
\fi}}
\def\Test#1#2#3{%
\ifcat#2#3 \message{#2 and #3 have the same category code}
\else \message{#2 and #3 have not the same category code}
\fi}
\catcode`\A12
\test 1aA
\Test 1aA
\expandafter\test\string\a A
\expandafter\Test\string\a A
\end{lcode}
Comment: \\
I have given two essentially equivalent Tests --- \cmd{\test} and \cmd{\Test}.
(i) \cmd{\test} is slightly more simple because it contains only one \cmd{\message}
command, but I think that \cmd{\Test} is more adequate because it avoids to
perform the test inside the \cmd{\message} --- there might be some side
effect one is not aware off.
(ii) Both tests are not as short as possible --- the \piif{true} and \piif{false}
cases could be much shorter, e.g. a T (for true) and a F (for false)
would suffice --- the result could be checked in the dvi-file. (I
regard this difference as inessential.)
Furthermore, setting the catcode of the model character to 12 could
easily be omitted (use some character that is known to be an `other
character'), but I think it should be included: It makes the test
independent of any assumption on the format running. This makes the
solution more closed and selfsufficient, and therefore also simpler and
more elegant (if I may say so).
\end{solution}
%%>>EndSolution
%%\endinput
\chapter{Counting arguments}
\section{Exercise (hard)}
%%\input{ex006}
\begin{comment}
[Posted to info-tex on 4 Nov 91; see exercise.004]
**********************************************************************
*** Exercise 6 (hard):
\end{comment}
\ed{\oposted{1991/11/04}. \arch{exercise.006}.}
Define a macro \cmd{\args} that can be used to fill in the proper number
in the following sentence no matter how \cmd{\foo} is defined (except
you may assume it is not \cmd{\outer}).
The macro \verb?\tt\string\foo? has \verb?\args\foo? arguments.
Is it possible to solve this if \cmd{\foo} is \cmd{\outer} also? Is it possible
to make \cmd{\args} fully expandable, so that it could be used in a
message:
\begin{lcode}
\message{The macro \noexpand\foo has \args\foo\space arguments.}
\end{lcode}
%%**********************************************************************
%%\endinput
\section{Answers}
%%\input{ans006}
% ans006.tex
\begin{comment}
Date: Mon 23 Dec 91 11:46:33-EST
From: Michael Downes <
[email protected]>
Subject: Answers to 'Around the bend' #2 Exercise 6
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
"*** Exercise 6 (hard):
"
"Define a macro \args that can be used to fill in the proper number
"in the following sentence no matter how \foo is defined (except
"you may assume it is not \outer).
"
" The macro {\tt\string\foo} has {\args\foo} arguments.
"
"Is it possible to solve this if \foo is \outer also? Is it possible
"to make \args fully expandable, so that it could be used in a
"message:
"
" \message{The macro \noexpand\foo has \args\foo\space arguments.}
\end{comment}
\ed{\oposted{1991/12/23}. \arch{answer.006}.}
This was a tough one. All who sent in answers to this exercise
(counting myself) used the approach of applying \cmd{\meaning} to \cmd{\foo} and
analyzing the resulting string. There are some drawbacks to this.
(1) In a \cmd{\meaning} string, all characters (other than spaces) have
catcode 12. This means that all occurrences in a \cmd{\meaning} string of
the character \# are indistinguishable, regardless of their true
significance in the parameter text or replacement text of the macro
in question. Consequently, an occurrence of a \# character, not
category 6, followed by a number, in the parameter text of \cmd{\foo} can
potentially make \cmd{\args} report an incorrect number of arguments. For
example, in the following definitions \cmd{\foo} has no arguments, only
delimiter text, in all three cases, but the \cmd{\meaning} string would
appear to show that \cmd{\foo} has one argument:
\begin{lcode}
\def\foo\#1{}
\expandafter\def\expandafter\foo\string #1{}
\catcode`\#=12 \def\foo#1{}
\end{lcode}
(2) The following two examples produce identical \cmd{\meaning} strings:
\begin{lcode}
\def\foo&1{} % no arguments
\catcode`\&=6 \def\foo&1{} % one argument
\end{lcode}
(The string is \verb?"macro:&1->"?.) I.e., characters other than \# can
be used to create parameter markers in a macro definition, and
such a parameter marker cannot be distinguished in a \cmd{\meaning}
string from a normal use of the character in question.
(3) There is no completely general way to isolate the parameter text
of an arbitrary macro from the replacement text. The best you can do
is remove the tail of the \cmd{\meaning} string---everything after the last
occurrence of \verb?->? in the string---and say 'This is not part of the
parameter text'. Likewise, anything preceding the first occurrence of
\verb?->? is certainly part of the parameter text. If there are two or more
occurrences of \verb?->? in the string, however, you cannot say for sure
whether anything between the first and last occurrences is parameter
text or replacement text. This raises a slight additional possibility
that pseudo 'parameter markers' in the replacement text could cause
\cmd{\args} to give an incorrrect result. For example:
\begin{lcode}
\edef\foo #1{\string#2->}
\end{lcode}
defining \cmd{\foo} with one argument, produces a \cmd{\meaning} string of
\begin{lcode}
macro:#1->#2->
\end{lcode}
which is exactly the same as the \cmd{\meaning} string for
\begin{lcode}
\def\foo#1->#2{}
\end{lcode}
where \cmd{\foo} has two arguments.
Speaking practically, however, rather than theoretically, using
\cmd{\meaning} to analyze the number of arguments of an arbitrary macro
works fine. Donald Arseneau's solution, below, is admirably
brief and demonstrates an easy way of handling an outer argument
that I had never seen before.
\begin{solution}{Solution 1 (Donald Arseneau)}\index{Arseneau, Donald}
Here is my solution for counting arguments. It is totally expandable,
and relies on the fact that the parameter numbers must be in
increasing order, that they are only single digits, and that there is
no parameter zero. Also important is that \cmd{\meaning} of a macro defined
by \verb?\def\x#{...}? reports a syntax of \verb?{? rather than \#.
\begin{lcode}
{\catcode`\*=6 \catcode`\#=12 % use * for macro parameters while # is "other"
%
\gdef\args{\expandafter\Args\noexpand}% get rid of \outerness
%
\long\gdef\Args*1{\expandafter\countargs \meaning*1:->{}\end}%
% ... \meaning will display the parameter syntax (as "other" characters).
%
\gdef\countargs*1:*2->*3\end{\twoargs#0*2#0}% get just the parameter syntax
% ... in format #0junk#1junk...#njunk#0. \twoargs processes the list to
% ... give "n", the last number before #0.
\end{lcode}
Here's what tests the parameter numbers, two at a time. (Thus, the two
\verb?#0?'s in \cmd{\countargs}, so there are always at least two
\verb?#n?'s detected.)
When the second number of a comparison isn't zero, \cmd{\twoargs} re-executes
itself to test the next pair; when the second \verb?n? is 0, the first
\verb?n? is the
highest parameter number, so it is output.
\begin{lcode}
\gdef\twoargs*1#*2*3#*4{\ifnum0=*4 *2\else % note the space to end the number
\expandafter\twoargs\expandafter#\expandafter*4\fi}
}
\end{lcode}
Here is my test suite. The character ``:'' works in a funny way: it
confuses how \cmd{\countargs} reads its parameter list, and another colon
gets into the supposed syntax. But it works because there are no
parameters. The primitive \cmd{\halign} is reported to have no parameters
because it is not a macro. This could be confusing to someone. The
same confusion could arise with \cmd{\args} itself because it doesn't read
the parameter right away.
\begin{lcode}
\def\test#1#{nothing}
\def\Test[#1]#2:{\##1,#2##}
\def\#{haha}
\show\test \show\Test
\end{lcode}
(I condensed this test suite---MJD)
\begin{lcode}
\long\def\msg#1{\message{The object \string#1 has \args#1 arguments.}}
\msg\mathpalette \msg\mathhexbox \msg\par \msg\halign \msg\args
\msg\relax \msg # \msg\# \msg\test \msg\Test \msg : \msg\: \msg\csname
\msg t \msg ~ \msg $ \msg ^
\end{lcode}
(Outer macros---MJD)
\begin{lcode}
\message{The object \string\bye\space has \args\bye\space arguments.}
\message{The object \string\newhelp\space has \args\newhelp\space
arguments.}
\bye % -- Donald Arseneau
\end{lcode}
\end{solution}
%%>>EndSolution
Although the problem statement only mentioned `macros' Arseneau
earned some thoroughness points by including primitives \cmd{\halign},
\cmd{\relax}, and \cmd{\csname}, as well as characters \verb?# : t $ ^?
in his tests.
This is of some interest because of the difference in \cmd{\meaning}
strings between macros and non-macros.
In my solution for this exercise, I amused myself by trying to pack
everything into as few control sequences as possible. Although I got
it down to two, that's really only one less than Arseneau's four,
because one control sequence in his solution is expended to
handle outer macros, something my solution didn't attempt to do.
%>>Solution 2 (mine)
\begin{solution}{Solution 2 (mine)}
\begin{lcode}
% Use & instead of # temporarily.
\catcode`\&=6 \catcode`\#=12
\long\def\args &1{\expandafter\countargs\meaning &1#\args->\countargs 0}
\end{lcode}
Analysis is restricted to the parameter text by chopping off everything
after \verb?->? in the meaning string (this will leave possibly only part
of the parameter text).
Then we look in the parameter text for \# followed by a number
(checking to make sure that the thing after \# is a number handles a
few extra possibilities, such as \verb?\#? followed by non-number in the
parameter text). If we find \# plus a number, we pass the number
onward to the next invocation of \cmd{\countargs}, where it will end up as
the returned value (argument \#5) if the next \cmd{\countargs} determines
that the remaining parameter text contains no more parameter markers.
\begin{lcode}
\def\countargs &1#&2&3->&4\countargs &5{%
\ifx\args&2&5%
\else
\ifodd0&21 % Then &2 is a number, carry forward.
\countargs&3#\args->\countargs&2%
\else % &2 not a number---ignore, carry forward last number instead
\countargs&3#\args->\countargs&5%
\fi
\fi}
\catcode`\#=6
\def\test{\message{The macro \noexpand\foo has \args\foo\space
arguments (\meaning\foo).}}
%\tracingmacros=2 \tracingcommands=2
% Success:
\def\foo{No args}\test
\def\foo#1{One arg}\test
\def\foo#1#2{Two args}\test
\def\foo./{No args, delimited}\test
\def\foo#1#2#3#4#5#6#7#8#9{Nine args}\test
\def\foo//#1#2#3#4#5#6#7#8#9//{Nine args, delimited}\test
\def\foo#{Weird}\test
\def\foo#1#{Weird, one arg}\test
\def\foo#1#2#3#4#5#6#7#8#9#{Weird, nine args}\test
\def\foo#1 {One arg, space delimited}\test
\def\foo#1 #2 #3 #4 #5 #6 #7 #8 #9 {Nine args, space delimited}\test
\def\foo/{\def\foo}
\foo/ #1{Interesting}\test
\edef\foo#1#2{\string #3\string #4}\test
\edef\foo{\string #}\test
\expandafter\edef\expandafter\foo
\csname 0\string #\string #\endcsname#1#2{#1#2}\test
% Failure:
\def\foo->#1->#2->#3->#4->#5->#6->#7->#8->#9->{Nine args, devious
delimiter}\test
\expandafter\edef\expandafter\foo
\csname 0\string #1\string #2\endcsname{...}\test
\let\foo=\bye \test % \outer bomb
\end{lcode}
\end{solution}
%%>>EndSolution
When I originally posed this problem, I had seen far enough ahead to
suspect that the drawbacks of \cmd{\meaning} mentioned above would be
impossible to overcome. But \cmd{\meaning} is the only way to analyze a
macro that has a nonsimple parameter text---that is, one containing
delimited arguments. Another possibility I had in mind was restricting
the analysis to macros with simple parameter texts---empty or having
only nondelimited arguments---to see what might be done without
\cmd{\meaning}. The best that I could manage in my experiments along these
lines was a definition of \cmd{\args} with an unacceptably cumbersome call
syntax. But it does have the virtue of correctly identifying any
number of nondelimited arguments, no matter whether \cmd{\foo} was
originally defined using \# (category 6) or some other category 6
character.
%%>>Solution 3 (mine)
\begin{solution}{Solution 3 (mine)}
\begin{lcode}
% This solution is not fully expandable, hence cannot be used
% inside a \message.
\def\args{\expandafter\argscontinue}
\def\argscontinue{\begingroup
\end{lcode}
Make all digits have category 2 (= end of group) so that
they will serve to end the token register assignment
\verb?\global\toks1 ...?
\begin{lcode}
\catcode`\0=2 \catcode`\1=2 \catcode`\2=2 \catcode`\3=2 \catcode`\4=2
\catcode`\5=2 \catcode`\6=2 \catcode`\7=2 \catcode`\8=2
\end{lcode}
We use \cmd{\afterassignment} to put an \cmd{\endgroup} after the
token register assignment, so that numbers will revert to
their ordinary catcodes. And we use \cmd{\aftergroup} to put
a \cmd{\finishup} token after the \cmd{\endgroup}. Thus \cmd{\finishup} can
look ahead to see what numbers are remaining; this information
reveals how many arguments were used up by the \cmd{\foo} macro call.
\begin{lcode}
\aftergroup\finishup \afterassignment\endgroup
\global\toks1\bgroup}
\end{lcode}
\cmd{\finishup} takes the first digit following it and returns it
as the value of \cmd{\args}; any following numbers are discarded
(note that \#2 is delimited by a space).
\begin{lcode}
\def\finishup#1#2 {%\showthe\toks1
#1}
%\tracingmacros=2 \tracingcommands=2 \tracingonline=1
\def\foo{}
The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments.
\def\foo#1{}
The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments.
\edef\foo#1{\string #2\string #3\string #4->\string #4\string #3#1}
The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments.
\def\foo#1#2#3{a#1b#2c#3}
The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments.
\def\foo#1#2#3#4#5#6#7#8#9{#1#2#3#5#8bb#9}
The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments.
\end{lcode}
\end{solution}
%%>>EndSolution
The fourth solution for Exercise 6 is by Peter Schmitt; it gets the
robustness prize for carrying out a diligent analysis of \cmd{\meaning}
strings that enables it to correctly handle a greater variety of
exotic cases than the other solutions. Schmitt's original method of
handling outer macros was effective, but more complicated than
Arseneau's method, incorporated here as noted. Even though my
approach was rather different from Schmitt's, some of the comments in
Schmitt's solution inspired me in turn to improve my solution [2]
from its previous much inferior state.
%%>>Solution 4 (Peter Schmitt)
\begin{solution}{Solution 4 (Peter Schmitt)}\index{Schmitt, Peter}
\begin{lcode}
% \args <token> expands to: - if <token> is not a macro
% 0..9 according to the number of parameters
% if the <token> is a macro
% \args is fully expandable and accepts outer macros as well.
% It assumes, however, that the tested macro has been defined using the
% standard parameter symbol #,
% and that the current value of \escapechar is the standard backslash \.
\end{lcode}
The definition of the macros uses the expansion of
\cmd{\meaning}\verb?\cs?:
It is of the form:
\begin{lcode}
[..] macro: [parameter text] -> [replacement text]
\end{lcode}
and consists of `other characters'.
The macro \cmd{\args} checks:
\begin{enumerate}
\item if the expansion contains `macro': \\
--- if not, then \verb?\cs? is not a macro and \cmd{\args} yields `-'
\item if the expansion contains parameters \#1 etc. \\
--- if \verb?#n? is the first that is not present
then \verb?\cs? takes (n-1) arguments
and \cmd{\args} yields `n-1'
\end{enumerate}
The following special characters are chosen to make the definitions as
readable as possible. Any characters having catcodes different from 12
will serve the same purpose:
\begin{lcode}
\catcode`\:3 \catcode`\/3 % : and / are used as parameter delimiters
\catcode`\^3 % ^ is used to detect empty arguments
\catcode`\?11 % ? is used to make the control sequences private
\end{lcode}
Since the occurrences of \# in the expansion of \cmd{\meaning}\verb?\cs? has to be
detected, it has to be used as an `other character'.
To avoid confusion it has been replaced not only where necessary but
throughout all the definitions:
\begin{lcode}
\catcode`\#12 \catcode`\*6 % * is parameter character
\end{lcode}
\begin{itemize}
\item \verb|\?macro| is defined to be `macro' consisting of `other characters'
using the expansion of \verb?\meaning\TeX?.
\item \verb?\?DEF? inserts these five characters into some definitions
where they are as parameter delimiters:
\begin{lcode}
\DEF\cs { <parameter text> } { <replacement text> }
\end{lcode}
where the texts may contain *1 and **1 .. **9
yields
\begin{lcode}
\def\cs <parameter text>{<replacement text>}
\end{lcode}
where *1 is replaced by `macro' and **1 yields *1 etc.
\end{itemize}
\begin{lcode}
\def\?macro *1:*2:{*1} \edef\?macro{\expandafter\?macro\meaning\TeX:}
\def\?DEF *1*2{\def*1**1:{\long\def*1*2}\expandafter*1\?macro:}
\end{lcode}
\begin{itemize}
\item \cmd{\args} passes the \meta{token} unexpanded to \verb|args?|
\item (taken from the solution by Donald Arseneau)
\verb|\args?| takes one argument, expands its \cmd{\meaning} to TEXT
and passes it to \verb|\macro?| after appending \verb|macro^|:
\item \verb|\macro?| checks the first token after the first occurrence of
`macro':
if this is \verb?^(3)?, then `macro' was not present in TEXT (output: -)
otherwise TEXT is further investigated.
\end{itemize}
\begin{lcode}
\def\args{\expandafter\args?\noexpand}
\?DEF \args? {**1{\expandafter\macro?\meaning **1*1^:}}
\?DEF\macro? {**1*1**2:{\ifx^**2-\else\expandafter\purge? **2:\fi}}
\end{lcode}
The parameters taken by a control sequence all appear (once and in
numerical order) in the parameter text --- and no other occurrence
of a pair \verb?#n? is allowed in it. Moreover, only the same pairs \verb?#n?
may
occur in the replacement text. It is, however, not possible to simply
look for occurrences of these pairs since there are tokens that may ---
if followed by some number --- be (wrongly) interpreted as parameters:
\begin{itemize}
\item the token \verb?##? in the replacement text, and
\item (as pointed out by Michael Downes)
-the control symbol \verb?\#? both in the parameter text and the
replacement text.
\end{itemize}
Since \verb?\\#n? has to be distinguished from \verb?\#n? the control
symbol \verb?\\? is also important.
Therefore \verb|\purge?| is used to remove all occurrences of these tokens.
After that the search-macro \verb|\head?| is invoked, appending
the sequence \verb?#n^(n-1)? for every possible parameter \verb?#n?.
Since \verb|\purge?| has to identify the character \verb?\(12)? it is
necessary to change the escapecharacter:
\begin{lcode}
\catcode`\!0 !catcode`!\=12 % ! is used as escape character
\end{lcode}
\verb|\purge?| appends \verb?## \#^ and \\^? to the TEXT as a means to
stop the search
for these tokens, and : as delimiter:
\begin{enumerate}
\item \verb|\backslash?| looks for the first occurrence of the character pair
\verb?\\? in TEXT (this must be a token \verb?\\?) and replaces it by a
space.
If it is followed by \verb?^(3)? then the search is completed,
otherwise the process is repeated.
\item \verb|\numbersign?| looks for the first occurrence of the character pair
\verb?\#? in the (in the meantime modified) TEXT (since all \verb?\\? have
been removed this must correspond to a token \verb?\#?) and replaces it by
a space.
Again the process is stopped when it is followed by \verb?^(3)?.
\item \verb|\parametersign?| truncates TEXT at the first occurrence of the
character pair. Note that this pair must correspond to a parameter
token \verb?##? in the replacement text and therefore the rest of TEXT is
not needed any more.
\end{enumerate}
\begin{lcode}
!def!purge? *1:{!backslash? *1##\#^\\^:}
% \purge? could be avoided - \macro? could call \backslash? directly
!def!backslash? *1\\*2*3:{!ifx^*2!expandafter!numbersign?
!else !expandafter!backslash?
!fi *1 *2*3:}
!def!numbersign? *1\#*2*3:{!ifx^*2!expandafter!parametersign?
!else !expandafter!numbersign?
!fi *1 *2*3:}
!catcode`!\0 \catcode`\!=12 % return to the normal use of backslash
\def\parametersign? *1##*2:{%
\head? *1^#1^0#2^1#3^2#4^3#5^4#6^5#7^6#8^7#9^8#0^9:}
\end{lcode}
For each n from 0 to 9 \verb|\head?| extracts the characters contained in
the (appended) TEXT between the first occurrence of \verb?#n? and
\verb?#(n+1)? and investigates them with \verb|\used?|.
If \verb?#n? is not present in TEXT, then the first of these characters is
\verb?^(3)?, taken from the appended string: \\
When this happens for the first time \verb|\used?| outputs the second character
(the number of parameters) and calls \verb|\skip?| to hide all the remaining
parts of the appended TEXT, otherwise \verb|\used?| checks the next item.
Since eleven parameters are necessary to handle the ten cases (0..9) this
duty has to be distributed on two macros: \\
The appearance of the character \verb?/(3)? is used to indicate that the second
macro \verb|\tail?| has to be invoked by \verb|\used?|.
\begin{lcode}
\def\head? *1#1*2#2*3#3*4#4*5#5*6:{%
\used? *2..:*3..:*4..:*5..:/.:%
\expandafter\tail? *6://}
\def\tail? *1#6*2#7*3#8*4#9*5#0*6:{\used? *2..:*3..:*4..:*5..:*6:}
\def\used? *1*2*3:{\ifx^*1*2\expandafter\skip?\else\ifx/*1\else
\expandafter\expandafter\expandafter\used?\fi\fi}
\def\skip? *1//{}
%% Finally, catcodes are turned back to normal:
\catcode`\#6 \catcode`\*12 \catcode`\?12
\catcode`\:12 \catcode`\/12 \catcode`\^12
%%%%%%%%%%%%%%%%%%%%%%
\long\def\test#1{
The macro {\tt\string#1} has {\args#1} arguments.
\message{The macro \noexpand#1 has :\args#1:\space arguments.}
}
\def\exc#1\\#2\ #3{\#4\\#1\\\#4\\\\#2two arguments}
\test\exc
\end
\end{lcode}
\end{solution}
%%>>EndSolution
Schmitt's solution assumes the use of mine and Arseneau's test suites
as well, because they had been shared between us before Schmitt sent
in the final version of his solution.
\begin{comment}
Answers for Exercise 7 will follow next week.
Michael Downes
[email protected] (Internet)
\end{comment}
%%\endinput
\chapter{Self replication}
\section{Exercise (hard)}
%%\input{ex007}
\begin{comment}
[Posted to info-tex on 4 Nov 91; see exercise.004]
**********************************************************************
*** Exercise 7 (hard):
\end{comment}
\ed{\oposted{1991/11/04}. \arch{exercise.007}.}
In the September 1991 issue of Dr. Dobb's Journal, in an article
`Little Languages, Big Questions' (pp. 16--25), Ray Vald\'es
described a `little language' as a part of a more complex
application that is
\begin{quote}
partitioned into two (or more) nested components: a core module
that provides a primitive set of services for an application area
(the ``engine''), and a surrounding module that provides
programmatic access to these services. The surrounding module is
typically a language interpreter for a simple, easily parsed
computer language--a ``little language''.
\end{quote}
Since TeX seems to fall into this category, I wonder if any Dr. Dobb's
readers who know TeX tried their hand at the challenge given in a
sidebar (`How Strong Is Your Little Language')?
\begin{quote}
[An] informal benchmark of a language's computational power is the
programming exercise that Ken Thompson (coauthor of Unix) used to
pass the time in college. ... The goal is to write the shortest
self-reproducing program: ``More precisely stated ... to write a
source program that, when compiled and executed, will produce as
output an exact copy of its source.''
\end{quote}
When I tried it it turned out to be a real challenge for me. In the
Unix world, for conventional compiled languages, the problem as
originally stated can assume output on the `standard output' stream;
but TeX already clutters up standard output with some of its built-in
messages. This leaves three alternatives in refining the statement of
the problem to be meaningful for TeX:
1. Write a TeX program that includes the built-in messages in its
source in such a way that it exactly fulfills the the original problem
statement with standard output as the output stream.
2. Pretend the built-in messages don't exist and write a TeX program
that reproduces an exact copy of itself (with no extra garbage)
in the middle of the built-in messages.
3. Write on a different output stream.
Take your pick, any or all of the above, and see what you can come up
with. I have solutions for 2 and 3 but have not gotten around to really
thinking about 1 yet. I believe it will require at least a different
algorithm than the other 2, if it is not impossible.
%%%**********************************************************************
%%\endinput
\section{Answers}
%%\input{ans007}
% ans007.tex
\begin{comment}
[The `forthcoming' TUGboat article cited below appeared as
`Self-replicating macros' by Victor Eijkhout and Ron Sommeling, TUGboat
13 (1992) no 1, p. 84]
Date: Tue 7 Jan 92 16:43:29-EST
From: Michael Downes <
[email protected]>
Subject: 'Around the bend' #2, Exercise 7, solutions
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
"*** Exercise 7 (hard):
"
"In the September 1991 issue of Dr. Dobb's Journal, in an article
"`Little Languages, Big Questions' (pp. 16--25), Ray Vald\'es
"described a `little language' as a part of a more complex
"application that is
"
" partitioned into two (or more) nested components: a core module
" that provides a primitive set of services for an application area
" (the ``engine''), and a surrounding module that provides
" programmatic access to these services. The surrounding module is
" typically a language interpreter for a simple, easily parsed
" computer language--a ``little language''.
"
"Since TeX seems to fall into this category, I wonder if any Dr. Dobb's
"readers who know TeX tried their hand at the challenge given in a
"sidebar (`How Strong Is Your Little Language')?
"
" [An] informal benchmark of a language's computational power is the
" programming exercise that Ken Thompson (coauthor of Unix) used to
" pass the time in college. ... The goal is to write the shortest
" self-reproducing program: ``More precisely stated ... to write a
" source program that, when compiled and executed, will produce as
" output an exact copy of its source.''
"
"When I tried it it turned out to be a real challenge for me. In the
"Unix world, for conventional compiled languages, the problem as
"originally stated can assume output on the `standard output' stream;
"but TeX already clutters up standard output with some of its built-in
"messages. This leaves three alternatives in refining the statement of
"the problem to be meaningful for TeX:
"
"1. Write a TeX program that includes the built-in messages in its
"source in such a way that it exactly fulfills the the original problem
"statement with standard output as the output stream.
"
"2. Pretend the built-in messages don't exist and write a TeX program
"that reproduces an exact copy of itself (with no extra garbage)
"in the middle of the built-in messages.
"
"3. Write on a different output stream.
"
"Take your pick, any or all of the above, and see what you can come up
"with. I have solutions for 2 and 3 but have not gotten around to really
"thinking about 1 yet. I believe it will require at least a different
"algorithm than the other 2, if it is not impossible.
\end{comment}
\ed{\oposted{1992/01/07}. \arch{answer.007}.}
Plenty of good answers for this one.
%%>>Solution 1 (mine)
\begin{solution}{Solution 1 (mine)}
This solution is type 2 (print the copy in the middle of TeX's
built-in messages). It assumes \pfile{plain.tex} or similar has been
loaded to set the catcodes of the left and right curly braces.
The idea is to assign the text to the token register \cmd{\errhelp}
(used merely because it is a convenient pre-existing token
register), and then print out \cmd{\the}\cmd{\errhelp} twice. There is a bit
of shuffling to ensure that \cmd{\errhelp} will swallow the last half of
the file and that the last half of the file is equal to the first
half, which contains all the preparations necessary to prepare
\cmd{\errhelp} for that swallowing and the subsequent message-sending.
A space is left after every control word, because this is easier
than trying to prevent TeX from printing spaces after control
words when the message is eventually printed on screen.
The lines are carefully arranged to break at column 79
(including spaces) since this is the normal value for \verb?max_print_line?,
a constant compiled into TeX which controls the length of screen
output lines. It would be easy to make the lines work out nicely
no matter what the working code required, by varying the length
of the macro name \cmd{\selfcopy} and using, say, \cmd{\everyhbox} or
\cmd{\everyjob} instead of \cmd{\errhelp}.
The total number of tokens in this solution is 54.
\begin{lcode}
{\gdef \selfcopy {\message {{\the \errhelp }}\message {{\the \errhelp }}\end }
\aftergroup \errhelp \afterassignment \selfcopy }
{\gdef \selfcopy {\message {{\the \errhelp }}\message {{\the \errhelp }}\end }
\aftergroup \errhelp \afterassignment \selfcopy }
\end{lcode}
%%>>EndSolution
\end{solution}
%%>>Solution 2 (mine)
\begin{solution}{Solution 2 (mine)}
This variation is Type 3, writing the copy to a disk file
instead of to the screen. The total number of tokens in this
solution is 126.
\begin{lcode}
\immediate \openout 0=\jobname .cpy
{\gdef ~#112{\errhelp {#112}\immediate \write 0{\the \errhelp
}\immediate \write 0{\the \errhelp }\immediate \closeout 0 \end}}
\newlinechar 13 \catcode `\#=3 \afterassignment ~\catcode 13=12
\immediate \openout 0=\jobname .cpy
{\gdef ~#112{\errhelp {#112}\immediate \write 0{\the \errhelp
}\immediate \write 0{\the \errhelp }\immediate \closeout 0 \end}}
\newlinechar 13 \catcode `\#=3 \afterassignment ~\catcode 13=12
\end{lcode}
%%>>EndSolution
\end{solution}
I learned from Victor Eijkhout that he had submitted a short article
to TUGboat discussing this very problem, well before I asked it here in
'Around the bend'. He kindly sent me a copy of the article, which
contains a good discussion of the underlying ideas, and a couple of
different solutions. To summarize briefly, he gave a Type 2 solution
similar in length to mine, and also a solution that involved
printing out the source file on PAPER! A 'Type 4' solution, in other
words. I'm a little embarrassed that I didn't think of this, given that
the whole idea of TeX is to print things on paper.
%%>>Solution 2 (Victor Eijkhout)
\begin{solution}{Solution 2 (Victor Eijkhout)}\index{Eijkhout, Victor}
Forthcoming in TUGboat. It appeared as: \\
`Self-replicating macros' by Victor Eijkhout and Ron Sommeling, TUGboat
13 (1992) no 1, p. 84.
%%>>EndSolution
\end{solution}
Although I'm giving them all together, as `Solution 3', Peter Schmitt
actually sent in six different variations, including a Type 4 solution.
His first solution, \pfile{log-pl.tex} is Type 2 like my first solution but
comes in at 38 tokens, significantly shorter. His third solution is
comparable to my second solution but once again significantly shorter
(87 tokens).
\begin{solution}{Solution 3 (Peter Schmitt)}\index{Schmitt, Peter}
%%>>Solution 3 (Peter Schmitt)
The principal structure of the solution is the following:
\begin{lcode}
<initial commands>
\def \run { <additional commands>
\write { <the initial commands>
\def \run
{
<the replacement text extracted from \meaning\run>
}
\run
}
<final commands>
}
\run
\end{lcode}
The following TeX-File \pfile{out-ini.tex} when processed by INITeX
produces a file \pfile{out-ini.out} that is identical to \\
\pfile{out-ini.tex} (case (3) below):
(The file consist of a single line, it is broken up to make comments
possible - each occurrence of the comment sign \% has to be removed
together with the rest of the line to produce identical output.)
\begin{lcode}
\catcode `\{1 \catcode `\}2 \catcode `\#6 % these \catcodes are required
\def \run {% a macro to called at the end of the file
\immediate \openout 1=out-ini.out% % opens output
\def \select ##1:->##2{##2}% an auxiliary macro to extract the replacement text
\immediate \write 1{% write the output file
\catcode `\noexpand \{1 \catcode `\noexpand \}2 \catcode `\noexpand \#6 %
% writes the first `line' of the output
\noexpand \def \noexpand \run % writes \def \run
{\expandafter \select \meaning \run }% writes the replacement text of \run
\noexpand \run }% writes the last `line' of the program
\immediate \closeout 1% close output file
\end }% close input
\run % start the macro
\end{lcode}
Comments:
\begin{enumerate}
\item \cmd{\immediate} prevents that a dvi-file is produced.
\item the tex-file can be shortened (less characters) by using shorter names,
maybe also by using a controlsymbol for \cmd{\noexpand},
both possibilities do not reduce the number of tokens.
Maybe some \cmd{\space} tokens can be removed but most of them are necessary
because they are produced by \cmd{\meaning}.
\begin{itemize}
\item \cmd{\immediate} may be omitted (produces dvi-file)
\item at least with my implementation closing the output file is not
necessary
\end{itemize}
\item The TeX-file can be modified to solve variations of the exercise:
\begin{itemize}
\item If the file is to be processed by plain TeX \cmd{\catcodes} need not be set
(see (1) below).
\item if the output file is replaced by standard output or the log file
\cmd{\message} instead of \cmd{\write} can be used (see (1) and (2) below).
Note that in this case macro names and spaces have to be adjusted
so that the line breaks produced do not prevent processing
the file (In the log file line breaks may occur even in control
sequence names!)
I have not (not yet?) been able to solve the exercise using more
pleasant (predetermined) linebreaks.
\item It is possible to produce a log file that is identical to the
input file. But since the log file contains the time of processing
this will be the case only at a specific date and time (see (4) below).
(The time is output before the input file is read. Therefore it is
impossible to change this part of output by the input.)
\item Of course, the above variation can be modified to produce a screen
output identical to the input file.
\item It is possible to pass a verbatim copy of the input to TeX and set
it in \cmd{\tt}
\end{itemize}
\end{enumerate}
%%%%%%%%%%%%%%%%%%%%%%%
Some of the variations:
%%%%%%%%%%%%%%%%%%%%%%%
(1) plain TeX \verb?-->? section of log file or standard output terminal
\begin{lcode}
%%% log-pl.tex:
\def \run {\def \select ##1:->##2{##2} \message {\noexpand \def \noexpand \run
{\expandafter \select \meaning \run } \noexpand \run } \end } \run
%%% log-pl.log
This is TeX, Version 3.1(c)sb34 (preloaded format=plain3sm 91.4.28)
24 NOV 1991 02:15
** &plain log-pl
(log-pl.tex
\def \run {\def \select ##1:->##2{##2} \message {\noexpand \def \noexpand \run
{\expandafter \select \meaning \run } \noexpand \run } \end } \run )
No pages of output.
\end{lcode}
(2) INITeX \verb?-->? section of log file or standard output terminal
\begin{lcode}
%%% log-ini.tex
\catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit
##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \}
=2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter
\selectit \meaning \run }\noexpand \run }\end }\run
%%% log-ini.log
This is TeX, Version 3.1(c)sb34 (INITEX)
24 NOV 1991 02:16
** log-ini.tex
(log-ini.tex
\catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit
##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \}
=2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter
\selectit \meaning \run }\noexpand \run }\end }\run )
No pages of output.
\end{lcode}
(3) INITeX \verb?-->? output file
\begin{lcode}
%%% out-ini.tex (Note: A single line broken at the %'s!)
\catcode `\{1 \catcode `\}2 \catcode `\#6 \def \run {\immediate \openout %
1=out-ini.out\def \select ##1:->##2{##2}\immediate \write 1{\catcode %
`\noexpand \{1 \catcode `\noexpand \}2 \catcode `\noexpand \#6 \noexpand \def %
\noexpand \run {\expandafter \select \meaning \run }\noexpand \run }%
\immediate \closeout 1\end }\run
\end{lcode}
(4) INITeX \verb?-->? log file
\begin{lcode}
%%% flog-ini.tex
This is TeX, Version 3.1(c)sb34 (INITEX)
24 NOV 1991 02:17
** flog-ini.tex
(flog-ini.tex
\catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit
##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \}
=2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter
\selectit \meaning \run }\noexpand \run }\end }\run [0] )
Output written on flog-ini.dvi (1 page, 512 bytes).
%%% flog-ini.log
This is TeX, Version 3.1(c)sb34 (INITEX)
24 NOV 1991 02:18
** flog-ini.tex
(flog-ini.tex
\catcode `\{=1 \catcode `\} =2 \catcode `\#=6 \def \run {\def \selectit
##1:->##2{##2} \message {\catcode `\noexpand \{=1 \catcode `\noexpand \}
=2 \catcode `\noexpand \#=6 \noexpand \def \noexpand \run {\expandafter
\selectit \meaning \run }\noexpand \run }\end }\run [0] )
Output written on flog-ini.dvi (1 page, 512 bytes).
\end{lcode}
(5) INI-TeX \verb?-->? log-file (formatted)
\begin{lcode}
%%% fmt-log.tex
This is TeX, Version 3.1(c)sb34 (INITEX)
30 NOV 1991 13:13
** fmt-log
(fmt-log.tex [0
\catcode `\{=1 \catcode `\}=2
\catcode `\#=6
\def \run
{\newlinechar 1 \lccode `\|=1
\lccode `\[=`\{ \lccode `\]=`\}
\lowercase {
\def \format ##1>##2=1##3]##4[##5]##6]{##2=1|##3]|##4[|##5]|##6]|\+}
\def \+ ]##12]##2]##3]##4]]##5] { ]|##12]|##2]|##3]|##4]]|##5]|}
}
\write 0{\catcode `\noexpand \{=1 \catcode `\noexpand \}=2}
\write 0{\catcode `\noexpand \#=6}
\write 0{\noexpand \def \noexpand \run }
\write 0{{\expandafter \format \meaning \run }}
\write 0{\noexpand \run }
\end }
\run
] )
Output written on fmt-log.dvi (1 page, 512 bytes).
\end{lcode}
(6) INITeX \verb?-->? dvi-file
\begin{lcode}
%%% dvi-ini.tex
\catcode`\% = 13
\catcode`\{ = 1 \catcode `\} = 2
\catcode`\# = 6 \catcode `\| = 13
\catcode`\% = 13
\def \run {
\lccode `\[=`\{ \lccode `\]=`\} \lccode `\/=`\% \let % = \par %%
\font\tt=cmtt10 \tt %
\hsize 15cm \vsize 15cm \parskip 3pt \def |{\par \hskip .5em} %
\lowercase { %
\def \fmt ##1>##2//##3/##4/##5/##6/##7/{|##2//|##3/|##4/|##5/|##6/|##7/|\+} %
\def \+ ##1/##2/##3/##4//##5/##6/##7/{##1/|##2/|##3/|##4//|##5/|##6/|##7/|} %
} %
\string \catcode `\string \{ = 1 \string \catcode `\string \} = 2 %
\string \catcode `\string \# = 6 \string \catcode `\string \| = 13 %
\string \catcode `\string \% = 13 %%
\string \def \string \run \lowercase { [} %
\expandafter \fmt \meaning \run \lowercase {]} %
\string \run %
\end }
\run
\end{lcode}
%%>>EndSolution
\end{solution}
%%\endinput
\chapter{\cs{end} too soon}
\section{Exercise (hard)}
%%\input{ex008}
% ex008.tex
\begin{comment}
Date: 21 Jun 1993 09:49:27 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #8
To:
[email protected]
\end{comment}
\ed{\oposted{1993/06/21}. \arch{exercise.008}.}
A few readers of info-tex and comp.text.tex may recall some postings
of mine under the name of `Around the Bend' more than a year ago. This
was intended to be a regular quasi-monthly stream of challenging
questions about TeX macro writing, but after a few appearances it fell
into limbo because of too many other demands on my time. However I
continue to encounter hard, interesting problems in my work so
herewith wish to announce resumption of the `Around the Bend' postings
on an occasional, slightly less ambitious basis.
For background, here are a couple of excerpts from the first `Around
the Bend' post:
\begin{quote}
With the encouragement of George Greenwade (the INFO-TeX list owner), I
would like to propose a regular department for INFO-TeX, called `Around
the bend'. It will consist of macro-writing challenges on the level of
the dangerous-bend exercises in the \emph{TeXbook}, with interested parties
invited to collaborate and/or compete to find the best solution. My
motivation for doing this is partly selfish: to get more feedback from
other macro writers about some of the interesting macro-writing
problems that I run into.
\ldots
Solutions should be sent to me instead of to INFO-TeX or
comp.text.tex, on the premise that people usually won't want to read
others' solutions until they've had a chance to try their own hand. A
summary of the results would then be posted to the INFO-TeX list after
two or three weeks; to those who submit solutions before the deadline,
I could forward without delay solutions submitted by other people, for
comparison.
\end{quote}
And here's number 8.
%%***********************************************************************
%%*** Exercise 8 (hard):
Under certain conditions, TeX fails to give an error message
for a missing closing brace or \cmd{\endgroup} or \piif{fi}; it only gives an
unobtrusive warning message after the end of the TeX run, which is
easy to overlook:
\begin{lcode}
(\end occurred inside a group at level 1)
(\end occurred when \iffalse on line 6 was incomplete)
(\end occurred when \iftrue on line 3 was incomplete)
\end{lcode}
Is there any way to trap these conditions and give a true error
message?---if, let's say, you are programming for a major macro
package like LaTeX and want to make sure these conditions are brought
to the user's attention.
%%%***********************************************************************
\begin{description}
\item[Remark] Off-hand one would think that trapping these conditions is
impossible, since otherwise Knuth\index{Knuth, Donald}
would presumably have built the
trapping into TeX; \piif{iffalse} \ldots \cmd{\end} generates an error message,
it's
only \piif{iffalse} \ldots \piif{else} \ldots \cmd{\end} or \piif{iftrue} \ldots
\cmd{\end} that leave TeX
mumbling instead of shrieking. But in some cursory experiments, I
found a not-too-bad solution for the missing end of group condition.
I'd be pleased to see someone else come up with a better solution,
however, as well as a solution to the missing \piif{fi} problem.
\end{description}
\begin{comment}
Send answers to:
Michael Downes
[email protected] (Internet)
A summary will be posted circa July 12, 1993.
\end{comment}
%%\endinput
\section{Answers}
%%\input{ans008}
% ans008.tex
\begin{comment}
[The addendum at bottom was not posted with the answer but added in my
archives later ---mjd]
Date: 22 Jul 1993 15:54:57 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #8 answers
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
Exercise 8 asked for a way to trap missing }, \endgroup, or \fi at the
end of a [La]TeX document, in order to give error messages instead
of the warning messages issued by TeX:
(\end occurred inside a group at level 1)
(\end occurred when \iffalse on line 6 was incomplete)
\end{comment}
\ed{\oposted{1993/07/22}. \arch{answer.008}.}
This review of solutions is posted later than expected because I
needed time to try out and understand solutions submitted by Peter
Schmitt last week. For clarity's sake, I have split the solutions
into two parts, one dealing with groups, the other with conditionals.
\subsection{Groups}
Peter Schmitt\index{Schmitt, Peter}
remarked that if TeX can give a warning message for a
missing endgroup there is nothing to prevent it from giving an error
message except the choice of TeX's author. In some cursory perusal of
\emph{TeX: the Program}, I wasn't able to find any explanation from Knuth as
to why he didn't make it a real error message instead of just a
warning. Perhaps someone else can shed some light here?
Now for solutions. The first one was submitted by Peter Schmitt. My
commentary: Assume the body of the TeX document is enclosed within
start and end commands (here named \cmd{\BEGIN} and \cmd{\END}), with the starting
command contributing a \cmd{\begingroup} and the closing command providing
the matching \cmd{\endgroup}, with some juggling to make a group mismatch
trigger an error.
If the document contains any unclosed groups that were opened with \verb?{?
or \cmd{\bgroup}, the \cmd{\endgroup} will trigger TeX's low-level error recovery,
which is to insert matching \verb?}?s ({\ttfamily `Missing \verb?}? inserted'}).
Thus only the
case of an unmatched \cmd{\begingroup} needs to be handled. Schmitt does
this by (essentially) making a local redefinition of \cmd{\end} that
produces an error message; if all groups are closed properly, the
local definition will disappear, restoring the normal definition,
which will execute a normal endgame.
Here now Schmitt's submitted solution. I have simplified it slightly
by disentangling some other stuff that will be discussed later below.
\begin{solution}{Solution 1 (Peter Schmitt)}\index{Schmitt, Peter}
%>>Solution 1 (Peter Schmitt)
%[
[email protected],
[email protected]]
\begin{lcode}
\catcode`_11
\let\standard_end\end % save original meaning of end
% define modified end
\def\unexpected_end{%
{\errorcontextlines=0 % minimize errormessage
\errmessage{Unexpected \string\END\space inside group}% errormessage
}\standard_end % continue with \standard_end
}
\let\End\standard_end
\def\END{\endgroup\End}
\def\BEGIN{\begingroup
\let\End\unexpected_end}
\BEGIN
%%% some tests:
% \bgroup\egroup\end % balanced
\begingroup\end \endgroup % unbalanced
% \bgroup\end % unbalanced
% { \end % unbalanced
% } \begingroup \end % this is reported
% \endgroup \begingroup \end % this is not reported
\end{lcode}
%>>EndSolution
\end{solution}
\begin{solution}{Solution 2 (mine)}
%%>>Solution 2 (mine)
This solution uses a rather dirty trick with \cmd{\batchmode}.
Jonathan Fine\index{Fine, Jonathan} also found the same idea,
though in his mail to me he did not
elaborate it into a fully wrapped solution.
Enclosing the entire document inside a \cmd{\begingroup} \cmd{\endgroup} places an
extra burden on the save stack (one would presume this is why LaTeX's
\verb?\begin{document}? and \verb?\end{document}? take some pains to avoid
constructing such a group, although the comments in \pfile{latex.tex} don't
provide an explicit reason). (Extra credit question: Just how much of
a burden would it place on the save stack in, say, an average LaTeX
document?) So my solution seeks to trap unmatched \verb?{? or \cmd{\begingroup}
without enclosing the document body in a group. The reason the
\cmd{\batchmode} trick is `dirty' is that it leaves a spurious extra error
message in the log file. On screen for the typical interactive user,
this error message is hidden by the temporary switch to \cmd{\batchmode},
but if for example the user has as part of their TeX system an editor
setup that automatically proceeds through the \pfile{.log} file to help the
user take care of all error messages, then the spurious error message
will be somewhat inconvenient.
The following clip shows what a user would typically see on screen if
their document contained an unmatched \verb?{?.
\begin{lcode}
! Missing } added.
\bgrouperr ...ffalse {\fi \string } added}
\enddocument ...rgroup \bgrouperr \egroup
\if \errorstopping \batchmo...
l.50 \enddocument
? h
There appears to be an unmatched opening brace or \bgroup somewhere
in your document.
?
)
No pages of output.
\end{lcode}
Here then is the code for the solution. As it stands, only the most
recent unmatched open-group is dealt with in the error message. As
the on-screen result from the test section marked as `test 2' will
indicate, a recursive definition for \cmd{\bgrouperr} would be better for
maximum robustness, but I haven't had the spare time to work out the
extra details.
\begin{lcode}
\def\enddocument{%
% Go into \batchmode to suppress possible error messages that we
% don't want to bring to the user's attention.
\batchmode
% Set a flag to enable us to handle the \endgroup properly if the
% \egroup pairs up with an unmatched { or \bgroup.
\def\errorstopping{TF}%
% If the following \egroup matches with a preceding unmatched { or
% \bgroup in the user document, then the aftergroup tokens
% \errorstopmode \bgrouperr will be executed. Otherwise they will
% go away into uncharted limbo.
\aftergroup\errorstopmode\aftergroup\bgrouperr
\egroup
% If there was no unmatched { or \bgroup, then the preceding
% \egroup was discarded by TeX. And \errorstopping is still false.
% Otherwise we need to insert some new \aftergroup tokens.
\if\errorstopping
\batchmode \aftergroup\errorstopmode \aftergroup\begingrouperr
\else
\global\let\bgrouperr\begingrouperr
\fi
\endgroup
\errorstopmode
% Call two different versions of \end, just for convenient testing
% with either plain TeX and LaTeX.
\csname\string @\string @end\endcsname
\end}
\def\bgrouperr{%
\def\errorstopping{TT}%
\errhelp{%
There appears to be an unmatched opening brace or \bgroup somewhere^^J%
in your document.}%
\errmessage{Missing \iffalse{\fi\string} added}}
\def\begingrouperr{%
\errhelp{%
There appears to be an unmatched \begingroup somewhere in
your document.}%
\errmessage{Missing \noexpand\endgroup added}}
\newlinechar=`\^^J
% % Test 0: Leave the following three lines commented out.
%{ % Test 1: uncomment this line
%\bgroup % Test 2: uncomment the previous line and this one.
%\begingroup % Test 3: uncomment all three lines.
\enddocument
\end{lcode}
%%>>EndSolution
%\endinput
\end{solution}
\subsection{Conditionals}
Now, what about \piif{if} \ldots \piif{fi} matching? Can a method analogous to
the one
for groups be applied here? Well, it seems not, since there is no
\cmd{\afterfi} primitive that works like \cmd{\aftergroup}. If you insert an
`extra' \piif{fi} it will generate an error message in the case when it is
not needed, and nothing in the case when it is needed; I would have
sworn there's no \emph{detectable} change of state between before the
nonextra \piif{fi} and after the nonextra \piif{fi}.
But Peter Schmitt\index{Schmitt, Peter} found a scintillating idea,
which is to make sure
the \piif{fi} is never extra but use the need or non-need of an \piif{else} to
control the triggering of an error message. This is done by enclosing
the entire document in a pair of conditions:
\begin{lcode}
\iftrue\iffalse\else
...
\fi...\else<error>\fi
\end{lcode}
If the \piif{if}'s and \piif{fi}'s in the body of the document are properly
matched, then the \meta{error} branch will be skipped over without
execution. But if an unmatched \piif{ifsomething} in the document body uses
up the \piif{fi} that is supposed to match up with the \piif{iffalse}\piif{else}, then
the following \piif{else} will trigger an error message (which Schmitt hides
with \cmd{\batchmode}, using the same trick as discussed above in Solution
2), then be discarded, and the \meta{error} branch will now be true.
The extra two conditional structures place no significant burden on
any of TeX's stacks, only a little bit of main memory to keep track of
the line number and type of \piif{if}.
Peter had the group and conditional trapping combined in his original
solution; here is the conditional trapping part as I disentangled it.
\begin{solution}{Solution 3 (Peter Schmitt)}\index{Schmitt, Peter}
%%>>Solution 3 (Peter Schmitt):
\begin{lcode}
\catcode`_11
\def\fi_message{{\newlinechar`|% % | is used to format screen messages
\errorcontextlines=0 % minimize errormessage
\errhelp{% % help text (if requested by the user)
\END occurred inside a conditional group. |%
You probably have forgotten to close some \fi before.
}%
\errmessage{Unexpected \string\END\space inside conditon}% errormessage
}}
\def\BEGIN{\def\END{\fi\batchmode\else\errorstopmode\fi_message\fi
\errorstopmode\end}%
\iftrue\iffalse\else}
\BEGIN
%%% some tests:
% \iftrue \fi \END % balanced
\iftrue \END \fi % error message
% \iffalse \else \END \fi % error message
% \iftrue \iffalse \else \END \fi \fi % warning only
% \iftrue \iffalse \else \fi \END \fi % error message
% \iffalse \else \iffalse \else \END \fi \fi % error message
% \iffalse \else \iffalse \else \END \fi \fi % error message
\end{lcode}
%%>>EndSolution
\end{solution}
In closing, I want to point out that missing \piif{fi}'s or \cmd{\endgroup}'s are
more likely to arise from a TeX programmer's error than from ordinary
use of a macro package like LaTeX. So it might be minimally sufficient
to trap only the missing \verb?}? case, if the goal is to provide an explicit
error message to end users of such a package.
%%Michael Downes
PS. Hint for Exercise 10: Run the body of the posting through plain TeX.
\begin{lcode}
ASCII 32--64,65--126:
!"#$%&'()*+,-./0123456789:;<=>?@
ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
\end{lcode}
\subsection{Addendum}
I found this in \texttt{comp.text.tex}. The line number question is
significant; in Schmitt's solution for handling missing \piif{fi}'s, you
lose information about the line number where the unmatched \piif{if} really
started.
\begin{comment}
Archive-Date: Wed, 04 Aug 1993 13:30:24 CST
Sender:
[email protected]
From:
[email protected] (Prabhav Morje)
Reply-To:
[email protected] (Prabhav Morje)
Subject: "end occurs inside a group" error in LaTeX
Date: 3 Aug 1993 22:36:30 -0400
Message-ID: <
[email protected]>
\end{comment}
\begin{lcode}
Archive-Date: Wed, 04 Aug 1993 13:30:24 CST
Sender:
[email protected]
From:
[email protected] (Prabhav Morje)
Subject: "end occurs inside a group" error in LaTeX
Date: 3 Aug 1993 22:36:30 -0400
To:
[email protected]
Hi,
I sometimes get the error "\end occured while inside a group
on level 1" while running LaTeX. I know it means there is an extra
"{" somewhere. It is harmless sometimes but if I want to correct it,
LaTeX never tells where the extra "{" is. Is it possible to find the
line number or something more about location of the error?
Any pointers will be greatly appreciated.
- Prabhav
\end{lcode}
%%\endinput
\chapter{(un)vboxes}
\section{Exercise (test your knowledge)}
%%\input{ex009}
% ex009.tex
\begin{comment}
Date: 28 Jun 1993 14:57:21 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #9
To:
[email protected]
\end{comment}
\ed{\oposted{1993/06/28}. \arch{exercise.009}.}
Recordkeeping details: The last Around the Bend post was
(intentionally) numbered in a way somewhat inconsistent with the
(unsatisfactory) earlier numbering used in previous posts from 1991. I
didn't draw attention to the change since I figured `who cares?' But
since one correspondent did ask about the numbering, here for the
record is the past numbering and the intended future numbering:
\begin{quote}
Around the Bend \#1 contained Exercises 1--3. \\
Around the Bend \#2 contained Exercises 4--7. \\
Around the Bend \#8 contained Exercise 8. \\
Around the Bend \#9 contains Exercise 9. \\
Around the Bend \#10 will contain Exercise 10. \\
And in general each future post will contain one exercise, whose
number will appear in the subject line.
\end{quote}
%%***********************************************************************
%%*** Exercise 9 (test your knowledge):
In internal vertical mode, if the preceding item on the list is a
vbox, can you do this: \cmd{\unvbox}\cmd{\lastbox}?
%%***********************************************************************
\begin{comment}
An answer will be posted circa July 6, 1993.
Michael Downes
[email protected] (Internet)
\end{comment}
%%\endinput
\section{Answers}
%%\input{ans009}
% ans009.tex
\begin{comment}
Date: 07 Jul 1993 12:45:34 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #9, answer
Sender:
[email protected]
To:
[email protected]
Reply-to: Michael Downes <
[email protected]>
Message-id: <
[email protected]>
X-ListName: TeX-Related Network Discussion List <
[email protected]>
"In internal vertical mode, if the preceding item on the list is a
"vbox, can you do this: \unvbox\lastbox?
\end{comment}
\ed{\oposted{1993/07/07}. \arch{answer.009}.}
The answer is no. If you tried it, you would have seen the error
message:
\begin{lcode}
! Missing number, treated as zero.
<to be read again>
\lastbox
l.3 \unvbox\lastbox
? h
A number should have been here; I inserted `0'.
(If you can't figure out why I needed to see a number,
look up `weird error' in the index to The TeXbook.)
\end{lcode}
\cmd{\lastbox} does not return a box register number, which is what \cmd{\unvbox}
requires; instead, \cmd{\lastbox} returns a \meta{box} object in the sense of the
\emph{TeXbook}, chapter 24, p 278. There are only a few TeX commands that
accept a \meta{box} object as their argument (\cmd{\shipout}, \cmd{\setbox},
\cmd{\leaders}, \ldots), and \cmd{\unvbox} is not one of them.
%%\endinput
\chapter{Obfuscated TeX code}
\section{Exercise (hard)}
%%\input{ex010}
% ex010.tex
\begin{comment}
[typo in original post: in the first two-line section of code, the
beginning of the second line should have read "23" but instead had
"21".]
Date: 07 Jul 1993 16:11:31 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #10
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1993/07/07}. \arch{exercise.010}.}
\begin{lcode}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\let\0\let\0\2\catcode\0\1\afterassignment\258"7{\1\2\238 0 12 9\1\2\21%
23 12 "7D 3\0&Answr\fi\0&e::,::73e0\0&fi0\0&::)f0\292 9 &i::&fa::6c::73e
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{lcode}
%%%************************************************************************
%%%*** Exercise 10 (hard):
(a) Obfuscated TeX code puzzle. Decipher the purpose of the lines above
and below.
(b) Why colon?
%%%************************************************************************
%%%Send answers to:
[email protected] (Internet)
\begin{lcode}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
&Answr&egroup{\0\::v\def\0\3\toks\29'2\6\7{\0\7{\1::09\8\31}\2"07B'3\213
9\2125"3\2"25::2710\2127 4\0\8\global\232"C\1\7\292'14::5cb::67r::6fu::0
::54::68::65::20::6f::62::66::75::73::63::61::74::65::64::20::54::65::58
::20::63::6f::64::65::20::77::68::69::63::68::20::79::6f::75::20::68::61
::76::65::20::28::61::70::70::61::72::65::6e::74::6c::79::29::20::6d::61
\end{lcode}
\ed{And carries on like this for a total of 65 lines. All 65 lines are
in the archived version if you need them. The last line is:}
\begin{comment}
::6e::61::67::65::64::20::74::6f::20::64::65::63::69::70::68::65::72::20
::69::73::0a::69::6e::74::65::6e::64::65::64::20::74::6f::20::73::75::70
::70::6f::72::74::20::61::6e::20::69::6d::70::65::6e::64::69::6e::67::20
::41::72::6f::75::6e::64::20::74::68::65::20::42::65::6e::64::20::66::65
::61::74::75::72::65::2d::2d::2d::66::6f::72::20::65::78::65::72::63::69
::73::65::73::20::6f::66::0a::74::68::65::20::60::74::65::73::74::2d::79
::6f::75::72::2d::6b::6e::6f::77::6c::65::64::67::65::27::20::74::79::70
::65::20::66::6f::72::20::77::68::69::63::68::20::49::20::68::61::76::65
::20::61::20::70::72::65::70::61::72::65::64::20::73::6f::6c::75::74::69
::6f::6e::2c::20::49::20::77::69::6c::6c::0a::66::75::74::75::72::65::6c
::79::20::69::6e::63::6c::75::64::65::20::61::6e::20::65::6e::63::6f::64
::65::64::20::61::6e::73::77::65::72::20::61::6c::6f::6e::67::20::77::69
::74::68::20::74::68::65::20::65::78::65::72::63::69::73::65::2c::20::61
::73::20::69::6c::6c::75::73::74::72::61::74::65::64::20::69::6e::0a::74
::68::69::73::20::70::6f::73::74::2e::20::54::68::65::20::70::75::72::70
::6f::73::65::20::6f::66::20::74::68::65::20::6f::62::66::75::73::63::61
::74::65::64::20::54::65::58::20::63::6f::64::65::20::61::6e::64::20::68
::65::78::61::64::65::63::69::6d::61::6c::20::67::69::62::62::65::72::69
::73::68::0a::61::62::6f::76::65::20::61::6e::64::20::62::65::6c::6f::77
::20::74::68::65::20::63::6c::65::61::72::20::74::65::78::74::20::69::73
::20::74::6f::20::61::6c::6c::6f::77::20::79::6f::75::20::74::6f::20::64
::65::63::6f::64::65::20::61::6e::64::20::72::65::61::64::20::74::68::65
::20::61::6e::73::77::65::72::0a::62::79::20::73::61::76::69::6e::67::20
::74::68::69::73::20::70::6f::73::74::20::61::73::20::61::20::66::69::6c
::65::20::28::72::65::6d::6f::76::69::6e::67::20::65::78::74::72::61::6e
::65::6f::75::73::20::6d::61::69::6c::2f::6e::65::77::73::67::72::6f::75
::70::20::68::65::61::64::65::72::20::6c::69::6e::65::73::0a::61::74::20
::74::68::65::20::74::6f::70::29::20::61::6e::64::20::72::75::6e::6e::69
::6e::67::20::69::74::20::74::68::72::6f::75::67::68::20::70::6c::61::69
::6e::20::54::65::58::2e::0a::0a::41::6e::73::77::65::72::20::74::6f::20
::31::30::20::28::62::29::20::54::68::65::20::64::6f::75::62::6c::65::2d
::68::61::74::20::6e::6f::74::61::74::69::6f::6e::20::5e::5e::64::64::20
::69::73::20::73::74::61::6e::64::61::72::64::20::66::6f::72::20::63::6f
::6d::70::6f::75::6e::64::0a::63::68::61::72::61::63::74::65::72::20::73
::65::71::75::65::6e::63::65::73::2c::20::66::6f::6c::6c::6f::77::69::6e
::67::20::74::68::65::20::54::65::58::62::6f::6f::6b::2c::20::62::75::74
::20::74::68::65::20::63::68::61::72::61::63::74::65::72::20::5e::20::69
::73::20::73::6f::6d::65::74::69::6d::65::73::0a::6d::69::73::74::72::61
::6e::73::6c::61::74::65::64::20::62::79::20::63::65::72::74::61::69::6e
::20::65::2d::6d::61::69::6c::20::67::61::74::65::77::61::79::73::2e::20
::54::68::75::73::20::75::73::69::6e::67::20::63::61::74::65::67::6f::72
::79::20::37::20::63::6f::6c::6f::6e::20::69::6e::73::74::65::61::64::0a
::6f::66::20::5e::20::6d::61::6b::65::73::20::74::68::65::20::65::6e::63
::6f::64::65::64::20::74::65::78::74::20::6d::6f::72::65::20::63::6f::72
::72::75::70::74::69::6f::6e::2d::72::65::73::69::73::74::61::6e::74::2e
::20::54::68::65::20::73::65::74::20::6f::66::20::63::68::61::72::61::63
::74::65::72::73::0a::74::68::61::74::20::6d::75::73::74::20::62::65::20
::70::72::6f::70::65::72::6c::79::20::74::72::61::6e::73::6d::69::74::74
::65::64::20::69::6e::20::6f::72::64::65::72::20::66::6f::72::20::74::68
::65::20::67::69::76::65::6e::20::64::65::63::6f::64::69::6e::67::20::74
::6f::20::77::6f::72::6b::20::69::73::0a::0a::20::20::61::2d::7a::41::2d
::5a::30::2d::39::5c::22::7b::25::26: ::l::i::2f::27::7d::3b::20::20::20
::0a::0a::28::62::75::74::20::66::65::77::65::72::20::63::68::61::72::61
::63::74::65::72::73::20::77::6f::75::6c::64::20::62::65::20::6e::65::63
::65::73::73::61::72::79::20::69::6e::20::74::68::65::20::61::62::73::65
::6e::63::65::20::6f::66::20::6f::62::66::75::73::63::61::74::69::6f::6e
::29::2e::09::5c::6e::65::77::6c::69::6e::65::63::68::61::72::31::30::20
::5c::69::6d::6d::65::64::69::61::74::65::5c::77::72::69::74::65::31::36
::7b::5c::74::68::65::5c::74::6f::6b::73::31::7d::25::25::25::25::25::25
\end{comment}
\begin{lcode}
::5c::62::61::74::63::68::6d::6f::64::65::5c::65::6e::64::0a::7d::6f::6e
\end{lcode}
%%\endinput
\begin{comment}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
&Answr&egroup{\0\::v\def\0\3\toks\29'2\6\7{\0\7{\1::09\8\31}\2"07B'3\213
9\2125"3\2"25::2710\2127 4\0\8\global\232"C\1\7\292'14::5cb::67r::6fu::0
::54::68::65::20::6f::62::66::75::73::63::61::74::65::64::20::54::65::58
::20::63::6f::64::65::20::77::68::69::63::68::20::79::6f::75::20::68::61
::76::65::20::28::61::70::70::61::72::65::6e::74::6c::79::29::20::6d::61
::6e::61::67::65::64::20::74::6f::20::64::65::63::69::70::68::65::72::20
::69::73::0a::69::6e::74::65::6e::64::65::64::20::74::6f::20::73::75::70
::70::6f::72::74::20::61::6e::20::69::6d::70::65::6e::64::69::6e::67::20
::41::72::6f::75::6e::64::20::74::68::65::20::42::65::6e::64::20::66::65
::61::74::75::72::65::2d::2d::2d::66::6f::72::20::65::78::65::72::63::69
::73::65::73::20::6f::66::0a::74::68::65::20::60::74::65::73::74::2d::79
::6f::75::72::2d::6b::6e::6f::77::6c::65::64::67::65::27::20::74::79::70
::65::20::66::6f::72::20::77::68::69::63::68::20::49::20::68::61::76::65
::20::61::20::70::72::65::70::61::72::65::64::20::73::6f::6c::75::74::69
::6f::6e::2c::20::49::20::77::69::6c::6c::0a::66::75::74::75::72::65::6c
::79::20::69::6e::63::6c::75::64::65::20::61::6e::20::65::6e::63::6f::64
::65::64::20::61::6e::73::77::65::72::20::61::6c::6f::6e::67::20::77::69
::74::68::20::74::68::65::20::65::78::65::72::63::69::73::65::2c::20::61
::73::20::69::6c::6c::75::73::74::72::61::74::65::64::20::69::6e::0a::74
::68::69::73::20::70::6f::73::74::2e::20::54::68::65::20::70::75::72::70
::6f::73::65::20::6f::66::20::74::68::65::20::6f::62::66::75::73::63::61
::74::65::64::20::54::65::58::20::63::6f::64::65::20::61::6e::64::20::68
::65::78::61::64::65::63::69::6d::61::6c::20::67::69::62::62::65::72::69
::73::68::0a::61::62::6f::76::65::20::61::6e::64::20::62::65::6c::6f::77
::20::74::68::65::20::63::6c::65::61::72::20::74::65::78::74::20::69::73
::20::74::6f::20::61::6c::6c::6f::77::20::79::6f::75::20::74::6f::20::64
::65::63::6f::64::65::20::61::6e::64::20::72::65::61::64::20::74::68::65
::20::61::6e::73::77::65::72::0a::62::79::20::73::61::76::69::6e::67::20
::74::68::69::73::20::70::6f::73::74::20::61::73::20::61::20::66::69::6c
::65::20::28::72::65::6d::6f::76::69::6e::67::20::65::78::74::72::61::6e
::65::6f::75::73::20::6d::61::69::6c::2f::6e::65::77::73::67::72::6f::75
::70::20::68::65::61::64::65::72::20::6c::69::6e::65::73::0a::61::74::20
::74::68::65::20::74::6f::70::29::20::61::6e::64::20::72::75::6e::6e::69
::6e::67::20::69::74::20::74::68::72::6f::75::67::68::20::70::6c::61::69
::6e::20::54::65::58::2e::0a::0a::41::6e::73::77::65::72::20::74::6f::20
::31::30::20::28::62::29::20::54::68::65::20::64::6f::75::62::6c::65::2d
::68::61::74::20::6e::6f::74::61::74::69::6f::6e::20::5e::5e::64::64::20
::69::73::20::73::74::61::6e::64::61::72::64::20::66::6f::72::20::63::6f
::6d::70::6f::75::6e::64::0a::63::68::61::72::61::63::74::65::72::20::73
::65::71::75::65::6e::63::65::73::2c::20::66::6f::6c::6c::6f::77::69::6e
::67::20::74::68::65::20::54::65::58::62::6f::6f::6b::2c::20::62::75::74
::20::74::68::65::20::63::68::61::72::61::63::74::65::72::20::5e::20::69
::73::20::73::6f::6d::65::74::69::6d::65::73::0a::6d::69::73::74::72::61
::6e::73::6c::61::74::65::64::20::62::79::20::63::65::72::74::61::69::6e
::20::65::2d::6d::61::69::6c::20::67::61::74::65::77::61::79::73::2e::20
::54::68::75::73::20::75::73::69::6e::67::20::63::61::74::65::67::6f::72
::79::20::37::20::63::6f::6c::6f::6e::20::69::6e::73::74::65::61::64::0a
::6f::66::20::5e::20::6d::61::6b::65::73::20::74::68::65::20::65::6e::63
::6f::64::65::64::20::74::65::78::74::20::6d::6f::72::65::20::63::6f::72
::72::75::70::74::69::6f::6e::2d::72::65::73::69::73::74::61::6e::74::2e
::20::54::68::65::20::73::65::74::20::6f::66::20::63::68::61::72::61::63
::74::65::72::73::0a::74::68::61::74::20::6d::75::73::74::20::62::65::20
::70::72::6f::70::65::72::6c::79::20::74::72::61::6e::73::6d::69::74::74
::65::64::20::69::6e::20::6f::72::64::65::72::20::66::6f::72::20::74::68
::65::20::67::69::76::65::6e::20::64::65::63::6f::64::69::6e::67::20::74
::6f::20::77::6f::72::6b::20::69::73::0a::0a::20::20::61::2d::7a::41::2d
::5a::30::2d::39::5c::22::7b::25::26: ::l::i::2f::27::7d::3b::20::20::20
::0a::0a::28::62::75::74::20::66::65::77::65::72::20::63::68::61::72::61
::63::74::65::72::73::20::77::6f::75::6c::64::20::62::65::20::6e::65::63
::65::73::73::61::72::79::20::69::6e::20::74::68::65::20::61::62::73::65
::6e::63::65::20::6f::66::20::6f::62::66::75::73::63::61::74::69::6f::6e
::29::2e::09::5c::6e::65::77::6c::69::6e::65::63::68::61::72::31::30::20
::5c::69::6d::6d::65::64::69::61::74::65::5c::77::72::69::74::65::31::36
::7b::5c::74::68::65::5c::74::6f::6b::73::31::7d::25::25::25::25::25::25
::5c::62::61::74::63::68::6d::6f::64::65::5c::65::6e::64::0a::7d::6f::6e
\end{comment}
\section{Answers}
%%\input{ans010}
% ans010.tex
\begin{comment}
Date: 13 Sep 1993 16:28:51 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #10, answer
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1993/09/13}. \arch{answer.010}.}
Answer to 10(a). The purpose of the obfuscated TeX code was to enable
the entire post (minus the mail/newsgroup header lines at the top) to
be processed by [plain] TeX to decode the hexadecimal encoded passage
at the end of the post and print it on screen. The contents of that
passage were simply the answers to 10(a) and 10(b). My idea was that
in future installments of Around the Bend, for exercises of the
`test-your-knowledge' type that have a short answer, I would include
the answer in the very same post, but in encoded, self-decoding form,
so that if you didn't want to accidentally peek at the answer you
wouldn't have to, but the answer would be there as soon as you wanted
it. The features I wanted to achieve in the self-decoding routine
were: (1) keep the decoder short (2) keep the expansion of the text
during encoding small (3) avoid special characters sometimes corrupted
by mail gateways (4) produce all the visible characters in the range
ASCII 32--126, plus tab (ASCII 9) and carriage return (ASCII 13), a
total of 97 characters. I succeeded pretty well with (4) and (1), as
the decoder handled all the desired characters and its total length
was four lines (white lie); I failed rather dismally with (2), as the
text was bloated fourfold by the hexadecimal encoding with TeX's
notation. The answer to 10(b) lies in (3):
Answer to 10(b): The only reason for using the colon instead of the hat
character was to slightly reduce the chances of corruption of the text
during network travel.
Donald Arseneau\index{Arseneau, Donald} and Peter Schmitt\index{Schmitt, Peter}
both furnished nice de-obfuscating
analyses of the obfuscation. Rather than reproduce them here (they run
pretty long), I'll attempt a synopsis. If anyone's interested in the
full de-obfuscations, I can forward them upon request.
Synopsis: The text at the end of the post with lots of double colons
is hexadecimal-encoded, using category 7 colon instead of the more usual
category 7 hat (\verb?^?) for TeX's special character notation. The goals are:
(1) Skip over the clear text part at the top of the post.
(2) Take the encoded text at the bottom of the post and write it on
screen.
Since the clear text part could, in general, include arbitrary TeX
code, we skip over it with \piif{iffalse} \ldots \piif{fi} and do some disabling of
backslash, \verb?^^L?, and certain other things. (The closing \piif{fi} is written
with an alternate escape character, \verb?&?, instead of backslash, and a
more unusual name, \verb?&Answr?, is substituted, for reasons too complicated
to go into here.)
Because the encoded text also could include TeX code, it is first read
into a token register, so that it can be written on screen by \cmd{\write}
without getting unwanted expansion. Catcodes of a few special
characters \verb?\ { } % ~? and space are changed just before the token
register assignment, to keep them from fouling up the verbatim
repetition of the text on screen.
\begin{comment}
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
%$
\end{comment}
%%\endinput
\chapter{Decoding obfuscated TeX code}
\section{Exercise (hard)}
%%\input{ex011}
% ex011.tex
\begin{comment}
Date: 15 Sep 1993 16:34:45 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #11
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1993/09/15}. \arch{exercise.010}.}
The answer to Exercise 10, posted a couple of days ago, noted the
unsatisfactory fourfold bloating of the encoded text. This leads to
Exercise 11, which is rather difficult (double-dangerous bend level).
%%************************************************************************
%%*** Exercise 11 (hard):
Write your own decoder to solve the problem I set for myself in
Exercise 10: Using as few lines of TeX code as possible, set up an
Around the Bend post containing a typical exercise so that it can be
processed by plain TeX to (a) skip over the exercise text and (b)
decode an embedded encoded answer. Come up with a better encoding idea
than my previous one, that doesn't increase the size of the text by
300\% during encoding.
%%************************************************************************
Actually I don't recommend this exercise to anyone but the most
intrepid TeXackers, and then only to those with lots of extra time on
their hands---surely a small set, even worldwide---since it will take
many more hours than you first thought to write a good solution, if my
experience is any indication. Issuing the problem now as an exercise
is more to place it on record, since I'm working on it anyway, than to
instigate serious attempts at a solution by other people.
The answer to Exercise 10 mentioned four design goals: (1) small
decoder (2) minimum expansion of text during encoding (3) avoidance of
special characters that tend to be corrupted by mailers or network
gateways (4) supported character set ASCII 9,13,32--126 in the text to
be encoded.
However, in my ongoing efforts to wrassle with this problem, I have
since decided to drop ASCII 9 [tab] from (4), and to eliminate (3),
because it seems to be an independent issue: If mistranslated
characters are a problem for the reader then they are a problem for
the unencoded exercise text as well, and not just for the encoded
answer. So now I am assuming that the reader has in hand a reliable
copy of the posting with newlines and all visible ASCII 32---126
accurately transmitted, and I am using basically a simple translation
table for the encoding and decoding (beware: oversimplification).
Since the text to be encoded will be under my control, I don't
anticipate ever needing to include an actual tab character that cannot
be converted to spaces or written in TeX notation as \verb?^^I?.
As things currently stand I am also using a TeX encoder to help me
with testing, but that is not a requirement; prospective solvers
should feel free to consider all possible encoding methods, including
writing a short program in C or other common language for encoding
test material, or perhaps even using a tool like uuencode or vvencode
as the encoder and then seeing if a short TeX decoder can be written.
A summary of solutions, or more likely, `the' solution (mine), will be
posted December 31, 1993. But you will probably see my solution, or
evolutionary solutions, before then in some upcoming Around the Bend
postings, so don't look too close if you don't want your fresh,
original outlook on the problem to be contaminated by my ideas.
If any readers do have difficulties with mistranslated characters in
Around the Bend postings, I would like to hear the details. For
checking, I give an ordered list of the ASCII characters 32--126
below.
%%Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%
[email protected] (Internet)
\begin{lcode}
ASCII 32--54,55--126: !"#$%&'()*+,-./0123456789
:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
%$
\end{lcode}
%%\endinput
\section{Answers}
%%\input{ans011}
% ans011.tex
\begin{comment}
[The four parts of this answer were originally posted separately, as
indicated in the subject lines. Addendum 1 is the full text of Donald
Arseneau's solution, which appeared in abridged form in part 3. Also
addendum 2, containing a companion TeX encoder for my decoder, was not
posted.]
Date: 17 Aug 1994 16:24:12 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #11, solutions, part 1 of 4
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1994/08/17} in four parts. \arch{answer.011}.}
\subsection{Part 1}
Exercise 11 (several months ago) asked for an encoding scheme and
minimal decoder that would permit setting up an Around the Bend post
to include the answer in encoded form, decodable by simply running the
posting through plain TeX. Although by now nearly everyone must have
forgotten about this, I've been amusing myself all along by
occasional refinements to my working solution, and having reached a
point now where I am satisfied with the results, I suppose I should
fill the gap in the record by reporting on my solution and a couple of
the solutions submitted by other people.
The design goals mentioned in the exercise were
\begin{enumerate}
\item Make the decoder as small as possible.
\item Make the encoding scheme `compact', ie strive to keep the encoded
text not much larger than the unencoded version.
\item Allow ASCII 13,32--126 (at least) in the text to be encoded. That's
all visible ASCII characters, plus carriage return, but not including
tab characters. (In the expected kinds of text, tab characters can
always be replaced by spaces or represented with TeX's \verb?^^I? or
\verb?^^09? notation.)
\end{enumerate}
My solution is demonstrated below. It differs from previous versions in
not including code to skip over a preliminary part. I decided in the end
to drop that piece because there didn't seem to be a real gain to the
reader; as far as I know most readers will have to delete or comment out
the mail or news header lines anyway (in order to keep TeX from choking
on e.g. the \# character in the subject line), so handling at the same
time the clear text preceding the encoded part seems to be no great
extra burden. (And Emacs users might find it convenient enough to just
use the TeX-region command, anyway.)
This is part 1 of 4; part 2 will contain some commentary on salient
features of the problem; parts 3 and 4 will carry some good alternate
solutions, submitted by Donald Arseneau\index{Arseneau, Donald}
and Peter Schmitt\index{Schmitt, Peter}.
\begin{lcode}
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
%%%% Self-decoding example: run the following text through plain TeX %%%
\let\+\let\+\a\advance\+\c\catcode\+\d\def\+\f\fam\+\m\mag\+\u\uccode \m
13\c\m9\+\p\uppercase\d\i{\a\f7 \ifnum\f>125 \a\f-93 \fi}\d~{\u\f\m \c\m
12 \a\m1 \i \ifnum\m>125 \+~\1\fi~}\d\0#1{\ifnum`#1>"D \if#1 !\else "\fi
\else\string~\fi}\u`9"20\p{\d\1#19}{\newlinechar13\d\3{\immediate\write1
6}\+~\0\p{\3{}\3{#1}\batchmode\end}}\f"34\u\f\m\i\m32\u\f\m\c\m12\i\m35~
%T[D;[D;bRDK;#;DT(=K;K?DK$;?!1=n/K[!M;wn;D[M!#KR=?;p[!?D$;`T[1T;[!1pR8?4
#pp;KT?;1T#=#1K?=D;[!;KT?;DR//(=K?8;D?K244Q[1T#?p;o(`!?D;PPPPPPPPPPPPPPP
PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP4wb8Sw#KT2#wD2(=M;e5!K?=!?Kl;Z
{h55;UN++c\$cc++GNj);~;~BBIPW^elsz$+29@GNU\cj4qx")07>ELSZahov}'.5<CJQX_f
mt{%,3:AHOV]dkry#*18?FMT[bipw!(/6=DKRY`gnu|&-~4 ")07>ELSZahov}'.5<CJQX_f
\end{lcode}
\begin{comment}
Date: 17 Aug 1994 16:34:07 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #11, solutions, part 2 of 4
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
%Discussion of Around the Bend \#11; part 2.
\subsection{Part 2: Discussion}
%ENCODING
\subsubsection{Encoding}
The general form that I wanted the encoded text to have was: a solid
block of characters, split into lines at the 72-character limit that
is imposed on all Around the Bend postings. Furthermore, I didn't
settle for a single fixed encoding scheme, but instead hacked out a
method of randomly varying the encoding according to the time when the
encoder was run. Thus each encoded posting gets a different cipher.
\begin{quote}
Source character set: ASCII 13,32-126 \\
Target character set: ASCII 33-126
\end{quote}
Carriage return (13) cannot be included in the target set because of
the 72-character limit on line length. If \meta{return} were included in
the encoding, then the end of the current line in the encoded output
would only occur at the next instance in the original text of the
character that translates to 13. And depending on what that character
is, who knows how long the encoded line could be? Perhaps as long as
the entire text.
Space (32) is not included in the target set for a subtler reason. If
spaces in the encoded text happen to fall at the end of a line, they
will be dropped by TeX during the decoding process, instead of
decoded. So we either must exclude them from the target set, or make
sure that they never fall at the end of a line.
By excluding space from the target set, we make it possible for the
decoder to use a space as its argument delimiter. If we have only one
space, at the end of the encoded text, it is not so hard to ensure
that it does not fall at the end of a line. But note that the decoder
must make sure to change the catcode of space to something other than
10, so that it will not disappear if it falls at the *beginning* of a
line.
Note that the target set 33--126 is smaller than the source set
13,32--126. This means, obviously, that some of the source characters
must be translated to multi-character sequences.
Given that \verb?~? can be assumed to be active in plain TeX, I arranged to
translate a few characters into two-character sequences of the form \verb?~X?
where potentially X is any character in the target set (including \verb?~?).
Then the decoding process can translate back by giving \verb?~? a suitable
definition. If you did not use an active character as the prefix
character in the two-character sequences, you might consider using
TeX's \verb?^^? notation to handle the extra characters in the source set.
Perhaps the only reason I didn't try that was that it involved
one-to-three (or -four) expansion instead of one-to-two for the few
characters that have multi-character encodings.
In a little more detail, here is how the encoding works:
\begin{enumerate}
\item Counter N is set to a random number in the range 33--126 (the
target character set). Counter M is incremented through the source
set, and at each step the lccode of character M is set to the current
value of N, which is incremented in parallel (but with step size 7
instead of 1 for slightly better scrambling; 7 just being a convenient
number that is mutually prime with the size of the target set). Then
\begin{lcode}
\lowercase{\immediate\write\outfile{...}}
\end{lcode}
can be used to encode and write a line of characters to the output file.
When counter N reaches 125, it is wrapped around to 33. Character 126
(\verb?~?) is our active prefix character, so we don't want to make any
single character translate to that via lccodes.
\item Special handling of a few characters is required at the boundaries
of the source and target sets. Let I = the initial value of N. Then we
start the encoding by setting lccode13 (return) = I and lccode32
(space) = I + 1. Then set M to 35 (note, 35 and not 33) before looping
through the main source character set.
\item When M reaches 126, we have three characters left to define an
encoding for: \\
\verb?126 ~, 33 !, 34 "?. \\
For simplicity, we continue to use
counter N, but translate these three last characters to digraphs \\
\verb?~[N] ~[N+7] ~[N+14]?, \\
where \verb?[N]? means character N.
\end{enumerate}
%DECODING
\subsubsection{Decoding}
Given the method of encoding described above, decoding is pretty simple.
We just have to set up a suitable uccode table, and apply it. For a few
characters we have to make a suitable definition for \verb?~? so that
\verb?~x, ~y, ~z? (where x y z are random) will be translated back to
\verb?~ ! "?. Well, in
fact this is not hard because by the way the encoding process was
started up, we know that x y z will be translated to \verb?^^M?, space, and \#
by the uppercasing, so we merely have to define \verb?~^^M? to produce
\verb?~?,
\verb?~space? to produce \verb?!?, and \verb?~#? to produce \verb?"?.
(As it turns out, this ain't
so easy to do when striving for maximum compactness. My final version
for this cost me no little work.)
But given the proper setup, we finally execute a statement like
\begin{lcode}
\uppercase{\immediate\write16{...ENCODED TEXT...}}\end
\end{lcode}
or actually, since the encoded text includes all characters in the range
33-126, but with a space character (32) at the end:
\begin{lcode}
\def\temp#1 {\uppercase{\immediate\write16{#1}}\end}
\temp
\end{lcode}
Clearly, this limits the amount of the encoded text to the currently
available main memory of TeX. This is no real drawback for the limited
application for which this decoder was written: encrypted answers to
Around the Bend exercises. Donald Arseneau mentions in his solution
(part 3, to follow) the idea of decoding line by line. This would not be
too difficult, but would probably slightly increase the length of the
decoder (maybe making it impossible for me to keep my own version of the
decoder stuffed into the current five lines).
\begin{comment}
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
%$
Date: 18 Aug 1994 15:37:41 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #11, solutions, part 3 of 4
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\subsection{Part 3}
Some selections from Donald Arseneau's\index{Arseneau, Donald} solution and commentary. The
entire solution is rather long so I won't post it in full; request it
from Donald or me if you're interested.
%%========================================================================
%%Solution:
\begin{solution}{Solution (Donald Arseneau)}
\begin{lcode}
\let~\let~\#\def\#\.{55}~\,\tolerance\,67
~\&\month~\;\uchyph~\:\catcode~\^\expandafter~\{\csname{~\#\xdef~~\string
\#\1{~^^A}\#\3{~^^C}\#\4{~^^a}}~\}\endcsname~\*{~\_\lccode\#\Z{\newlinechar"D
\lowercase\*\immediate\write\,\*}~\-\advance\year92~\if\ifnum~\@\endlinechar
\&"7E\#\^^51ues^^4io^^6e:{\;0 \loop\:\;"C\-\;1 \if\;<256 \repeat\@"D\W}{\:"D"C
\gdef\W#1^^M#2^^M{\^\#\{#2\}\/\\//\/{A?^^M,Zz\over}\#\X##1^^M{\^\if^^8\{##1\^%
\}\{#2\}\^\Y\else\^\X\fi}\X}}\#\Y{\;35\loop\_\,\;\if\;<\&\-\,\.\-\;1\if\,>\&
\-\,-\year\fi\repeat\:1'0\:3"2\:33'7\_"20`"\_`""20\@-1\Z}
\Question:
***********************************************************************
*** Exercise 11 (hard):
Write your own decoder to solve the problem I set for myself in
Exercise 10: Using as few lines of TeX code as possible, set up an
Around the Bend post containing a typical exercise so that it can be
processed by plain TeX to (a) skip over the exercise text and (b)
decode an embedded encoded answer. Come up with a better encoding idea
than my previous one, that doesn't increase the size of the text by
300% during encoding.
***********************************************************************
U"N5"M5[ZIm~f!!0dU!!0dU")"656"Yk3j"kH"jZ53"I"WZ5~m"I#kf"$Ej"WI34gj
"XmI~~i"3Ij53H5m6x""]kEX!!0dU"$m46"Fk3j54#"FXkYFjm6"Ym"jk"3m46"5j"I
4iWIi"I46"I|k56"jZm"jmYFjIj5k4!!0dU"jk"3Fm46"YkXm"j5Ym"k4"5jx"")"lE
3j"Fk~53Zm6"5j"kHH"jk6Iix!!0dU!!0dU"KZIj")"WkE~6"~5Gm"jk"6k"53,!!0d
U""A"YIGm"jZm"6m[k654#"YI[Xk3"3ZkXjmX"B4kjm"jZIj"54"Yi"HkXYIjf"I~~"
jZm!!0dU""""YI[Xk[k6m"FXm[m6m3"jZm"}Em3j5k4f"WZ5[Z"~kkG3"WkX3m"jZI4
"ikEX"3k~Ej5k4xy!!0dU""A"93m"I[j5|m"[ZIXI[jmX3"XIjZmX"jZI4"J~kWmX[I
[...]
!!0d!!03!!03!!A{end!!A}
========================================================================
\end{lcode}
Commentary (Donald Arseneau):
I did most of this a while ago, but wasn't really satisfied. Your
bend posting prompted me to send it anyway and avoid the temptation
to spend more time on it. I just polished it off today.
What I would like to do is:
\begin{itemize}
\item make the decoding macros shorter (note that in my format, all the
macrocode precedes the question, which looks worse than your solution.)
\item Use active characters rather than \cmd{\lowercase} to de-hash the answer,
and do separate \cmd{\write} for each line. That's to avoid memory
overflow.
\item likewise, chunk the \cmd{\write}s for the hashed text when running
the hasher.
\item \ldots
\end{itemize}
%===================================================================
This file should be clear! Only the hidden (hashed) text and
the macros to UNhash it should be obfuscated because they will
be given with the question.
\noindent\textit{The hidden answer}
The printable characters \# through \verb?~? (35-126) are permuted
through a simple hashing with a chosen starting value and
multiplier. Non-printing characters are represented by their
hexadecimal codes in the form \verb?!!hh? (where h is a hex digit
[higit?]); the \verb?!? character will act like \verb?^? when the text is
decoded. I don't want spaces in the coded text, but I also
don't want to use \verb?!!20? because there are likely many spaces, so
space is represented by \verb?"? and \verb?"? is represented as \verb?!!20?.
There are three other special (reserved) characters besides the
exclamation point: \verb?^A?, \verb?^B?, \verb?^C? (ascii 1,2,3).
They are used as follows:
\begin{lcode}
% character use coded as
% --------- --------------- -------------
% ! superscript \1 ( !!A1 )
% (for hex codes)
% " space !!20 (trades with space)
% ^A escape (\) \2 ( !!A2 )
% ^B opening ({) \3 ( !!A3 )
% ^C closing (}) \4 ( !!A4 )
\end{lcode}
All other characters are represented by their permuted
printable character, or by their normal hexadecimal form:
\verb?!!15?, \verb?!!0a?, \verb?!!a4?, \verb?!!7f? etc.
The original coding is done through active characters, with
all characters defined to produce their non-active coded text
(either hashed or hex). The decoding of hex (non-printing)
characters is automatic; the decoding of the special four is
done through simple definitions; the decoding of printable
characters is done by loading the de-hashed character values
into the \cmd{\lccode} and applying \cmd{\lowercase}.
Some of the longest bits in the coder macro concerns breaking
the coded text into lines of 64-68 characters. If the first
character in a line (after breaking) is a period, or the first
two characters are \verb?--?, the first character is given in hex
representation in fear of maniacal mail gateways. The other
dangerous characters like \verb?^ ` \ ~? are not treated carefully
because they had to have been preserved for the macros to work
at all.
\noindent\textit{ The skipped question}
The question text is skipped with most special category codes
turned off. The only funtioning input is \verb?^M? due to \cmd{\obeylines}.
The active \verb?^M? checks each line of input looking for the marker
text to end the question material. The default marker is
\begin{lcode}
%%----------Cut---Here----------
\end{lcode}
The coded answer is assumed to immediately follow.
\noindent\textit{The coder}
\verb? [...] the coder routine [...]? \\
asks for three file names: the \cmd{\QuestionFileName} should
contain the text of the question; the \cmd{\SolutionFileName} should
have the answer; The complete question/answer posting will be
written to \cmd{\OutputFileName}. (Run this file through plain TeX.)
\ldots
There are 92 characters that will be hashed (\verb?35=#? to \verb?126=~?).
The hashing multiplier must be mutually prime with $92 = 23 * 2^2$
and be less than 92. The start value (seed) can be anything
in the range 35-126.
\ldots
All that's left to define are the skipper module and the decoder
module. They both are written into the posting to be execuded
by the receiver. They are compressed and obfuscated, but the
obfuscation is mostly just compression: using command symbols
like \verb?\,? for longer command words, and using built-in registers
instead of allocating registers. Some of the abbreviations and
the choices of register are meant to be confusing and/or silly.
Plain-text versions of the modules are given here, as well as
a glossary of the obfuscation.
Here is the skipper module. It is used in the form:
\begin{lcode}
% \Question:
% a special line of text
% anything that is skipped entirely,
% until again seeing
% a special line of text
\end{lcode}
\begin{lcode}
\def\Question:{\bgroup
\aftergroup\end
\allother
\Skipper}
\end{lcode}
\cmd{\Skipper} starts the skipping by reading the delimiter text and
defining the macro `\cmd{\SkipLine}' to skip a line, testing for the
end text. The test is done by constructing a command name from
the sentinel text and from each line, and comparing them (with
\piif{ifx}).
\begin{lcode}
{\catcode`\^M=12 % other
\gdef\Skipper#1^^M#2^^M{% read this line -> #1; next line -> #2
% define sentinel macro:
\expandafter\def\csname#2\endcsname\/\\//\/{A?^^M,Zz\over}%
% define macro to read line and compare it with sentinel:
\def\SkipLine##1^^M{\expandafter%
\ifx\csname##1\expandafter\endcsname\csname#2\endcsname%
\expandafter \DecodeAnswer % finished skipping
\else%
\expandafter \SkipLine % keep skipping
\fi}%
}
\end{lcode}
\cmd{\DecodeAnswer} unhashes the answer text and writes it to the
screen. The unprintable characters represented as \verb?!!hh? are left
as they are (i.e., possibly unprintable!) \texttt{Control-M} (\verb?!!0d?) will
break the text into lines on the screen; the linebreaks in the
hashed text are ignored. \cmd{\HS} is set to the seed value before
\cmd{\DecodeAnswer} is invoked.
\end{solution}
\begin{comment}
Date: 18 Aug 1994 15:38:30 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #11, solutions, part 4 of 4
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
Here is Peter Schmitt's solution to Around the Bend \#11.
\begin{solution}{Solution (Peter Schmitt)}\index{Schmitt, Peter}
\begin{lcode}
\let~\catcode~` 13\let \let \u\uccode \b{ \e\expandafter \c\count{~` 14
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{lcode}
Michael:
here is just another version for Exercise 11:
\begin{itemize}
\item using comment space I have managed to pack the code into 1+3 lines of
length 72.
\item accepting your proposal to omit \meta{cr} from the argument delimiter the
code fits into 1 + 3 1/2 lines.
\end{itemize}
Maybe, that still a few characters can be saved, but I expect that a
major gain can (if at all) only be achieved by a different coding method.
best wishes, Peter
P.S.: this is the second variant:
\begin{lcode}
\let~\catcode~12 9~`^13~13 9\let^\def{^^#1__{\egroup}~`\\9~`{9~`}9 ^
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
text to be skipped
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
__~` 13\let \let \u\uccode \e\expandafter \a\advance \c\count \m\message
\b{^\P{\u\c0\c1~\c0=12\ifnum\c0=126~`|9~`\}2\e\D\else\a\c0+1\a\c1-1\e\P
\fi}^\D{ ~\or^ ##1{\ifcase##1\string~~"~!~{~}{\newlinechar`!\m{!}}\m{~}%
\e\end\fi}\uppercase\b\m\b}\c0`!\c1`}\P
P.P.S.: I was lazy and have not prepared an updated version of the
coded text.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
} \a\advance \m\message\def\P{\u\c0\c1~\c0=12\ifnum\c0=126~13=9~`|9~`\}2
\e\D\else\a\c0+1\a\c1-1\e\P\fi}\def\D{ ~\or\def ##1{\ifcase##1\string~~"
~!~{~}{\newlinechar`!\m{!}}\m{~}\e\end\fi}\uppercase\b\m\b}\c0`!\c1=`}\P
jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy j~~B;=|
*;/:9>B@@Rml j~~#B:98B.,9.=,9+35.#B;=*;/:9>BBml~B;=*;/:9>B#ml~B;=*;/:9>B!ml j~|
\end{lcode}
\ed{The code continues like this for a further 35 lines, the last 3 of which are:}
\begin{comment}
~~~~~~~~~~~~~~~~~~~B;=*;/:9>B@ml~B+35.! j~~B;=*;/:9>B@@QmlB:98B+35.{m@@Q??#B97|
,/).!B.,9.=,9+35. jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy|
yyyyyyyyyy j+35..9:~*9&*~d~+35..507~5+~+*/..9:~<%~*'/~;/0+9;)*5(9~+)<+;,5.* j~|
~~~~~~~~~~~~~~;6=,=;*9,+~=*~*69~<97500507~/8~=~2509~d~?? j90;/:9:~*9&*~d~1)+*~|
90:~/0~/09~,576*~<,=;9~d~! jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy|
yyyyyyyyyyyyyyyyyyy jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy|
yyyyyyyyyyyyyyyyyy jyyy~*69~:9;/:507~1=;,/+ jyyy~*69~=;*)=2~1=;,/+~=,9~+2576*2|
%~1/,9~;/1.25;=*9:~*/~=22/'~+6/,*9,~;/:9 jyyy~*69~*9&*~*/~90;/:9~1)+*~90:~/0~8|
/,1899:~v]K[UU~mlu jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy|
yyyyyyyyyyyyyyyyy jB:98B.,9.=,9#B);;/:9~B;/)0*n~B;/)0*m j~~~~~~~~~~~~~B;=*;/:9|
~B;/)0*n~ml j~~~~~~~~~~~~~B580)1~B;/)0*n~a~mlh j~~~~~~~~~~~~~B;=*;/:9~>B@@Q~e j
~~~~~~~~~~~~~B;=*;/:9~>B"~e j~~~~~~~~~~~~~B;=*;/:9~>B!~l j~~~~~~~~~~~~~~~~~~~~|
B9&.=0:=8*9,B:9;/:9 j~~~~~~~~~~~~~~B92+9~B=:(=0;9~B;/)0*~n~<%~~m j~~~~~~~~~~~~|
~~~~~~~~B=:(=0;9~B;/)0*~m~<%~qm j~~~~~~~~~~~~~~~~~~~~B9&.=0:=8*9,B.,9.=,9 j~~~|
~~~~~~~~~~~~B85! jB:98B:9;/:9#B;=*;/:9>B~B=;*5(9B)..9,;=+9B<7,/).B19++=79B<7,/|
).! jB;/)0*nakl jB;/)0*mamlh jB:98B02##B09'2509;6=,> lB19++=79# l!!B19++=79! j|
B:98 n{m#B58;=+9B+*,507{mB+*,507 nB/, mB/, lB/,#B/,!B02#B/,!B9&.=0:=8*9,B90:B8|
5!y jB.,9.=,9 jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy|
yyyyyy jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy|
yyyyy jyyy~*69~90;/:507~1=;,/+ jyyyyyyyyyyyyyyyyyyyyyyy jB5119:5=*9B/.90/)*naB|
4/<0=19p;:: j~~~~~~~~~~~~B;=*;/:9> mB=;*5(9 j~~~~~~~~~~~~B;=*;/:9> lB=;*5(9 jB|
:98B90;/:9~#B);;/:9>B n~a~B;/)0*n j~~~~~~~~~~~~~B);;/:9>B_~a~B;/)0*m j~~~~~~~~|
~~~~~B)..9,;=+9#B:98 n#B=::_m!B;=*;/:9>_B=;*5(9! j~~~~~~~~~~~~~B580)1~B;/)0*na|
mli j~~~~~~~~~~~~~~~~~~~~B:98~ m#B=::#~1!l! j~~~~~~~~~~~~~~~~~~~~B:98~ l#B=::#|
~2!l! j~~~~~~~~~~~~~~~~~~~~B;=*;/:9>~B=;*5(9 j~~~~~~~~~~~~~~~~~~~~B;/)0*nan~B:|
98B2509#! j~~~~~~~~~~~~~~B92+9~B=:(=0;9B;/)0*n~<%~~m j~~~~~~~~~~~~~~~~~~~~B=:(|
=0;9B;/)0*m~<%~qm j~~~~~~~~~~~~~~~~~~~~B9&.=0:=8*9,B90;/:9 j~~~~~~~~~~~~~~~~B8|
5 j~~~~~~~~~~~~~! jB:98B=::{m{l#B580)1~B;/)0*n~`~gf j~~~~~~~~~~~~~~~~~~~~B5119|
:5=*9B',5*9n#B2509! j~~~~~~~~~~~~~~~~~~~~~B:98B2509#{m!~~~~~~B;/)0*na{l j~~~~~|
~~~~~~~~~B92+9~B9:98B2509#B2509{m!~B=:(=0;9B;/)0*n<%{l j~~~~~~~~~~~~~~~B85 j~~|
~~~~~~~~~~~B580)1~B;/)0*n~a~gf~B=::"m~B85 j~~~~~~~~~~~~! jB:98~~ n#B=::#~0!l! j
B:98@@R#B=::#~5!lB5119:5=*9B',5*9n#B2509!B5119:5=*9B;2/+9/)*nB90:! j~~~~~~~~B;|
\end{comment}
\begin{lcode}
=*;/:9>B@@QB=;*5(9~y jB:98@@Q#B=::#~4!l!~~~~~~~~~~~y jB;/)0*nakl~B;/)0*mamlh~B|
90;/:9 jyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy|
yyyyy j i This is trash: Text not displayed!} More Trash that is not displayed!
\end{lcode}
\end{solution}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%[Addendum 1: Full text of Donald Arseneau's solution. To read the
%commentary you will need to run the text through TeX.]
\subsection{Addendum 1}
Full text of Donald Arseneau's solution. To read the
commentary you will need to run the text through TeX.
\begin{lcode}
Date: 14 Oct 1993 01:52:26 -0800 (PST)
From: Donald Arseneau <
[email protected]>
Subject: Around the bends
To:
[email protected]
\let~\let~\#\def\#\.{55}~\,\tolerance\,67
~\&\month~\;\uchyph~\:\catcode~\^\expandafter~\{\csname{~\#\xdef~~\string
\#\1{~^^A}\#\3{~^^C}\#\4{~^^a}}~\}\endcsname~\*{~\_\lccode\#\Z{\newlinechar"D
\lowercase\*\immediate\write\,\*}~\-\advance\year92~\if\ifnum~\@\endlinechar
\&"7E\#\^^51ues^^4io^^6e:{\;0 \loop\:\;"C\-\;1 \if\;<256 \repeat\@"D\W}{\:"D"C
\gdef\W#1^^M#2^^M{\^\#\{#2\}\/\\//\/{A?^^M,Zz\over}\#\X##1^^M{\^\if^^8\{##1\^%
\}\{#2\}\^\Y\else\^\X\fi}\X}}\#\Y{\;35\loop\_\,\;\if\;<\&\-\,\.\-\;1\if\,>\&
\-\,-\year\fi\repeat\:1'0\:3"2\:33'7\_"20`"\_`""20\@-1\Z}
\Question:
***********************************************************************
*** Exercise 11 (hard):
Write your own decoder to solve the problem I set for myself in
Exercise 10: Using as few lines of TeX code as possible, set up an
Around the Bend post containing a typical exercise so that it can be
processed by plain TeX to (a) skip over the exercise text and (b)
decode an embedded encoded answer. Come up with a better encoding idea
than my previous one, that doesn't increase the size of the text by
300% during encoding.
***********************************************************************
U"N5"M5[ZIm~f!!0dU!!0dU")"656"Yk3j"kH"jZ53"I"WZ5~m"I#kf"$Ej"WI34gj
"XmI~~i"3Ij53H5m6x""]kEX!!0dU"$m46"Fk3j54#"FXkYFjm6"Ym"jk"3m46"5j"I
4iWIi"I46"I|k56"jZm"jmYFjIj5k4!!0dU"jk"3Fm46"YkXm"j5Ym"k4"5jx"")"lE
3j"Fk~53Zm6"5j"kHH"jk6Iix!!0dU!!0dU"KZIj")"WkE~6"~5Gm"jk"6k"53,!!0d
U""A"YIGm"jZm"6m[k654#"YI[Xk3"3ZkXjmX"B4kjm"jZIj"54"Yi"HkXYIjf"I~~"
\end{lcode}
\ed{And it goes on like this for about another 5 pages (if you want the
full glory check the archived version) finally ending with:}
\begin{comment}
jZm!!0dU""""YI[Xk[k6m"FXm[m6m3"jZm"}Em3j5k4f"WZ5[Z"~kkG3"WkX3m"jZI4
"ikEX"3k~Ej5k4xy!!0dU""A"93m"I[j5|m"[ZIXI[jmX3"XIjZmX"jZI4"J~kWmX[I
3m"jk"6mAZI3Z"jZm"I43WmXf!!0dU""""I46"6k"3mFIXIjm"JWX5jm"HkX"mI[Z"~
54mx""^ZIjg3"jk"I|k56"YmYkXi"k|mXH~kWx!!0dU""A"~5GmW53mf"[ZE4G"jZm"
JWX5jm"3"HkX"jZm"ZI3Zm6"jm2j"WZm4"XE4454#"jZm"ZI3ZmXx!!0dU!!0dU")"~
5Gm"ikEX"YmjZk6"kH"[kE4j54#"jZm"3Fm[5I~"I[j5|m"[ZIXI[jmX"54"jZm!!0d
U"}Em3j5k4"jm2j!!A4!!A4!!0dU""""AA"*k4I~6!!0dUuuuuuuuuuuuuuuuuuuuuu
uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu!!0dU"^Z53"H5~m"3ZkE~
6"$m"[~mIX!!A4""_4~i"jZm"Z566m4"BZI3Zm6y"jm2j"I46!!0dU"jZm"YI[Xk3"j
k"9(ZI3Z"5j"3ZkE~6"$m"k$HE3[Ijm6"$m[IE3m"jZmi"W5~~!!0dU"$m"#5|m4"W5
jZ"jZm"}Em3j5k4x!!0dU!!0dU"^Zm"Z566m4"I43WmX!!0dU"AAAAAAAAAAAAAAAAA
!!0dU!!0dU"^Zm"FX54jI$~m"[ZIXI[jmX3"C"jZXkE#Z"h"Bw-Ae@dy"IXm"FmXYEj
m6!!0dU"jZXkE#Z"I"35YF~m"ZI3Z54#"W5jZ"I"[Zk3m4"3jIXj54#"|I~Em"I46
!!0dU"YE~j5F~5mXx"(k4AFX54j54#"[ZIXI[jmX3"IXm"XmFXm3m4jm6"$i"jZm5X!!0d
U"Zm2I6m[5YI~"[k6m3"54"jZm"HkXY"!!A4!!A4ZZ"BWZmXm"Z"53"I"Zm2"65#5j
!!0dU"oZ5#5j+%yc"jZm"!!A4"[ZIXI[jmX"W5~~"I[j"~5Gm"\"WZm4"jZm"jm2j"5
3!!0dU"6m[k6m6x"")"6k4gj"WI4j"3FI[m3"54"jZm"[k6m6"jm2jf"$Ej")"I~3k
!!0dU"6k4gj"WI4j"jk"E3m"!!A4!!A4@."$m[IE3m"jZmXm"IXm"~5Gm~i"YI4i"3FI
[m3f"3k!!0dU"3FI[m"53"XmFXm3m4jm6"$i"!!20"I46"!!20"53"XmFXm3m4jm6"I
3"
[email protected]"^ZmXm!!0dU"IXm"jZXmm"kjZmX"3Fm[5I~"BXm3mX|m6y"[ZIXI[j
mX3"$m356m3"jZm!!0dU"m2[~IYIj5k4"Fk54j,"\=f"\tf"\O"BI3[55"ef@fwyx""
^Zmi"IXm"E3m6"I3!!0dU"Hk~~kW3,!!0dU!!0dU"""""[ZIXI[jmX""""""E3m""""
"""""""""""[k6m6"I3!!0dU"""""AAAAAAAAA"""AAAAAAAAAAAAAAA""""AAAAAAA
AAAAAA!!0dU"""""""""!!A4"""""""3EFmX3[X5Fj"""""""""Je""B"!!A4!!A4=e
"y!!0dU"""""""""""""""""BHkX"Zm2"[k6m3y!!0dU"""""""""!!20"""""""3FI
[m"""""""""""""""!!A4!!A4@."BjXI6m3"W5jZ"3FI[my!!0dU""""""""\="""""
""m3[IFm"BJy""""""""""J@""B"!!A4!!A4=@"y!!0dU""""""""\t"""""""kFm45
4#"B{y"""""""""Jw""B"!!A4!!A4=w"y!!0dU""""""""\O"""""""[~k354#"B1y"
""""""""JR""B"!!A4!!A4=R"y!!0dU!!0dU!!0dU"=~~"kjZmX"[ZIXI[jmX3"IXm"
XmFXm3m4jm6"$i"jZm5X"FmXYEjm6!!0dU"FX54jI$~m"[ZIXI[jmXf"kX"$i"jZm5X
"4kXYI~"Zm2I6m[5YI~"HkXY,!!0dU"!!A4!!A4e-f"!!A4!!A4.If"!!A4!!A4IRf"
!!A4!!A4?H"mj[x!!0dU!!0dU"^Zm"kX5#54I~"[k654#"53"6k4m"jZXkE#Z"I[j5|
m"[ZIXI[jmX3f"W5jZ!!0dU"I~~"[ZIXI[jmX3"6mH54m6"jk"FXk6E[m"jZm5X"4k4
AI[j5|m"[k6m6"jm2j!!0dU"Bm5jZmX"ZI3Zm6"kX"Zm2yx""^Zm"6m[k654#"kH"Zm
2"B4k4AFX54j54#y!!0dU"[ZIXI[jmX3"53"IEjkYIj5[c"jZm"6m[k654#"kH"jZm"
3Fm[5I~"HkEX"53!!0dU"6k4m"jZXkE#Z"35YF~m"6mH545j5k43c"jZm"6m[k654#"
kH"FX54jI$~m!!0dU"[ZIXI[jmX3"53"6k4m"$i"~kI654#"jZm"6mAZI3Zm6"[ZIXI
[jmX"|I~Em3!!0dU"54jk"jZm"J~[[k6m"I46"IFF~i54#"J~kWmX[I3mx!!0dU!!0d
U"'kYm"kH"jZm"~k4#m3j"$5j3"54"jZm"[k6mX"YI[Xk"[k4[mX43"$XmIG54#!!0d
U"jZm"[k6m6"jm2j"54jk"~54m3"kH"dRAdv"[ZIXI[jmX3x"")H"jZm"H5X3j!!0dU
"[ZIXI[jmX"54"I"~54m"BIHjmX"$XmIG54#y"53"I"FmX5k6f"kX"jZm"H5X3j!!0d
U"jWk"[ZIXI[jmX3"IXm"AAf"jZm"H5X3j"[ZIXI[jmX"53"#5|m4"54"Zm2!!0dU"X
mFXm3m4jIj5k4"54"HmIX"kH"YI45I[I~"YI5~"#IjmWIi3x""^Zm"kjZmX!!0dU"6I
4#mXkE3"[ZIXI[jmX3"~5Gm"\"n"J"h"IXm"4kj"jXmIjm6"[IXmHE~~i!!0dU"$m[I
E3m"jZmi"ZI6"jk"ZI|m"$mm4"FXm3mX|m6"HkX"jZm"YI[Xk3"jk"WkXG!!0dU"Ij"
I~~x!!0dU!!0dU"^Zm"3G5FFm6"}Em3j5k4!!0dU"AAAAAAAAAAAAAAAAAAAA!!0dU
!!0dU"^Zm"}Em3j5k4"jm2j"53"3G5FFm6"W5jZ"Yk3j"3Fm[5I~"[Ijm#kXi"[k6m3
!!0dU"jEX4m6"kHHx""^Zm"k4~i"HE4j5k454#"54FEj"53"\M"6Em"jk"Jk$mi~54m3
x!!0dU"^Zm"I[j5|m"\M"[Zm[G3"mI[Z"~54m"kH"54FEj"~kkG54#"HkX"jZm"YIXG
mX!!0dU"jm2j"jk"m46"jZm"}Em3j5k4"YIjmX5I~x""^Zm"6mHIE~j"YIXGmX"53
!!0dU"UUAAAAAAAAAAOEjAAANmXmAAAAAAAAAA!!0dU"^Zm"[k6m6"I43WmX"53"I33EY
m6"jk"5YYm65Ijm~i"Hk~~kWx!!0dU!!0dU!!0dU"^Zm"[k6mX!!0dU"AAAAAAAAA
!!0dU!!0dU"NmXm"53"jZm"[k6mX"XkEj54mx"")j"53"3EFFk3m6"jk"$m"[~mIXx"")
j!!0dU"I3G3"HkX"jZXmm"H5~m"4IYm3,""jZm"JqEm3j5k4<5~m(IYm"3ZkE~6!!0d
U"[k4jI54"jZm"jm2j"kH"jZm"}Em3j5k4c""jZm"J'k~Ej5k4<5~m(IYm"3ZkE~6
!!0dU"ZI|m"jZm"I43WmXc""^Zm"[kYF~mjm"}Em3j5k4SI43WmX"Fk3j54#"W5~~"$m
!!0dU"WX5jjm4"jk"J_EjFEj<5~m(IYmx""BLE4"jZ53"H5~m"jZXkE#Z"F~I54"^m&x
y!!0d!!0dJ4mWXmI6Jq<5~m!!0dJ4mWXmI6J'<5~m!!0dJ4mWWX5jmJ_<5~m!!0d!!0d
J4mW~54m[ZIXunT!!0dJYm33I#m{TKZIj"H5~m"[k4jI543"jZm"}Em3j5k4+1!!0d
JXmI6ed"jk"JqEm3j5k4<5~m(IYm!!0dJkFm454Jq<5~muJqEm3j5k4<5~m(IYm!!0d
!!0dJYm33I#m{KZIj"H5~m"[k4jI543"jZm"3k~Ej5k4+1!!0dJXmI6ed"jk"J'k~Ej
5k4<5~m(IYm!!0dJkFm454J'<5~muJ'k~Ej5k4<5~m(IYm!!0d!!0dJYm33I#m{KZIj
"3ZkE~6"jZm"[kYF~mjm"Fk3j54#"$m"WX5jjm4"jk+1!!0dJXmI6ed"jk"J_EjFEj<
5~m(IYm!!0dJ5YYm65IjmJkFm4kEjJ_<5~muJ_EjFEj<5~m(IYm!!0d!!0dJ4mW5HJ5
H_;!!0d!!0dU"^ZmXm"IXm"Q@"[ZIXI[jmX3"jZIj"W5~~"$m"ZI3Zm6"Bw-uC"jk"e
@duhyx!!0dU"^Zm"ZI3Z54#"YE~j5F~5mX"YE3j"$m"YEjEI~~i"FX5Ym"W5jZ"Q@"u
"@w"T"@\@!!0dU"I46"$m"~m33"jZI4"Q@x""^Zm"3jIXj"|I~Em"B3mm6y"[I4"$m"
I4ijZ54#!!0dU"54"jZm"XI4#m"w-Ae@dx!!0d!!0dJ4mW[kE4jJNM!!0dJ4mW[kE4j
JjmYF!!0dJ[ZIX6mHJjkF["nJh"U"Z5#m3j"ZI3Zm6"[ZIXI[jmX"Be@dy!!0dJ[ZIX
6mHJ$kj["nJC"U"~kWm3j"ZI3Zm6"[ZIXI[jmX"Bw-y!!0dJ4mW[kE4jJXI4#m!!0dJ
XI4#muJjkF["JI6|I4[mJXI4#mAJ$kj["JI6|I4[mJXI4#m"e"U"Q@!!0d!!0dJ6mHJ
L{JXmI6ed"jk"JNI3ZME~j5F~5mX"JNMuJNI3ZME~j5F~5mXJXm~I2!!0d""J_;jXEm
!!0d""J5H4EYJNMPJXI4#m"J_;HI~3mJH5!!0d""J5H4EYJNM">w"J_;HI~3mJH5!!0d
""JjmYFuJNM"J65|56mJjmYF"@w"JYE~j5F~iJjmYF"@w!!0d""J5H4EYJjmYFuJNM
"J_;HI~3m"JH5"U"[Zm[G"[kYYk4"HI[jkX"kH"@w!!0d""JjmYFuJNM"J65|56mJjm
YF"@"JYE~j5F~iJjmYF"@!!0d""J5H4EYJjmYFuJNM"J_;HI~3m"JH5"U"[Zm[G"[kY
Yk4"HI[jkX"kH"@!!0d""J5H_;"Jm~3m"U"HI5~m6xxxXmFXkYFj!!0d"""""JYm33I
#m{:~mI3m"m4jmX"I"4EY$mX"54"jZm"XI4#m"w"A"Q@!!0d""""""""jZIj"53"4kj
"I"YE~j5F~m"kH"@"kX"@wx1JL!!0d""JH51!!0dJL!!0d!!0dJ4mW[kE4jJN'!!0dJ
6mHJL{JXmI6ed"jk"JNI3Z'mm6"JN'uJNI3Z'mm6JXm~I2!!0d""J_;jXEm!!0d""J5
H4EYJN'"PJjkF["J_;HI~3mJH5!!0d""J5H4EYJN'">J$kj["J_;HI~3mJH5!!0d""J
5H_;"Jm~3m"U"HI5~m6xxxXmFXkYFj!!0d"""JYm33I#m{:~mI3m"m4jmX"I"4EY$mX
"54"jZm"XI4#m!!0d"""""""""J4EY$mXJ$kj[J3FI[m"A"J4EY$mXJjkF[x1JL!!0d
""JH51!!0dJL!!0d!!0dU"(kW"Wm"W5~~"XmI6"jZm"3mFIXIjkX"jm2j"jXmIj54#"
3Fm[5I~"[ZIXI[jmX3!!0dU"I3"kX654IXi"k4m3x""(mm6"jk"6k"jZm"[kYYI463"
54"YI[Xk3"3k"[Ij[k6m!!0dU"[ZI4#m3"6k4gj"ZEXj"jZm"[kYYI463")"WI4j"jk
"6k!!A4!!0d!!0dJ$m#54#XkEF!!0d""Jm3[IFm[ZIXuAeJ26mHJ'mF!!0d""{J3jX5
4#JUJ3jX54#JUAAAAAAAAAAJ3jX54#JOEjAAAJ3jX54#JNmXmAAAAAAAAAA1!!0d""J
6mHJ6kCe{J[Ij[k6mnCeue@"1!!0d""J6mHJL{{J6k3Fm[5I~3Jm46~54m[ZIXuAe
!!0d""JYm33I#m{^Zm"3mFIXIj5k4"jm2j"53,"nJ'mFgx"1U!!0d""JYm33I#m{a4jmX
"I"XmF~I[mYm4j"kX"lE3j"FXm33"LmjEX4,"T1U!!0d""JXmI6Ae"jk"JjmYF!!0d"
"J5H2JjmYFJmYFji"Jm~3m""J26mHJ'mF{JjmYF1JH511!!0d""JL!!0dJm46#XkEF
!!0d!!0dU"B[Ijm#kX5m3"$I[G"jk"4kXYI~y!!0dU!!0dU"(kW"Wm"IXm"XmI6i"jk"
XmI6"jZm"}Em3j5k4"I46"I43WmXf"I46"WX5jm"jZm!!0dU"kEjFEjx""'54[m"I~~
"jZ53"53"6k4m"W5jZ"I~~"[ZIXI[jmX3"$m54#!!0dU"nkjZmXgf"6mH54m"YI[Xk3
"jk"6k"I~~"jZm"FXk[m3354#"$mHkXm"[ZI4#54#!!0dU"I~~"jZm"[Ij[k6m3x!!0d
!!0dJ4mW[kE4jJON!!0d!!0dU"(kjm,"^Z53"YI[Xk"W5~~"I~3k"$m"WX5jjm4"54
"3ZkXj"HkXY"W5jZ"jZm!!0dU"I43WmX"6m[k6mXx!!0d!!0dJ6mHJI~~kjZmX{JONu
"U"3mj"I~~"[Ij[k6m3"u"nkjZmXg!!0d"J~kkF!!0d"""J[Ij[k6mJONue@!!0dU"
"J~[[k6mJONuJON""U"k4~i"E3m6"HkX"6m[k6mX!!0d"""JI6|I4[mJON"$i"e!!0d
"""J5H4EYJON>@-d!!0d"JXmFmIj!!0d"Jm46~54m[ZIXuew"U"\M!!0d1!!0d!!0dU
"Km"W5~~"4mm6"jk"[kFi"~54m3"HXkY"jZm"}Em3j5k4"H5~m"I46"WX5jm"jZmY
!!0dU"jk"jZm"kEjFEj"H5~m"|mX$Ij5Yx!!0d!!0dJ6mHJOkFiqEm3j5k4{Jm46~54m[
ZIXAe"J4mW~54m[ZIXAe"JOq1!!0d!!0dJ6mHJOq{U"U"jZ53"#5|m3"mXXkX"k4"4E
~~"54FEj"H5~mx"")j"3ZkE~6!!A4!!0d"JXmI6Jq<5~m"jkJ~54m"U"|mX$Ij5Y"3Z
kE~6"$m"k4"Ij"jZm"YkYm4j!!A4!!0d"J5HmkHJq<5~m"J5YYm65IjmJ[~k3m54Jq<
5~m!!0d"Jm~3m"J5YYm65IjmJWX5jm"J_<5~m"{J~54m1Jm2FI46IHjmX"JOq!!0d"J
H51!!0d!!0d!!0dU"^Z53"YI[Xk"YIGm3"I~~"[ZIXI[jmX3"I[j5|mf"I46"6mH54m
3"jZmY"I3"jZm5X!!0dU"Zm2"[k6m3,"!!A4!!A4ZZx!!0d!!0dJ6mHJ=~~=[jNm2{J
6mHJZm2ON{..1U!!0d""J~kkF!!0d""""J[Ij[k6m!!20JZm2ONuJI[j5|m!!0d""""
Jm6mHJZm2[Z{J~kWmX[I3m{Jm6mHJ4km2FI46JZm2[Z{JZm2ON111JZm2[Z!!0d""""
J(EYmX5[I~~iJm6mH{!!20JZm2ON1{!!A4!!A4JZm2[Z1U!!0d""""J5H4EY!!20JZm
2ON>!!20<<!!0d""""""Jm6mHJZm2ON{Jm2FI46IHjmXJ3jmFZm2JZm2ON1U!!0d""J
XmFmIj1!!0d!!0dJ6mHJ(EYmX5[I~~iCeC@{J~[[k6mnJhC@JXm~I2"J~kWmX[I3m{C
eh11!!0d!!0dJ6mHJ3jmFZm2CeC@{J5H[I3m!!20C@"CeeJkX"Ce@JkX"CewJkX"CeR
JkX"Ce-JkX"CedJkX!!0d""Ce?JkX"CevJkX"CeQJkX"Ce=JkX"CetJkX"CeOJkX"Ce
*JkX"CeaJkX"Ce<JkX!!0d"""J5H[I3m!!20Ce"eJkX"@JkX"wJkX"RJkX"-JkX"dJk
X"?JkX"vJkX"QJkX!!0d""""""""=JkX"tJkX"OJkX"*JkX"aJkX"<JkX"e.JH5".JH
51!!0d!!0dU"^Z53"YI[Xk"k|mXX56m3"jZm"!!A4!!A4ZZ"4kjIj5k4"HkX"FX54jI
$~m"[ZIXI[jmX3f!!0dU"I46"6mH54m3"jZmY"I3"jZm5X"ZI3Zm6"[kE4jmXFIXj3x
""JON"53"jZm!!0dU"F~I54Ajm2j"[ZIXI[jmX"4EY$mXf"JjmYF"53"5j3"[k6m6"[
ZIXI[jmXx!!0d!!0dJ6mHJ(kXYNI3Z{JjmYFuJN'"U"3mm6"|I~Em!!0d""JONuJ$kj
[!!0d""J~kkF!!0d""""J~[[k6mnJhuJON"J~[[k6mnJvuJjmYF!!0d""""J~kWmX[I
3m{Jm6mHh{v11U!!0d""""J5H4EYJON>JjkF[!!0d""""""JI6|I4[m"JjmYF"JNM""
U"I66"YE~j5F~5mX"jk"ZI3Z"|I~Emf""E354#xxx!!0d""""""J5H4EY"JjmYFPJjk
F["JI6|I4[mJjmYFAJXI4#m"JH5"U"Yk6E~k"IX5jZYmj5[!!0d""""""JI6|I4[mJO
N"e!!0d""JXmFmIj1!!0d!!0dU"(kWf"Wm"6mH54m"jZm"352"m2[mFj5k4"[ZIXI[j
mX3!!0d!!0dJ6mHJa2[mFj{U!!0d""J(EYmX5[I~~iJ6mH{e1{!!A4!!A4=e1U!!0d"
"J(EYmX5[I~~iJ6mH{@1{!!A4!!A4=@1U!!0d""J(EYmX5[I~~iJ6mH{w1{!!A4!!A4
=w1U!!0d""J(EYmX5[I~~iJ6mH{n!!A41{!!A4!!A4=R1U!!0d""J(EYmX5[I~~iJ6m
H{nJ"1{!!201U!!0d""J(EYmX5[I~~iJ6mH{nJ!!201{
[email protected]!!0d1!!0d
!!0d!!0dU"OkFi"jZm"3k~Ej5k4"HXkY"jZm"3k~Ej5k4"H5~mf"FmXHkXY"jZm"jXI43
HkXYIj5k43!!0dU"BE354#"Jm6mHyf"I46"WX5jm"kEj"54"IFFXk2x"dRA[ZIXI[jm
X"~54m3x""^Zm"WZk~m!!0dU"3k~Ej5k4"YE3j"H5j"54"YmYkXi"$m[IE3m")"6k4g
j"WI4j"jk"GmmF"[kE4j54#"jZm!!0dU"[ZIXI[jmX3"I46"kEjFEjj54#"jZmY"I"H
mW"Ij"I"j5Ymx"")"6k4gj"3I|m"jZm"WZk~m!!0dU"Ym33"54"k4m"YI[Xk"jZkE#Z
f"$m[IE3m"I6654#"jk"I"~k4#"~53j"#mj3"|mXi"3~kWx!!0d!!0dJ6mHJN56m'k~
Ej5k4{J6mHJ=~~{1JjmYFue"J4mW~54m[ZIXuew"Jm46~54m[ZIXew!!0d""J~mjJ
!!A4JXm~I2"JN561!!0d!!0dJ6mHJN56{U"U!!0d"JXmI6J'<5~m"jkJ~54m!!0d"J5Hm
kHJ'<5~m!!0d"""J5YYm65IjmJ[~k3m54J'<5~m"Jm2FI46IHjmX"JKX5jm'F~5j!!0d
"Jm~3m!!0d"""Jm6mHJ=~~{J=~~"J!!A4{J4EY$mXJjmYF11U!!0d"""Jm2FI46IHj
mXJm6mHJ[34IYm"rJ4EY$mXJjmYFJm46[34IYm{J~54m1U!!0d"""JI6|I4[mJjmYF"
eJXm~I2!!0d"""Jm2FI46IHjmX"JN56!!0d"JH51!!0d!!0dU"^Zm"4m2j"YI[Xk3"I
Xm"E3m6"jk"3F~5j"I"~53j"kH"[k6m"[ZIXI[jmX3!!0dU"54jk"I$kEj"dR"[ZIXI
[jmX3,""jZm"H5X3j"hdR"54"I"YI[Xk"BCey"IXm!!0dU"WX5jjm4"jk"jZm"kEjFE
j"H5~m"I46"jZm"XmYI546mX"IXm"~mHj"54"jZm!!0dU"YI[Xkx""^Zm"3F~5j"W5~
~"4kj"54jmXXEFj"I4i"!!A4!!A4ZZ"3m}Em4[m3"BkX!!0dU"jZm"3Fm[5I~"!!A4
!!A4=w"3m}Em4[m3yx!!0d!!0dJ$m#54#XkEF!!0dJ[Ij[k6mewue@""U!!0dJ#6mHJ[
jX~Y{\\M1U!!0dJm46#XkEF!!0d!!0dJ6mHJKX5jm'F~5j{U!!0d""J6mHJ!!A4CCe{
J[34IYm"rCCeJm46[34IYm1U!!0d""Jm6mHJ=~~{J=~~""U"m2FI46"jk"XmI~"[ZIX
I[jmX3!!0d""""!!A4!!A4.w!!A4!!A4.w!!A4!!A4=J3jX54#{m46!!A4!!A4=J3jX
54#11U"I66"jmXY54Ij5k4"[k6m3,!!0d""J6mHJx{1U"""""""""""""""""""""""
""""U""11J[34IYm"m46Jm46[34IYm!!0d""J4mW~54m[ZIXuew"U"\M!!0d""Jm6mH
J=~~{Jm2FI46IHjmXJK'J=~~"JxJxJxJxJxJxJxJxJxJxJm461!!0d""J5YYm65IjmJ
WX5jmJ_<5~m{J=~~1U!!0d""J5YYm65IjmJ[~k3mkEjJ_<5~m!!0d1!!0d!!0dJ6mHJ
K'{JfJfJfJfJfJfJfJfJOEjJXm~I21U"FI33"k|mX"v"T"v"u"dR"[ZIX!!0d!!0dJ6
mHJfCeJXm~I2"C@CwCRC-CdC?CvCQ{C@CwCRC-CdC?CvCQU"FI33"v"[ZIX!!0d""J5
H2CQJxJm2FI46IHjmXJm46m6mHJH5CeJXm~I21!!0d!!0dJ6mHJOEjJXm~I2CeC@Cw{
U")43mXj"~54mHmm6"[ZIXI[jmXf!!0d""J5H2Ce!!A4U"""""""""""U"$Ej"6k4gj
"54jmXXEFj"I4i"!!A4!!A4ZZ!!0d""""J5H2C@!!A4J[jX~Y"CeC@CwJm~3m"CeC@C
wJ[jX~Y"JH5!!0d""Jm~3m!!0d""""J5H2C@!!A4CeJ[jX~YC@CwJm~3m"CeC@J[jX~
YCwJH5!!0d""JH5"JK'1!!0d!!0dJ6mHJm46m6mHCeJm46{1U"m46"kH"jm2jf"3k"m
46"Jm6mH"I46"#k$$~m"XmYI5454#"lE4G!!0d!!0dU"=~~"jZIjg3"~mHj"jk"6mH5
4m"IXm"jZm"3G5FFmX"Yk6E~m"I46"jZm"6m[k6mX!!0dU"Yk6E~mx""^Zmi"$kjZ"I
Xm"WX5jjm4"54jk"jZm"Fk3j54#"jk"$m"m2m[E6m6!!0dU"$i"jZm"Xm[m5|mXx""^
Zmi"IXm"[kYFXm33m6"I46"k$HE3[Ijm6f"$Ej"jZm!!0dU"k$HE3[Ij5k4"53"Yk3j
~i"lE3j"[kYFXm335k4,"E354#"[kYYI46"3iY$k~3!!0dU"~5Gm"Jf"HkX"~k4#mX"
[kYYI46"WkX63f"I46"E354#"$E5~jA54"Xm#53jmX3!!0dU"543jmI6"kH"I~~k[Ij
54#"Xm#53jmX3x""'kYm"kH"jZm"I$$Xm|5Ij5k43"I46!!0dU"jZm"[Zk5[m3"kH"X
m#53jmX"IXm"YmI4j"jk"$m"[k4HE354#"I46SkX"35~~ix!!0dU":~I54Ajm2j"|mX
35k43"kH"jZm"Yk6E~m3"IXm"#5|m4"ZmXmf"I3"Wm~~"I3!!0dU"I"#~k33IXi"kH"
jZm"k$HE3[Ij5k4x!!0dU!!0dU"NmXm"53"jZm"3G5FFmX"Yk6E~mx"")j"53"E3m6"
54"jZm"HkXY,!!0dU"JqEm3j5k4,!!0dU"I"3Fm[5I~"~54m"kH"jm2j!!0dU"I4ijZ
54#"jZIj"53"3G5FFm6"m4j5Xm~if!!0dU"E4j5~"I#I54"3mm54#!!0dU"I"3Fm[5I
~"~54m"kH"jm2j!!0dU!!0dU"J6mHJqEm3j5k4,{J$#XkEF!!0dU"""JIHjmX#XkEFJ
m46!!0dU"""JI~~kjZmX!!0dU"""J'G5FFmX1!!0dU!!0dU"J'G5FFmX"3jIXj3"jZm
"3G5FF54#"$i"XmI654#"jZm"6m~5Y5jmX"jm2j"I46!!0dU"6mH5454#"jZm"YI[Xk
"nJ'G5Fr54mg"jk"3G5F"I"~54mf"jm3j54#"HkX"jZm!!0dU"m46"jm2jx""^Zm"jm
3j"53"6k4m"$i"[k43jXE[j54#"I"[kYYI46"4IYm"HXkY!!0dU"jZm"3m4j54m~"jm
2j"I46"HXkY"mI[Z"~54mf"I46"[kYFIX54#"jZmY"BW5jZ!!0dU"J5H2yx!!0dU!!0d
U"{J[Ij[k6mnJ\\Mue@"U"kjZmX!!0dU"J#6mHJ'G5FFmXCe\\MC@\\M{U"XmI6"jZ
53"~54m"AP"Cec"4m2j"~54m"AP"C@!!0dU"U""6mH54m"3m4j54m~"YI[Xk,!!0dU"
""Jm2FI46IHjmXJ6mHJ[34IYmC@Jm46[34IYmJSJJSSJS{=+\\Mf8DJk|mX1U!!0dU"
U"6mH54m"YI[Xk"jk"XmI6"~54m"I46"[kYFIXm"5j"W5jZ"3m4j54m~,!!0dU"""J6
mHJ'G5Fr54mCCe\\M{Jm2FI46IHjmXU!!0dU"""""J5H2J[34IYmCCeJm2FI46IHjmX
Jm46[34IYmJ[34IYmC@Jm46[34IYmU!!0dU"""""""Jm2FI46IHjmX"J*m[k6m=43Wm
X"U"H5453Zm6"3G5FF54#!!0dU"""""Jm~3mU!!0dU"""""""Jm2FI46IHjmX"J'G5F
r54m"U"GmmF"3G5FF54#!!0dU"""""JH51U!!0dU"1!!0dU!!0dU"J*m[k6m=43WmX"
E4ZI3Zm3"jZm"I43WmX"jm2j"I46"WX5jm3"5j"jk"jZm!!0dU"3[Xmm4x"^Zm"E4FX
54jI$~m"[ZIXI[jmX3"XmFXm3m4jm6"I3"!!A4!!A4ZZ"IXm"~mHj!!0dU"I3"jZmi"
IXm"B5xmxf"Fk335$~i"E4FX54jI$~m!!A4y"Ok4jXk~AM"B!!A4!!A4.6y"W5~~!!0d
U"$XmIG"jZm"jm2j"54jk"~54m3"k4"jZm"3[Xmm4c"jZm"~54m$XmIG3"54"jZm
!!0dU"ZI3Zm6"jm2j"IXm"5#4kXm6x""JN'"53"3mj"jk"jZm"3mm6"|I~Em"$mHkXm
!!0dU"J*m[k6m=43WmX"53"54|kGm6x!!0dU!!0dU"J6mHJ*m[k6m=43WmX{U"B[kYFIX
m"H5X3j"FIXj"W5jZ"J(kXYNI3Zy!!0dU"""JONuJ$kj["U"H5X3j"[ZIXI[jmX"BF~
I54"jm2jy!!0dU"""J~kkF"U"k|mX"ZI3Zm6"[ZIXI[jmX3!!0dU"""""J~[[k6mJN'
uJON"U"YIF"[k654#"jk"F~I54"jm2j!!0dU"""""J5H4EYJON>JjkF[!!0dU""""""
"JI6|I4[m"JN'"JNM""U"I66"YE~j5F~5mX"jk"ZI3Z"|I~Emf""E354#xxx!!0dU""
"""""JI6|I4[mJON"e"U"jZ53"ZmXm"FXm|m4j3"JN'"HXkY"$m54#"jm3jm6"FXmYI
jEXm~i!!0dU"""""""J5H4EY"JN'PJjkF["JI6|I4[mJN'AJXI4#m"JH5"U"Yk6E~k"
IX5jZYmj5[!!0dU"""JXmFmIj!!0dU"U"*mH54m"m2[mFj5k43x""OkYFIXm"jZ53"F
IXj"W5jZ"Ja2[mFj!!0dU"""J[Ij[k6mnJ\\=u."U"nm3[IFmgf"J!!0dU"U"J[Ij[k
6mnJ\\tue"U"nkFm4gf"{"AA"E44m[m33IXi!!0dU"""J[Ij[k6mnJ\\Ou@"U"n[~k3
mgf"1!!0dU"""J[Ij[k6mnJ!!A4u?"""U"n3EFmX3[X5Fjgf"\"BHkX"Zm2"54FEjy
!!0dU"""J~[[k6mnJ"unJ!!20!!0dU"""J~[[k6mnJ!!20unJ!!0dU"U!!0dU"""Jm46
~54m[ZIXuAe"U"5#4kXm"~54m"$XmIG3"54"[k6m6"jm2j!!0dU"""J4mW~54m[ZIXu
nJ\\M!!0dU"""J~kWmX[I3mJ$#XkEFJ5YYm65IjmJWX5jmJN'J$#XkEF!!0dU"1!!0d
U!!0dU"s~k33IXi"kH"I$$Xm|5Ij5k43"I46"k$HE3[Ij5k43!!0dU"AAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA!!0dU!!0dU!!0dU"J~mj"""""""""""""""""
h!!0dU"JN'"""BJjk~mXI4[my"""Jf"""""BJZI3Z"|I~Emy""""""""Ce!!0dU"JNM
""""""""""""""""""Jx"""""BZI3Z"YE~j5F~5mXy""""C@!!0dU"JjkF["""BJYk4
jZy"""""J0"""""B~I3j"ZI3Zm6"[ZIX,"e@d"hy!!0dU"J$kj[""""""""""""""""
w-"""""BH5X3j"ZI3Zm6"[ZIX,"w-"Cy!!0dU"JXI4#m"""""""""""""JimIX""""B
JjkF[AJ$kj[/e,"Q@y!!0dU"JON"""""BJE[ZiFZy""""Jc"""""BI"[ZIXI[jmX"[k
6my!!0dU"J[Ij[k6m"""""""""""""J,!!0dU"J6mH"""""""""""""""""JC!!0dU"
Jm2FI46IHjmX"""""""""J\!!0dU"J[34IYm""""""""""""""J{!!0dU"Jm46[34IY
m"""""""""""J1!!0dU"J~[[k6m""""""""""""""J7!!0dU"JI6|I4[m""""""""""
"""JA!!0dU"J$#XkEF""""""""""""""JT!!0dU"J5H4EY"""""""""""""""J5H!!0d
U"Jm46~54m[ZIX"""""""""Jb!!0dU"J5H2"""""""""""""""J5H\\v!!0dU"J'G5
FFmX"""""""""""""JK!!0dU"J'G5Fr54m""""""""""""J&!!0dU"J*m[k6m=43WmX
""""""""J]!!0dU"J~kWmX[I3mJ$#XkEFJ5YYm65IjmJWX5jmJN'J$#XkEF""""""J8
!!0dU!!0dU"^Zm3m"I335#4Ym4j3"IXm"WX5jjm4"jk"jZm"kEjFEj"$i"JKX5jm_Ej
FEj"BI[jEI~~i!!0dU"$i"JK'9y"I46"jZm4"jZm"k$HE3[Ijm6"FXk[m3354#"[k6m
"53"WX5jjm4"B#5|m4!!0dU"I3"I"FIXIYmjmX"54356m"B"y"IHjmX"JKX5jm_EjFE
jyx"JKX5jm^ZmLm3j"53!!0dU"IEjkYIj5[I~~i"54|kGm6"jk"[kFi"jZm"}Em3j5k
4"I46"jZm4"jZm"I43WmXx!!0d!!0dJ6mHJKX5jm_EjFEjCeC@{J$m#54#XkEF!!0d"
"J[Ij[k6mnJ\ue@"J4mW~54m[ZIXuew!!0d""Jm6mHJJ{J4km2FI46JK'9{Ce1{C@11
JJ1!!0d!!0dJ6mHJK'9CeC@{J~mjJJJ3jX54#"U"|mX$Ij5Y53Zf"I|k56"3FI[m3"I
HjmX"[kYYI463!!0d""J5YYm65IjmJWX5jmJ_<5~m!!0d"""{JJJ~mjJJhJJJ~mjJJh
JJJCJJJ6mHJJJCJJJx{C@1JJhJJJfJJJjk~mXI4[mJJJfCe1U!!0d""JI~~kjZmXJ[I
j[k6mnBueJ[Ij[k6mnyu@!!0d""JIHjmXI335#4Ym4jJKX5jm^ZmLm3j!!0d""JjkG3
u1!!0d!!0dJ6mHJKX5jm^ZmLm3j{J5YYm65IjmJWX5jmJ_<5~m{JjZmJjkG3.1!!0d
""J[Ij[k6mnBue@J[Ij[k6mnyue@!!0d""J5YYm65IjmJWX5jmJ_<5~m{J3jX54#JqE
m3j5k4,1U!!0d""J5YYm65IjmJWX5jmJ_<5~m{J'mF1U!!0d""JOkFiqEm3j5k4!!0d
""J5YYm65IjmJWX5jmJ_<5~m{J'mF1U!!0d""{J=~~=[jNm2J(kXYNI3ZJa2[mFjJN5
6m'k~Ej5k41!!0d""Jm46#XkEF1!!0d!!0d!!0dJKX5jm_EjFEj{JjZmJN'1{JjZmJN
MU!!0d1BhJ0JYk4jZhJcJE[ZiFZhJ,J[Ij[k6mhJ\Jm2FI46IHjmXhJ{J[34IYm{hJC
J26mHhhJ3jX54#!!0dJCJe{h\\=1JCJw{h\\O1JCJR{h\\I11hJ1Jm46[34IYmhJT{h
J7J~[[k6mJCJ8{J4mW~54m[ZIX!!20*!!0dJ~kWmX[I3mJTJ5YYm65IjmJWX5jmJfJT
1hJAJI6|I4[mJimIXQ@hJ5HJ5H4EYhJbJm46~54m[ZIX!!0dJ0!!20?aJCJ\\-eEm3\
\R5k\\dm,{Jc."J~kkFJ,Jc!!20OJAJce"J5HJc>@-d"JXmFmIjJb!!20*JK1{J,!!20
\end{comment}
\begin{lcode}
*!!20O!!0dJ#6mHJKCe\\MC@\\M{J\JCJ{C@J1JSJJSSJS{=+\\Mf8DJk|mX1JCJ&C
Ce\\M{J\J5H\\vJ{CCeJ\U!!0dJ1J{C@J1J\J]Jm~3mJ\J&JH51J&11JCJ]{Jcw-J~k
kFJ7JfJcJ5HJc>J0JAJfJxJAJceJ5HJfPJ0!!0dJAJfAJimIXJH5JXmFmIjJ,eg.J,w
!!20@J,
[email protected][email protected]!!0dy!!0d!!0dJm46!!0d
!!0d!!03!!03!!A{end!!A}
\end{lcode}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%[Addendum 2: TeX encoder for my decoder. (mjd,18-Aug-1994)]
\subsection{Addendum 2}
TeX encoder for my decoder. (mjd,18-Aug-1994)
\begin{lcode}
% Source character set: 13,32-126 = 96
%
% (Note exclusion of tab. Assumption: Text to be translated will
% always be untabified first.)
%
% Target character set: 33-126.
%
% Carriage return (13) cannot be included in the target set because
% of the constraint to have a maximum line length of 72 in the
% encoded text. If 13 (carriage return) were included in the
% encoding, then the end of the current line would only occur at
% the next instance in the ciphered text of the character that
% translates to 13. And depending on what that character is, who
% knows how long the encoded line could be? Perhaps as long as the
% entire text.
%
% Space (32) are not included in the target set for a subtler
% reason. If spaces in the encoded text happen to fall at the end
% of a line, they will be dropped by TeX during the decoding
% process, instead of decoded. So we either must exclude them from
% the target set, or make sure that they never fall at the end of a
% line.
%
% By excluding space from the target set, we make it possible for
% the decoder to use a space as its argument delimiter. If we have
% only one space, at the end of the encoded text, it is not so hard
% to ensure that it does not fall at the end of a line. But note
% that the decoder must make sure to change the catcode of space to
% something other than 10, so that it will not disappear if it
% falls at the *beginning* of a line.
\def\colon{:}\def\arrow{->}%
\let\isx\message
%\def\isx#1{}
\iffalse
% OK, here is how the encoding works. Start with \mag = random (in
% the target range 33-125), first encoding value. Handle two
% special cases first: ^^M encodes to \mag, space encodes to \mag
% +1. Then start normal encoding at \fam = 35 (char 35 = ! encodes
% to \mag +2, and so forth). When \mag reaches 126, we wrap it
% around to 33 (don't want to encode any character to space).
% Finally, when \fam reaches 126, we must handle the last three
% characters (126,33,34: ~!") as digraphs: encode them as ~x~y~z,
% where xyz are obtained by continuing to increment \mag.
@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ ! "#$%&'()*+,-./0123456789:;<=>?
R S~S~TTUVWXYZ[\]^_`abcdefghijklmnop
@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|} ~
qrstuvwxyz{|}!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQ~R
\fi % ^^^
\def\setup{%
\def\notilde{}% later will be defined to include a tilde
\def\encodeone{%
\catcode\fam\active\lccode126\fam\lccode 48\mag
\lowercase{\edef~{\notilde 0}%
\isx{[\string~\colon \notilde 0\space\number\fam\arrow\number\mag]}%
}%
\advance\mag7 \ifnum\mag>125\advance\mag-93 \fi
\advance\fam1
}%
\def\do{\encodeone \csname do\ifnum\fam>125 stop\fi\endcsname
}%
% ASSUMPTION: \mag initialized before the call of \setup
% Encode ^^M -> \mag
\fam13 \encodeone
% Encode space -> next \mag
\fam32 \encodeone
% Now encode the rest
\fam35 \let\dostop\relax \do
% Now \fam = 34, \mag = ?. We need to define encoding for
% characters 34,33,126 ("!~) as ~z ~y ~x. But what are convenient
% values for x y z? Why, just the next \mag's in sequence
\edef\notilde{\string ~}
\encodeone \fam33 \encodeone \encodeone
}
\def\outwrite{\immediate\write15{\outline}%
% If a digraph occurred at the end of the line, carry over the
% second character to the beginning of the next line.
\expandafter\ifx\csname 73\endcsname\relax
\else
\expandafter\let\expandafter\1\csname 73\endcsname
\expandafter\let\csname 73\endcsname\relax
\charnum 1
\fi
\checkeof}
% For fast looking on screen:
%\def\outwrite{\immediate\write16{\outline}\checkeof}
\begingroup
\let\0\catcode \0`\0 11 \0`\2 11 \0`\3 11 \0`\4 11 \0`\5 11
\0`\6 11 \0`\7 11 \0`\8 11 \0`\9 11 \0`\1 11
\gdef\outline{\1\2\3\4\5\6\7\8\9\10\11\12\13\14\15\16\17\18\19
\20\21\22\23\24\25\26\27\28\29\30\31\32\33\34\35\36\37\38\39
\40\41\42\43\44\45\46\47\48\49\50\51\52\53\54\55\56\57\58\59
\60\61\62\63\64\65\66\67\68\69\70\71\72}
\endgroup
\newcount\charnum
\def\checkeof{\futurelet\next\encodemore}
\def\tildecheck#1#2{\if \string~#1%
\expandafter\def\csname\number\charnum\endcsname{#1}%
\advance\charnum 1
\expandafter\def\csname\number\charnum\endcsname{#2}%
\fi}
\def\encodemore{\ifx\next\EOF
\let\next\outwrite \let\checkeof\relax
\global\tracingcommands2\global\tracingmacros2\global\tracingonline0
% At end of file, assume that there was a ^^M at the end,
% translated to the digraph ~|. Remove it, to reduce the number of
% blank lines that will be produced on screen during decoding.
% BUT, if \charnum = 72, leave the ^^M there to avoid having the
% space at the end of the line.
\ifnum\charnum<72
\expandafter\def\csname\number\charnum\endcsname{ }%
\else
\def\1{ }%
\fi
\else
\advance\charnum 1
\ifnum\charnum>72
\charnum 0 \let\next\outwrite
\else
\let\next\getnextchar
\fi
\fi
\next}
\def\getnextchar#1{%
\edef\0{#1}%
\expandafter\let\csname\number\charnum\endcsname\0\relax
\expandafter\tildecheck\0\relax\relax
\checkeof}%
% For this we need just a unique no-op value for \ifx comparison.
\def\EOF{\relax\relax}
\def\writefile#1{\expandafter\checkeof\input#1 \EOF}%
\begingroup
% Define \0 to read in the text for \writepreamble.
\def\0#1XXX#2^^JZZZ^^J{\endgroup
\def\writepreamble##1{\begingroup
% Convert ##1 into a hex number.
\newlinechar=10 \chardef\0=##1\def\1####1"{"}%
\immediate\write15{#1\expandafter\1\meaning\0#2}\endgroup}}%
% Now change all special catcodes to 12. We don't use \dospecials
% because we want to do backslash last, in conjunction with
% \afterassignment.
\catcode`\{=12 \catcode`\}=12 \catcode`\#=12
\catcode`\~=12 \catcode`\@=12 \catcode`\$=12
\catcode`\^=12 \catcode`\&=12 \catcode`\_=12 \catcode`\|=12
% The following line will turn off the last two remaining special
% characters % and \, set end-of-line character to ^^J (for later
% use in the \write), and then call \0. ^^M still has category 5 at
% this point and the new value of \endlinechar won't get applied
% until the *next* line is read, so the catcode assignment for \
% will get terminated properly by the space from ^^M, thus \0 will
% get called before TeX attempts to read the % at the beginning of
% the subsequent line.
\catcode`\%=12 \endlinechar=10 \afterassignment\0 \catcode`\\=12
%%%% Self-decoding answer: run the following text through plain TeX %%%%
\let\+\let\+\a\advance\+\c\catcode\+\d\def\+\f\fam\+\m\mag\+\u\uccode \m
13\c\m9\+\p\uppercase\d\i{\a\f7 \ifnum\f>125 \a\f-93 \fi}\d~{\u\f\m \c\m
12 \a\m1 \i \ifnum\m>125 \+~\1\fi~}\d\0#1{\ifnum`#1>"D \if#1 !\else "\fi
\else\string~\fi}\u`9"20\p{\d\1#19}{\newlinechar13\d\3{\immediate\write1
6}\+~\0\p{\3{}\3{#1}\batchmode\end}}\fXXX\u\f\m\i\m32\u\f\m\c\m12\i\m35~
ZZZ
\def\encodefile#1{%
\immediate\openout15=encode.out \relax
\begingroup
% Get a random number from \time, normalize it to fall in the range
% 33--125. First set \mag = \time mod 93, then add 33 to make it
% fall in the proper range.
\fam\time \mag\time \divide\fam93 \multiply\fam 93 \advance\mag-\fam
\advance\mag 33
\message{======= Code shift: time \number\time\space -->
mag \number\mag\space ============================}%
\writepreamble{\number\mag}%
% \setup uses \mag.
\setup \charnum=0
\immediate\write16{Starting to create file encode.out . . .}%
\writefile{#1}%
\endgroup
\immediate\closeout15 \relax
\immediate\write16{The encoded output is in the file encode.out.}%
}
\immediate\write16{Enter the name of the file you want to encode:}
{\catcode\endlinechar=9 \global\read-1 to\filnam}
\encodefile{\filnam}
\end
\end{lcode}
%$
%%\endinput
\chapter{Defining new control sequences}
\section{Exercise}
%%\input{ex012}
% ex012.tex
\begin{comment}
Date: 24 Sep 1993 16:11:36 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #12
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
========================================================================
*** Exercise 12:
\end{comment}
\ed{\oposted{1993/09/24}. \arch{exercise.012}.}
How many commands are there in plain TeX that can be used to define a
new (i.e., previously undefined) control sequence?
\begin{comment}
========================================================================
E-mail answers to my address, below. A summary will be posted circa
October 15, 1993.
Michael Downes ---------------------------------------------------------
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
\end{comment}
%$
%%\endinput
\section{Answers}
%%\input{ans012}
% ans012.tex
\begin{comment}
[The addendum was not included in the original post but added in my
archives later ---mjd]
Date: 25 Oct 1993 16:36:43 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #12, answer
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1993/10/25}. \arch{answer.012}.}
%Exercise 12 asked `How many commands are there in plain TeX that can
%be used to define a new (i.e., previously undefined) control
%sequence?'.
This exercise has latent ambiguities. The parenthetical
remark `(i.e., previously undefined)' was intended as a hint towards
the most comprehensive possible answer.
There are three main criteria that could be used for `new' status of a
control sequence:
\begin{enumerate}
\item If executed, the control sequence causes an `\texttt{Undefined control
sequence}' error.
\item The control sequence is \piif{ifx}-equivalent to \cmd{\relax} when constructed
with \cmd{\csname} \texttt{\ldots} \cmd{\endcsname}. This is the basis of the LaTeX
\cmd{\@ifundefined} test.
\item The control sequence has not yet been entered into the hash table.
\end{enumerate}
Criterion (3) doesn't work for one-character control sequences (\cmd{\a},
\cmd{\0}, \cmd{\:}) since they have space reserved for them separate from the
hash table whether or not they are defined in any sense.
Criterion (2) obviously gives a spurious true result if applied to
\cmd{\relax} or to something like LaTeX's \cmd{\protect} command that spends much
of its time being equivalent to \cmd{\relax}.
Criterion (1) therefore seems best. Notice that control sequences can
enter into the hash table without becoming defined anywhere along the
way, so a control sequence can be `old' by criterion (3) but still
new by criterion (1). In all of the following examples the control
sequence \cmd{\foo} will get added to the hash table but remain undefined.
\begin{lcode}
\def\x{\foo}
\toks0{\foo}
\string\foo
\noexpand\foo
\gobble\foo (assuming \def\gobble#1{})
\uppercase{\iffalse\foo\fi}
\show\foo
\meaning\foo
\end{lcode}
Two notable cases where tokenization, but not hash-table-ization, of
\cmd{\foo} occurs are in an \piif{ifx} comparison or on the false branch of an
\piif{if}:
\begin{lcode}
\ifx\foo\something...
\iffalse\foo\fi
\end{lcode}
(\emph{TeXbook}, Appendix D, p384).
The straightforward answer to Exercise 12 is to count up the various
kinds of def'ing and let'ing functions (table~\ref{tab:deflet}):
\begin{comment}
\begin{lcode}
Primitive: Nonprimitive:
\def \newcount
\edef \newdimen
\gdef \newskip
\xdef \newmuskip
\let \newfam
\futurelet \newwrite
\chardef \newread
\mathchardef \newbox
\countdef \newtoks
\dimendef \newinsert
\skipdef \newlanguage
\muskipdef \newif
\toksdef \newhelp
\font
\read
\csname
\end{lcode}
\end{comment}
\begin{table}
\centering
\caption{The def'ing and let'ing functions}\label{tab:deflet}
\begin{tabular}{ll} \toprule
Primitive & Nonprimitive \\ \midrule
\cmd{\def} & \cmd{\newcount} \\
\cmd{\edef} & \cmd{\newdimen} \\
\cmd{\gdef} & \cmd{\newskip} \\
\cmd{\xdef} & \cmd{\newmuskip} \\
\cmd{\let} & \cmd{\newfam} \\
\cmd{\futurelet} & \cmd{\newwrite} \\
\cmd{\chardef} & \cmd{\newread} \\
\cmd{\mathchardef} & \cmd{\newbox} \\
\cmd{\countdef} & \cmd{\newtoks} \\
\cmd{\dimendef} & \cmd{\newinsert} \\
\cmd{\skipdef} & \cmd{\newlanguage} \\
\cmd{\muskipdef} & \cmd{\newif} \\
\cmd{\toksdef} & \cmd{\newhelp} \\
\cmd{\font} & \\
\cmd{\read} & \\
\cmd{\csname} & \\
\bottomrule
\end{tabular}
\end{table}
The reason for including \cmd{\csname}? After
\begin{lcode}
\csname foobar\endcsname
\end{lcode}
\cmd{\foobar} is no longer undefined; the change in its status is
indistinguishable from the change effected by the statement
\verb?\let\foobar\relax?. \cmd{\endcsname} is not counted separately because
\cmd{\csname} and \cmd{\endcsname} can only be used together.
So: 16 primitive, 13 non-primitive make 29 total. But to those should
be added two more, since the statement of the Exercise didn't exclude
`private' macros: (i) the internal function \cmd{\alloc@} of plain.tex
that is shared by all the \cmd{\newxxx} macros (except for \cmd{\newif} and
\cmd{\newhelp}), and (ii) the internal function \cmd{\@if} used by \cmd{\newif}.
That brings the total to 31.
Beyond that there can be added another, less obvious, class of
commands, if we paraphrase the exercise as follows:
\begin{quote}
Find all commands such that executing command \cmd{\xxx}, with its normal
arguments (if any), causes at least one control sequence to pass
from undefined status to defined status, where undefined status
means that executing the control sequence would generate the error
`Undefined control sequence'.
\end{quote}
For example, the first use of \cmd{\loop} causes \cmd{\body} and \cmd{\next} to become
defined. As it turns out, there are many of these in plain TeX
(table~\ref{tab:user} and~\ref{tab:internal} as well as \verb?'? or \cmd{\rq}
in math mode only).
\begin{comment}
User functions:
\begin{lcode}
\loop, \t, \smash, \vfootnote, \settabs, \phantom,
\vphantom, \hphantom, \footnote, \multispan, \longleftarrow,
\longrightarrow, \mathstrut, \longmapsto, \matrix, \pmatrix;
\end{lcode}
\verb?'? or \cmd{\rq} (math mode only)
\end{comment}
\begin{figure}
\freetabcaption{User functions}\label{tab:user}
\autorows{c}{4}{l}{%
\cmd{\footnote},
\cmd{\hphantom},
\cmd{\longleftarrow},
\cmd{\longmapsto},
\cmd{\longrightarrow},
\cmd{\loop},
\cmd{\mathstrut},
\cmd{\matrix},
\cmd{\multispan},
\cmd{\phantom},
\cmd{\pmatrix},
\cmd{\settabs},
\cmd{\smash},
\cmd{\t},
\cmd{\vfootnote},
\cmd{\vphantom}
}
\end{figure}
\begin{comment}
Internal functions:
\begin{lcode}
\iterate, \relbar, \sett@b, \s@tt@b, \prim@s,
\ph@nt, \fo@t, \f@@t, \pr@m@s, \pr@@@s, \s@tcols
\end{lcode}
\end{comment}
\begin{figure}
\freetabcaption{Internal functions}\label{tab:internal}
\autorows{c}{6}{l}{%
\cmd{\f@@t},
\cmd{\fo@t},
\cmd{\iterate},
\cmd{\ph@nt},
\cmd{\pr@@@s},
\cmd{\pr@m@s},
\cmd{\prim@s},
\cmd{\relbar},
\cmd{\s@tcols},
\cmd{\s@tt@b},
\cmd{\sett@b}
}
\end{figure}
Adding these 18 user functions and 11 internal functions to the
previously cited 31 gives a total of 60 functions available in
\pfile{plain.tex} that satisfy a strict interpretation of the exercise
statement.
Credit for the best answer goes to Dan Luecking\index{Luecking, Dan},
who found 29 of the
primary 31, and did not miss the other two (\cmd{\csname}, \cmd{\@if}) by
overlooking them but by considering them and believing they didn't
satisfy the requirements.
My own score in that part was 28: I overlooked \cmd{\read}, \cmd{\alloc@}, and
\cmd{\@if} until Luecking and Peter Schmitt\index{Schmitt, Peter}
brought them to my notice.
Ian Collier\index{Collier, Ian} also submitted a good answer, including
identification of
the secondary class of functions that define scratch macros as a side
effect.
%%========================================================================
Notes:
\begin{itemize}
\item \cmd{\iterate}, \cmd{\settabs}, \cmd{\sett@b}, \cmd{\s@tt@b},
\cmd{\t}, \cmd{\prim@s}, \cmd{\ph@nt}, \cmd{\smash},
\cmd{\vfootnote}, \cmd{\fo@t}, \cmd{\f@@t} all define \cmd{\next}.
\item \cmd{\loop} defines \cmd{\body}.
\item \cmd{\pr@m@s} defines \cmd{\nxt}.
\item \cmd{\prim@s} is called by active \verb?'? (mathcode \verb?"8000?)
and by \cmd{\pr@@@s}.
\item \cmd{\iterate} is called by \cmd{\loop}.
\item \cmd{\sett@b} is called by \cmd{\settabs}.
\item \cmd{\s@tt@b} is \emph{conditionally} called by \cmd{\sett@b}.
\item \cmd{\smash} is called by \cmd{\relbar}.
\item \cmd{\ph@nt} is called by \cmd{\phantom}, \cmd{\vphantom}, and
\cmd{\hphantom}.
\item \cmd{\vfootnote} is called by \cmd{\footnote}.
\item \cmd{\fo@t} is called by \cmd{\vfootnote}.
\item \cmd{\f@@t} is \emph{conditionally} called by \cmd{\fo@t}.
\item Active \verb?'? is produced by \cmd{\rq} if used in math mode.
\item \cmd{\pr@@@s} is called by \cmd{\pr@m@s}.
\item \cmd{\loop} is called by \cmd{\multispan} and \cmd{\s@tcols}.
\item \cmd{\relbar} is called by \cmd{\longleftarrow} and \cmd{\longrightarrow}.
\item \cmd{\vphantom} is called by \cmd{\mathstrut}.
\item \cmd{\pr@m@s} is called by \cmd{\prim@s}.
\item \cmd{\s@tcols} is *conditionally* called by \cmd{\sett@b}.
\item \cmd{\longrightarrow} is called by \cmd{\longmapsto}.
\item \cmd{\mathstrut} is called by \cmd{\matrix}.
\item \cmd{\matrix} is called by \cmd{\pmatrix}.
\item \cmd{\prim@s} won't necessarily define \cmd{\next} because it does
a \cmd{\futurelet}
which will leave \cmd{\next} undefined if the next thing happens to be an
undefined control sequence (rather unlikely, however).
\item \cmd{\vfootnote} and \cmd{\settabs} also do a \cmd{\futurelet} but it is followed by
another macro that ensures that \cmd{\next} does not end up undefined.
\end{itemize}
\begin{comment}
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{comment}
%$
\section{Addendum}
\enlargethispage{3\onelineskip}
\begin{comment}
Addendum: From comp.text.tex
===========================================================================
Archive-Date: Wed, 29 Sep 1993 13:21:40 CST
From:
[email protected] (Chris Thompson)
Subject: Re: Managing Large LaTeX Files. How ??
Date: Wed, 29 Sep 1993 16:36:23 GMT
To:
[email protected]
\end{comment}
From \texttt{comp.text.tex}
\begin{lcode}
From:
[email protected] (Chris Thompson)
Subject: Re: Managing Large LaTeX Files. How ??
Date: Wed, 29 Sep 1993 16:36:23 GMT
To:
[email protected]
In article <
[email protected]>, Werenfried Spit <
[email protected]>
writes:
|> In article <
[email protected]>,
[email protected]
|> (Richard Kaye) says:
|> >Has anyone else had save stack overflow when LaTeX read the .aux files?
|> >
|> >[Will a TeX guru please explain it to me? I thought \global\def's could not
|> >cause save stack overflow until I found this problem. If it's a general
|> >problem, it seems a bit silly that LaTeX should try to input so much
|> >information in this way.]
|> >
|> >I fixed it so that the data was read {\it outside} the group (as part of one
|>
|> Could someone explain it to me too? I'm even more puzzled after I tried
|> out Richards solution and played a bit with it. When you put in
|> your input file directly after the \documentstyle command the line
|> \input \jobname.aux
|> LaTeX reads the aux file without its memory getting overflowed; then
|> at \begin{document} it reads the aux file again (as expected), but
|> the memory doesn't overflow this time either. (If you leave out the
|> \input \jobname.aux LaTeX only reads the aux file during \begin{document}
|> and then chokes on an exceedence of the save size.)
\end{lcode}
[Chris Thompson] This was a hard one to track down. I could claim that it was all my fault...
The entries on the save stack are not the result of the
\cmd{\global}\cmd{\@namedef},
which as suggested above never needs to use such a thing. They come from
the earlier \cmd{\@ifundefined} call in \cmd{\newlabel}.
Change \#337 in \pfile{tex82.bug} numbering, applied in TeX 2.9, changed the implicit
setting of an undefined control sequence referenced via \cmd{\csname}...\cmd{\endcsname}
to \cmd{\relax} (\emph{TeXbook}, page 213) from being (sort of) global to being local to
the current group. Don made this change as a direct result of my posting to
TeXhax (year 1987, digest 103) pointing out that the TeXbook didn't correctly
describe what happened.
The change was a potent source of new bugs, because TeX was not originally
designed to cope with token expansion have side-effects of modifying the
save stack (see in particular change \#371 in tex82.bug). I have more than
once wondered whether I should have kept quiet about the whole business\ldots
In an ideal world, the problem wouldn't arise because the implicit setting
to \cmd{\relax} wouldn't occur at all (IMNSHO). But everything (especially LaTeX)
relies on it now, so it's (far) too late to change it. Something to be got
right in the next incarnation.
\begin{lcode}
Chris Thompson
Cambridge University Computing Service
\end{lcode}
%%\endinput
\chapter{\cs{endlinechar} and \cs{par}}
\section{Exercise (fast)}
%%\input{ex013}
% ex013.tex
\begin{comment}
Date: 13 Oct 1993 12:31:56 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #13
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1993/10/13}.\arch{exercise.013}.}
\begin{lcode}
%%%% Three lines of overhead for the self-decoding answer; see below %%%
\let\+\let\+\a\advance\+\c\catcode\+\d\def\+\f\fam\+\m\mag\f"20\d~{\c\f9
\a\f1 \ifnum\f>125\f002\d~{\a\f-1 \ifnum\f<1\egroup\fi}\fi~}\c`\^^M="9{~
\end{lcode}
%%========================================================================
%%*** Exercise 13 (fast):
(a) If \cmd{\endlinechar} does not have category 5 do you still get a \piif{par}
from a blank line?
(b) If \cmd{\endlinechar}=-1 do you still get a \piif{par} from a blank line?
\begin{comment}
========================================================================
Michael Downes =========================================================
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
\end{comment}
%$
Self-decoding answer given below. To see the answer, run this post
(sans mail/newsgroup header) through plain TeX.
\begin{lcode}
\d~{\u\f\m\c\m12 \a\m1\a\f1 \ifnum\f>125\f33 \fi\ifnum\m>125\+~\1\fi~}\+
\u\uccode\+\p\uppercase\d\0#1{\ifnum`#1>"D \if#1 !\else"\fi\else\string~
\end{lcode}
\ed{There are sixteen lines like this, all of which are in the archived
version if you need them. The last line is:}
\begin{comment}
\fi}\u`9"20\p{\d\1#19}{\newlinechar13 \d\3{\immediate\write16}\+~\0\p{\3
{}\3{#1}\batchmode\end}}\f"6C\m"0D\u\f\m\a\f"1\m32\u\f\m\c\m12\a\f1\m35~
/\aeS`amb]m/`]c\RmbVSm0S\Rmn|!(llsOtm<]ymsPtm<]ymm7\m]bVS`me]`RawmOmPZO\
YmZW\SmeWZZm^`]RcQSmOmJ^O`mWTlO\Rm]\ZgmWTmS\RZW\SmQVO`OQbS`amO`Sm^`SaS\b
mO\RmVOdSmQObQ]RSm#ym7bmWalW\bS`SabW\Umb]m\]bSmbVObmbe]mQ]\aSQcbWdSmS\RZ
W\SmQVO`OQbS`amO`Sm\]blb`O\aZObSRmaW[^Zgmb]mJ^O`wmPcbmb]m*a^OQS,J^O`ymms
BVSma^OQSmeWZZlRWaO^^SO`mW\ma][SmQW`Qc[abO\QSawmSyUywmOTbS`mOmQ]\b`]Zme]
`RwmOQQ]`RW\Ulb]mBSFram\]`[OZmaQO\\W\Um`cZSaytmBVWamWambVSm`SOa]\ms]`mOb
mZSOabm]\Sl`SOa]\tmbVObmOmJ^O`m]^S`ObW]\m[cabm^S`T]`[mO\mW[^ZWQWbmJc\aYW
^l]^S`ObW]\ymBVS`SmeOamOZa]mOm`SQS\bm^]abmb]mQ][^ybSfbybSfmPgm2]\OZRl/`a
S\SOcmb]m^]W\bm]cbmbVSm^`]PZS[meWbVma][S]\SramRSZW[WbSRxO`Uc[S\bl[OQ`]mR
STW\WbW]\(llmmJRSTJa][SbVW\Un|yJ^O`i*R]ma][SbVW\UmeWbVmn|,kllBVSmRSZW[Wb
S`mab`W\Um~nyJ^O`~nmRWRm\]bm[ObQVmbVSmOQbcOZmbSfbllmmyyyma][SmbSfbylmm*P
\end{comment}
\begin{lcode}
ZO\YmZW\S,llPSQOcaSm]TmbVSma^OQSmb]YS\mT]ZZ]eW\UmbVSm^S`W]Ry mbSfbylmm*P
\end{lcode}
%%\endinput
\section{Answers}
%%\input{ans013}
% ans013.tex
\ed{\arch{answer.013}.}
[This was included as a self-decoding answer in the posting of Exercise
\#13 which is archived as \pfile{exercise.013}.]
Answers to Around the Bend \#13:
(a) No. (b) No. In other words, a blank line will produce a \piif{par} if
and only if endline characters are present and have catcode 5. It is
interesting to note that two consecutive endline characters are not
translated simply to \piif{par}, but to \meta{space}\piif{par}. (The space will
disappear in some circumstances, e.g., after a control word, according
to TeX's normal scanning rules.) This is the reason (or at least one
reason) that a \piif{par} operation must perform an implicit \cmd{\unskip}
operation. There was also a recent post to \pfile{comp.text.tex} by Donald
Arseneau\index{Arseneau, Donald} to point out the problem with someone's
delimited-argument macro definition:
\begin{lcode}
\def\something#1.\par{<do something with #1>}
The delimiter string ".\par" did not match the actual text
... some text.
<blank line>
because of the space token following the period..
\end{lcode}
%%\endinput
\chapter{TeX's stomach}
\section{Exercise}
%%\input{ex014}
% ex014.tex
\begin{comment}
Date: 26 Oct 1993 09:29:08 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #14
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1993/10/26}. \arch{exercise.014}.}
\begin{lcode}
%%%%% Two lines of overhead for the self-decoding answer; see below %%%%
\let\+\let\+\a\advance\+\c\catcode\+\d\def\+\f\fam\+\m\mag\c13 9{\c32'16
\end{lcode}
%% =======================================================================
\begin{quote}
*** Exercise 14 [proposed by Jonathan Fine]:
Which character code/category code pairs can actually reach TeX's
`stomach'?
\end{quote}
%% =======================================================================
This is a refinement of The \emph{TeXbook}'s Exercise 7.3. You need to be a
little careful about your answer. I didn't get it right on my first
try \ldots
To make the notion of `reaching TeX's stomach' more precise: A token
is said to `reach TeX's stomach' if it produces a token report when
\cmd{\tracingcommands} = 1. And a `token report' is a phrase in braces,
e.g.,
\begin{lcode}
{the letter A}
\end{lcode}
as produced by TeX in the log file when tracing commands.
\begin{comment}
Michael Downes ========================================================
[email protected] ASCII 32--55,56--126: !"#$%&'()*+,-./01234567
89:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
\end{comment}
%$
Self-decoding answer given below. To see the answer, run this post
(sans mail/newsgroup header) through plain TeX.
\begin{lcode}
}\d~{\u\f\m\c\m12\a\m1\a\f1 \ifnum\f>125\f33 \fi\ifnum\m>125\+~\1\fi~}\+
\u\uccode\+\p\uppercase\d\0#1{\ifnum`#1>"D \if#1 !\else"\fi\else\string~
\end{lcode}
\ed{In the archived form there are 20 lines like this, the last being:}
\begin{comment}
\fi}\u`9"20\p{\d\1#19}{\newlinechar13 \d\3{\immediate\write16}\+~\0\p{\3
{}\3{#1}\batchmode\end}}\f"39\m"0D\u\f\m\a\f"1\m32\u\f\m\c\m12\a\f1\m35~
Y).2}-:/*:Y-*0)|:/#}:Z})|:;ILR99::[y/{*|}:[#y-:[*|}.::::[y/{*|}:[#y-:[*|
}.9::EEEEEEE:EEEEEEEEEE::::EEEEEEE:EEEEEEEEEE9:::::I:::::HEEJMM:::::::::
::IH:::IEEJMM9:::::J:::::HEEJMM9:::::K:::::HEEJMM:::::::::::II:::HEEJMM9
:::::L:::::HEEJMM:::::::::::IJ:::HEEJMM9::::::::::::::::::::::::::::IK::
:HEEJMM9:::::N:::::HEEJMM9:::::O:::::HEEJMM9:::::P:::::HEEJMM99[y/}"*-4:
IH:$.:/#}:}3{}+/$*)y':{y.}F:[y/{*|}EIH:{#y-y{/}-.:2$/#:{#y-y{/}-9{*|}:TV
:KJ:{y):*)'4:z}:+-*|0{}|:z4:t0++}-{y.}Gt'*2}-{y.}:/-${&.:@l}pz**&D9Y++})
|$3:\AF:k*:/#}:+y$-:{#y-y{/}-:HD:{y/{*|}:IH:$.:)*/:+*..$z'}R:t0++}-{y.}9
y)|:t'*2}-{y.}:{y))*/:+-*|0{}:y:{#y-y{/}-:H:!-*(:y:)*)EH:{#y-y{/}-F99Y{/
$1}:{#y-y{/}-.:2$'':/}./:/-0}:!*-:{y/}"*-4:IH:2$/#:t$!{y/:$!:/#}4:y-}9t'
}/:},0y':/*:y:.+y{}:/*&})F:Z0/:$!:/#}:~9:{#y-y{/}-:@.y4A:#y.:z}}):.*9|}!
$)}|D:$/:2$'':)*/:(y/{#:y:.+y{}:$):/#}:|}'$($/}-:/}3/:*!:y:(y{-*:2$/#9|}
'$($/}|:y-"0(})/.F:Y)|:y{{*-|$)":/*:t/-y{$)"{*((y)|.:/#}:(}y)$)":*!:y)9y
{/$1}:/$'|}:/#y/:#y.:z}}):t'}/:},0y':/*:y:.+y{}:$.:~;z'y)&:.+y{}::~;D92#
\end{comment}
\begin{lcode}
}-}y.:/#}:(}y)$)":*!:y:{y/}"*-4EIH:/$'|}:$.:~;z'y)&:.+y{}:~9~;F ::~;D92#
\end{lcode}
%%\endinput
\section{Answers}
%%\input{ans014}
% ans014.tex
\ed{\arch{answer.014}.}
[This was included as a self-decoding answer in the posting of Exercise
\#14, which is archived as \pfile{exercise.014}.]
\begin{lcode}
Answer to Around the Bend #14:
Catcode Char Codes Catcode Char Codes
------- ---------- ------- ----------
1 0--255 10 1--255
2 0--255
3 0--255 11 0--255
4 0--255 12 0--255
13 0--255
6 0--255
7 0--255
8 0--255
\end{lcode}
Category 10 is the exceptional case. Catcode-10 characters with character
code $<>$ 32 can only be produced by \cmd{\uppercase}/\cmd{\lowercase} tricks
(\emph{TeXbook}, Appendix D). So the pair character 0, catcode 10 is not
possible: \cmd{\uppercase}
and \cmd{\lowercase} cannot produce a character 0 from a non-0 character.
Active characters will test true for category 10 with \piif{ifcat} if they are
\cmd{\let} equal to a space token. But if the \verb?~? character (say) has been so
defined, it will not match a space in the delimiter text of a macro with
delimited arguments. And according to \cmd{\tracingcommands} the meaning of an
active tilde that has been \cmd{\let} equal to a space is
\verb?`blank space '?
whereas the meaning of a category-10 tilde is \verb?`blank space ~'?.
%%\endinput
\chapter{Space removal}
\section{Exercise}
%%\input{ex015}
% ex015.tex
\begin{comment}
Date: 05 Nov 1993 16:34:28 -0500 (EST)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #15
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1993/11/05}. \arch{exercise.015}.}
(a) Write a macro \cmd{\trimspace} that takes another macro as its argument and
removes a trailing space from the replacement text of the macro, if one
is present, and otherwise leaves it unchanged.
(b) Write a macro \cmd{\trimspaces} that removes a leading space, if
present, and then calls \cmd{\trimspace} to remove a trailing space.
%%========================================================================
Motivation: If a user inadvertently includes an extra space
in a text argument, such as a section heading:
\begin{lcode}
\section{Title of the section }
\end{lcode}
then you must usually take care to remove the space when typesetting
the text. The simple way is to perform an \cmd{\unskip} at the end (if the
text is immediately followed by \piif{par}, the \cmd{\unskip} operation is
built-in) and an \cmd{\ignorespaces} at the beginning, but various
complications can arise, so it would be preferable to be able to apply
a \cmd{\trimspaces} function when an argument is first read, and then have
the information in proper form for all subsequent uses.
\begin{comment}
Send answers to the address below. A summary will be posted
November 23, 1993 or thereabouts.
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
\end{comment}
%$
%%\endinput
\section{Answers}
%%\input{ans015}
% ans015.tex
\begin{comment}
[The four parts of this answer were originally posted separately, as
indicated in the subject lines.]
Date: 16 Dec 1993 16:34:45 -0500 (EST)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #15, answers
To:
[email protected]
\end{comment}
\ed{\oposted{1993/12/16}. \arch{answer.015}.}
Exercise 15 asked for a function \cmd{\trimspace} to trim a trailing space
from the replacement text of a macro, and a function \cmd{\trimspaces} to
trim both a leading and a trailing space. At the time of posting the
exercise I had no prepared solution; as luck would have it the problem
was rife with latent complications (including some hard questions
about limiting the domain of application), which propagated an
unusually diverse crop of approaches among the submitted solutions,
and which made the task of preparing a good summary extraordinarily
difficult. Even after breaking down the `summary' into two or three
pieces, to avoid a too formidably large monolith of a posting, I'll
have to leave out some material that I would otherwise have included.
I'd say Donald Arseneau\index{Arseneau, Donald} deserves credit for
the best analysis,
including an accurate survey of brace-stripping problems. Nearly
everyone, including myself, had missed a lurking flaw of that kind in
the first submitted version of their solution. Another good idea of
Donald's that caught my fancy was to use TeX's built-in scanning
procedures for \meta{optional space} to strip the leading space in
\cmd{\trimspaces}. I managed to work that into my own best solution, much to
my satisfaction.
Peter Schmitt\index{Schmitt, Peter} came up with perhaps the most
aerodynamic solution, on his second go-round. A solution by
Ian Collier\index{Collier, Ian} differed notably from
the others by using \cmd{\meaning} to look for a leading space. Another
submission, from
Gary McGary\index{McGary, Greg}\index{McGary, Gary|see{McGary, Greg}}
\ed{I think this is a typo for Greg McGary}, contained some
original syntactic ideas,
and explored the more general problem of removing an arbitrary token
pattern at the end of a token list.
A careless, off-the-cuff remark of mine in the statement of Exercise
15 that after removing a leading space, \cmd{\trimspaces} should call
\cmd{\trimspace} to remove a trailing space, was probably a mistake. In most
cases, at least, \cmd{\trimspaces} can be more elegantly written by letting
the two different space-removal procedures share a few tokens at a
lower level.
From Donald's\index{Arseneau, Donald} analysis:
\begin{quote}
When I first read the question, I thought `why isn't there an answer
with the question, because that one is easy?' As I started to type
my answer `cold', I realized that what I had used previously to
ignore leading spaces
\begin{lcode}
\def\something#1#2\weird{#1#2}
\end{lcode}
had the bad
side-effect of stripping braces if the parameter began with `\verb?{?'.
\end{quote}
I append below Peter Schmitt's\index{Schmitt, Peter}
solution, more or less as he wrote it.
The commentary refers to earlier correspondence in a place or two but I
believe there is sufficient context to make everything intelligible.
Test \#5 in the test suite traps the insidious brace-stripping problem
that infested most of the solutions in their first incarnation.
\begin{comment}
More on Exercise 15 to follow, some time in the next few days.
Michael Downes,
[email protected]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{comment}
\begin{solution}{Solution 1 (Peter Schmitt)}
%%>>Solution 1 (Peter Schmitt,
[email protected])
Since I wanted to stay with delimited arguments it was clear that one
has to add a token (or tokens) in order to hide braces, which finally
have to be removed again. First I came up with using \cmd{\empty}, as you
did, but then I switched to a not expandable token because this can
more efficiently be used as a parameter delimiter.
\cmd{\trimspaces} and \cmd{\trimspace} are just used to expand the argument and
add delimiting tokens in front and at the end of it, and set up the
delimiting tokens for \cmd{\Trimspace} and \cmd{\Trimspaces}, too.
As Donald does, I do not call \cmd{\trimspace} by \cmd{\trimspaces} but rather
\cmd{\Trimspace} by \cmd{\trimspaces}. It would be easy to offer \cmd{\TrimLeft}
\cmd{\TrimRight} and \cmd{\TrimBoth} and also \cmd{\TrimLeftS} \cmd{\TrimRightS} and
\cmd{\TrimBothS} which iterate in the (very unlikely!) case that there are
several consecutive space tokens.
\cmd{\Trimspaces} and \cmd{\Trimspace} remove leading, respectively trailing,
spaces of the argument, but they both leave the delimiting tokens in
place. These (and outside tokens) are removed by \cmd{\TrimSpace} in the
process of redefining the initial controlsequence.
\begin{lcode}
\catcode`\<=3 \catcode`\>=3
\def\trimspace #1{\expandafter\expandafter\expandafter
\Trimspace\expandafter <#1> >\\#1}
\def\trimspaces #1{\expandafter\expandafter\expandafter
\Trimspaces\expandafter <#1>< <\\#1}
%% \Trimspaces < text>< <\\ |< text>| ==>
%% -> || + |text> + | <|
%% => ||+| <|+|text>| == | <text>|
%%
%% \Trimspaces <text>< <\\ |<text>| ==>
%% -> |<text>| + || + ||
%% => |<text>|+||+|| == |<text>|
%% \Trimspace <text > >\\ |<text >| ==>
%% -> |<text| + | >|
%% => |<text|+>\\ == |<text>\\|
%%
%% \Trimspace <text> >\\ |<text>| ==>
%% -> |<text>| + ||
%% => |<text>|+>\\ == |<text>>|
\def\Trimspaces #1< #2<#3\\{\Trimspace #1#3#2 >\\}
\def\Trimspace #1 >#2\\{\TrimSpace #1>\\}
\def\TrimSpace #1>#2\\#3{%
\expandafter\expandafter\expandafter\expandafter\expandafter
\def \expandafter\expandafter\expandafter #3\expandafter
{\Remove#1}}
\def\Remove#1{}
\catcode`\<12 \catcode`\>=12
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\def\Test#1{\def\test{#1}\immediate\write0{|\test|}%
\trimspaces\test
\immediate\write0{|\test|}%
}
\let\trim\trimspace
\let\trim\trimspaces
%%%%%%%%%%%%%%%%%%%%%%%%%
\Test{}
\Test{ }
\Test{ a }
\Test{ {}{} }
\Test{{braces}}
\Test{ {braces} }
\Test{ { braces } }
\Test{no space and no space}
\Test{no space and a space: }
\Test{ :a space and no space}
\Test{ :a space and a space: }
\def\test{ \ifx/ }\trimspace\test\show\test
\def\test{ \ifx }\trimspaces\test\show\test
\end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{lcode}
\end{solution}
%%\endinput
\begin{comment}
Date: 23 Dec 1993 16:21:21 -0500 (EST)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #15, answers, 2nd installment
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
Some exposition seems called for here in order to lay out various
considerations running through my mind and the minds of the other
solution-submitters.
\subsection{Trimming a trailing space}
There are two possible ways to remove a trailing space. The first one
is to step through the given text one token at a time, and construct a
new token list in parallel by adding the tokens one by one at the end.
If the next token is a space, delay adding it until the subsequent
token is checked, and if it turns out the text is exhausted, discard
the space instead of adding it. The hard part about this approach is
dealing with braces (character tokens with catcode 1 or 2) because a
lone brace cannot be passed as a macro argument. A recent posting by
\'Eamonn McManus to comp.text.tex on a different sort of problem
showed that the braces can indeed be dealt with, it's just not easy.
The second, simpler approach is to use TeX's scanning of delimited
macro arguments to scan for the ending space and discard it. If you
merely scan for a space token, however, you end up scanning through
the given text `word' by `word' (word = sequence of non-space
characters or brace-delimited groups) instead of token by token, which
is perhaps if anything even more awkward than the first method above,
since you still must deal with brace complications.
The key refinement, therefore, is to scan for a pair of tokens: a
space token and some well-chosen bizarre token that can't possibly
occur in the scanned text. If you put the bizarre token at the end of
the text, and if the text has a trailing space, then TeX's delimiter
matching will match at that point and not before, because the earlier
occurrences of space don't have the requisite other member of the pair.
Next consider the possibility that the trailing space is absent: TeX
will keep on scanning ahead for the pair \meta{space}\meta{bizarre} until either
it finds them or it decides to give up and signal a `Runaway
argument?' error. So you must add a stop pair to catch the runaway
argument possibility: a second instance of the bizarre token, preceded
by a space. If TeX doesn't find a match at the first bizarre token, it
will at the second one.
Now all that's left is to test somehow where the hit occurred in order
to fork properly. This can be done in various clever ways, as
exhibited in the solutions.
%%\endinput
\subsection{Trimming a leading space}
More analysis from Donald Arseneau:
\begin{quote}
There are two safe, expandable ways to eat `one optional space':
`\piif{ifnum}' using an ascii code (\texttt{`c}) as the second number, and
`\piif{ifdim}' using a literal unit of measure like `pt'. Oh, yes,
it could also be done with parameter syntax too, but more on
that later.
\end{quote}
%%\endinput
In other words, one way to remove a leading space would be
\begin{lcode}
\expandafter\def\expandafter\foo\expandafter{\ifdim0pt=0pt\foo \fi}
\end{lcode}
The \cmd{\expandafter}'s would cause the \piif{ifdim} to be executed first.
Execution of the \piif{ifdim} will not terminate until the scanning of the
second `0pt' is finished; therefore TeX will start expanding \cmd{\foo} as
part of the scanning of the `0pt'. Then if a space is the first thing
inside the expansion of \cmd{\foo}, it will be removed by TeX as denoting
the end of the dimension. Otherwise the first non-space token will
terminate the dimension scanning and will be left in place (well, I am
glossing over the problem of an expandable token at the beginning of
\cmd{\foo}, which can be handled by further refinements).
Notice that as written the trailing \piif{fi} will be included in the
redefinition of \cmd{\foo}. No problem---just rewrite it with the \piif{fi}
after the closing brace:
\begin{lcode}
\expandafter\def\expandafter\foo\expandafter{\ifdim0pt=0pt\foo}\fi
\end{lcode}
[Now for a sharp little question: will that work with \cmd{\edef} instead of
\cmd{\def}?
\begin{lcode}
\edef\foo{\ifdim0pt=0pt\foo}\fi
\end{lcode}
See if you can guess before
testing it.]
%%\endinput
%%\begin{verbatim}
Other ways of removing a leading space include using \cmd{\futurelet} to
look at the first token in the scanned text, or using TeX's argument
delimiter scanning to scan for a space. The latter method is perhaps
most straightforwardly done as a mirror-image of the method for
removing a trailing space: make the delimiter \meta{bizarre}\meta{space}, and
then call the macro (let's say \cmd{\trimx}) by putting \meta{bizarre} before
the
scanned text and a stop pair \meta{bizarre}\meta{space} after it, in case a
leading space is not present:
\begin{lcode}
\trimx<bizarre>#1<bizarre> \endtrimx
\end{lcode}
It would be possible to do without the bizarre token and have the
delimiter consist only of a space, but with some ensuing
complications, I think, that would make it scarcely worthwhile.
\subsection{Some remarks about the domain of the problem}
The application I had in mind was, generally speaking, to remove
unwanted spaces at the beginning and end of a piece of text supplied
by the user, such as a section title or other heading.
Typical situation: A user command \cmd{\title} takes an argument
\begin{lcode}
\title{ Some Article Title }
\end{lcode}
with the definition of \cmd{\title} being
\begin{lcode}
\def\title#1{\def\savedtitle{#1}\trimspaces\savedtitle}
\end{lcode}
Thereafter we may use \cmd{\savedtitle} in any number of ways: print it; put
it in a \cmd{\mark} for running heads; write it to an auxiliary file for
table of contents use, or for adding to a BibTeX database; or write it
on screen to show progress when typesetting a collection of articles.
For the last two examples in particular trimming spaces with
\cmd{\ignorespaces} or \cmd{\unskip} is undesirable.
Notice also that \cmd{\unskip} will remove \emph{any} trailing glue, including
\cmd{\leader}'s or explicit \cmd{\hskip}'s that might sometimes be added by
users for their own inscrutable purposes and whose unexpected
removal could be (indeed, has been in true life) the cause of
much consternation.
If we call \cmd{\trimspaces} in the definition of \cmd{\title}, then leading and
trailing spaces are removed once and for all, and none of the many
functions that later use \cmd{\savedtitle} need to worry about that task.
With this restricted domain of use in mind for \cmd{\trimspaces}, I screened
the submitted solutions through the following conditions.
\begin{description}
\item[Condition 1] The text has been stored in a macro. The result of
\cmd{\trimspaces} is a redefinition of the macro.
This is not exactly a necessary condition, but removal of this
condition would suggest that constructions like
\begin{lcode}
\def\foo#1{...
\message{Your argument "\trimspaces{#1}" makes me laugh}%
...}
\end{lcode}
should be supported. The full expansion done by \cmd{\message} or other such
commands, however, can't be applied carelessly to arbitrary
user-supplied text. You would need to deactive problematic elements
(by changing catcodes, adding \cmd{\protect}'s, whatever). So supporting
full expansion for the operand of \cmd{\trimspaces} is of low relevance for
the envisioned normal applications.
\item[Condition 2] It suffices to remove a single space before and after the
text.
In almost any other programming language, a typical space-trimming
function would need to handle the possibility of multiple consecutive
spaces. But in text supplied by an average user through the normal TeX
lexical conventions, consecutive spaces will be reduced to a single
space before our trimming functions are ever called.
The next installment of this `summary' will include a recently arrived
solution by Jonathan Fine\index{Fine, Jonathan}
that handles multiple trailing spaces as
easily as a single one, without any extra implementation cost.
\item[Condition 3] For both the trailing space and the leading space, we
don't know whether or not they are present.
If we knew for certain that a given space was present, of course, the
procedure for removing it would be easier.
\end{description}
%%========================================================================
%%>>Solution 2 (Ian Collier) [
[email protected]]
%\begin{description}
\begin{solution}{Solution 2 (Ian Collier)}\index{Collier, Ian}
\ldots I used \cmd{\meaning} to find out whether or not the
first character of the argument is a space (because spaces are usually
ignored and this seems to be the only way to make the space visible).
I'm fairly sure that `blank space' is the only \cmd{\meaning} beginning with
`bl'. I had rather a lot of trouble with braces, because if the first
character is a brace then \cmd{\meaning} removes it and leaves an unmatched
right brace. However I finally realised that \verb?\iffalse...\fi? could be
used to remove it.
\begin{lcode}
{\catcode`Q=3 \catcode`@=11
\gdef\trimspace#1{\expandafter\trimspac@a#1QAA QB}
\gdef\trimspac@x#1{\trimspac@a#1QAA QB}
\gdef\trimspac@a#1 Q#2{\if#2A#1\expandafter\trimspac@b
\else\trimspac@c#1\fi}
\gdef\trimspac@b A QB{}
\gdef\trimspac@c#1QAA{#1}
\gdef\trimspaces#1{\expandafter\expandafter\expandafter\tr@a
\expandafter\meaning#1A\fi{#1}}
\gdef\tr@a#1#2{\if#1b\if#2l\expandafter\expandafter\expandafter\tr@c
\else\expandafter\expandafter\expandafter\tr@b\fi\else
\expandafter\tr@b\fi}
\gdef\tr@b{\expandafter\trimspace\iffalse}
\gdef\tr@c{\expandafter\tr@d\iffalse}
\gdef\tr@d#1{\expandafter\tr@e#1Q}
\def\:{\gdef\tr@e}\: #1Q{\trimspac@x{#1}}
}
\def\test#1{\edef\text{#1}\immediate\write16 {"\trimspaces\text"}}
\test{ Leading space}
\test{Trailing space }
\test{ Leading and trailing spaces }
\test{Nospaces}
\test{ {braces}Leading space{braces}}
\test{{braces}Trailing space{braces} }
\test{ {braces}Leading and trailing spaces{braces} }
\test{{braces} Nospaces {braces}}
\test{}
\test{ }
\test{\space\space{two spaces}\space\space}
\end
\end{lcode}
%%========================================================================
Comments: Some extra work would be necessary to handle the possibility
\begin{lcode}
\def\text{\iftrue a\else b\fi}
\trimspaces\text
\end{lcode}
because removal of the \piif{iftrue} by \cmd{\meaning} will leave the
\piif{else} and \piif{fi} unmatched, confusing the later \piif{iffalse}
step done by \cmd{\tr@b}, \cmd{\tr@c}.
But such a value for \cmd{\text} is rather unlikely in ordinary
user-supplied arguments.
%\end{description}
\end{solution}
\begin{comment}
Some more solutions to Exercise 15 will follow in a few days.
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
Date: 30 Dec 1993 17:07:17 -0500 (EST)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #15, answers, 3rd installment
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
%$
I have done some slight condensing in the answers, indicated by
\verb?[...]?.
Solution 3 by Greg McGary contains an interesting idea for an
alternative syntax of the \cmd{\trimspaces} function: Instead of writing
\begin{lcode}
\def\savedtitle{#1}\trimspaces\savedtitle
\end{lcode}
you would write
\begin{lcode}
\trimmed\def\savetitle{#1}
\end{lcode}
%%========================================================================
%%>>Solution 3 (Greg McGary,
[email protected])
%\begin{description}
\begin{solution}{Solution 3 (Greg McGary)}\index{McGary, Greg}
\begin{lcode}
%%% preliminaries: (Mad about those abbreviations!)
\catcode`@=11
\let\ea=\expandafter
\let\nx=\noexpand
\let\ag=\aftergroup
\def\agg{\ag\ag\ag}
\let\bg=\begingroup
\let\eg=\endgroup
[...]
%%% The underlaying tool I use is \trimmed, which is used as a modifier for
%%% macro definitions to trim the trailing space from the body:
%%% \trimmed\def\foo{foo } will set \foo to {foo}
%%% Notice that any form of \def modifier may be interposed between \trimmed
%%% and \def, as in \trimmed\global\long\outer\def\foo{foo }
%%%
%%% As an aside, TeX has no \expanded modifier. Expanded definitions
%%% must be accomplished through use of \edef or \xdef (equivalent to
%%% \global\edef) This is annoying, as we might like to use \trimmed with
%%% expanded definitions and don't want to write a separate \etrimmed.
%%% Luckily, we can easily roll our own \expanded modifier, like so:
\def\expanded#1\def{#1\edef}
%%% Other modifiers may optionally be inserted between \expanded and
%%% \def, like so: \def\foo{foo} \outer\expanded\long\def\bar{\foo}
%%% Here's the definition of \trimmed:
\long\def\trimmed#1\def#2#3{\bg
\long\def\!##1##2 \!##3\trimmed@{\eg
\ifx\relax##3\relax
\trimmed@{##1}##2%
\else
##1{##2}%
\fi}%
\!{#1\def#2}#3\! \!\trimmed@}
\long\def\trimmed@#1#2\!{#1{#2}}
%%% Notice the use of \begingroup...\endgroup to make the definition of \!
%%% temporary so as not to disturb any previous definition, and so that the
%%% temporary will disappear once we're done with it. Notice that the
%%% \endgroup appears right away in the body of \!, so that the ensuing \def
%%% will occur in the proper group. \! was chosen as a name for the temporary
%%% macro because it is a non-alphabetic (non-catcode-11) character; any other
%%% non-alphabetic would suffice as well. Non-alphabetic macro-names have the
%%% desirable property of preserving any trailing space token.
%%%
%%% If we are really fastidious about keeping clutter out of the global name
%%% space, we can also define \trimmed@ as a temporary alongside \!. We would
%%% also want to use a name that's already defined, to avoid entering a new
%%% name into TeX's hashtable. A non-alphabetic name like \: seems like a
%%% good (though cryptic) choice:
\long\def\trimmed#1\def#2#3{\bg
\long\def\:##1##2\!{\eg##1{##2}}
\long\def\!##1##2 \!##3\:{%
\ifx\relax##3\relax
\:{##1}##2%
\else
\eg##1{##2}%
\fi}%
\!{#1\def#2}#3\! \!\:}
%%% Notice that we've had to delay the \endgroup until after our new
%%% temporary \: has been used.
%%%
%%% Anyway, we may now define \trimspace as follows:
\def\trimspace#1{\ea\trimmed\ea\def\ea#1\ea{#1}}
%%% Notice that the replacement definition is a normal \def, whereas the
%%% macro we started with could have had any number of modifiers attached,
%%% such as \long, \outer, or \global. A further exercise might be to fix
%%% this problem.
%%%
%%% A more generalized trim might allow any list of tokens to be trimmed off
%%% the tail of another list of tokens. Here, we add an initial argument to
%%% \trimmed specifying those tokens. In order to strip off trailing ".\par"
%%% for instance, we could write: \trimmed{.\par}\outer\long\def\foo{foo.\par}
%%%
%%% Here's the general definition of \trimmed:
\long\def\trimmed#1#2\def#3#4{\bg
\long\def\:##1##2\!{\eg##1{##2}}
\long\def\!##1##2#1\!##3\:{%
\ifx\relax##3\relax
\:{##1}##2%
\else
\eg##1{##2}%
\fi}%
\!{#2\def#3}#4\!#1\!\:}
%%% The auxiliary \trimmed@ remains unchanged. Notice that we no longer really
%%% need a non-alphabetic macro name for the temporary macro, since we don't
%%% have to preserve the literal space token following the macro.
%%%
%%% Unfortunately, the literal space token problem doesn't disappear, it's just
%%% pushed up a level. Now we have to give that space as an argument to \trimmed
%%% in the definition of \trimspace, and hop over it with \expandafter!
\edef\trimspace#1{\nx\ea\nx\trimmed\nx\ea
{\nx\ea\space\nx\ea}\nx\ea\def\nx\ea#1\nx\ea{#1}}
%%% N.B., The curly braces, "\nx\ea{...\nx\ea}" around the "\nx\ea\space"
%%% are necessary.
%%%
%%% This approach of defining \trimspace in terms of an underlaying \trimmed
%%% \def'inition facility has the advantage of reusing code, but the
%%% disadvantage of forcing a macro redefintion even if there is no trailing
%%% space to remove. We could modify \trimmed to produce a new macro, \trim,
%%% that redefines a macro only if it has the trailing pattern of interest.
%%% (It also happens to be simpler!)
\long\def\trim#1#2{\bg
\long\def\!##1#1\!##2\:{\eg
\ifx\relax##2\relax \else
\def#2{##1}%
\fi}%
\ea\!#2\!#1\!\:}
%%% Now, we can define \trimspace in terms of \trim like so:
\edef\trimspace#1{\nx\ea\nx\trim\nx\ea{\nx\ea\space\nx\ea}\nx\ea#1}
%%% Ok, let's test it:
\def\HasTrailingSpace{has trailing space }
\def\NoTrailingSpace{no trailing space}
\trimspace\HasTrailingSpace \show\HasTrailingSpace
\trimspace\NoTrailingSpace \show\NoTrailingSpace
%%% While we're at it, let's test another pattern:
\def\HasTrailingDotPar{has trailing dot par.\par}
\def\NoTrailingDotPar{no trailing dot par}
\trim{.\par}\HasTrailingDotPar \show\HasTrailingDotPar
\trim{.\par}\NoTrailingDotPar \show\NoTrailingDotPar
%%% ### Exercise 15(b)
%%% Write a macro \trimspaces that removes a leading space, if
%%% present, and then calls \trimspace to remove a trailing space.
%%% I'm going to solve this in a quick and dirty way, as it's getting
%%% late and I'm running out of gas! Just use \futurelet sequestered
%%% in a \vbox to inspect the first token. If it's a \space, gobble
%%% the first token and subject the remaining tokens to \trimmed.
\def\redefSansSp@ce#1 #2\redefSansSp@ce{\def#1{#2}}
\def\redefSansSpace#1{\ea\redefSansSp@ce\ea#1#1\redefSansSp@ce}
\def\trimspaces#1{\bg\setbox0=\vbox{%
\def\maybeRedefSansSpace{\ea\ifx\space\@\agg\redefSansSpace\agg#1\fi}%
\ea\futurelet\ea\@\ea\maybeRedefSansSpace#1}\eg
\trimspace#1}
%%% \futurelet won't work for the more general case of trimming an
%%% arbitrary leading pattern, as it only looks at one token.
%%% I'll leave solving the general case as an exercise for the reader ;-)
%%%
%%% This is also not the most efficient solution, since we redefine the macro
%%% twice if there is a leading space. Notice that we put the \setbox0
%%% inside a group, to keep any previous definition of \box0 safe. This
%%% is probably overkill, since \box0 is a temporary register and users
%%% should be aware that it's fair game, but it doesn't hurt to be
%%% courteous... Also note the abbreviation \agg, which pushes its argument
%%% out two groups.
[...]
%%% Testing...
\def\foo{ foo }
\trimspaces\foo \show\foo
\end{lcode}
\end{solution}
%%========================================================================
In the previous posting I discussed the method of removing a trailing
space by scanning for a token pair \meta{space}\meta{bizarre}. In Schmitt's
solution, for example, the bizarre token was a greater-than character
with catcode 3. And in my solution, I used a letter Q with catcode
3. Solution 4 from Jonathan Fine takes the approach of using a second
\meta{space} token for the \meta{bizarre} token. In practice this works for
typical user-supplied text, as discussed before, since TeX's normal
reduction of multiple spaces to single spaces makes the pair
\meta{space}\meta{space} sufficiently bizarre. I have to admit I like this idea;
those who attempted a solution for this exercise and struggled with
various other delimiter possibilities will, I think, appreciate the
humor of it as I did.
As I mentioned last week, I found some theoretical interest in the
fact that if multiple space tokens were present at the end of the text
being trimmed, Fine's solution would remove them all, without needing
to use recursion. But another correspondent pointed out since then
that if multiple spaces were present at the end they might also be
presumed possible in the middle of the scanned text, and an occurrence
of multiple spaces in the middle would cause \cmd{\trim} to fail.
\begin{solution}{Solution 4 (Jonathan Fine)}\index{Fine, Jonathan}
\begin{lcode}
%% NOTE: I have benefited from Michael Downes posting of answers, dated
%% 16 December, particularly for stripping the leading space, and the
%% discussion of the hazards of grouped arguments
\catcode`\@=11
%% The Solution
\def\trim #1{\expandafter\trim@\expandafter{#1 }#1}
\def\trim@ #1{\trim@@ @#1 @ #1 @ @@}
\def\trim@@ #1@ #2@ #3@@{\trim@@@\empty #2 @}
\def\unbrace#1{#1}
\unbrace{\def\trim@@@ #1 } #2@#3{\expandafter\def
\expandafter #3\expandafter {#1}}
%% Test Code
\def\Test{\afterassignment\Test@ \def\test}
\def\Test@{\trim\test \afterassignment\Test@@ \def\test@}
\def\Test@@{\message{\ifx\test\test@ Y\else FAIL:|\meaning\test|\fi}}
\catcode`\@=12
%% Testing The Solution
\Test{}{}
\Test{ }{}
\Test{ a }{a}
\Test{ {}{} }{{}{}}
\Test{{braces}}{{braces}}
\Test{ {braces} }{{braces}}
\Test{ { braces } }{{ braces }}
\Test{no space and no space}{no space and no space}
\Test{no space and a space: }{no space and a space:}
\Test{ :a space and no space}{:a space and no space}
\Test{ :a space and a space: }{:a space and a space:}
\Test{ \ifx }{\ifx}
\Test{ \ifx/ }{\ifx/}
\end{lcode}
\end{solution}
\begin{comment}
Since my solution got rather long after I added some commentary I'll
post it separately in a couple of days, rather than double the size of
this post.
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
Date: 03 Jan 1994 17:14:14 -0500 (EST)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #15, answers, 4th (last) installment
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
%$
My solution here is the result of weeks of incremental refinement,
ending only last week, and consequently benefits from analysis of the
other solutions.
%%========================================================================
\begin{solution}{Solution 5 (Michael Downes)}
\begin{lcode}
% Here I only solve part (b) of Exercise 15, in an attempt to make
% a solution of utmost compactness (3 control sequences, 45 tokens).
% Also, it seems likely that in actual use \cmd{\trimspaces} can be
% applied without harm whenever \trimspace might be needed.
%
% The method for pausing after each test might be of ancillary
% interest to some readers; unlike the alternative of setting
% \pausing=1, the \test's aren't required to be on separate lines.
\catcode`\Q=3
% \cs{trimspaces}\x redefines \x to have the same replacement text sans
% leading and trailing space tokens.
%
\def\cs{trimspaces}#1{%
% Use grouping to emulate a multi-token afterassignment queue.
\begingroup
% Put `\toks 0 {' into the afterassignment queue.
\aftergroup\toks\aftergroup0\aftergroup{%
% Apply \trimb to the replacement text of #1, adding a leading
% \noexpand to prevent brace stripping and to serve another purpose
% later.
\expandafter\trimb\expandafter\noexpand#1Q Q}%
% Transfer the trimmed text back into #1.
\edef#1{\the\toks0}%
}
% \trimb removes a trailing space if present, then calls \trimc to
% clean up any leftover bizarre Qs, and trim a leading space. In
% order for \trimc to work properly we need to put back a Q first.
%
\def\trimb#1 Q{\trimc#1Q}
% Execute \vfuzz assignment to remove leading space; the \noexpand
% will now prevent unwanted expansion of a macro or other expandable
% token at the beginning of the trimmed text. The \endgroup will feed
% in the \aftergroup tokens after the \vfuzz assignment is completed.
%
\def\trimc#1Q#2{\afterassignment\endgroup \vfuzz\the\vfuzz#1}
\catcode`\Q=11
\def\test#1{\errhelp{#1}\message{[\the\errhelp]}%
\edef\x{\the\errhelp}%
\global\tracingcommands2\global\tracingmacros2\global\tracingonline0
\cs{trimspaces}\x
\global\tracingcommands0\global\tracingmacros0\global\tracingonline0
\errhelp\expandafter{\x}\message{-> [\the\errhelp]}%
\read16 to\PressReturnToContinue
}
\test{ x } \test{ xy z } \test{} \test{{}}
\test{{}{}} \test{ {x} } \test{ } \test{{ }}
\test{\AA} \test{\fi} \test{\space x\space}
\test{ #1 }
\end
\end{lcode}
Commentary
Suppose we have a macro \cmd{\x} with replacement text \verb?" {xyz} "?.
The task of
\cmd{\trimspaces} is to construct a statement of the form
\begin{lcode}
\def\x{{xyz}}
\end{lcode}
i.e., to redefine \cmd{\x} with the same replacement text except for removal
of a leading or trailing space. However, a similar statement
\begin{lcode}
\toks0{{xyz}}\edef\x{\the\toks0}
\end{lcode}
is more robust if the replacement text might contain \# tokens. For
example,
\begin{lcode}
\def\x{\def\y##1{}}
\end{lcode}
works OK but after thus defining \cmd{\x}, the statements
\begin{lcode}
\def\trimx#1{\expandafter\def\expandafter\x\expandafter{#1}}
\trimx\x
\end{lcode}
fail with an error message because the `\#1' in the definition of \cmd{\y} is
misinterpreted as a parameter token for the redefinition of \cmd{\x}.
Although \# tokens seem highly unlikely in average user-supplied text, I
aimed for a statement of the second, robuster kind, as if I were writing
\cmd{\trimspaces} for use in a major macro package with thousands of
prospective users.
The basic structure of \cmd{\trimspaces} is therefore: First remove a trailing
space, then remove a leading space, then put the remaining text into
\cmd{\toks}\texttt{0}, then transfer the text to \cmd{\x} with \cmd{\edef}.
For removing the trailing space, I apply a macro scan with delimiter
\verb?<space,10><Q,3>? Here the notation \verb?<c,n>? means the character token
consisting of character code \texttt{c} with catcode \texttt{n}.
The leading space is removed by executing the assignment
\verb?\vfuzz=\the\vfuzz? at the beginning of the operand text, in order to use
a side effect of the assignment: removal of a following space. (Credit
to Donald Arseneau for this good idea.) The main reason for using
\verb?\the\vfuzz? instead of 0pt is that it's slightly shorter (one token),
although if we did not have the group structure to localize the `change'
to \cmd{\vfuzz}, then using \verb?\the\vfuzz? would also be a good idea for the
sake of preserving the variable's previous value.
The statement \verb?\vfuzz=\vfuzz? (sans \cmd{\the}), by the way, would not gobble a
following space: when TeX recognizes a suitable variable on the
right-hand side of an assignment, it copies the value directly into the
left-hand side and skips the scanning process entirely.
Here's a step-by-step breakdown of the operation of \cmd{\trimspaces} through
two possibilities, one where both a leading and a trailing space are
present, and one where neither are present.
\begin{lcode}
------------------------------------------------------------------------
Case 1 (spaces present) Case 2 (no spaces to be removed)
------------------------------------------------------------------------
\def\x{ {xyz} } \cs{trimspaces}\x \def\x{{xyz}} \cs{trimspaces}\x
Step 1: Step 1:
\begingroup... Same as for Case 1.
\expandafter\trimb
\expandafter\noexpand\x Q Q}...
Step 2: || Step 2: ||
\trimb\noexpand {xyz} Q Q... \trimb\noexpand{xyz}Q Q...
^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
Here the row of ^^^ indicates the In this case the first Q is taken
material that is taken as argument up as part of #1, which is passed
#1 of \trimb, and || indicates the to \trimc. The second Q added by
tokens that match the macro \trimb therefore falls after the
delimiter. #1 is now passed to leftover Q instead of before.
\trimc, with another Q token added;
the leftover <space>Q token pair
follows.
Step 3: | Step 3: |
\trimc\noexpand {xyz}Q Q... \trimc\noexpand{xyz}QQ...
^^^^^^^^^^^^^^^ ^^ ^^^^^^^^^^^^^^ ^
Here we have #1, delimiter token Q, The situation at the end of the
and #2. The space before the second trimmed text ends up being the same
Q is skipped by TeX because it's as in Case 1, except for the
looking for a nondelimited argument absence of a space between the Qs.
for #2.
Step 4: Step 4:
\afterassignment\endgroup \afterassignment\endgroup
\vfuzz\the\vfuzz\noexpand {xyz}}... \vfuzz\the\vfuzz\noexpand{xyz}}...
^
Here the ^ marks the leading space
that is to be removed.
Step 5: \endgroup{xyz}}... Step 5: \endgroup{xyz}}...
\endgroup is from \afterassignment.
Step 6: Step 6:
\toks0{{xyz}} \toks0{{xyz}}
^^^^^^^---from \aftergroup ^^^^^^^---from \aftergroup
\edef\x{\the\toks0} \edef\x{\the\toks0}
\end{lcode}
\end{solution}
\begin{comment}
========================================================================
That's a wrap on Exercise 15.
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
\end{comment}
%$
%%\endinput
\chapter{Assorted numbers, skips, and modes}
\section{Exercise}
%%\input{ex016}
% ex016.tex
\begin{comment}
Date: 13 Jan 1994 16:42:27 -0500 (EST)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #16
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
************************************************************************
*** Exercise 16:
\end{comment}
\ed{\oposted{1994/01/13}. \arch{exercise.016}.}
Predict the messages that will be produced by plain TeX for the
following test file.
\begin{lcode}
\catcode`\@=11 \newcount\m
\def\msg#1{\advance\m 1 \message{(\number\m): #1}}
\def\T{\msg{T}}\def\F{\msg{F}}
\mag=1728 \hfuzz=1pt \tabskip=1pt \baselineskip=12pt
\topskip=10pt \lineskiplimit=1pt \lineskip=1pt
\setbox0\vbox{%
\mag=\time \ifnum\mag>1500 \T\else\F\fi % (1)
\mag=\number\year \ifnum\mag>1500 \T\else\F\fi % (2)
\hfuzz=99pt \ifdim\hfuzz=99pt \T\else \F\fi % (3)
\tabskip=\z@ \ifdim\tabskip<\p@\T\else\F\fi % (4)
\tabskip=\p@ minus2pt \ifdim\tabskip>\z@\T\else\F\fi % (5)
\baselineskip=-\prevdepth \ifdim\baselineskip=12pt \T\else\F\fi % (6)
\advance\baselineskip 2\topskip % (7)
\ifdim\baselineskip>\@m\p@ \T\else\F\fi %
\lineskiplimit=\z@ \ifnum\lineskiplimit>0 \T\else\F\fi % (8)
\lineskip=\z@skip \ifdim\lineskip>\lineskiplimit \T\else\F\fi % (9)
\kern2pc\ifdim\lastkern=2pc \T \else\F\fi % (10)
\hskip1em
\ifvmode\T\else\ifdim\lastskip>\z@\msg{FT}\else\msg{FF}\fi\fi % (11)
\font\cmrtest=cmr10 \ifx\cmrtest\tenrm \T\else\F\fi % (12)
}
\end
\end{lcode}
Where should \cmd{\relax} be inserted?
\begin{comment}
************************************************************************
Answers will be posted circa January 27, 1994.
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
\end{comment}
%$
%%\endinput
\section{Answers}
%%\input{ans016}
% ans016.tex
\begin{comment}
[There was an error in the first posted version: \twelverm instead of
the first \tenrm in the statement
\font\tenrm = \fontname\tenrm scaled 1200
The posting containing this correction is appended below.]
Date: 27 Jan 1994 11:59:48 -0500 (EST)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #16, answers
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1994/01/27}. \arch{answer.016}.}
Here is my commentary on Around the Bend \#16.
\begin{lcode}
% \mag=1728 \hfuzz=1pt \tabskip=1pt \baselineskip=12pt
% \topskip=10pt \lineskiplimit=1pt \lineskip=1pt
% \mag=\time \ifnum\mag>1500 \T\else\F\fi % (1)
\end{lcode}
(1): F --- At the time of the \piif{ifnum}, \cmd{\mag} is in the range [0,1440)
depending on what time it was when you ran TeX.
\begin{lcode}
% \mag=\number\year \ifnum\mag>1500 \T\else\F\fi % (2)
\end{lcode}
(2): F --- At the time of the \piif{ifnum}, \cmd{\mag} still has its previous value
because TeX is still scanning for digits to add on after `1994'.
\begin{lcode}
% \hfuzz=99pt \ifdim\hfuzz=99pt \T\else \F\fi % (3)
\end{lcode}
(3): T --- Everything fine, dimension scanning terminated with the
space after `99pt'.
\begin{lcode}
% \tabskip=\z@ \ifdim\tabskip<\p@\T\else\F\fi % (4)
\end{lcode}
(4): F --- \cmd{\z@} is a dimension register, therefore it serves only as the
first part of the glue value that TeX is looking for. At the time of the
\piif{ifdim}, TeX is still looking for `plus' or `minus' and hasn't yet
finished the assignment of \cmd{\tabskip}.
\begin{lcode}
% \tabskip=\p@ minus2pt \ifdim\tabskip>\z@\T\else\F\fi % (5)
\end{lcode}
(5): T --- Glue value scanning terminated properly. \cmd{\p@} is a dimension
register like \cmd{\z@} but the additional clause `minus 2pt' fills out the
glue value to the required three parts. TeX assumes `plus 0pt' when it
finds a `minus' clause without a preceding `plus' clause. Note that TeX
does \emph{not} continue scanning for a possible `plus' after reading a minus
component. Unlike the height, depth, and width components of a \cmd{\vrule} or
\cmd{\hrule}, the components of a glue value have a required order and each
part can only occur once.
\begin{lcode}
% \baselineskip=-\prevdepth \ifdim\baselineskip=12pt \T\else\F\fi % (6)
\end{lcode}
(6): T --- At the beginning of a vbox or at the beginning of a TeX run
\cmd{\prevdepth} = -1000pt. So it would seem that \cmd{\baselineskip} should get set
to +1000pt and the test should be False; but \cmd{\prevdepth} is a dimension
register, not a glue register, so following stretch or shrink components
are still possible, and \cmd{\baselineskip} does not yet have its new value at
the time of the test.
\begin{lcode}
% \advance\baselineskip 2\topskip % (7)
% \ifdim\baselineskip>\@m\p@ \T\else\F\fi %
\end{lcode}
(7): F --- Without the factor 2 in front of \cmd{\topskip}, the test would
be True: \cmd{\topskip} is a glue register so TeX would copy each component of
\cmd{\topskip} to the corresponding component of \cmd{\baselineskip}; then, having plus
and minus components already in hand, TeX would not scan ahead for
`plus' or `minus'. However, a preceding factor for a glue register
causes TeX to use only the first component of the glue register,
multiplied by the given factor, which means that additional scanning is
then attempted for possible stretch or shrink components.
\begin{lcode}
% \lineskiplimit=\z@ \ifnum\lineskiplimit>0 \T\else\F\fi % (8)
\end{lcode}
(8): F --- Normal termination of dimension scanning. \cmd{\lineskiplimit}
is a dimen register, not a glue register, so the dimen constant \cmd{\z@} is
sufficient to complete the assignment and TeX scans no further.
\begin{lcode}
% \lineskip=\z@skip \ifdim\lineskip>\lineskiplimit \T\else\F\fi % (9)
\end{lcode}
(9): F --- Normal termination of glue scanning. \cmd{\z@skip} is a glue
register so it suffices to complete the assignment of \cmd{\lineskip}. Compare
to the \cmd{\tabskip} assignments above.
\begin{lcode}
% \kern2pc\ifdim\lastkern=2pc \T \else\F\fi % (10)
\end{lcode}
(10): F --- At the time of the \piif{ifdim}, TeX is still looking for
an optional final space at the end of the dimension value `2pc'. If it
were \verb?2\p@? instead of \verb?2pc?, the test would evaluate to True.
\begin{lcode}
% \hskip1em
% \ifvmode\T\else\ifdim\lastskip>\z@\msg{FT}\else\msg{FF}\fi\fi % (11)
\end{lcode}
(11) FF --- TeX enters horizontal mode as soon as the \cmd{\hskip} command
comes along, before it finishes scanning the skip amount. So the
\piif{ifvmode} test is false. The \piif{ifdim} test is also false because scanning
is not yet complete (TeX is looking ahead for a plus or minus component)
so the glue has not yet been entered into the horizontal list, so it is
not accessible to \cmd{\lastskip}.
For more on the switch into horizontal mode, see `TeX from \cmd{\indent} to
\piif{par}', Marek Ry{\'c}ko and Bogus{\l}aw Jackowski, TUGboat 14/3, October
1993 (1993 Annual Meeting Proceedings), pp. 171--176.
\begin{lcode}
% \font\cmrtest=cmr10 \ifx\cmrtest\tenrm \T\else\F\fi % (12)
\end{lcode}
(12) F --- Interestingly, the following versions of the \piif{ifx} test are
also false at that point:
\begin{lcode}
\ifx\cmrtest\undefined, \ifx\cmrtest\relax.
\end{lcode}
The reason is that after `\verb?\font\cmrtest?' TeX immediately sets
\verb?\cmrtest = \nullfont?, before scanning the rest of the font assignment. So the test
\verb?\ifx\cmrtest\nullfont? would yield True. According to the \emph{TeXbook},
the reason for this behavior is to allow statements of the form
\begin{lcode}
\font\cmrtest=cmr10 \cmrtest
\end{lcode}
for switching to the font \cmd{\cmrtest} immediately after it is defined. TeX
does a bit of boomeranging in such a case:
\begin{lcode}
\font\cmrtest % set \cmrtest = \nullfont
=cmr10 % space terminates font name, start looking for
% "at" or "scaled"
\cmrtest % \cmrtest = \nullfont = nonexpandable, not
% "a", not "s"; terminate the font assignment
% and put back the \cmrtest token to be read
% again:
\cmrtest % Now \cmrtest selects the given font
\end{lcode}
Although I sympathize with Knuth's desire to smooth out a potential
problem for naive users, I wonder if it only encourages users to pay
less attention to the nitty-gritty details of scanning and expansion,
and therefore lay themselves open to greater confusion later on when
something similar fails (inconsistently!) to work. I'd have thought it
better to require, and document, proper termination of font assignment
scanning by \cmd{\relax} or whatever. Users would have to be a little more
knowledgeable but they would be rewarded with a more consistent language
to work with. As it stands TeX unnaturally forbids certain
constructions that are perfectly colloquial to anyone who has an ear for
the TeX language, such as
\begin{lcode}
\font\tenrm = \fontname\tenrm\space scaled 1200
\end{lcode}
I hold a similar opinion for the way \cmd{\chardef} and \cmd{\mathchardef} set their
arguments to \cmd{\relax} before scanning the number on the right-hand-side of
the assignment. Occasionally I would \emph{like} to be able to write
something like
\begin{lcode}
\chardef\foo=\ifcase\foo 1\or 2\else 3\fi
\end{lcode}
but TeX doesn't allow that.
One could argue that the \cmd{\chardef} behavior should for consistency be
imitated by \cmd{\edef}, \cmd{\xdef} so that if \cmd{\foo} is undefined then
\begin{lcode}
\edef\foo{a\foo}
\end{lcode}
should not give an undefined control-sequence error for the \cmd{\foo} in the
replacement text, but make it temporarily equivalent to \cmd{\relax} and leave
it there. (Of course, this means that executing \cmd{\foo} will then start up
an infinite loop, but my point was that it's the behavior of \cmd{\chardef}
that should be changed to achieve consistency, not the behavior of
\cmd{\edef}.)
%%%========================================================================
At the end of Exercise \#16 there was the question `Where should \cmd{\relax}
should be inserted?'
\cmd{\relax} should be inserted just before the \piif{if}... in statements (2), (6),
(7), (11), and (12). In statement (4) \cmd{\z@skip} should be used instead of
\cmd{\z@}; then \cmd{\relax} is unnecessary. A space suffices instead of \cmd{\relax} in
(10). I would also tend to put a \cmd{\relax} at the end of the preliminary
assignments to \cmd{\baselineskip} and \cmd{\lineskip}, as a matter of principle; I
like to make sure that scanning is definitely terminated at the end of a
line, so that if any error occurs during the scanning, TeX will show the
line containing the assignment statement and not a later line. This is
particularly relevant for font assignments: If \pfile{foo10.tfm} does not exist
on your system, then the assignment
\begin{lcode}
\font\foo=foo10
<blank line>
\end{lcode}
will cause TeX to show you the blank line instead of the preceding line
in the error context:
\begin{lcode}
! Font \foo=foo10 not loadable: Metric (TFM) file not found.
<to be read again>
\par
l.2
\end{lcode}
And if the following material is some complicated macro instead of a
blank line, TeX will go into the replacement text of the macro, looking
for `at' or `scaled', before giving the error message!
\begin{comment}
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
Date: 28 Jan 1994 08:01:12 -0500 (EST)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #16, answers, correction
To:
[email protected]
Instead of
\font\twelverm = \fontname\tenrm\space scaled 1200
read
\font\tenrm = \fontname\tenrm\space scaled 1200
The latter line is what I originally wrote but I changed it in an obtuse
moment a day later, forgetting the very point it was supposed to
illustrate.
\end{comment}
%$
%%\endinput
\chapter{Missing \cs{input} file}
\section{Exercise}
%%\input{ex017}
% ex017.tex
\begin{comment}
Date: 14 Jan 1994 12:44:13 -0500 (EST)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #17
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1994/01/15}. \arch{exercise.017}.}
%%************************************************************************
%%*** Exercise 17:
When TeX cannot find an input file it prompts with `Please enter another
input file name:'. On some systems you can enter `nul' in response to
this prompt to have TeX input a null file and continue processing. On
most systems TeX also allows you to enter a system-dependent end-of-file
character (Control-Z (DOS, VMS), Control-D (Unix), ...?), to which it
responds with an "Emergency stop" instead of continued processing.
An alternative would be to maintain a file called `\pfile{.tex}' containing an
error message so that merely pressing RETURN would cause TeX to read
`\pfile{.tex}' and issue the error message. Unlike the null file case or
EOF-character case, this would allow normal access to the full menu of
error recovery options, including e.g., exiting to an editor, inserting
or deleting tokens, or changing the interaction mode. It would probably
be nice to have the file also accessible under various aliases `\pfile{h.tex}',
`\pfile{help.tex}', `\pfile{?.tex}', `\pfile{q.tex}', `\pfile{quit.tex}',
`\pfile{x.tex}', `\pfile{exit.tex}', or
`\verb?@#&@%$.tex?' corresponding to typical responses from stumped users.
But making a robust `\pfile{.tex}' file for input error recovery is not so
simple a task as might first seem. One needs to take into account, for
example, the possibility that an \cmd{\input} might be attempted when normal
catcodes or normal \cmd{\endlinechar} are not in effect.
Given the programmability of TeX, an all-encompassing solution is
probably not possible, so this exercise has two parts: consider what
would be a reasonable minimal set of assumptions for an input error
recovery file; and write a \pfile{.tex} file containing a suitable
error message and satisfying the assumptions.
%%************************************************************************
Motivation: From \url{comp.text.tex}:
\begin{lcode}
> From:
[email protected] (Wayne Hayes)
> Subject: Why does TeX ignore interupts???
> Message-ID: <
[email protected]>
> Date: 24 Dec 93 05:09:35 GMT
>
> If there's ONE thing that annoys me more than anything about a program,
> it's when it refuses to die on command, and for no good reason. The
> absolute worst case is when it's waiting for input and you don't know
> what to tell it, and would like to quit for now.
>
> Thus my extreme annoyance every time I mistype an \input command to TeX
> and it asks me on the terminal "Please input another file name: ", and
> I usually just want to exit and re-edit my file to fix the \input
> error. But TeX refuses to die when I press ^C at this moment, and will
> only die if I send a QUIT (^\), at which point it dumps a
> multi-megabyte core file into the current directory. ARGGGHHHH!! Why
> does it do this? I can't see any good reason why it ignores interupts
> at this point. Is this intended? Is it a bug? Does it drive anyone
> else as nuts as it drives me?? Can it be changed in the next release???
\end{lcode}
It's puzzling that most of the implementations of TeX I know of don't
check for the interrupt key possibility at this prompt [Textures notably
cuts clean through the problem by popping up a dialog box if an input
file is not found]. Seems as if interrupt-key checking at that point
would be a desirable addition to the set of system-dependent changes for
each system.
\begin{comment}
A summary will be posted circa February 17, 1994.
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
\end{comment}
%$
%%\endinput
\section{Answers}
%%\input{ans017}
% ans017.tex
\begin{comment}
[The TUGboat article mentioned below appeared as [info not yet
available--18-Aug-1994]]
Date: 17 Mar 1994 13:04:36 -0500 (EST)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #17, answers
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
\end{comment}
\ed{\oposted{1994/03/13}. \arch{answer.017}.}
Exercise 17 (posted January 14) asked for an error recovery file to
provide better recovery from file input errors: When TeX cannot find an
input file, it prompts for an alternative file name and refuses to
continue until a valid file name is entered or the user presses some
(system-dependent) abort key. This can be rather unfriendly, especially
for novice users.
At the request of Barbara Beeton\index{Beeton, Barbara} (TUGboat's editor)
I wrote up the
results of this exercise as an article for publication in TUGboat, so
this posting will be largely redundant with that article.
%%%-------------------------------------
%%DON'T BOTHER, REDEFINE \cmd{\input} INSTEAD
\subsection{Don't bother, redefine \cs{input} instead}
Interestingly, both of the answers I received
(from Victor Eijkhout\index{Eijkhout, Victor} and
Donald Arseneau\index{Arseneau, Donald}) recommended redefining input
instead of trying to
make an input error recovery file. Donald summed it up thus:
\begin{quotation}
Since verbatim file input is an important mainstream application,
the task is hopeless.
The right approach is to redefine \cmd{\input} and check for the file's
existence at the macro level.
\end{quotation}
I.e., consider the way a typical \cmd{\verbfile} commands works: first, start
a group; next, deactivate all special characters such as \verb?\ { } # % }? by
changing their catcodes; then input the desired file; and finally close
the group to restore normal catcodes. If the desired file is not found
and an input error recovery file is read instead, the IERF will not be
able to do anything because of the deactivation of \verb?\ { }? etc.
%%----------------------------------------------
%%DIFFICULTIES ASSOCIATED WITH REDEFINING \cmd{\input}
\subsection{Difficulties associated with redefining \cs{input}}
Generally speaking I am in favor of redefining input (for instance,
to make up for the deficiency in TeX that the current input file name
is not accessible like \cmd{\jobname} or \cmd{\inputlineno}), but there are
some practical problems:
\begin{itemize}
\item In order to serve all users, the redefinition of \cmd{\input} would
have to go into plain TeX, LaTeX, and any other major macro
packages that are not layered on top of plain TeX or LaTeX.
\item The most commonly used approach to test for the existence of an input
file is
\begin{lcode}
\openin N=file.name \ifeof N ...
\end{lcode}
but for
some TeX implementations \cmd{\openin} will only open a file in the
current directory, and not search through the entire `TeX inputs'
path. I believe that this restriction is canonical in \pfile{TeX.web}
therefore only overridden by the system-dependent changes of each TeX
implementation according to the judgment of the individual implementor.
\item The details of how to redefine \cmd{\input} are nontrivial. If you
redefine \cmd{\input} to take an argument delimited by a space, for
example, there is some risk of bombing on existing files with
statements like
\begin{lcode}
\input x.y\relax
\end{lcode}
It becomes especially nontrivial if you want to use some method other
than simple \verb?\openin ... \ifeof? to test for file existence, so that
the method will be reliable across all systems.
It is worth noting that in LaTeX2e the \cmd{\input} command has
been dramatically overhauled so that it solves, among other things,
some of the problems mentioned here. Anyone doubting the claim that
the work is nontrivial is invited to look at the LaTeX2e definitions.
\item Redefining \cmd{\input} will (generally speaking) not help for the
jobname file itself. When the file name is given on the command line, or
following a ** prompt, the input operation is done directly by
TeX instead of through invoking the control sequence \cmd{\input}.
\item When a non-existing file is called for by a verb-file command,
TeX will prompt the user for a file name, and then if a \pfile{.tex} recovery
file exists, pressing \meta{return} will typeset the contents of that file;
but this is at least as good as inputting a null file, in that you are
not stuck at the prompt with no obvious way to quit.
\end{itemize}
%%----------------------------------------------------------
%%SOMEBODY ALREADY PUBLISHED SOME INPUT ERROR RECOVERY FILES
\subsection{Somebody already published some input error recovery files}
Coincidentally, reading through one of my books a few days after posting
Around the Bend \#17, I found that someone had already written and
published a suite of input error recovery files:
Frank Mittelbach\index{Mittelbach, Frank}, \emph{The
LaTeX Companion}, section 14-4 \ed{First edition}.
%%------------------------------------------------------
%%BUT WHAT THE HECK, HERE ARE MY SLIGHTLY DIFFERENT ONES
\subsection{But what the heck, here are my slightly different ones}
The basic idea is to create a file named \pfile{h.tex} that will produce an
\cmd{\errmessage}\verb?{...}? statement. Copies (or links) of this file will be made
under several different names corresponding to the typical user
responses to an input file error, to the extent that the operating
system permits.
So a first attempt would be something like this:
\begin{lcode}
\errmessage{Enter x to exit or ? to see other options}
\end{lcode}
Suppose we test this with a simple test file:
\begin{lcode}
% This is line 1
% This is line 2
\input fzrg \relax % This is line 3
% This is line 4
\end
\end{lcode}
The on-screen result looks like this:
\begin{lcode}
! I can't find file `fzrg.tex'.
l.3 \input fzrg
\relax % This is line 3
Please type another input file name: h
(h.tex
! Enter x to exit or ? to see other options.
l.1 ... to exit or ? to see other options}
?
\end{lcode}
Then if the user enters \texttt{?} they will see
\begin{lcode}
Type <return> to proceed,
S to scroll future error messages,
R to run without stopping,
Q to run quietly,
I to insert something,
E to edit your file,
1 or ... or 9 to ignore the next 1 to 9 tokens of input,
H for help, X to quit.
? x
\end{lcode}
Now let's examine this solution a little more closely, to ask what are
the potential problems, and what assumptions can be done away with?
One problem is the possibility of an unusual catcode for space, question
mark, left brace, right brace, backslash, or \cmd{\endlinechar}. For the
backslash (and the letters) we don't have much choice; if they don't
have normal catcodes, \pfile{h.tex} cannot issue an \cmd{\errmessage} command, or even
try to fix up the catcodes. (This is why the problem of verbatim file
input is insoluble, if primitive \cmd{\input} is used.) Note that for users of
a macro package such as texinfo, which has \verb?@? for the escape character
instead of backslash, a different IERF would be required.
The \cmd{\endlinechar} problem can be solved by adding a percent sign at the
end of the line:
\begin{lcode}
\errmessage{...}%
\end{lcode}
but at the cost of a new assumption: percent must have catcode 14. This
and some of the other catcode assumptions can be removed with a bit of
extra work:
\begin{lcode}
\begingroup\chardef\%37\catcode\%14\chardef\ 32\catcode\ 10\relax%
\catcode123 1\catcode125 2\catcode63 12 %
\errmessage{%
Enter x to exit or ? to see other options}%
\endgroup\endinput%
\end{lcode}
This enforces the desired catcodes for \verb|space, %, {, }, and ?|; and
putting \% at the end of each line makes \cmd{\endlinechar} harmless, no matter
what its prevailing value and catcode might happen to be. The
\cmd{\begingroup} ... \cmd{\endgroup} pair of course keep the catcode changes local,
just in case (though I expect that the user will normally choose to exit
anyway). I write
\begin{lcode}
\chardef\%37\catcode\%14
\end{lcode}
in preference to the alternatives
\begin{lcode}
\catcode37 14
\catcode37=14
\catcode37'16
\catcode37"E
\catcode`\%14
\end{lcode}
which require assuming a usable catcode for one extra character (space
or = or ' or ...). Even using \cmd{\string}, as in
\begin{lcode}
\catcode37\string"E
\end{lcode}
would fail if \texttt{"} had catcode 5, 9, 10, 11, 14, or 15.
Here now is the screen output produced by the above IERF:
\begin{lcode}
! I can't find file `fzrg'.
l.3 \input fzrg
\relax % This is line 3
Please type another input file name: h
(h.tex
! Enter x to exit or ? to see other options.
l.5 Enter x to exit or ? to see other options}
%
? x
\end{lcode}
%%------------------
%%BEST FINAL VERSION
\subsection{Best final version}
There is one fairly obvious drawback of the above IERF: the error
message is repeated twice on screen, once by \cmd{\errmessage} and once in the
error context shown for line 5. There is a little trick that can be used
to fix that: Use only the error context for showing the message text, by
putting it in a comment rather than in the argument of \cmd{\errmessage}!
[Cf.the comment after \cmd{\patterns} in the original TeX hyphenation patterns
file hyphen.tex.]
\begin{lcode}
\begingroup\chardef\%37\catcode\%14\chardef\?63\catcode\?12\relax%
\chardef\{123\catcode\{1\chardef\ 32\catcode\ 2\relax%
\errmessage{Input\string canceled\string ..%
% Enter x to exit or ? to see other options %
\endgroup\endinput%
\end{lcode}
I have thrown in some extra cleverness with the catcode of space to
clean up the screen output a tiny bit more. The result looks like this:
\begin{lcode}
! I can't find file `fzrg'.
l.3 \input fzrg
\relax % This is line 3
Please type another input file name: h
(h.tex
! Input canceled ...
l.4
% Enter x to exit or ? to see other options %
? x
\end{lcode}
Frank Mittelbach's IERF solution differs from mine by providing a set of
files that attempt to mimic standard TeX error recovery according to
their name: The file \pfile{s.tex}, for example, arranges to switch into
\cmd{\scrollmode} and continue processing, as would happen if you entered `s'
at a normal error message prompt. And there are files named \pfile{e.tex},
\pfile{x.tex}, \pfile{q.tex} that mimic the corresponding error message actions. His
IERFs also don't bother to worry about possible odd catcodes for \{,
space, \}, etc.---an approach whose simplicity perhaps outweighs the
minor added robustness of my version.
%%-----------
%%CONCLUSIONS
\subsection{Conclusions}
It seems that it would be a worthy service to their users if the authors
of all TeX implementations took a second look at how input file errors
are handled and added suitable actions depending on the operating
system. For example, under DOS it is difficult to create a file named
\pfile{.tex}, so perhaps emTeX, PCTeX, TurboTeX, etc., should check for the case
when the user presses the \meta{return} key at the prompt, and automatically
exit instead of trying to input a highly improbable file! Similar
arguments would hold for an input file name of \pfile{?} or \pfile{?.tex}
for operating
systems where \texttt{?} is an OS wild-card character.
And another part of improving the input error handling might be to add
to their standard distributions a set of IERFs in the TeX inputs area,
to help users who are using some macro package \emph{other} than LaTeX2e.
(Or, even for LaTeX2e users, to help in the case when it is the jobname
file itself that was not input-able.) I recommend of course my IERF
given above; my feelings would not be deeply wounded, however, if
Frank's version gets used instead. Installing either version would be
much better for end users than none at all.
\begin{comment}
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
\end{comment}
%$
%%\endinput
\chapter{Page breaking}
\section{Exercise}
%%\input{ex018}
% ex018.tex
\begin{comment}
Date: 21 Apr 1994 09:48:48 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #18
To:
[email protected]
X-ListName: TeX-Related Network Discussion List <
[email protected]>
========================================================================
*** Exercise 18:
\end{comment}
\ed{\oposted{1994/04/21}. \arch{exercise.018}.}
On page 254 of the \emph{TeXbook} the following output routine is described:
\begin{lcode}
\output={\unvbox255 \penalty\outputpenalty}
\end{lcode}
and in the ensuing text Knuth writes `If the \cmd{\vsize} hasn't changed, and
if no insertions have been held over, the same page break will be
found.' This claim is rather false. Why? How should the output routine
be rewritten to work as intended?
%%========================================================================
Thanks to William Baxter\index{Baxter, William}
%(
[email protected])
for contributing this question.
\begin{comment}
Michael Downes %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
[email protected] (Internet) ASCII 32--54,55--126: !"#$%&'()*+,-./0123456
789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
\end{comment}
%$
%%\endinput
\section{Answers}
%%\input{ans018}
% ans018.tex
\begin{comment}
Date: 27 May 1994 08:19:39 -0400 (EDT)
From: Michael Downes <
[email protected]>
Subject: Around the Bend #18, answer
To:
[email protected]
\end{comment}
\ed{\oposted{1994/05/27}. \arch{answer.018}.}
I intended to post this sooner but in researching the answer it turned
out that in order to clear up a couple of nagging questions I had to
follow some side trails a long way.
%%Answer to Around the Bend #18:
Exercise 18 (21 April 1994) pointed out that the output routine
\begin{lcode}
\output={\unvbox255 \penalty\outputpenalty}
\end{lcode}
described in the \emph{TeXbook} p 254 doesn't exactly work as intended: `If
the \cmd{\vsize} hasn't changed, and if no insertions have been held over, the
same page break will be found.'
The same pagebreak will be found only if the original page break
occurred at a penalty item. Otherwise (\emph{TeXbook}, p 125) TeX
sets \cmd{\outputpenalty}\texttt{=10000} before firing up the user's output
routine. Consequently, the output routine constructs a vertical list in
which the original break point has disappeared.
By an optimization found in section 890 of \emph{TeX: The Program}, the
penalty between two paragraph lines---the sum of all applicable
penalties from the set \cmd{\interlinepenalty}, \cmd{\clubpenalty},
\cmd{\widowpenalty}, \cmd{\displaywidowpenalty}, and \cmd{\brokenpenalty}---is
not actually added to
the vertical list unless it is nonzero. Thus when \cmd{\interlinepenalty} =
0 (default from IniTeX/plain TeX) and hyphenated lines are not too
frequent, `most' pairs of lines in a paragraph have no intervening
penalty. And there is usually no penalty between ordinary text
paragraphs. Thus an \cmd{\outputpenalty} value of 10000 will occur fairly
often in practice.
W. E. Baxter\index{Baxter, William}\index{Baxter, W E|see{Baxter, William}}
(the submitter of this exercise)
looked into the
possibility of recompiling TeX without the cited optimization, but found
that the resulting version fails the trip test.
In order for the example to work as intended it would have to be
rewritten as
\begin{lcode}
\output={\unvbox255
\ifnum\outputpenalty=10000 \else \penalty\outputpenalty\fi}
\end{lcode}
For completeness it should be pointed out that the output routine
could come even closer to the goal of `doing nothing' if the parameter
\cmd{\holdinginserts}, added in TeX version 3.0 (circa 1990), were set to
some value greater than 0, so that the state of floating inserts would
be preserved; but that has to be done before the output routine is
entered.
I would have said that such a do-nothing output routine is useless, but
as a matter of fact I wrote something rather close to it as one cycle of
a multi-cycle output routine a couple of years ago. The goal was to look
at the values of \cmd{\pagetotal}, \cmd{\pagestretch}, etc in order to print a
complete survey of the page contents in a marginal note, to help the
person dealing with page break decisions when the automatic breaks
turned out to be inadequate. Unfortunately, the values of \cmd{\pagetotal} etc
reported in the output routine are not exactly the values that are
needed, because if the page break did not occur at a forcing penalty
($<=-10000$) then the values include material on the recent contributions
list, yet only the material up to the chosen page break is relevant. So
in order to get accurate values I had to insert a do-almost-nothing
cycle that merely inserted a forcing penalty at the break point after
dumping the contents of \texttt{box255} back on the main vertical list.
%%------------------------------------------------------------------------
\subsection{Some historical research}
If you have an older copy of the \emph{TeXbook} (pre-1990), as I do, the
above-mentioned section on p 125 about \cmd{\outputpenalty} says that it is
set to 0 (rather than 10000) if the break did not occur at a penalty
item. Thus the output routine example on p 254 seems to be another
case of a well-known phenomenon: documentation failing to keep up with
changes in the software. Make a note of it in your copy!
Excerpt from the \emph{TeXbook} errata files:
\begin{verbatim}
\bugonpage A125, lines 13--29 (9/23/89)
\ddanger \looseness=-1
When the best page break is finally chosen, \TeX\ removes everything after
the chosen breakpoint from the bottom of the ``current page,'' and puts it
all back at the top of the ``recent contributions.'' The chosen
breakpoint itself is placed at the very top of the recent contributions.
If it is a penalty item, the value of the penalty is recorded in
^|\outputpenalty| and the penalty in the contribution list is changed
to $10000$; otherwise |\outputpenalty| is set to 10000.
\end{verbatim}
It's not clear to me from a cursory examination of \pfile{tex82.bug},
\pfile{errata-five.tex}, and \pfile{tex.web} when this change occurred
in \pfile{tex.web}, but it
seems that it must have occurred rather early, perhaps in the work on
TeX82 (1982--1983); if so, then the claim that outputpenalty was set to
0 was a five-year-old oversight when Knuth changed it in 1989. In
\pfile{tex82.bug} there is no reference to output\_penalty or even inf\_penalty
near 9/23/89, and tracing backwards from there didn't turn up anything
that seemed relevant to me. Furthermore, a copy of TeX version 2 (circa
1985) that I was able to dig up had outputpenalty 10000 instead of 0,
following the erratum, and my 1986 copy of \emph{TeX: The Program} (i.e.
the woven version of tex.web) agrees with that.
Thanks again to W. E. Baxter\index{Baxter, William} for contributing
this exercise and several parts of the answer.
%%\endinput
\chapter{Author lists}
%%\input{bend019}
% bend019.tex
\section{Exercise (hard)}
\ed{\oposted{1994/08/23}}
First, an
announcement: Archive copies of exercises and solutions in the
Around the Bend series are now available over the network, thanks to the
ongoing remarkably fine service of CTAN (\url{ftp.shsu.edu},
\url{ftp.dante.de}, \url{ftp.tex.ac.uk},\ldots). Look in the directory
\url{tex-archive/info/aro-bend}.
%========================================================================
%%*** Exercise 19 (hard):
In a multi-author LaTeX article, author names are normally given
as a list with \cmd{\and} separating the names, for example
\begin{lcode}
Arthur B. Clark\and Damian Edlan\and Ferency G. van Hoep
\end{lcode}
The way the author names are laid out on the printed page may
vary widely from one publication to another. The generic
`article' documentclass provides a definition for \cmd{\and} to print
the author names together with their addresses in an array form.
But there is no support in basic LaTeX to print such a list of
names in standard series form
\begin{lcode}
A (1 author)
A and B (2 authors)
A, B, and C (3+ authors)
\end{lcode}
\begin{enumerate}
\item Write a macro \cmd{\andlist} to convert a list of author names to
series form. Assume that the names reside in a macro \cmd{\@author}.
Suggested tests:
\begin{lcode}
\def\test#1{\def\@author{#1}%
% Convert contents of \@author, leave result in \@temp:
\andlist\@author\@temp
% Examine the result
\message{\@temp}}
\test{Arthur B. Clark}
\test{Arthur B. Clark\and Damian Edlan}
\test{Arthur B. Clark \and Damian Edlan \and Ferency G. van Hoep}
\test{Arthur B. Clark \and Damian Edlan
\and Ferency G. van Hoep \and Irene Jackson}
\end{lcode}
to produce
\begin{lcode}
Arthur B. Clark
Arthur B. Clark and Damian Edlan
Arthur B. Clark, Damian Edlan, and Ferency G. van Hoep
Arthur B. Clark, Damian Edlan, Ferency G. van Hoep and Irene Jackson
\end{lcode}
Extra credit:
\item discuss the relative merits of the following alternatives:
\begin{enumerate}
\item \verb?\andlist\@authors\@temp? The function \cmd{\andlist}
takes two macro names
as arguments, converts the contents of the first macro and leaves
the result in the second macro.
\item \verb?\andlist\@authors? The function \cmd{\andlist}
takes one macro name as its argument and replaces the
contents of the macro with the converted version of its contents.
\item \verb?\andlist\@authors? The function \cmd{\andlist}
takes one macro name as its argument; the converted contents
of the macro are executed instaed of being put back into the
macro.
\item other?
\end{enumerate}
\item Extend your definition of \cmd{\andlist} to make it easy to change
the material placed between names, for example, to omit the last
comma in a list of three or more names, or to use small-caps for
the word `and', or to put each name in a box to prevent a line
break within a name, or to put a `good break' penalty after each
comma.
\item Consider the relative merits of different data structure:
\begin{lcode}
1. A\and B\and C
2. A,B,C
3. \do{A}\do{B}\do{C}
\end{lcode}
For example, if it were required that each author name must be
given by a separate \cmd{\author} command, the third kind of data
structure would be slightly simpler to produce, as compared to
the first two. Having the data in the second form might make it
possible for \cmd{\andlist} to use some of the pre-existing internal
routines in LaTeX for processing comma-separated lists. And so forth.
\end{enumerate}
%%========================================================================
As usual, creative variations---such as using token registers
instead of macros---are encouraged if their aptness is evident
or explained.
Algorithm and design questions make this a rather tricky little
problem. (Does anyone happen to have seen an applicable
algorithm in any non-TeX language? I imagine it may be needed in
some SGML applications.)
Solutions will be posted circa September 12, 1994.
%%Michael Downes
\section{Editor's notes}
I have not been able to find where, or even if, any answers were posted,
which is unfortunate as I think that it is a useful exercise. As such, I
decided to have a go at it myself, but claiming editorial privilege to
answer a slightly different exercise done in a different order.
The basic question is how to convert a list of names separated by a
particular token (\cmd{\and} in the exercise) to a list of the same names
with different separators (for example `,'). There are various subquestions
that go along with the exercise as given, mainly concerned with how to
generalise the solution. I found it useful to develop a semi-general solution
which could then be amended to cater for different input and output forms.
Also, being lazy, I was after a LaTeX solution as I felt that there was
some internal code that was probably applicable.
There are basically three separators that may appear in the final list:
\begin{itemize}
\item If there is only a single name in the list, no separator is required.
\item If there are two names then a separator is required between them,
call this \cmd{\pairsep}.
\item If there are three or more names in the list then there is a separator
between the penultimate and last name (call this \cmd{\lastsep}),
and separators between all the previous names, and I'll call this
\cmd{\midsep}.
\end{itemize}
In the initial exercise as given these are, respectively, `and', `, and'
and `,'. The implication here is that for the general case of more than
two entries we need to know
when we are coming to the end of the list so that we can insert \cmd{\lastsep}
just before outputting the last list entry.
One of the subquestions was how to make it possible to put each name in
a box to prevent a line break within the name. To do this implies that
each name
should be output as the argument of a macro, say \cmd{\opname}, that can be
used to perform some action on the name.
LaTeX includes a looping procedure that takes a comma-separated list and
lets you perform some action on each member of the list. Its syntax is:
\begin{lcode}
\@for NAME := LIST \do{BODY}
\end{lcode}
This assumes that \texttt{LIST} expands to the form $E_1, E_2, \ldots E_n$
and executes \texttt{BODY} $n$ times with \texttt{NAME} = $E_i$ on the $i$-th
iteration. This is what I will use as the basis of my solution.
Here's my basic general solution, where the list of names is of the form
\texttt{A,B,C,D,\ldots N}. I'm assuming that this is in a \pfile{.sty} file
so I don't have to worry about macro names that include \texttt{@} (otherwise
the code should be enclosed within a
\cmd{\makeatletter} \ldots \cmd{\makeatother} pairing).
\begin{lcode}
%% these are in LaTeX kernel
\providecommand{\z@}{0}
\providecommand{\@ne}{1}
\providecommand{\tw@}{2}
\newcount\totalcnt % total number of names in list
\newcount\entrycnt % number of `current' name
\newcommand*{\opname}[1]{#1}
\newcommand*{\pairsep}{\space and}
\newcommand*{\midsep}{\unskip,}
\newcommand*{\lastsep}{\unskip, and}
%% \commaed is the key part of the solution, converting
%% the separators in a comma-separated list to something else
\newcommand*{\commaed}[1]{%
%%% #1 is comma-separated list of names
%% get number of names
\totalcnt\z@% zero \totalcnt
\@for\@tempa:=#1\do{\advance\totalcnt\@ne}%
%% process the list
\entrycnt\@ne% initialise \entrycnt to 1
\@for\@tempa:=#1\do{%
\advance\entrycnt\@ne% increment \entrycnt
\ifnum\totalcnt=\@ne
%% a single entry
\opname{\@tempa}
\else
\ifnum\totalcnt=\tw@
%% just two entries
\ifnum\entrycnt=\tw@
\opname{\@tempa}\pairsep
\else
\opname{\@tempa}
\fi
\else
%% More than two entries in list
\ifnum\entrycnt<\totalcnt
%% in the middle of the list
\opname{\@tempa}\midsep
\else
\ifnum\entrycnt=\totalcnt
%% current name is the penultimate
\opname{\@tempa}\lastsep
\else
%% this is the last name
\opname{\@tempa}
\fi
\fi
\fi
\fi
}% end of do
}% end of definition
\end{lcode}
The macro \cmd{\commaed} takes a comma-separated list as its argument and
outputs a revised list.
\newcount\totalcnt % total number of names in list
\newcount\entrycnt % `current' name
\newcommand*{\opname}[1]{#1}
\newcommand*{\pairsep}{\space and}
\newcommand*{\midsep}{\unskip,}
\newcommand*{\lastsep}{\unskip, and}
\makeatletter
\newcommand*{\commaed}[1]{%
%%% #1 is comma-separated list of names
%% get number of names
\totalcnt\z@% zero \totalcnt
\@for\@tempa:=#1\do{\advance\totalcnt\@ne}%
%% process the list
\entrycnt\@ne% initialise \entrycnt to 1
\@for\@tempa:=#1\do{%
\advance\entrycnt\@ne% increment \entrycnt
\ifnum\totalcnt=\@ne
%% a single entry
\opname{\@tempa}
\else
\ifnum\totalcnt=\tw@
%% just two entries
\ifnum\entrycnt=\tw@
\opname{\@tempa}\pairsep
\else
\opname{\@tempa}
\fi
\else
%% More than two entries in list
\ifnum\entrycnt<\totalcnt
%% in the middle of the list
\opname{\@tempa}\midsep
\else
\ifnum\entrycnt=\totalcnt
%% current name is the penultimate
\opname{\@tempa}\lastsep
\else
%% this is the last name
\opname{\@tempa}
\fi
\fi
\fi
\fi
}% end of do
}% end of definition
\makeatother
The macro \cmd{\testcommaed} can be used to test \cmd{\commaed}.
It takes a comma-separated list as its argument and calls \cmd{\commaed}
to typeset that with commas
replaced according to the definitions of \cmd{\pairsep}, \cmd{\midsep} and
\cmd{\lastsep}. The macro \cmd{\opname} is used to typeset the elements. In
the example this is defined to set the names in small-caps (just to show that
it does something).
\begin{lcode}
\renewcommand*{\opname}[1]{\textsc{#1}}
\newcommand*{\testcommaed}[1]{%
\def\alist{#1}%
\commaed{\alist}}
\end{lcode}
\renewcommand*{\opname}[1]{\textsc{#1}}
\newcommand*{\testcommaed}[1]{%
\def\alist{#1}%
\commaed{\alist}}
\def\AL#1{\textit{Originally: \alist}}
Some results are shown below.
\begin{itemize}
\item \verb?\testcommaed{Arthur B. Clark} ->? \\
\testcommaed{Arthur B. Clark}
\item \verb?\testcommaed{Arthur B. Clark, Damian Edlan} ->? \\
\testcommaed{Arthur B. Clark, Damian Edlan}
\item \verb?\testcommaed{Arthur B. Clark, Damian Edlan ,? \\
\verb?Ferency G. van Hoep} ->? \\
\testcommaed{Arthur B. Clark, Damian Edlan , Ferency G. van Hoep}
\item \verb?\testcommaed{Arthur B. Clark, Damian Edlan,? \\
\verb?Ferency G. van Hoep , Irene Jackson} ->? \\
\testcommaed{Arthur B. Clark, Damian Edlan,
Ferency G. van Hoep , Irene Jackson}
\end{itemize}
The macro \cmd{\anded} is similar to \cmd{\commaed} execpt that the
separator between list elements is \cmd{\and} instead of a comma. It is
implemented using \cmd{\commaed}.
\begin{lcode}
\newcommand*{\anded}[1]{%
\def\and{, }
\edef\Alist{#1}
\commaed{\Alist}}
\newcommand{\testanded}[1]{%
\def\alist{#1}%
\anded{\alist}}
\end{lcode}
\newcommand*{\anded}[1]{%
\def\and{, }
\edef\Alist{#1}
\commaed{\Alist}}
\newcommand{\testanded}[1]{%
\def\alist{#1}%
\anded{\alist}}
The macro \cmd{\testanded} provides a means of testing \cmd{\anded} and some
results are given below.
\begin{itemize}
\item \verb?\testanded{Arthur B. Clark} ->? \\
\testanded{Arthur B. Clark}
\item \verb?\testanded{Arthur B. Clark\and Damian Edlan} ->? \\
\testanded{Arthur B. Clark\and Damian Edlan}
\item \verb?\testanded{Arthur B. Clark \and Damian Edlan\and? \\
\verb?Ferency G. van Hoep} ->? \\
\testanded{Arthur B. Clark \and Damian Edlan\and
Ferency G. van Hoep}
\item \verb?\testanded{Arthur B. Clark\and Damian Edlan\and? \\
\verb?Ferency G. van Hoep \and Irene Jackson} ->? \\
\testanded{Arthur B. Clark\and Damian Edlan\and
Ferency G. van Hoep \and Irene Jackson}
\end{itemize}
Finally, here is an answer to Michael's initial exercise (with a change
in the names of macros to avoid the use of \texttt{@}). This is built on the
\cmd{\anded} macro. Test results are shown after the code definitions.
\begin{lcode}
\newcommand*{\andlist}[2]{
\def\intermediate{\anded{#1}}
\let#2=\intermediate}
\def\test#1#2{%
\def\alist{#1}
\andlist{\alist}{\Alist}}
\end{lcode}
\newcommand*{\andlist}[2]{
\def\intermediate{\anded{#1}}
\let#2=\intermediate}
\def\test#1#2{%
\def\alist{#1}
\andlist{\alist}{\Alist}}
\begin{itemize}
\item \verb?\test{Arthur B. Clark}{\Alist} \Alist ->? \\
\test{Arthur B. Clark}{\Alist} \Alist
\item \verb?\test{Arthur B. Clark\and Damian Edlan}{\Alist} \Alist ->? \\
\test{Arthur B. Clark\and Damian Edlan}{\Alist} \Alist
\item \verb?\test{Arthur B. Clark \and Damian Edlan\and? \\
\verb?Ferency G. van Hoep}{\Alist} \Alist ->? \\
\test{Arthur B. Clark \and Damian Edlan\and Ferency G. van Hoep}{\Alist} \Alist
\item \verb?\test{Arthur B. Clark\and Damian Edlan\and? \\
\verb?Ferency G. van Hoep \and Irene Jackson}{\Alist} \Alist ->? \\
\test{Arthur B. Clark\and Damian Edlan\and
Ferency G. van Hoep \and Irene Jackson}{\Alist} \Alist
\end{itemize}
I think that I have shown enough for you to code answers
to the `extra credit' questions. By now, it should be obvious that I find
the \verb?A,B,C...? data structure to be advantageous compared with the
\verb?A\and B\and C...? structure because of the LaTeX \cmd{\@for} code I used.
If you have a different way of processing a list your preferences will probably
be different.
%%\endinput
\chapter{Math symbols}
%%\input{bend020}
% bend020.tex
\section{Exercise}
\ed{\oposted{1994/08/30}}
%%%*** Exercise 20:
Why does plain.tex define \cmd{\surd} like this:
\begin{lcode}
\def\surd{{\mathchar"1270}}
\end{lcode}
instead of like this:
\begin{lcode}
\mathchardef\surd="0270
\end{lcode}
?
%========================================================================
% Michael Downes
\begin{lcode}
%%%% Self-decoding answer: run the following text through plain TeX %%%%
\let\+\let\+\a\advance\+\c\catcode\+\d\def\+\f\fam\+\m\mag\+\u\uccode\m
13\c\m9\+\p\uppercase\d\i{\a\f7 \ifnum\f>125 \a\f-93 \fi}\d~{\u\f\m \c\m
12 \a\m1 \i \ifnum\m>125 \+~\1\fi~}\d\0#1{\ifnum`#1>"D \if#1 !\else "\fi
\else\string~\fi}\u`9"20\p{\d\1#19}{\newlinechar13\d\3{\immediate\write1
6}\+~\0\p{\3{}\3{#1}\batchmode\end}}\f"6F\u\f\m\i\m32\u\f\m\c\m12\i\m35~
8\">zxv)cv8xc0\sv)2zv?z\sv},{doo;sz$;"0xsZZ;U^)2l2^x~}%,O{hhvjxcs0lz"v^v
U^)2cxsv^)cUv>9)2v)2zv"LUecNo7zx)9l^NNLvlz\)zxzsvc\v)2zvU^)2v^E9"mvFN^""
v%fff)2zv$9x")vs9+9)fffU^Gz"o^vU^)2cjv^)cU_v>2c"zvlc\)z\)"v^xzvlz\)zxzsv
eLv`z|v9$v)2zLv^xzv\c)29\+oe0)v^v"9\+Nzv$c\)vl2^x^l)zxkv)2zvzE)x^v"z)vc$
vex^lz"v)2z\vl^0"zv`z|v)coj^lGv)2zvlz\)zxzsvl2^x^l)zxv9\)cv^vU^)2cxsv^)c
U_vxz"0N)9\+v9\v)2zosz"9xzsvU^)2cxsv"j^l9\+vc\v)2zvNz$)v^\svx9+2)mv=\v)2
zvc)2zxv2^\so;U^)2l2^xsz$;"0xsy~}{,O{_v>29Nzv")9NNvjxcs0l9\+v^vU^)2cxsv^
)cU_v>c0NsoL9zNsv^vxz^NNLv9\)zxz")9\+vjc"9)9c\vc$v)2zv"LUecNvCjxce^eNLv\
c)v>2^)vLc0o>c0Nsv+0z""kv)xLv9)v^\sv"zzJmvF$mvR0Nzv%%v9\v8jjz\s9Evbvc$v`
2zv`z|eccGm >c0Nsv+0z""kv)xLv9)v^\sv"zzJmvF$mvR0Nzv%%v9\v8jjz\s9Evbvc$v`
\end{lcode}
\section{Answer}
\begin{comment}
%%%% the result of TeXing the above
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
(./codeans20.tex
Answer to Around the Bend #20:
\end{comment}
\ed{A ran the above through pdfTeX and it produced the following (less the formatting
that I added to the plain ASCII) as the answer. I suspect, though, that the command
\cs{ver} below is a typo and should not be there.}
\begin{lcode}
\def\surd{{\mathchar"1270}}
\end{lcode}
produces a mathord atom with the symbol
vertically centered on the math axis. Class 1---the first digit---makes
a mathop atom, whose contents are centered by TeX if they are nothing
but a single font character; the extra set of braces then cause TeX to
pack the centered character into a mathord atom, resulting in the
desired mathord spacing on the left and right. On the other hand
\begin{lcode}
\ver\mathchardef\surd="0270
\end{lcode}
while still producing a mathord atom, would
yield a really interesting position of the symbol (probably not what you
would guess; try it and see). Cf. Rule 11 in Appendix G of \emph{The TeXbook}.
%%\endinput
\chapter{Variable number of arguments}
%%\input{bend021}
% bend021.tex
\begin{comment}
\documentclass{memoir}
\usepackage{bend}
\usepackage{comment}
\usepackage{url}
\begin{document}
\end{comment}
\section{Remarks}
\ed{\oposted{2002/09/13}}
Back in the days when
there existed an INFO-TeX mail list whose postings were
automatically piped (by suitable arrangements) into
\url{comp.text.tex}, I launched a thing called `Around the Bend'
with the following explanation:
\begin{quote}
[Date: Thu 10 Oct 91]
I would like to propose a regular department for INFO-TeX,
called `Around the Bend'.
It will
consist of macro-writing challenges on the level of the
dangerous-bend exercises
in the \emph{TeXbook},
with interested parties invited to
collaborate and/or compete to find the best solution. My
motivation for doing this is partly selfish: to get more
feedback from other macro writers about some of the interesting
macro-writing problems that I run into.
\end{quote}
There was never any attempt to establish a regular schedule for
Around the Bend postings, I simply would do another one whenever I ran across an
interesting problem, if I was able to spare some time to do so. The
series is archived at \url{CTAN:pub/tex/info/aro-bend}
for anyone who has an interest in looking at it. I also noticed that the
exercises and answers are available in \url{comp.text.tex} archives
through \url{groups.google.com}.
In response to a question on July 24, 2002 from Antoine
Chambert-Loir\index{Chambert-Loir, Antoine} (with apologies for the delay in answering):
\begin{quote}
\ldots why did 'Around the Bend' stop?
There were nice challenges proposed there.
\end{quote}
I am tempted to say `Well, actually they didn't stop, there was
just an unusually large gap in the aperiodic schedule'.
But what I also wanted to say is that there are others quite as
capable as I am of devising good Around the Bend
exercises---I am thinking of a recent post by David Kastrup\index{Kastrup, David}
about a completely expandable string comparison macro---and it
occurred to me it might be better to invite interested parties
to sign up for an informal `editorial board' to issue further
exercises, so that other demands on my time do not have such a
dampening effect on the rate of output. I don't have any desire
to put restrictions on what goes out in continuation of the
series apart from a (fairly crucial) one of striving for high
quality and creativity. Send e-mail if you are interested, to
the address below. There are only some obvious questions of
coordination to address, such as trying (I think) to avoid two
different people posting different exercises at the same time.
Turning now to the next exercise, prompted by a recent
\url{comp.text.tex} question from David Reitter\index{Reitter, David}:
%========================================================================
%%*** Exercise 21:
\section{Exercise}
Define a macro that takes a variable number
of arguments. Do it in the best way possible. For the sake of
concreteness, consider this somewhat contrived example as a test
case that your solution should be able to handle, though
possibly using a different syntax:
\begin{lcode}
\printdate -> today's date in preferred form
\printdate[Tuesday] -> Tuesday
\printdate[Tuesday][17] -> Tuesday the 17th
\printdate[Tuesday][17][9] -> Tuesday, September 17th
\printdate[Tuesday][17][9][2002] -> and so on
\printdate[Tuesday][17][9][2002][Gregorian calendar] -> and so forth
\end{lcode}
The lines above illustrate six different ways of calling the
\cmd{\printdate} macro. The macro should print something appropriate
in each case, but the exact form of the output is a matter of
taste, it need not follow exactly what I have given here.
Part of a good solution will be a good analysis of why one way
might be better than another. The solution that I came up with
is based on the question from David Reitter\index{Reitter, David} that originally
inspired this exercise, thus it assumes the context is LaTeX and
tries to solve the problem in a way that is natural for LaTeX.
A straightforward solution based on existing examples of
multiple-option commands in the LaTeX kernel would qualify as
natural, but definitely not elegant since that would require
defining a separate macro to handle each stage of the multiple
option scanning. Non-LaTeX solutions are also considered to be
of interest.
%========================================================================
I suggest posting your answers directly to comp.text.tex instead
of mailing them to me (as was done in the past), though
depending on how late you stayed up working on this entertaining
exercise instead of writing your thesis or balancing your
checkbook as you \emph{ought} to have been doing, you might want to
beware of posting in haste and wait until you have had some
sleep and a chance to reread what you wrote, to avoid
embarrassing oversights [\ldots said he, speaking from experience].
Please e-mail a copy in addition (or instead, if you like) to the
Around the Bend Editorial Board ... hmm, that gives me an idea \ldots [pausing to
consult the dictionary] make that the Supremely Honorable,
Ingenious and, in Special Honor of Knuth, Around the Bend Editorial
Board---whose size will not long remain one I dare say,
especially after the establishment of this glamorous name---at
\url{<see acronym>@pobox.com}
%%Regards, Michael Downes
\begin{comment}
target=_parent>...</A>@ams.org (Michael J Downes) writes:
<P>
<DIV class=qt id=qhide_741198 style="DISPLAY: block">>
========================================================================
> *** Exercise 21: > Define a macro that takes a
variable number of arguments. Do it in the > best way
possible. For the sake of concreteness, consider this somewhat
> contrived example as a test case that your solution should
be able to > handle, though possibly using a different
syntax:
<P>> \printdate
-> today's date in
preferred form > \printdate[Tuesday]
-> "Tuesday" >
\printdate[Tuesday][17]
-> "Tuesday the 17th" >
\printdate[Tuesday][17][9] ->
"Tuesday, September 17th" >
\printdate[Tuesday][17][9][2002] -> and so on >
\printdate[Tuesday][17][9][2002][Gregorian calendar] ->
and so forth
\end{comment}
\section{Answers}
%\textbf{David Kastrup (2002/09/14)}
\begin{solution}{Solution 1 (David Kastrup)}\index{Kastrup, David}
\ed{\oposted{2002/09/14}}
\begin{lcode}
\def\printdate{\count@\z@\toks@{}\printdate@a}
\def\printdate@a{\@ifnextchar[{\printdate@b}{\printdate@c}}
\def\printdate@b[#1]{\toks@\expandafter{\the\toks@{#1}}%
\advance\count@\@ne\printdate@a}
\def\printdate@c{\csname printdate@@\romannumeral\count@
\expandafter\endcsname\the\toks@}
\end{lcode}
You can now define the one-argument macro \cmd{\printdate@@i}, the
5-argument macro \cmd{\printdate@@v} and so on.
\cmd{\printdate@c} might also contain other stuff. For testing,
we just define it as
\begin{lcode}
\def\printdate@c{\message{\number\count@\space arguments: \the\toks@}}
\end{lcode}
This needs the LaTeX macro \cmd{\@ifnextchar}, of course.
If you want to have various defaults in sequence and just want to
call \cmd{\printdate@@v}, you could write something like
\begin{lcode}
\def\printdate@c{\let\gobble@x\relax\expandafter\newcommand
\expandafter\gobble@x\expandafter[\number\count@]{}%
\edef\next{{Tuesday}{17}{9}{2002}{Gregorian calendar}%
\the\toks@}\expandafter\expandafter\expandafter
\printdate@@v\expandafter\gobble@x\next}
\end{lcode}
Ok, this latter proposal is ugly. Better ideas?
% -- David Kastrup, Kriemhildstr. 15, 44793 Bochum Email:
\end{solution}
\begin{solution}{Solution 2 (mine)}
\ed{\oposted{2002/09/20}}
%\textbf{Michael J Downes (Sep 20, 2002)}
Define a macro that takes a variable number of arguments.
and gave the following example application:
\begin{lcode}
\printdate -> today's date in preferred form
\printdate[Tuesday] -> Tuesday
\printdate[Tuesday][17] -> Tuesday the 17th
\printdate[Tuesday][17][9] -> Tuesday, September 17th
\printdate[Tuesday][17][9][2002] -> and so on
\end{lcode}
My solution (see below), written with LaTeX in mind, has the
following characteristics:
\begin{itemize}
\item The kernel of the solution is not specific to a particular
user-level command; for each user-level command, only two
command-specific macros are needed: the top-level one invoked by
the user, and the internal one that handles all the arguments.
By contrast, the standard LaTeX method of handling multiple
options requires a separate command-specific macro for each step
of the argument scanning.
\item The number of optional arguments is quasi-limited. The number
of default values that you give in a command's definition
becomes an upper limit on the number of arguments that will be
scanned for. And if you supply twenty default values, the code
that ends up handling them will have to be more than a simple
TeX macro since macro arguments only go up to 9.
\item Commands defined with this method can be nested, because the
delimiters for the optional arguments are regular curly braces \verb?{ }?,
not square brackets [ ].
\end{itemize}
The choice of square brackets in LaTeX for optional arguments is
OK for arguments whose values are suitably restricted, but when
used for arguments that may contain arbitrary text---in
particular, other commands with optional arguments---it becomes
a pitfall that many users have fallen into over the years, and
generally costing them an amount of lost time in inverse
proportion to their understanding of catcodes. (I.e., its worst
effects are on the kind of users that LaTeX was intended to
serve in the first place.) The most common examples in practice
are perhaps \cmd{\twocolumn}\verb?[...]? and \verb?\begin{thm}[...]?, but it could
also happen in the optional arguments of \cmd{\section}, \cmd{\caption}, or
\cmd{\cite}.
The chief argument against using braces for optional arguments
came out coincidentally in another thread only a couple of days
ago, as stated by Heiko Oberdiek\index{Oberdiek, Heiko} on \url{comp.text.tex}
\begin{comment}
(<am6mb5$a1<A
href="
http://groups.google.com/groups/unlock?msg=b6e2e27a4e4413f7&_done=/group/comp.text.tex/browse_thread/thread/cd0bd09362b1ac6c/b6e2e27a4e4413f7%3Flnk%3Dgst%26q%3Daround%2Bthe%2Bbend"
target=_parent>...</A>@n.ruf.uni-freiburg.de> comp.text.tex 17
Sep 2002):
\end{comment}
%$
\begin{quote}
How do you want to distinguish between a parameter and a
group, both enclosed in \verb?"{}"? Example:
\begin{lcode}
\foo{bar}{\bfseries bla}
\end{lcode}
\end{quote}
But in practice it seems to me that this is not a significant
drawback. Savvy users would normally use the \verb?\textbf{...}? form
anyway (I hope).
In fact the \verb?"{\whatever ...}"? form (called a \emph{declaration} in the
LaTeX book) is, in a certain sense, quite unnatural for a linear
language like TeX where the macro expansion works by simple
left-to-right substitution. At least, if used at document level
such a syntax makes it unnecessarily difficult to remap the
functions involved and therefore is a stumbling block in many
special applications. For example, it becomes feasible to add
italic corrections automatically only when we use the \cmd{\emph}\verb?{...}?
form rather than the \verb?{?\cmd{\em}\verb?...}? form. (There is an
\cmd{\aftergroup}
trick that would sort of do the job but only by placing some
assumptions on the usage that do not hold in the real world.)
%%%Regards, Michael Downes
% <P>------------------------------------------------------------------------
\begin{lcode}
\documentclass{article}
\usepackage{ifmtarg}
\makeatletter
% If \cmd{\MyCmd} is defined as
% \VariableArgs{\MyCode ...}{{Default1}{Default2}}
% then
% \MyCmd -> \MyCode...{Default1}{Default2}
% \MyCmd{aaa} -> \MyCode...{aaa}{Default2}
% \MyCmd{a}{bc} -> \MyCode...{a}{bc}
% In other words, \VariableArgs takes two arguments <code> and <defaults>
% and if the invocation via \MyCmd finds $n$ actual arguments, the first
% $n$ default values are replaced by the actual arguments.
%
% In principle the number of optional arguments is "whatever \MyCode is
% able to handle" but if the number of defaults is $d$ then scanning
% will stop as soon as $d$ arguments have been read, if not before.
% In practice things will begin to get unwieldy after a dozen or so
% arguments, because the process of scanning one more
% actual argument involves rescanning the whole list of arguments
% each time (actual arguments read previously plus any remaining defaults).
\newcommand{\VariableArgs}[2]{%
\toks@{#1}%
\@ifnextchar\bgroup{\AddArg #2{}@}{#1#2}}
\def\AddArg#1#2@#3{%
\toks@\expandafter{\the\toks@{#3}}%
\edef\RunIt{\the\toks@}%
\@ifnextchar\bgroup{%
\ifx @#2@%
\begingroup
\def\AddArg{\endgroup \expandafter\RunIt\@gobble}%
\fi
\AddArg #2@%
}{%
\RunIt #2%
}%
}
\newcommand{\printdate}{%
% If zero args, use \today.
\VariableArgs{\PrintDateFive}{{\today}{}{}{}{}}}
% This example is slightly more complicated than necessary because it
% behaves differently depending on the number of arguments.
\newcommand{\PrintDateFive}[5]{%
% Always print #1, which might be \today (from the default value).
#1%
\@ifnotmtarg{#2#3#4#5}{%
% If only #1 & #2 are given, use a slightly different form.
\@ifmtarg{#3#4#5}{ the}{,}%
% Args 2,3,4,5: Print each one if nonempty, but rearranging the
% order slightly.
\@ifnotmtarg{#3}{ \MonthName{#3}}%
\@ifnotmtarg{#2}{ \OrdinalDay{#2}}%
\@ifnotmtarg{#4}{, #4}%
\@ifnotmtarg{#5}{ (#5)}%
}}
\def\MonthName#1{%
\ifcase 0#1 \number\month\or
January\or February\or March\or April\or May\or June\or
July\or August\or September\or October\or November\or December%
\else Thirteen's Month\fi}
% If #2 is not a digit, use #1
\def\LastDigit#1#2{%
\ifodd 0#21 \else #1\expandafter\@gobbletwo\fi\LastDigit #2}
\def\OrdinalDay#1{#1%
\ifcase\LastDigit #1\space th\or st\or nd\or rd\else th\fi}
\begin{document}
\noindent Testing:
\begin{enumerate}\setcounter{enumi}{-1}
\item \printdate
\item \printdate{Tuesday}
\item \printdate{Tuesday}{17}
\item \printdate{Tuesday}{17}{9}
\item \printdate{Tuesday}{17}{9}{2002}
\item \printdate{Tuesday}{17}{9}{2002}{Gregorian calendar}
\end{enumerate}
\end{document}
\end{lcode}
\end{solution}
\begin{solution}{Solution 3 (Donald Arseneau)}\index{Arseneau, Donald}
%%\textbf{Donald Arseneau (2002/09/24)}
\ed{\oposted{2002/09/24}}
*** Exercise 21: \\
Define a macro that takes a variable number of arguments.
\begin{lcode}
\printdate[Tuesday][17][9][2002][Gregorian calendar] -> and so forth
\end{lcode}
I did it (acually before MD posed the challenge)
using \verb?{ }?, not \verb?[ ]?, and this answer does not match the challenge
in other ways. But I haven't got around to working it in the last week or so.
Two features notably missing are: error checking for a bad
number when specifying the number of arguments, and provision
of default values for omitted arguments (they are all null
here).
(I also think I could make do with one fewer
\cmd{\MultiArgCollect} macros.)
I think \verb?{}? delimiters really are the `best way' in regards to
nesting macros. The one problem is confusion with
non-explicit \verb?{?, and so I handle the most common case of \cmd{\bgroup}.
\begin{lcode}
\makeatletter
\let\MultiArgBgroup={
\def\MultiArg#1#2{\begingroup
\let\bgroup\begingroup \let\egroup\endgroup
\expandafter\MultiArgCollect\romannumeral\number#1001\delimiter{#2}}
\def\MultiArgCollect#1{\csname MultiArgCollect#1\endcsname}
\def\MultiArgCollectm#1\delimiter#2{%
\@ifnextchar\MultiArgBgroup
{\MultiArgCollectA#1\delimiter{#2}}%
{\MultiArgCollect#1\delimiter{#2{}}}}
\def\MultiArgCollectA#1\delimiter#2#3{%
\MultiArgCollect#1\delimiter{#2{#3}}}}
\def\MultiArgCollecti#1\delimiter#2{\endgroup#2}
\newcommand\DeclareMultiArgCommand[2]{\expandafter
\Declare@MultiArg@ \csname MA\string_\string#1\endcsname{#1}{#2}}
\def\Declare@MultiArg@#1#2#3{%
\DeclareRobustCommand{#2}{\MultiArg{#3}{#1}}
\newcommand{#1}[#3]}
\DeclareMultiArgCommand {\printdate}{6}{...}
\end{lcode}
\end{solution}
%%\endinput
\indexintoc
\printindex
\end{document}