\" @(#)u3 6.1 (Berkeley) 5/22/86
\"
sp
SH
III. DOCUMENT PREPARATION
PP
UC UNIX
systems are used extensively for document preparation.
There are two major
formatting
programs,
that is,
programs that produce a text with
justified right margins, automatic page numbering and titling,
automatic hyphenation,
and the like.
UL nroff
is designed to produce output on terminals and
line-printers.
UL troff
(pronounced ``tee-roff'')
instead drives a phototypesetter,
which produces very high quality output
on photographic paper.
This paper was formatted with
UL troff .
SH
Formatting Packages
PP
The basic idea of
UL nroff
and
UL troff
is that the text to be formatted contains within it
``formatting commands'' that indicate in detail
how the formatted text is to look.
For example, there might be commands that specify how long
lines are, whether to use single or double spacing,
and what running titles to use on each page.
PP
Because
UL nroff
and
UL troff
are relatively hard to learn to use effectively,
several
``packages'' of canned formatting requests are available
to let you specify
paragraphs, running titles, footnotes, multi-column output,
and so on, with little effort
and without having to learn
UL nroff
and
UL troff .
These packages take a modest effort to learn,
but the rewards for using them are so great
that it is time well spent.
PP
In this section,
we will provide a hasty look at the ``manuscript''
package known as
UL \-ms .
Formatting requests typically consist of a period and two upper-case letters,
such as
UL .TL ,
which is used to introduce a title,
or
UL .PP
to begin a new paragraph.
PP
A document is typed so it looks something like this:
P1
\&.TL
title of document
\&.AU
author name
\&.SH
section heading
\&.PP
paragraph ...
\&.PP
another paragraph ...
\&.SH
another section heading
\&.PP
etc.
P2
The lines that begin with a period are the formatting requests.
For example,
UL .PP
calls for starting a new paragraph.
The precise meaning of
UL .PP
depends on what output device is being used
(typesetter or terminal, for instance),
and on what publication the document will appear in.
For example,
UL \-ms
normally assumes that a paragraph is preceded by a space
(one line in
UL nroff ,
\(12 line in
UL troff ),
and the first word is indented.
These rules can be changed if you like,
but they are changed by changing the interpretation
of
UL .PP ,
not by re-typing the document.
PP
To actually produce a document in standard format
using
UL \-ms ,
use the command
P1
troff -ms files ...
P2
for the typesetter, and
P1
nroff -ms files ...
P2
for a terminal.
The
UL \-ms
argument tells
UL troff
and
UL nroff
to use the manuscript package of formatting requests.
PP
There are several similar packages;
check with a local expert to determine which ones
are in common use on your machine.
SH
Supporting Tools
PP
In addition to the basic formatters,
there is
a host of supporting programs
that help with document preparation.
The list in the next few paragraphs
is far from complete,
so browse through the manual
and check with people around you for other possibilities.
PP
UL eqn
and
UL neqn
let you integrate mathematics
into the text of a document,
in an easy-to-learn language that closely resembles the way
you would speak it aloud.
For example, the
UL eqn
input
P1
sum from i=0 to n x sub i ~=~ pi over 2
P2
produces the output
EQ
sum from i=0 to n x sub i ~=~ pi over 2
EN
PP
The program
UL tbl
provides an analogous service for preparing tabular material;
it does all the computations necessary to align complicated columns
with elements of varying widths.
PP
UL refer
prepares bibliographic citations from a data base,
in whatever style is defined by the formatting package.
It looks after all the details of numbering references in sequence,
filling in page and volume numbers,
getting the author's initials and the journal name right,
and so on.
PP
UL spell
and
UL typo
detect possible spelling mistakes in a document.\(dg
FS
\(dg "typo" is not provided with Berkeley Unix.
FE
UL spell
works by comparing the words in your document
to a dictionary,
printing those that are not in the dictionary.
It knows enough about English spelling to detect plurals and the like,
so it does a very good job.
UL typo
looks for words which are ``unusual'',
and prints those.
Spelling mistakes tend to be more unusual,
and thus show up early when the most unusual words
are printed first.
PP
UL grep
looks through a set of files for lines
that contain a particular text pattern
(rather like the editor's context search does,
but on a bunch of files).
For example,
P1
grep \(fming$\(fm chap*
P2
will find all lines that end with
the letters
UL ing
in the files
UL chap* .
(It is almost always a good practice to put single quotes around
the pattern you're searching for,
in case it contains characters like
UL *
or
UL $
that have a special meaning to the shell.)
UL grep
is often useful for finding out in which of a set of files
the misspelled words detected by
UL spell
are actually located.
PP
UL diff
prints a list of the differences between
two files,
so you can compare
two versions of something automatically
(which certainly beats proofreading by hand).
PP
UL wc
counts the words, lines and characters in a set of files.
UL tr
translates characters into other characters;
for example it will convert upper to lower case and vice versa.
This translates upper into lower:
P1
tr A-Z a-z <input >output
P2
PP
UL sort
sorts files in a variety of ways;
UL cref
makes cross-references;
UL ptx
makes a permuted index
(keyword-in-context listing).
UL sed
provides many of the editing facilities
of
UL ed ,
but can apply them to arbitrarily long inputs.
UL awk
provides the ability to do both pattern matching and numeric computations,
and to conveniently process fields within lines.
These programs are for more advanced users,
and they are not limited to document preparation.
Put them on your list of things to learn about.
PP
Most of these programs are either independently documented
(like
UL eqn
and
UL tbl ),
or are sufficiently simple that the description in
the
ul 2
UC UNIX
Programmer's Manual
is adequate explanation.
SH
Hints for Preparing Documents
PP
Most documents go through several versions (always more than you expected) before they
are finally finished.
Accordingly, you should do whatever possible to make
the job of changing them easy.
PP
First, when you do the purely mechanical operations of typing,
type so that subsequent editing will be easy.
Start each sentence on a new line.
Make lines short,
and break lines at natural places,
such as after commas and semicolons,
rather than randomly.
Since most people change documents by rewriting phrases
and adding, deleting and rearranging sentences,
these precautions simplify any editing
you have to do later.
PP
Keep the individual files of a document down
to modest size,
perhaps ten to fifteen thousand characters.
Larger files edit more slowly,
and of course if you make a dumb mistake
it's better to have clobbered a small file than a big one.
Split into files at natural boundaries in the document,
for the same reasons that you start each sentence
on a new line.
PP
The second aspect of making change easy
is to not commit yourself to formatting details too early.
One of the advantages of formatting packages like
UL \-ms
is that they permit you to delay decisions
to the last possible moment.
Indeed,
until a document is printed,
it is not even decided whether it will be typeset
or put on a line printer.
PP
As a rule of thumb, for all but the most trivial jobs,
you should type a document in terms of a set of requests
like
UL .PP ,
and then define them appropriately,
either by using one of the canned packages
(the better way)
or by defining your own
UL nroff
and
UL troff
commands.
As long as you have entered the text in some systematic way,
it can always be cleaned up and re-formatted
by a judicious combination of
editing commands and request definitions.