de P1

de P1
KS
DS
ft CW
ta 5n 10n 15n 20n 25n 30n 35n 40n 45n 50n 55n 60n 65n 70n 75n 80n
.
de P2
ft 1
DE
KE
.
de CW
lg 0
\%\&\\$3\f(CW\\$1\fP\&\\$2
lg
.
de WC
lg 0
\%\&\\$3\f(CI\\$1\fP\&\\$2
lg
.
TL
A tutorial for the
CW sam
B
command language
AU
Rob Pike
AI
MH
AB
CW sam
is an interactive text editor with a command language that makes heavy use
of regular expressions.
Although the language is syntactically similar to
CW ed (1),
the details are interestingly different.
This tutorial introduces the command language, but does not discuss
the screen and mouse interface.
With apologies to those unfamiliar with the Ninth Edition Blit software,
it is assumed that the similarity of
CW sam
to
CW mux (9)
at this level makes
CW sam 's
mouse language easy to learn.
PP
The
CW sam
command language applies identically to two environments:
when running
CW sam
on an ordinary terminal
(\f2via\f1\f1
CW sam\ -d ),
and in the command window of a
I downloaded
CW sam ,
that is, one using the bitmap display and mouse.
AE
SH
Introduction
PP
This tutorial describes the command language of
CW sam ,
an interactive text editor that runs on Blits and
some computers with bitmap displays.
For most editing tasks, the mouse-based editing features
are sufficient, and they are easy to use and to learn.
PP
The command language is often useful, however, particularly
when making global changes.
Unlike the commands in
CW ed ,
which are necessary to make changes,
CW sam
commands tend to be used
only for complicated or repetitive editing tasks.
It is in these more involved uses that
the differences between
CW sam
and other text editors are most evident.
PP
CW sam 's
language makes it easy to do some things that other editors,
including programs like
CW sed
and
CW awk ,
do not handle gracefully, so this tutorial serves partly as a
lesson in
CW sam 's
manner of manipulating text.
The examples below therefore concentrate entirely on the language,
assuming that facility with the use of the mouse in
CW sam
is at worst easy to pick up.
In fact,
CW sam
can be run without the mouse at all (not
I downloaded ),
by specifying the
CW -d
flag, and it is this domain that the tutorial
occupies; the command language in these modes
are identical.
PP
A word to the Unix adept:
although
CW sam
is syntactically very similar to
CW ed ,
it is fundamentally and deliberately different in design and detailed semantics.
You might use knowledge of
CW ed
to predict how the substitute command works,
but you'd only be right if you had used some understanding of
CW sam 's
workings to influence your prediction.
Be particularly careful about idioms.
Idioms form in curious nooks of languages and depend on
undependable peculiarities.
CW ed
idioms simply don't work in
CW sam :
CW 1,$s/a/b/
makes one substitution in the whole file, not one per line.
CW sam
has its own idioms.
Much of the purpose of this tutorial is to publish them
and make fluency in
CW sam
a matter of learning, not cunning.
PP
The tutorial depends on familiarity with regular expressions, although
some experience with a more traditional Unix editor may be helpful.
To aid readers familiar with
CW ed ,
I have pointed out in square brackets [] some of
the relevant differences between
CW ed
and
CW sam .
Read these comments only if you wish
to understand the differences; the lesson is about
CW sam ,
not
CW sam
I vs.
CW ed .
Another typographic convention is that output appears in
CW "this font,
while typed input appears as
WC "slanty text.
PP
Nomenclature:
CW sam
keeps a copy of the text it is editing.
This copy is called a
I file .
To avoid confusion, I have called the permanent storage on disc a
I
Unix file.
R
SH
Text
PP
To get started, we need some text to play with.
Any text will do; try something from
James Gosling's Emacs manual:
P1
$ \f(CIsam -d
a
This manual is organized in a rather haphazard manner. The first
several sections were written hastily in an attempt to provide a
general introduction to the commands in Emacs and to try to show
the method in the madness that is the Emacs command structure.
\&.
ft
P2
WC "sam -d
starts
CW sam
running.
The
CW a
command adds text until a line containing just a period, and sets the
I
current text
R
(also called
I dot )
to what was typed \(em everything between the
CW a
and the period.
CW ed "" [
would leave dot set to only the last line.]
The
CW p
command prints the current text:
P1
WC p
This manual is organized in a rather haphazard manner. The first
several sections were written hastily in an attempt to provide a
general introduction to the commands in Emacs and to try to show
the method in the madness that is the Emacs command structure.
P2
[Again,
CW ed
would print only the last line.]
The
CW a
command adds its text
I after
dot; the
CW i
command is like
CW a,
but adds the text
I before
dot.
P1
ft CI
i
Introduction
\&.
p
ft
Introduction
P2
There is also a
CW c
command that changes (replaces) the current text,
and
CW d
that deletes it; these are illustrated below.
PP
To see all the text, we can specify what text to print;
for the moment, suffice it to say that
WC 0,$
specifies the entire file.
CW ed "" [
users would probably type
WC 1,$ ,
which in practice is the same thing, but see below.]
P1
WC 0,$p
Introduction
This manual is organized in a rather haphazard manner. The first
several sections were written hastily in an attempt to provide a
general introduction to the commands in Emacs and to try to show
the method in the madness that is the Emacs command structure.
P2
Except for the
CW w
command described below,
I all
commands,
including
CW p ,
set dot to the text they touch.
Thus,
CW a
and
CW i
set dot to the new text,
CW p
to the text printed, and so on.
Similarly, all commands
(except
CW w )
by default operate on the current
text [unlike
CW ed ,
for which some commands (such as
CW g )
default to the entire file].
PP
Things are not going to get very interesting until we can
set dot arbitrarily.
This is done by
I addresses ,
which specify a piece of the file.
The address
CW 1 ,
for example, sets dot to the first line of the file.
P1
WC 1p
Introduction
WC c
WC Preamble
WC .
P2
The
CW c
command didn't need to specify dot; the
CW p
left it on line one.
It's therefore easy to delete the first line utterly;
the last command left dot set to line one:
P1
WC d
WC 1p
This manual is organized in a rather haphazard manner. The first
P2
(Line numbers change
to reflect changes to the file.)
PP
The address \f(CW/\f2text\f(CW/\f1
sets dot to the first appearance of
I text ,
after dot.
CW ed "" [
matches the first line containing
I text .]
If
I text
is not found, the search restarts at the beginning of the file
and continues until dot.
P1
WC /Emacs/p
Emacs
P2
It's difficult to indicate typographically, but in this example no newline appears
after
CW Emacs :
the text to be printed is the string
CW Emacs ', `
exactly.
(The final
CW p
may be left off \(em it is the default command.
When downloaded, however, the default is instead to select the text,
to highlight it,
and to make it visible by moving the window on the file if necessary.
Thus,
CW /Emacs/
indicates on the display the next occurrence of the text.)
PP
Imagine we wanted to change the word
CW haphazard
to
CW thoughtless .
Obviously, what's needed is another
CW c
command, but the method used so far to insert text includes a newline.
The syntax for including text without newlines is to surround the
text with slashes (which is the same as the syntax for
text searches, but what is going on should be clear from context).
The text must appear immediately after the
CW c
(or
CW a
or
CW i ).
Given this, it is easy to make the required change:
P1
WC /haphazard/c/thoughtless/
WC 1p
This manual is organized in a rather thoughtless manner. The first
P2
[Changes can always be done with a
CW c
command, even if the text is smaller than a line].
You'll find that this way of providing text to commands is much
more common than is the multiple-lines syntax.
If you want to include a slash
CW /
in the text, just precede it with a backslash
CW \e ,
and use a backslash to protect a backslash itself.
P1
WC /Emacs/c/Emacs\e\e360/
WC 4p
general introduction to the commands in Emacs\e360 and to try to show
P2
We could also make this particular change by
P1
WC /Emacs/a/\e\e360/
P2
PP
This is as good a place as any to introduce the
CW u
command, which undoes the last command.
A second
CW u
will undo the penultimate command, and so on.
P1
WC u
WC 4p
general introduction to the commands in Emacs and to try to show
WC u
WC 3p
This manual is organized in a rather haphazard manner. The first
P2
Undoing can only back up; there is no way to undo a previous
CW u .
SH
Addresses
PP
We've seen the simplest forms of addresses, but there is more
to learn before we can get too much further.
An address selects a region in the file \(em a substring \(em
and therefore must define the beginning and the end of a region.
Thus, the address
CW 13
selects from the beginning of line thirteen to the end of line thirteen, and
CW /Emacs/
selects from the beginning of the word
CW Emacs ' `
to the end.
PP
Addresses may be combined with a comma:
P1
13,15
P2
selects lines thirteen through fifteen. The definition of the comma
operator is to select from the beginning of the left hand address (the
beginning of line 13) to the end of the right hand address (the
end of line 15).
PP
A few special simple addresses come in handy:
CW .
(a period) represents dot, the current text,
CW 0
(line zero) selects the null string at the beginning of the file, and
CW $
selects the null string at the end of the file
[not the last line of the file].
Therefore,
P1
0,13
P2
selects from the beginning of the file to the end of line thirteen,
P1
\&.,$
P2
selects from the beginning of the current text to the end of the file, and
P1
0,$
P2
selects the whole file [that is, a single string containing the whole file,
not a list of all the lines in the file].
PP
These are all
I absolute
addresses: they refer to specific places in the file.
CW sam
also has relative addresses, which depend
on the value of dot,
and in fact we have already seen one form:
CW /Emacs/
finds the first occurrence of
CW Emacs
searching forwards from dot.
Which occurrence of
CW Emacs
it finds depends on the value of dot.
What if you wanted the first occurrence
CW before
dot? Just precede the pattern with a minus sign, which reverses the direction
of the search:
P1
-/Emacs/
P2
In fact, the complete syntax for forward searching is
P1
+/Emacs/
P2
but the plus sign is the default, and in practice is rarely used.
Here is an example that includes it for clarity:
P1
0+/Emacs/
P2
selects the first occurrence of
CW Emacs
in the file; read it as ``go to line 0, then search forwards for
CW Emacs .''
Since the
CW +
is optional, this can be written
CW 0/Emacs/ .
Similarly,
P1
$-/Emacs/
P2
finds the last occurrence in the file, so
P1
0/Emacs/,$-/Emacs/
P2
selects the text from the first to last
CW Emacs ,
inclusive.
Slightly more interesting:
P1
/Emacs/+/Emacs/
P2
(there is an implicit
CW .+
at the beginning) selects the second
CW Emacs
following dot.
PP
Line numbers may also be relative.
P1
-2
P2
selects the second previous line, and
P1
+5
P2
selects the fifth following line (here the plus sign is obligatory).
PP
Since addresses may select (and dot may be) more than one line,
we need a definition of `previous' and `following:'
`previous' means
I
before the beginning
R
of dot, and `following'
means
I
after the end
R
of dot.
For example, if the file contains \f(CWA\f(CIAA\f(CWA\f1,
with dot set to the middle two
CW A 's
(the slanting characters),
CW -/A/
sets dot to the first
CW A ,
and
CW +/A/
sets dot to the last
CW A .
Except under odd circumstances (such as when the only occurrence of the
text in the file is already the current text), the text selected by a
search will be disjoint from dot.
PP
To select the
CW "troff -ms
paragraph containing dot, however long it is, use
P1
-/.PP/,/.PP/-1
P2
which will include the
CW .PP
that begins the paragraph, and exclude the one that ends it.
PP
When typing relative line number addresses, the default number is
CW 1 ,
so the above could be written slightly more simply:
P1
-/.PP/,/.PP/-
P2
PP
What does the address
CW +1-1
or the equivalent
CW +-
mean? It looks like it does nothing, but recall that dot need not be a
complete line of text.
CW +1
selects the line after the end of the current text, and
CW -1
selects the line before the beginning. Therefore
CW +1-1
selects the line before the line after the end of dot, that is,
the complete line containing the end of dot.
We can use this construction to expand a selection to include a complete line,
say the first line in the file containing
CW Emacs :
P1
WC 0/Emacs/+-p
general introduction to the commands in Emacs and to try to show
P2
The address
CW +-
is an idiom.
SH
Loops
PP
Above, we changed one occurrence of
CW Emacs
to
CW Emacs\e360 ,
but if the name of the editor is really changing, it would be useful
to change
I all
instances of the name in a single command.
CW sam
provides a command,
CW x
(extract), for just that job.
The syntax is
\f(CWx/\f2pattern\f(CW/\f2command\f1.
For each occurrence of the pattern in the selected text,
CW x
sets dot to the occurrence and runs command.
For example, to change
CW Emacs
to
CW vi,
P1
WC 0,$x/Emacs/c/vi/
WC 0,$p
This manual is organized in a rather haphazard manner. The first
several sections were written hastily in an attempt to provide a
general introduction to the commands in vi and to try to show
the method in the madness that is the vi command structure.
P2
This
works by subdividing the current text
CW 0,$ "" (
\(em the whole file) into appearances of its textual argument
CW Emacs ), (
and then running the command that follows
CW c/vi/ ) (
with dot set to the text.
We can read this example as, ``find all occurrences of
CW Emacs
in the file, and for each one,
set the current text to the occurrence and run the command
CW c/vi/ ,
which will replace the current text by
CW vi. ''
[This command is somewhat similar to
CW ed 's
CW g
command. The differences will develop below, but note that the
default address, as always, is dot rather than the whole file.]
PP
A single
CW u
command is sufficient to undo an
CW x
command, regardless of how many individual changes the
CW x
makes.
P1
WC u
WC 0,$p
This manual is organized in a rather haphazard manner. The first
several sections were written hastily in an attempt to provide a
general introduction to the commands in Emacs and to try to show
the method in the madness that is the Emacs command structure.
P2
PP
Of course,
CW c
is not the only command
CW x
can run. An
CW a
command can be used to put proprietary markings on
CW Emacs :
P1
WC 0,$x/Emacs/a/{TM}/
WC /Emacs/+-p
general introduction to the commands in Emacs{TM} and to try to show
P2
[There is no way to see the changes as they happen, as in
CW ed 's
CW g/Emacs/s//&{TM}/p ;
see the section on Multiple Changes, below.]
PP
The
CW p
command is also useful when driven by an
CW x ,
but be careful that you say what you mean;
P1
WC 0,$x/Emacs/p
EmacsEmacs
P2
since
CW x
sets dot to the text in the slashes, printing only that text
is not going to be very
informative. But the command that
CW x
runs can contain addresses. For example, if we want to print all
lines containing
CW Emacs ,
just use
CW +- :
P1
WC 0,$x/Emacs/+-p
general introduction to the commands in Emacs{TM} and to try to show
the method in the madness that is the Emacs{TM} command structure.
P2
Finally, let's restore the state of the file with another
CW x
command, and make use of a handy shorthand:
a comma in an address has its left side default to
CW 0 ,
and its right side default to
CW $ ,
so the easy-to-type address
CW ,
refers to the whole file:
P1
WC ",x/Emacs/ /{TM}/d
WC ,p
This manual is organized in a rather haphazard manner. The first
several sections were written hastily in an attempt to provide a
general introduction to the commands in Emacs and to try to show
the method in the madness that is the Emacs command structure.
P2
Notice what this
CW x
does: for each occurrence of Emacs,
find the
CW {TM}
that follows, and delete it.
PP
The `text'
CW sam
accepts
for searches in addresses and in
CW x
commands is not simple text, but rather
I regular\ expressions.
Unix has several distinct interpretations of regular expressions.
The form used by
CW sam
is that of
CW egrep (1),
including parentheses
CW ()
for grouping and an `or' operator
CW |
for matching strings in parallel.
CW sam
makes two extensions:
although
CW .
(the most overloaded character in Unix) matches any character
I except
newline, the regular expression
CW @
(think of it as a big dot) matches any character, even newlines;
and the character sequence
CW \en
matches a newline character.
Replacement text, such as used in the
CW a
and
CW c
commands, is still plain text, but the sequence
CW \en
represents newline in that context, too.
PP
Here is an example. Say we wanted to double space the document, that is,
turn every newline into two newlines.
The following all do the job:
P1
WC ",x/\en/ a/\en/
WC ",x/\en/ c/\en\en/
WC ",x/$/ a/\en/
WC ",x/^/ i/\en/
P2
The last example is slightly different, because it puts a newline
I before
each line; the other examples place it after.
The first two examples manipulate newlines directly
[something outside
CW ed 's
ken]; the last two
use regular expressions:
CW $
is the empty string at the end of a line, while
CW ^
is the empty string at the beginning.
PP
These solutions all have a possible drawback: if there is already a blank line
(that is, two consecutive newlines), they make it much larger (four
consecutive newlines).
A better method is to extend every group of newlines by one:
P1
WC ",x/\en+/ a/\en/
P2
The regular expression operator
CW +
means `one or more;'
CW \en+
is identical to
CW \en\en* .
Thus, this example
takes every sequence of newlines and adds another
to the end.
PP
A more common example is indenting a block of text by a tab stop.
The following all work,
although the first is arguably the cleanest (the blank text in slashes is a tab):
P1
WC ",x/^/a/ /
WC ",x/^/c/ /
WC ",x/.*\en/i/ /
P2
The last example uses the pattern (idiom, really)
CW .*\en
to match lines:
CW .*
matches the longest possible string of non-newline characters.
Taking initial tabs away is just as easy:
P1
WC ",x/^ /d
P2
In these examples I have specified an address (the whole file), but
in practice commands like these are more likely to be run without
an address, using the value of dot set by selecting text with the mouse.
SH
Conditionals
PP
The
CW x
command is a looping construct:
for each match of a regular expression,
it extracts (sets dot to) the match and runs a command.
CW sam
also has a conditional,
CW g :
\f(CWg/\f2pattern\f(CW/\f2command\f1
runs the command if dot contains a match of the pattern
I
without changing the value of dot.
R
The inverse,
CW v ,
runs the command if dot does
I not
contain a match of the pattern.
(The letters
CW g
and
CW v
are historical and have no mnemonic significance. You might
think of
CW g
as `guard.')
CW ed "" [
users should read the above definitions very carefully; the
CW g
command in
CW sam
is fundamentally different from that in
CW ed .]
Here is an example of the difference between
CW x
and
CW g:
P1
,x/Emacs/c/vi/
P2
changes each occurrence of the word
CW Emacs
in the file to the word
CW vi ,
but
P1
,g/Emacs/c/vi/
P2
changes the
I "whole file
to
CW vi
if there is the word
CW Emacs
anywhere in the file.
PP
Neither of these commands is particularly interesting in isolation,
but they are valuable when combined with
CW x
and with themselves.
SH
Composition
PP
One way to think about the
CW x
command is that, given a selection (a value of dot)
it iterates through interesting subselections (values of dot within).
In other words, it takes a piece of text and cuts it into smaller pieces.
But the text that it cuts up may already be a piece cut by a previous
CW x
command or selected by a
CW g .
CW sam 's
most interesting property is the ability to define a sequence of commands
to perform a particular task.\(dg
FS
\(dg
The obvious analogy with shell pipelines is only partially valid,
because the individual
CW sam
commands are all working on the same text; it is only how the text is
sliced up that is changing.
FE
A simple example is to change all occurrences of
CW Emacs
to
CW emacs ;
certainly the command
P1
WC ",x/Emacs/ c/emacs/
P2
will work, but we can use an
CW x
command to save retyping most of the word
CW Emacs :
P1
WC ",x/Emacs/ x/E/ c/e/
P2
(Blanks can be used
to separate commands on a line to make them easier to read.)
What this command does is find all occurrences of
CW Emacs
CW ,x/Emacs/ ), (
and then
I
with dot set to that text,
R
find all occurrences of the letter
CW E
CW x/E/ ), (
and then
I
with dot set to that text,
R
run the command
CW c/e/
to change the character to lower case.
Note that the address for the command \(em the whole file, specified by a comma
\(em is only given to the leftmost
piece of the command; the rest of the pieces have dot set for them by
the execution of the pieces to their left.
PP
As another simple example, consider a problem
solved above: printing all lines in the file containing the word
CW Emacs:
P1
WC ",x/.*\en/ g/Emacs/p
general introduction to the commands in Emacs and to try to show
the method in the madness that is the Emacs command structure.
P2
This command says to break the file into lines
CW ,x/.*\en/ ), (
and for each line that contains the string
CW Emacs
CW g/Emacs/ ), (
run the command
CW p
with dot set to the line (not the match of
CW Emacs ),
which prints the line.
To save typing, because
CW .*\en
is a common pattern in
CW x
commands,
if the
CW x
is followed immediately by a space, the pattern
CW .*\en
is assumed.
Therefore, the above could be written more succinctly:
P1
WC ",x g/Emacs/p
P2
The solution we used before was
P1
WC ,x/Emacs/+-p
P2
which runs the command
CW +-p
with dot set to each match of
CW Emacs
in the file (recall that the idiom
CW +-p
prints the line containing the end of dot).
PP
The two commands usually produce the same result
(the
CW +-p
form will print a line twice if it contains
CW Emacs
twice). Which is better?
CW ,x/Emacs/+-p
is easier to type and will be much faster if the file is large and
there are few occurrences of the string, but it is really an odd special case.
CW ",x/.*\en/ g/Emacs/p
is slower \(em it breaks each line out separately, then examines
it for a match \(em but is conceptually cleaner, and generalizes more easily.
For example, consider the following piece of the Emacs manual:
P1
command name="append-to-file", key="[unbound]"
Takes the contents of the current buffer and appends it to the
named file. If the files doesn't exist, it will be created.

command name="apropos", key="ESC-?"
Prompts for a keyword and then prints a list of those commands
whose short description contains that keyword. For example,
if you forget which commands deal with windows, just type
"@b[ESC-?]@t[window]@b[ESC]".

\&\f2and so on\f(CW
P2
This text consists of groups of non-empty lines, with a simple format
for the text within each group.
Imagine that we wanted to find the description of the `apropos'
command.
The problem is to break the file into individual descriptions,
and then to find the description of `apropos' and to print it.
The solution is straightforward:
P1
WC ,x/(.+\en)+/\ g/command\ name="apropos"/p
command name="apropos", key="ESC-?"
Prompts for a keyword and then prints a list of those commands
whose short description contains that keyword. For example,
if you forget which commands deal with windows, just type
"@b[ESC-?]@t[window]@b[ESC]".
P2
The regular expression
CW (.+\en)+
matches one or more lines with one or more characters each, that is,
the text between blank lines, so
CW ,x/(.+\en)+/
extracts each description; then
CW g/command\ name="apropos"/
selects the description for `apropos' and
CW p
prints it.
PP
Imagine that we had a C program containing the variable
CW n ,
but we wanted to change it to
CW num .
This command is a first cut:
P1
WC ",x/n/ c/num/
P2
but is obviously flawed: it will change all
CW n 's
in the file, not just the
I identifier
CW n .
A better solution is to use an
CW x
command to extract the identifiers, and then use
CW g
to find the
CW n 's:
P1
WC ",x/[a-zA-Z_][a-zA-Z_0-9]*/ g/n/ v/../ c/num/
P2
It looks awful, but it's fairly easy to understand when read
left to right.
A C identifier is an alphabetic or underscore followed by zero or more
alphanumerics or underscores, that is, matches of the regular expression
CW [a-zA-Z_][a-zA-Z_0-9]* .
The
CW g
command selects those identifiers containing
CW n ,
and the
CW v
is a trick: it rejects those identifiers containing more than one
character. Hence the
CW c/num/
applies only to free-standing
CW n 's.
PP
There is still a problem here:
we don't want to change
CW n 's
that are part of the character constant
CW \en .
There is a command
CW y ,
complementary to
CW x ,
that is just what we need:
\f(CWy/\f2pattern\f(CW/\f2command\f1
runs the command on the pieces of text
I between
matches of the pattern;
if
CW x
selects,
CW y
rejects.
Here is the final command:
P1
WC ",y/\e\en/ x/[a-zA-Z_][a-zA-Z_0-9]*/ g/n/ v/../ c/num/
P2
The
CW y/\e\en/
(with backslash doubled to make it a literal character)
removes the two-character sequence
CW \en
from consideration, so the rest of the command will not touch it.
There is more we could do here; for example, another
CW y
could be prefixed to protect comments in the code.
I won't elaborate the example any further, but you should have
an idea of the way in which the looping and conditional commands
in
CW sam
may be composed to do interesting things.
SH
Grouping
PP
There is another way to arrange commands.
By enclosing them in brace brackets
CW {} ,
commands may be applied in parallel.
This example uses the
CW =
command, which reports the line and character numbers of dot,
together with
CW p ,
to report on appearances of
CW Emacs
in our original file:
P1
WC ,p
This manual is organized in a rather haphazard manner. The first
several sections were written hastily in an attempt to provide a
general introduction to the commands in Emacs and to try to show
the method in the madness that is the Emacs command structure.
ft CI
,x/Emacs/{
=
+-p
}
ft
3; #171,#176
general introduction to the commands in Emacs and to try to show
4; #234,#239
the method in the madness that is the Emacs command structure.
P2
(The number before the semicolon is the line number;
the numbers beginning with
CW #
are character numbers.)
As a more interesting example, consider changing all occurrences of
CW Emacs
to
CW vi
and vice versa. We can type
P1
ft CI
,x/Emacs|vi/{
g/Emacs/ c/vi/
g/vi/ c/Emacs/
}
ft
P2
or even
P1
ft CI
,x/[a-zA-Z]+/{
g/Emacs/ v/....../ c/vi/
g/vi/ v/.../ c/Emacs/
}
ft
P2
to make sure we don't change strings embedded in words.
SH
Multiple Changes
PP
You might wonder why, once
CW Emacs
has been changed to
CW vi
in the above example,
the second command in the braces doesn't put it back again.
The reason is that the commands are run in parallel:
within any top-level
CW sam
command, all changes to the file refer to the state of the file
before any of the changes in that command are made.
After all the changes have been determined, they are all applied
simultaneously.
PP
This means, as mentioned, that commands within a compound
command see the state of the file before any of the changes apply.
This method of evaluation makes some things easier (such as the exchange of
CW Emacs
and
CW vi ),
and some things harder.
For instance, it is impossible to use a
CW p
command to print the changes as they happen,
because they haven't happened when the
CW p
is executed.
An indirect ramification is that changes must occur in forward
order through the file,
and must not overlap.
SH
Unix
PP
CW sam
has a few commands to connect to Unix processes.
The simplest is
CW ! ,
which runs the command with input and output connected to the terminal.
P1
WC !date
Wed May 28 23:25:21 EDT 1986
!
P2
(When downloaded, the input is connected to
CW /dev/null
and only the first few lines of output are printed;
any overflow is stored in
CW $HOME/sam.err .)
The final
CW !
is a prompt to indicate when the command completes.
PP
Slightly more interesting is
CW > ,
which provides the current text as standard input to the Unix command:
P1
WC "1,2 >wc
2 22 131
!
P2
The complement of
CW >
is, naturally,
CW < :
it replaces the current text with the standard output of the Unix command:
P1
WC "1 <date
!
WC 1p
Wed May 28 23:26:44 EDT 1986
P2
The last command is
CW | ,
which is a combination of
CW <
and
CW > :
the current text is provided as standard input to the Unix command,
and the Unix command's standard output is collected and used to
replace the original text.
For example,
P1
WC ",| sort
P2
runs
CW sort (1)
on the file, sorting the lines of the text lexicographically.
Note that
CW < ,
CW >
and
CW |
are
CW sam
commands, not Unix shell operators.
PP
The next example converts all appearances of
CW Emacs
to upper case using
CW tr (1):
P1
WC ",x/Emacs/ | tr a-z A-Z
P2
CW tr
is run once for each occurrence of
CW Emacs .
Of course, you could do this example more efficiently with a simple
CW c
command, but here's a trickier one:
given a Unix mail box as input,
convert all the
CW Subject
headers to distinct fortunes:
P1
WC ",x/^Subject:.*\en/ x/[^:]*\en/ < /usr/games/fortune
P2
(The regular expression
CW [^:]
refers to any character
I except
CW :
and newline; the negation operator
CW ^
excludes newline from the list of characters.)
Again,
CW /usr/games/fortune
is run once for each
CW Subject
line, so each
CW Subject
line is changed to a different fortune.
SH
A few other text commands
PP
For completeness, I should mention three other commands that
manipulate text. The
CW m
command moves the current text to after the text specified by the
(obligatory) address after the command.
Thus
P1
WC "/Emacs/+- m 0
P2
moves the next line containing
CW Emacs
to the beginning of the file.
Similarly,
CW t
(another historic character) copies the text:
P1
WC "/Emacs/+- t 0
P2
would make, at the beginning of the file, a copy of the next line
containing
CW Emacs .
PP
The third command is more interesting: it makes substitutions.
Its syntax is
\f(CWs/\f2pattern\f(CW/\f2replacement\f(CW/\f1.
Within the current text, it finds the first occurrence of
the pattern and replaces it by the replacement text,
leaving dot set to the entire address of the substitution.
P1
WC 1p
This manual is organized in a rather haphazard manner. The first
WC s/haphazard/thoughtless/
WC p
This manual is organized in a rather thoughtless manner. The first
P2
Occurrences of the character
CW &
in the replacement text stand for the text matching the pattern.
P1
WC s/T/"&&&&"/
WC p
"TTTT"his manual is organized in a rather thoughtless manner. The first
P2
There are two variants. The first is that a number may be specified
after the
CW s ,
to indicate which occurrence of the pattern to substitute; the default
is the first.
P1
WC s2/is/was/
WC p
"TTTT"his manual was organized in a rather thoughtless manner. The first
P2
The second is that suffixing a
CW g
(global) causes replacement of all occurrences, not just the first.
P1
WC s/[a-zA-Z]/x/g
WC p
"xxxx"xxx xxxxxx xxx xxxxxxxxx xx x xxxxxx xxxxxxxxxxx xxxxxxx xxx xxxxx
P2
Notice that in all these examples
dot is left
set to the entire line.
PP
[The substitute command is vital to
CW ed,
because it is the only way to make changes within a line.
It is less valuable in
CW sam ,
in which the concept of a line is much less important.
For example, many
CW ed
substitution idioms are handled well by
CW sam 's
basic commands. Consider the commands
P1
s/good/bad/
s/good//
s/good/& bye/
P2
which are equivalent in
CW sam
to
P1
/good/c/bad/
/good/d
/good/a/ bye/
P2
and for which the context search is likely unnecessary because the desired
text is already dot.
Also, beware this
CW ed
idiom:
P1
1,$s/good/bad/
P2
which changes the first
CW good
on each line; the same command in
CW sam
will only change the first one in the whole file.
The correct
CW sam
version is
P1
,x s/good/bad/
P2
but what is more likely meant is
P1
,x/good/ c/bad/
P2
CW sam
operates under different rules.]
SH
Files
PP
So far, we have only been working with a single file,
but
CW sam
is a multi-file editor.
Only one file may be edited at a time, but
it is easy to change which file is the `current' file for editing.
To see how to do this, we need a
CW sam
with a few files;
the easiest way to do this is to start it
with a list of Unix file names to edit.
P1
$ \f(CIecho *.ms\f(CW
conquest.ms death.ms emacs.ms famine.ms slaughter.ms
$ \f(CIsam -d *.ms\f(CW
-. conquest.ms
P2
(I'm sorry the Horsemen don't appear in liturgical order.)
The line printed by
CW sam
is an indication that the Unix file
CW conquest.ms
has been read, and is now the current file.
CW sam
does not read the Unix file until
the associated
CW sam
file becomes current.
PP
The
CW n
command prints the names of all the files:
P1
WC n
-. conquest.ms
- death.ms
- emacs.ms
- famine.ms
- slaughter.ms
P2
This list is also available in the menu on mouse button 3.
The command
CW f
tells the name of just the current file:
P1
WC f
-. conquest.ms
P2
The characters to the left of the file name encode helpful information about
the file.
The minus sign becomes a plus sign if the file has a window open, and an
asterisk if more than one is open.
The period (another meaning of dot) identifies the current file.
The leading blank changes to an apostrophe if the file is different
from the contents of the associated Unix file, as far as
CW sam
knows.
This becomes evident if we make a change.
P1
WC 1d
WC f
\&'-. conquest.ms
P2
If the file is restored by an undo command, the apostrophe disappears.
P1
WC u
WC f
-. conquest.ms
P2
The file name may be changed by providing a new name with the
CW f
command:
P1
CW "f pestilence.ms
\&'-. pestilence.ms
P2
WC f
prints the new status of the file,
that is, it changes the name if one is provided, and prints the
name regardless.
A file name change may also be undone.
P1
WC u
WC f
-. conquest.ms
P2
PP
When
CW sam
is downloaded, the current file may be changed simply by selecting
the desired file from the menu (selecting the same file subsequently
cycles through the windows opened on the file).
Otherwise, the
CW b
command can be used to choose the desired file:\(dg
FS
\(dg A bug prevents the
CW b
command from working when downloaded.
Because the menu is more convenient anyway, and
because the method
of choosing files from the command language is slated to change,
the bug hasn't been fixed.
FE
P1
WC "b emacs.ms
-. emacs.ms
P2
Again,
CW sam
prints the name (actually, executes an implicit
CW f
command) because the Unix file
CW emacs.ms
is being read for the first time.
It is an error to ask for a file
CW sam
doesn't know about, but the
CW B
command will prime
CW sam 's
menu with a new file, and make it current.
P1
WC "b flood.pic
?no such file `flood.pic'
WC "B flood.pic
-. flood.pic
WC n
- conquest.ms
- death.ms
- emacs.ms
- famine.ms
-. flood.pic
- slaughter.ms
P2
Both
CW b
and
CW B
will accept a list of file names.
CW b
simply takes the first file in the list, but
CW B
loads them all.
The list may be typed on one line \(em
P1
WC "B devil.tex satan.tex 666.tex emacs.tex
P2
\(em or generated by a Unix command \(em
P1
WC "B <echo *.tex
P2
The latter form requires a Unix command;
CW sam
does not understand the shell file name metacharacters, so
CW "B *.tex
attempts to load a single file named
CW *.tex .
(The
CW <
form is of course derived from
CW sam 's
CW <
command.)
CW echo
is not the only useful command to run subservient to
CW B ;
for example,
P1
WC "B <grep -l Emacs *
P2
will load only those files containing the string
CW Emacs .
Finally, a special case: a
CW B
with no arguments creates an empty, nameless file within
CW sam .
PP
The complement of
CW B
is
CW D :
P1
WC "D devil.tex satan.tex 666.tex emacs.tex
P2
eradicates the files from
CW sam 's
memory (not from the Unix machine's disc).
CW D
without any file names removes the current file from
CW sam .
PP
There are three other commands that relate the current file
to Unix files.
The
CW w
command writes the file to disc;
without arguments, it writes the entire file to the Unix file associated
with the current file in
CW sam
(it is the only command whose default address is not dot).
Of course, you can specify an address to be written,
and a different file name, with the obvious syntax:
P1
WC "1,2w /tmp/revelations
/tmp/revelations: #44
P2
CW sam
responds with the file name and the number of characters written to the file.
The
CW write
command on the button 3 menu is identical in function to an unadorned
CW w
command.
PP
The other two commands,
CW e
and
CW r ,
read data from Unix files.
The
CW e
command clears out the current file,
reads the data from the named file (or uses the current file's old name if
none is explicitly provided), and sets the file name.
It's much like a
CW B
command, but puts the information in the current file instead of a new one.
CW e
without any file name is therefore an easy way to refresh
CW sam 's
copy of a Unix file.
[Unlike in
CW ed ,
CW e
doesn't complain if the file is modified. The principle is not
to protect against things that can be undone if wrong.]
Since its job is to replace the whole text,
CW e
never takes an address.
PP
The
CW r
command is like
CW e ,
but it doesn't clear the file:
the text in the Unix file replaces dot, or the specified text if an
address is given.
P1
WC "r emacs.ms
P2
has essentially the effect of
P1
WC "<cat emacs.ms
P2
The commands
CW r
and
CW w
will set the name of the file if the current file has no name already defined;
CW e
sets the name even if the file already has one.
PP
There is a command, analogous to
CW x ,
that iterates over files instead of pieces of text:
CW X
(capital
CW x ).
The syntax is easy; it's just like that of
CW x
\(em \f(CWX/\f2pattern\f(CW/\f2command\f1.
(The complementary command is
CW Y ,
analogous to
CW y .)
The effect is to run the command in each file whose menu entry
(that is, whose line printed by an
CW f
command) matches the pattern.
For example, since an apostrophe identifies modified files,
P1
WC "X/'/ w
P2
writes the changed files out to disc.
Here is a longer example: find all uses of a particular variable
in the C source files:
P1
WC "X/\e.c$/ ,x/variable/+-p
P2
We can use an
CW f
command to identify which file the variable appears in:
P1
ft CI
X/\e.c$/ ,g/variable/ {
f
,x/variable/+-{
=
p
}
}
ft
P2
Here, the
CW g
command guarantees that only the names of files containing the variable
will be printed (but beware that
CW sam
may confuse matters by printing the names of files it reads in during
the command).
The
CW =
command shows where in the file the variable appears, and the
CW p
command prints the line.
PP
The
CW D
command is handy as the target of an
CW X .
This example deletes from the menu all C files that do not contain
a particular variable:
P1
WC "X/\e.c$/ ,v/variable/ D
P2
If no pattern is provided for the
CW X ,
the command (which defaults to
CW f )
is run in all files, so
P1
WC "X D
P2
cleans
CW sam
up for a fresh start.
PP
But rather than working any further, let's stop now:
P1
WC q
$
P2
fi
PP
Some of the file manipulating commands can be undone:
undoing a
CW f ,
CW e ,
or
CW r
restores the previous state of the file,
but
CW w ,
CW B
and
CW D
are irrevocable.
And, of course, so is
CW q .