GopherProxy

HTML "Rc — The Plan 9 Shell
\" /*% refer -k -e -n -l3,2 -s < % | tbl | troff -ms | lp -dfn
Tm shell programming language g
de TP \" An indented paragraph describing some command, tagged with the command name
IP "\\f(CW\\$1\\fR" 5
if \\w'\\f(CW\\$1\\fR'-4n .br
.
de CI
nr Sf \\n(.f
\%\&\\$3\f(CW\\$1\fI\&\\$2\f\\n(Sf
.
TL
Rc \(em The Plan 9 Shell
AU
Tom Duff
[email protected]
AB
I Rc
is a command interpreter for Plan 9 that
provides similar facilities to UNIX's
Bourne shell,
with some small additions and less idiosyncratic syntax.
This paper uses numerous examples to describe
I rc 's
features, and contrasts
I rc
with the Bourne shell, a model that many readers will be familiar with.
AE
NH
Introduction
PP
I Rc
is similar in spirit but different in detail from UNIX's
Bourne shell. This paper describes
I rc 's
principal features with many small examples and a few larger ones.
It assumes familiarity with the Bourne shell.
NH
Simple commands
PP
For the simplest uses
I rc
has syntax familiar to Bourne-shell users.
All of the following behave as expected:
P1
date
cat /lib/news/build
who >user.names
who >>user.names
wc <file
echo [a-f]*.c
who | wc
who; date
vc *.c &
mk && v.out /*/bin/fb/*
rm -r junk || echo rm failed!
P2
NH
Quotation
PP
An argument that contains a space or one of
I rc 's
other syntax characters must be enclosed in apostrophes
CW ' ): (
P1
rm 'odd file name'
P2
An apostrophe in a quoted argument must be doubled:
P1
echo 'How''s your father?'
P2
NH
Patterns
PP
An unquoted argument that contains any of the characters
CW *
CW ?
CW [
is a pattern to be matched against file names.
A
CW *
character matches any sequence of characters,
CW ?
matches any single character, and
CW [\fIclass\fP]
matches any character in the
CW class ,
unless the first character of
I class
is
CW ~ ,
in which case the class is complemented.
The
I class
may also contain pairs of characters separated by
CW - ,
standing for all characters lexically between the two.
The character
CW /
must appear explicitly in a pattern, as must the path name components
CW .
and
CW .. .
A pattern is replaced by a list of arguments, one for each path name matched,
except that a pattern matching no names is not replaced by the empty list;
rather it stands for itself.
NH
Variables
PP
UNIX's Bourne shell offers string-valued variables.
I Rc
provides variables whose values are lists of arguments \(em
that is, arrays of strings. This is the principal difference
between
I rc
and traditional UNIX command interpreters.
Variables may be given values by typing, for example:
P1
path=(. /bin)
user=td
font=/lib/font/bit/pelm/ascii.9.font
P2
The parentheses indicate that the value assigned to
CW path
is a list of two strings. The variables
CW user
and
CW font
are assigned lists containing a single string.
PP
The value of a variable can be substituted into a command by
preceding its name with a
CW $ ,
like this:
P1
echo $path
P2
If
CW path
had been set as above, this would be equivalent to
P1
echo . /bin
P2
Variables may be subscripted by numbers or lists of numbers,
like this:
P1
echo $path(2)
echo $path(2 1 2)
P2
These are equivalent to
P1
echo /bin
echo /bin . /bin
P2
There can be no space separating the variable's name from the
left parenthesis; otherwise, the subscript would be considered
a separate parenthesized list.
PP
The number of strings in a variable can be determined by the
CW $#
operator. For example,
P1
echo $#path
P2
would print 2 for this example.
PP
The following two assignments are subtly different:
P1
empty=()
null=''
P2
The first sets
CW empty
to a list containing no strings.
The second sets
CW null
to a list containing a single string,
but the string contains no characters.
PP
Although these may seem like more or less
the same thing (in Bourne's shell, they are
indistinguishable), they behave differently
in almost all circumstances.
Among other things
P1
echo $#empty
P2
prints 0, whereas
P1
echo $#null
P2
prints 1.
PP
All variables that have never been set have the value
CW () .
PP
Occasionally, it is convenient to treat a variable's value
as a single string. The elements of a string are concatenated
into a single string, with spaces between the elements, by
the
CW $"
operator.
Thus, if we set
P1
list=(How now brown cow)
string=$"list
P2
then both
P1
echo $list
P2
and
P1
echo $string
P2
cause the same output, viz:
P1
How now brown cow
P2
but
P1
echo $#list $#string
P2
will output
P1
4 1
P2
because
CW $list
has four members, but
CW $string
has a single member, with three spaces separating its words.
NH
Arguments
PP
When
I rc
is reading its input from a file, the file has access
to the arguments supplied on
I rc 's
command line. The variable
CW $*
initially has the list of arguments assigned to it.
The names
CW $1 ,
CW $2 ,
etc. are synonyms for
CW $*(1) ,
CW $*(2) ,
etc.
In addition,
CW $0
is the name of the file from which
I rc 's
input is being read.
NH
Concatenation
PP
I Rc
has a string concatenation operator, the caret
CW ^ ,
to build arguments out of pieces.
P1
echo hully^gully
P2
is exactly equivalent to
P1
echo hullygully
P2
Suppose variable
CW i
contains the name of a command.
Then
P1
vc $i^.c
vl -o $1 $i^.v
P2
might compile the command's source code, leaving the
result in the appropriate file.
PP
Concatenation distributes over lists. The following
P1
echo (a b c)^(1 2 3)
src=(main subr io)
cc $src^.c
P2
are equivalent to
P1
echo a1 b2 c3
cc main.c subr.c io.c
P2
In detail, the rule is: if both operands of
CW ^
are lists of the same non-zero number of strings, they are concatenated
pairwise. Otherwise, if one of the operands is a single string,
it is concatenated with each member of the other operand in turn.
Any other combination of operands is an error.
NH
Free carets
PP
User demand has dictated that
I rc
insert carets in certain places, to make the syntax
look more like the Bourne shell. For example, this:
P1
cc -$flags $stems.c
P2
is equivalent to
P1
cc -^$flags $stems^.c
P2
In general,
I rc
will insert
CW ^
between two arguments that are not separated by white space.
Specifically, whenever one of
CW "$'`
follows a quoted or unquoted word, or an unquoted word follows
a quoted word with no intervening blanks or tabs, an implicit
CW ^
is inserted between the two. If an unquoted word immediately following a
CW $
contains a character other than an alphanumeric, underscore or
CW * ,
a
CW ^
is inserted before the first such character.
NH
Command substitution
PP
It is often useful to build an argument list from the output of a command.
I Rc
allows a command, enclosed in braces and preceded by a left quote,
CW "`{...}" ,
anywhere that an argument is required. The command is executed and its
standard output captured.
The characters stored in the variable
CW ifs
are used to split the output into arguments.
For example,
P1
cat `{ls -tr|sed 10q}
P2
will concatenate the ten oldest files in the current directory in temporal order, given the
default
CW ifs
setting of space, tab, and newline.
NH
Pipeline branching
PP
The normal pipeline notation is general enough for almost all cases.
Very occasionally it is useful to have pipelines that are not linear.
Pipeline topologies more general than trees can require arbitrarily large pipe buffers,
or worse, can cause deadlock.
I Rc
has syntax for some kinds of non-linear but treelike pipelines.
For example,
P1
cmp <{old} <{new}
P2
will regression-test a new version of a command.
CW <
or
CW >
followed by a command in braces causes the command to be run with
its standard output or input attached to a pipe. The parent command
CW cmp "" (
in the example)
is started with the other end of the pipe attached to some file descriptor
or other, and with an argument that will connect to the pipe when opened
(e.g.,
CW /dev/fd/6 ).
Some commands are unprepared to deal with input files that turn out not to be seekable.
For example
CW diff
needs to read its input twice.
NH
Exit status
PP
When a command exits it returns status to the program that executed it.
On Plan 9 status is a character string describing an error condition.
On normal termination it is empty.
PP
I Rc
captures command exit status in the variable
CW $status .
For a simple command the value of
CW $status
is just as described above. For a pipeline
CW $status
is set to the concatenation of the statuses of the pipeline components with
CW |
characters for separators.
PP
I Rc
has a several kinds of control flow,
many of them conditioned by the status returned from previously
executed commands. Any
CW $status
containing only
CW 0 's
and
CW | 's
has boolean value
I true .
Any other status is
I false .
NH
Command grouping
PP
A sequence of commands enclosed in
CW {}
may be used anywhere a command is required.
For example:
P1
{sleep 3600;echo 'Time''s up!'}&
P2
will wait an hour in the background, then print a message.
Without the braces,
P1
sleep 3600;echo 'Time''s up!'&
P2
would lock up the terminal for an hour,
then print the message in the background.
NH
Control flow \(em \f(CWfor\fP
PP
A command may be executed once for each member of a list
by typing, for example:
P1
for(i in printf scanf putchar) look $i /usr/td/lib/dw.dat
P2
This looks for each of the words
CW printf ,
CW scanf
and
CW putchar
in the given file.
The general form is
P1
for(\fIname\fP in \fIlist\fP) \fIcommand\fP
P2
or
P1
for(\fIname\fP) \fIcommand\fP
P2
In the first case
I command
is executed once for each member of
I list
with that member assigned to variable
I name .
If the clause
CW in "" ``
I list ''
is missing,
CW in "" ``
CW $* ''
is assumed.
NH
Conditional execution \(em \f(CWif\fP
PP
I Rc
also provides a general if-statement. For example:
P1
for(i in *.c) if(cpp $i >/tmp/$i) vc /tmp/$i
P2
runs the C compiler on each C source program that
cpp processes without error.
An `if not' statement provides a two-tailed conditional.
For example:
P1
for(i){
if(test -f /tmp/$i) echo $i already in /tmp
if not cp $i /tmp
}
P2
This loops over each file in
CW $* ,
copying to
CW /tmp
those that do not already appear there, and
printing a message for those that do.
NH
Control flow \(em \f(CWwhile\fP
PP
I Rc 's
while statement looks like this:
P1
while(newer subr.v subr.c) sleep 5
P2
This waits until
CW subr.v
is newer than
CW subr.c ,
presumably because the C compiler finished with it.
PP
If the controlling command is empty, the loop will not terminate.
Thus,
P1
while() echo y
P2
emulates the
I yes
command.
NH
Control flow \(em \f(CWswitch\fP
PP
I Rc
provides a switch statement to do pattern-matching on
arbitrary strings. Its general form is
P1
switch(\fIword\fP){
case \fIpattern ...\fP
\fIcommands\fP
case \fIpattern ...\fP
\fIcommands\fP
\&...
}
P2
I Rc
attempts to match the word against the patterns in each case statement in turn.
Patterns are the same as for filename matching, except that
CW /
and
CW .
and
CW ..
need not be matched explicitly.
PP
If any pattern matches, the
commands following that case up to
the next case (or the end of the switch)
are executed, and execution of the switch
is complete. For example,
P1
switch($#*){
case 1
cat >>$1
case 2
cat >>$2 <$1
case *
echo 'Usage: append [from] to'
}
P2
is an append command. Called with one file argument,
it appends its standard input to the named file. With two, the
first is appended to the second. Any other number
elicits an error message.
PP
The built-in
CW ~
command also matches patterns, and is often more concise than a switch.
Its arguments are a string and a list of patterns. It sets
CW $status
to true if and only if any of the patterns matches the string.
The following example processes option arguments for the
I man (1)
command:
P1
opt=()
while(~ $1 -* [1-9] 10){
switch($1){
case [1-9] 10
sec=$1 secn=$1
case -f
c=f s=f
case -[qwnt]
cmd=$1
case -T*
T=$1
case -*
opt=($opt $1)
}
shift
}
P2
NH
Functions
PP
Functions may be defined by typing
P1
fn \fIname\fP { \fIcommands\fP }
P2
Subsequently, whenever a command named
I name
is encountered, the remainder of the command's
argument list will assigned to
CW $*
and
I rc
will execute the
I commands .
The value of
CW $*
will be restored on completion.
For example:
P1
fn g {
grep $1 *.[hcyl]
}
P2
defines
CI g " pattern
to look for occurrences of
I pattern
in all program source files in the current directory.
PP
Function definitions are deleted by writing
P1
fn \fIname\fP
P2
with no function body.
NH
Command execution
PP
I Rc
does one of several things to execute a simple command.
If the command name is the name of a function defined using
CW fn ,
the function is executed.
Otherwise, if it is the name of a built-in command, the
built-in is executed directly by
I rc .
Otherwise, directories mentioned in the variable
CW $path
are searched until an executable file is found.
Extensive use of the
CW $path
variable is discouraged in Plan 9. Instead, use the default
CW (.
CW /bin)
and bind what you need into
CW /bin .
NH
Built-in commands
PP
Several commands are executed internally by
I rc
because they are difficult to implement otherwise.
TP ". [-i] \fIfile ...\f(CW
Execute commands from
I file .
CW $*
is set for the duration to the reminder of the argument list following
I file .
CW $path
is used to search for
I file .
Option
CW -i
indicates interactive input \(em a prompt
(found in
CW $prompt )
is printed before each command is read.
TP "builtin \fIcommand ...\f(CW
Execute
I command
as usual except that any function named
I command
is ignored.
For example,
P1
fn cd{
builtin cd $* && pwd
}
P2
defines a replacement for the
CW cd
built-in (see below) that announces the full name of the new directory.
TP "cd [\fIdir\f(CW]
Change the current directory to
I dir .
The default argument is
CW $home .
CW $cdpath
is a list of places in which to search for
I dir .
TP "eval [\fIarg ...\f(CW]
The arguments are concatenated (separated by spaces) into a string, read as input to
I rc ,
and executed. For example,
P1
x='$y'
y=Doody
eval echo Howdy, $x
P2
would echo
P1
Howdy, Doody
P2
since the arguments of
CW eval
would be
P1
echo Howdy, $y
P2
after substituting for
CW $x .
TP "exec [\fIcommand ...\f(CW]
I Rc
replaces itself with the given
I command .
This is like a
I goto
\(em
I rc
does not wait for the command to exit, and does not return to read any more commands.
TP "exit [\fIstatus\f(CW]
I Rc
exits immediately with the given status. If none is given, the current value of
CW $status
is used.
TP "flag \fIf\f(CW [+-]
This command manipulates and tests the command line flags (described below).
P1
flag \fIf\f(CW +
P2
sets flag
I f .
P1
flag \fIf\f(CW -
P2
clears flag
I f .
P1
flag \fIf\f(CW
P2
tests flag
I f ,
setting
CW $status
appropriately.
Thus
P1
if(flag x) flag v +
P2
sets the
CW -v
flag if the
CW -x
flag is already set.
TP "rfork [nNeEsfF]
This uses the Plan 9
I rfork
system entry to put
I rc
into a new process group with the following attributes:
TS
box;
l l l
lfCW l l.
Flag Name Function
_
n RFNAMEG Make a copy of the parent's name space
N RFCNAMEG Start with a new, empty name space
e RFENVG Make a copy of the parent's environment
E RFCENVG Start with a new, empty environment
s RFNOTEG Make a new note group
f RFFDG Make a copy of the parent's file descriptor space
F RFCFDG Make a new, empty file descriptor space
TE
Section
I fork (2)
of the Programmer's Manual describes these attributes in more detail.
TP "shift [\fIn\f(CW]
Delete the first
I n
(default 1) elements of
CW $* .
TP "wait [\fIpid\fP]
Wait for the process with the given
I pid
to exit. If no
I pid
is given, all outstanding processes are waited for.
TP "whatis \fIname ...\f(CW
Print the value of each
I name
in a form suitable for input to
I rc .
The output is an assignment to a variable, the definition of a function,
a call to
CW builtin
for a built-in command, or the path name of a binary program.
For example,
P1
whatis path g cd who
P2
might print
P1
path=(. /bin)
fn g {gre -e $1 *.[hycl]}
builtin cd
/bin/who
P2
TP "~ \fIsubject pattern ...\f(CW
The
I subject
is matched against each
I pattern
in turn. On a match,
CW $status
is set to true.
Otherwise, it is set to
CW "'no match'" .
Patterns are the same as for filename matching.
The
I patterns
are not subjected to filename replacement before the
CW ~
command is executed, so they need not be enclosed in
quotation marks, unless of course, a literal match for
CW *
CW [
or
CW ?
is required.
For example
P1
~ $1 ?
P2
matches any single character, whereas
P1
~ $1 '?'
P2
only matches a literal question mark.
NH
Advanced I/O Redirection
PP
I Rc
allows redirection of file descriptors other than 0 and 1
(standard input and output) by specifying the file descriptor
in square brackets
CW "[ ]
after the
CW <
or
CW > .
For example,
P1
vc junk.c >[2]junk.diag
P2
saves the compiler's diagnostics from standard error in
CW junk.diag .
PP
File descriptors may be replaced by a copy, in the sense of
I dup (2),
of an already-open file by typing, for example
P1
vc junk.c >[2=1]
P2
This replaces file descriptor 2 with a copy of file descriptor 1.
It is more useful in conjunction with other redirections, like this
P1
vc junk.c >junk.out >[2=1]
P2
Redirections are evaluated from left to right, so this redirects
file descriptor 1 to
CW junk.out ,
then points file descriptor 2 at the same file.
By contrast,
P1
vc junk.c >[2=1] >junk.out
P2
redirects file descriptor 2 to a copy of file descriptor 1
(presumably the terminal), and then directs file descriptor 1
to a file. In the first case, standard and diagnostic output
will be intermixed in
CW junk.out .
In the second, diagnostic output will appear on the terminal,
and standard output will be sent to the file.
PP
File descriptors may be closed by using the duplication notation
with an empty right-hand side.
For example,
P1
vc junk.c >[2=]
P2
will discard diagnostics from the compilation.
PP
Arbitrary file descriptors may be sent through
a pipe by typing, for example,
P1
vc junk.c |[2] grep -v '^$'
P2
This deletes blank lines
from the C compiler's error output. Note that the output
of
CW grep
still appears on file descriptor 1.
PP
Occasionally you may wish to connect the input side of
a pipe to some file descriptor other than zero.
The notation
P1
cmd1 |[5=19] cmd2
P2
creates a pipeline with
CW cmd1 's
file descriptor 5 connected through a pipe to
CW cmd2 's
file descriptor 19.
NH
Here documents
PP
I Rc
procedures may include data, called ``here documents'',
to be provided as input to commands, as in this version of the
I tel
command
P1
for(i) grep $i <<!
\&...
tor 2T-402 2912
kevin 2C-514 2842
bill 2C-562 7214
\&...
!
P2
A here document is introduced by the redirection symbol
CW << ,
followed by an arbitrary EOF marker
CW ! "" (
in the example). Lines following the command,
up to a line containing only the EOF marker are saved
in a temporary file that is connected to the command's
standard input when it is run.
PP
I Rc
does variable substitution in here documents. The following command:
P1
ed $3 <<EOF
g/$1/s//$2/g
w
EOF
P2
changes all occurrences of
CW $1
to
CW $2
in file
CW $3 .
To include a literal
CW $
in a here document, type
CW $$ .
If the name of a variable is followed immediately by
CW ^ ,
the caret is deleted.
PP
Variable substitution can be entirely suppressed by enclosing
the EOF marker following
CW <<
in quotation marks, as in
CW <<'EOF' .
PP
Here documents may be provided on file descriptors other than 0 by typing, for example,
P1
cmd <<[4]End
\&...
End
P2
PP
If a here document appears within a compound block, the contents of the document
must be after the whole block:
P1
for(i in $*){
mail $i <<EOF
}
words to live by
EOF
P2
NH
Catching Notes
PP
I Rc
scripts normally terminate when an interrupt is received from the terminal.
A function with the name of a UNIX signal, in lower case, is defined in the usual way,
but called when
I rc
receives the corresponding note.
The
I notify (2)
section of the Programmer's Manual discusses notes in some detail.
Notes of interest are:
TP sighup
The note was `hangup'.
Plan 9 sends this when the terminal has disconnected from
I rc .
TP sigint
The note was `interrupt', usually sent when
the interrupt character (ASCII DEL) is typed on the terminal.
TP sigterm
The note was `kill', normally sent by
I kill (1).
TP sigexit
An artificial note sent when
I rc
is about to exit.
PP
As an example,
P1
fn sigint{
rm /tmp/junk
exit
}
P2
sets a trap for the keyboard interrupt that
removes a temporary file before exiting.
PP
Notes will be ignored if the note routine is set to
CW {} .
Signals revert to their default behavior when their handlers'
definitions are deleted.
NH
Environment
PP
The environment is a list of name-value pairs made available to
executing binaries.
On Plan 9, the environment is stored in a file system named
CW #e ,
normally mounted on
CW /env .
The value of each variable is stored in a separate file, with components
terminated by zero bytes.
(The file system is
maintained entirely in core, so no disk or network access is involved.)
The contents of
CW /env
are shared on a per-process group basis \(mi when a new process group is
created it effectively attaches
CW /env
to a new file system initialized with a copy of the old one.
A consequence of this organization is that commands can change environment
entries and see the changes reflected in
I rc .
PP
Functions also appear in the environment, named by prefixing
CW fn#
to their names, like
CW /env/fn#roff .
NH
Local Variables
PP
It is often useful to set a variable for the duration
of a single command. An assignment followed by a command
has this effect. For example
P1
a=global
a=local echo $a
echo $a
P2
will print
P1
local
global
P2
This works even for compound commands, like
P1
f=/fairly/long/file/name {
{ wc $f; spell $f; diff $f.old $f } |
pr -h 'Facts about '$f | lp -dfn
}
P2
NH
Examples \(em \fIcd, pwd\fP
PP
Here is a pair of functions that provide
enhanced versions of the standard
CW cd
and
CW pwd
commands. (Thanks to Rob Pike for these.)
P1
ps1='% ' # default prompt
tab=' ' # a tab character
fn cd{
builtin cd $1 &&
switch($#*){
case 0
dir=$home
prompt=($ps1 $tab)
case *
switch($1)
case /*
dir=$1
prompt=(`{basename `{pwd}}^$ps1 $tab)
case */* ..*
dir=()
prompt=(`{basename `{pwd}}^$ps1 $tab)
case *
dir=()
prompt=($1^$ps1 $tab)
}
}
}
fn pwd{
if(~ $#dir 0)
dir=`{/bin/pwd}
echo $dir
}
P2
Function
CW pwd
is a version of the standard
CW pwd
that caches its value in variable
CW $dir ,
because the genuine
CW pwd
can be quite slow to execute.
(Recent versions of Plan 9 have very fast implementations of
CW pwd ,
reducing the advantage of the
CW pwd
function.)
PP
Function
CW cd
calls the
CW cd
built-in, and checks that it was successful.
If so, it sets
CW $dir
and
CW $prompt .
The prompt will include the last component of the
current directory (except in the home directory,
where it will be null), and
CW $dir
will be reset either to the correct value or to
CW () ,
so that the
CW pwd
function will work correctly.
NH
Examples \(em \fIman\fP
PP
The
I man
command prints pages of the Programmer's Manual.
It is called, for example, as
P1
man 2 sinh
man rc
man -t cat
P2
In the first case, the page for
I sinh
in section 2 is printed.
In the second case, the manual page for
I rc
is printed. Since no manual section is specified,
all sections are searched for the page, and it is found
in section 1.
In the third case, the page for
I cat
is typeset (the
CW -t
option).
P1
cd /sys/man || {
echo $0: No manual! >[1=2]
exit 1
}
NT=n # default nroff
s='*' # section, default try all
for(i) switch($i){
case -t
NT=t
case -n
NT=n
case -*
echo Usage: $0 '[-nt] [section] page ...' >[1=2]
exit 1
case [1-9] 10
s=$i
case *
eval 'pages='$s/$i
for(page in $pages){
if(test -f $page)
$NT^roff -man $page
if not
echo $0: $i not found >[1=2]
}
}
P2
Note the use of
CW eval
to make a list of candidate manual pages.
Without
CW eval ,
the
CW *
stored in
CW $s
would not trigger filename matching
\(em it's enclosed in quotation marks,
and even if it weren't, it would be expanded
when assigned to
CW $s .
Eval causes its arguments
to be re-processed by
I rc 's
parser and interpreter, effectively delaying
evaluation of the
CW *
until the assignment to
CW $pages .
NH
Examples \(em \fIholmdel\fP
PP
The following
I rc
script plays the deceptively simple game
I holmdel ,
in which the players alternately name Bell Labs locations,
the winner being the first to mention Holmdel.
KF
P1
t=/tmp/holmdel$pid
fn read{
$1=`{awk '{print;exit}'}
}
ifs='
\&' # just a newline
fn sigexit sigint sigquit sighup{
rm -f $t
exit
}
cat <<'!' >$t
Allentown
Atlanta
Cedar Crest
Chester
Columbus
Elmhurst
Fullerton
Holmdel
Indian Hill
Merrimack Valley
Morristown
Neptune
Piscataway
Reading
Short Hills
South Plainfield
Summit
Whippany
West Long Branch
!
while(){
lab=`{fortune $t}
echo $lab
if(~ $lab Holmdel){
echo You lose.
exit
}
while(read lab; ! grep -i -s $lab $t) echo No such location.
if(~ $lab [hH]olmdel){
echo You win.
exit
}
}
P2
KE
PP
This script is worth describing in detail
(rather, it would be if it weren't so silly.)
PP
Variable
CW $t
is an abbreviation for the name of a temporary file.
Including
CW $pid ,
initialized by
I rc
to its process-id,
in the names of temporary files insures that their
names won't collide, in case more than one instance
of the script is running at a time.
PP
Function
CW read 's
argument is the name of a variable into which a
line gathered from standard input is read.
CW $ifs
is set to just a newline. Thus
CW read 's
input is not split apart at spaces, but the terminating
newline is deleted.
PP
A handler is set to catch
CW sigint ,
CW sigquit ,
and
CW sighup,
and the artificial
CW sigexit
signal. It just removes the temporary file and exits.
PP
The temporary file is initialized from a here
document containing a list of Bell Labs locations, and
the main loop starts.
PP
First, the program guesses a location (in
CW $lab )
using the
CW fortune
program to pick a random line from the location list.
It prints the location, and if it guessed Holmdel, prints
a message and exits.
PP
Then it uses the
CW read
function to get lines from standard input and validity-check
them until it gets a legal name.
Note that the condition part of a
CW while
can be a compound command. Only the exit status of the
last command in the sequence is checked.
PP
Again, if the result
is Holmdel, it prints a message and exits.
Otherwise it goes back to the top of the loop.
NH
Design Principles
PP
I Rc
draws heavily from Steve Bourne's
CW /bin/sh .
Any successor of the Bourne shell is bound to
suffer in comparison. I have tried to fix its
best-acknowledged shortcomings and to simplify things
wherever possible, usually by omitting inessential features.
Only when irresistibly tempted have I introduced novel ideas.
Obviously I have tinkered extensively with Bourne's syntax.
PP
The most important principle in
I rc 's
design is that it's not a macro processor. Input is never
scanned more than once by the lexical and syntactic analysis
code (except, of course, by the
CW eval
command, whose
I "raison d'être
is to break the rule).
PP
Bourne shell scripts can often be made
to run wild by passing them arguments containing spaces.
These will be split into multiple arguments using
CW IFS ,
often at inopportune times.
In
I rc ,
values of variables, including command line arguments, are not re-read
when substituted into a command.
Arguments have presumably been scanned in the parent process, and ought
not to be re-read.
PP
Why does Bourne re-scan commands after variable substitution?
He needs to be able to store lists of arguments in variables whose values are
character strings.
If we eliminate re-scanning, we must change the type of variables, so that
they can explicitly carry lists of strings.
PP
This introduces some
conceptual complications. We need a notation for lists of words.
There are two different kinds of concatenation, for strings \(em
CW $a^$b ,
and lists \(em
CW "($a $b)" .
The difference between
CW ()
and
CW ''
is confusing to novices,
although the distinction is arguably sensible \(em
a null argument is not the same as no argument.
PP
Bourne also rescans input when doing command substitution.
This is because the text enclosed in back-quotes is not
a string, but a command. Properly, it ought to
be parsed when the enclosing command is, but this makes
it difficult to
handle nested command substitutions, like this:
P1
size=`wc -l \e`ls -t|sed 1q\e``
P2
The inner back-quotes must be escaped
to avoid terminating the outer command.
This can get much worse than the above example;
the number of
CW \e 's
required is exponential in the nesting depth.
I Rc
fixes this by making the backquote a unary operator
whose argument is a command, like this:
P1
size=`{wc -l `{ls -t|sed 1q}}
P2
No escapes are ever required, and the whole thing
is parsed in one pass.
PP
For similar reasons
I rc
defines signal handlers as though they were functions,
instead of associating a string with each signal, as Bourne does,
with the attendant possibility of getting a syntax error message
in response to typing the interrupt character. Since
I rc
parses input when typed, it reports errors when you make them.
PP
For all this trouble, we gain substantial semantic simplifications.
There is no need for the distinction between
CW $*
and
CW $@ .
There is no need for four types of quotation, nor the
extremely complicated rules that govern them. In
I rc
you use quotation marks when you want a syntax character
to appear in an argument, or an argument that is the empty string,
and at no other time.
CW IFS
is no longer used, except in the one case where it was indispensable:
converting command output into argument lists during command substitution.
PP
This also avoids an important UNIX security hole.
In UNIX, the
I system
and
I popen
functions call
CW /bin/sh
to execute a command. It is impossible to use either
of these routines with any assurance that the specified command will
be executed, even if the caller of
I system
or
I popen
specifies a full path name for the command. This can be devastating
if it occurs in a set-userid program.
The problem is that
CW IFS
is used to split the command into words, so an attacker can just
set
CW IFS=/
in his environment and leave a Trojan horse
named
CW usr
or
CW bin
in the current working directory before running the privileged program.
I Rc
fixes this by never rescanning input for any reason.
PP
Most of the other differences between
I rc
and the Bourne shell are not so serious. I eliminated Bourne's
peculiar forms of variable substitution, like
P1
echo ${a=b} ${c-d} ${e?error}
P2
because they are little used, redundant and easily
expressed in less abstruse terms.
I deleted the builtins
CW export ,
CW readonly ,
CW break ,
CW continue ,
CW read ,
CW return ,
CW set ,
CW times
and
CW unset
because they seem redundant or
only marginally useful.
PP
Where Bourne's syntax draws from Algol 68,
I rc 's
is based on C or Awk. This is harder to defend.
I believe that, for example
P1
if(test -f junk) rm junk
P2
is better syntax than
P1
if test -f junk; then rm junk; fi
P2
because it is less cluttered with keywords,
it avoids the semicolons that Bourne requires
in odd places,
and the syntax characters better set off the
active parts of the command.
PP
The one bit of large-scale syntax that Bourne
unquestionably does better than
I rc
is the
CW if
statement with
CW "else
clause.
I Rc 's
CW if
has no terminating
CW fi -like
bracket. As a result, the parser cannot
tell whether or not to expect an
CW "else
clause without looking ahead in its input.
The problem is that after reading, for example
P1
if(test -f junk) echo junk found
P2
in interactive mode,
I rc
cannot decide whether to execute it immediately and print
CW $prompt(1) ,
or to print
CW $prompt(2)
and wait for the
CW "else
to be typed.
In the Bourne shell, this is not a problem, because the
CW if
command must end with
CW fi ,
regardless of whether it contains an
CW else
or not.
PP
I Rc 's
admittedly feeble solution is to declare that the
CW else
clause is a separate statement, with the semantic
proviso that it must immediately follow an
CW if ,
and to call it
CW "if not
rather than
CW else ,
as a reminder that something odd is going on.
The only noticeable consequence of this is that
the braces are required in the construction
P1
for(i){
if(test -f $i) echo $i found
if not echo $i not found
}
P2
and that
I rc
resolves the ``dangling else'' ambiguity in opposition
to most people's expectations.
PP
It is remarkable that in the four most recent editions of the UNIX system
programmer's manual the Bourne shell grammar described in the manual page
does not admit the command
CW who|wc .
This is surely an oversight, but it suggests something darker:
nobody really knows what the Bourne shell's grammar is. Even examination
of the source code is little help. The parser is implemented by recursive
descent, but the routines corresponding to the syntactic categories all
have a flag argument that subtly changes their operation depending on the
context.
I Rc 's
parser is implemented using
I yacc ,
so I can say precisely what the grammar is.
NH
Acknowledgements
PP
Rob Pike, Howard Trickey and other Plan 9 users have been insistent, incessant
sources of good ideas and criticism. Some examples in this document are plagiarized
from [Bourne],
as are most of
I rc 's
good features.
NH
Reference
LP
S. R. Bourne,
UNIX Time-Sharing System: The UNIX Shell,
Bell System Technical Journal, Volume 57 number 6, July-August 1978