The vi/ex Editor, Part 6: Addresses and Columns | |
Screen-Mode Addresses | |
A Few Address Principles | |
Useful Addresses | |
Editing in Columns | |
Single-Character Columns | |
Multi-Character Columns | |
Next Installment | |
By popular demand I'm trying something new in the tutorial, | |
starting with this installment. The e-mail I receive from | |
tutorial readers most often asks me how to do some specific type | |
of editing job, using whatever editor tools are needed. So, I'm | |
now mixing my general-principle explanations with in-depth | |
coverage of particular work areas. | |
The first application area I'm covering is the one readers ask | |
about most often, by far: editing files where columns are a major | |
factor. Future areas are up to you readers. If you have an | |
application area you'd like to see explained in some depth, | |
e-mail me your suggestion. | |
Screen-Mode Addresses | |
You use them all the time. They're the address targets that tell | |
screen-mode commands like c d y which stretch of your file to act | |
on. And even more often you use such addresses without commands, | |
to move around in the file. | |
For starters, I'll tell you some basics of screen-mode addressing | |
that aren't particularly clear to most editor users. Then it's | |
on to a few powerful but obscure addresses that most of us rarely | |
or never use. | |
A FEW ADDRESS PRINCIPLES | |
The first fact of screen-mode range addresses is simple enough: | |
one end of the range to be affected by the command is always | |
marked by the cursor itself. The address you give the command | |
(always a single address) indicates where the other end of the | |
affected range is to be. The address target can be either forward | |
or backward from the cursor position, in most cases. But exactly | |
how the cursor and the target terminate the two ends of the range | |
is variable. | |
At the start we have to distinguish between line addresses and | |
character addresses. Line addresses are very straightforward: | |
the command affects the entire line the cursor is on, the entire | |
line where the address point is located, and all the lines in | |
between. If you are using an address without a command, in order | |
to move the cursor, a line address generally puts the cursor on | |
the first non-whitespace character in the line addressed. | |
But line versus character addresses affect a lot more than | |
exactly what's included in the range. As one example, if you | |
yank or delete text using a line address and then place that text | |
somewhere with a p or P command, that text will appear on a new | |
line or lines, above or below the line you are on, respectively. | |
But if you yanked or deleted with a character address, when you | |
put the text back in, it will appear within the line you are on, | |
just just ahead of or behind the cursor. And to dispose of one | |
editor fallacy here and now, it does not make a bit of difference | |
that the range of text you yanked or deleted with a character | |
address amounts to exactly one or more lines -- it will still | |
behave as any other text yanked or deleted with a character | |
address. | |
So which addresses are line addresses? That depends on what your | |
command is. | |
Besides the three commands I cited as examples above, there are | |
four other, less-used commands -- ! < > = -- that also take | |
addresses. The only thing you have to know right now about these | |
four commands is that they can act only on entire lines; that's | |
inherent in what they do. So with these four commands, every | |
address is a line address. (Except a handful of addresses, such | |
as "f", that cannot be used with these commands at all.) | |
With the three more-used commands c d y or with an address used | |
by itself to move the cursor, an individual address is either | |
always a line address or always a character address -- usually. | |
There are exceptions to this rule also, such as the address | |
"j", which is a character address when you are just moving the | |
cursor, but a line address to any command. | |
So just where does a character address take you? When you are | |
just moving around in the file, the cursor lands on the character | |
that is the target you sought. Or if the target was a string of | |
characters, the character address puts the cursor on the first of | |
these. | |
When you are using a character address with a command, the | |
situation is more complex. The one firm rule is that if the | |
character address is farther down in the file than the cursor | |
position, the cursor position is included in the range the | |
command affects; while if the address target is earlier in the | |
file than the cursor, the cursor position is not included in the | |
range. | |
The question of whether the address target is included in the | |
command's range, like all the other open questions raised in the | |
last few paragraphs, will have to be answered separately for each | |
address. (But the usual rule is that if the address target is | |
forward of the cursor, the target is not included; if the target | |
lies backward from the cursor, the target is included.) | |
Note also that a count given with any of these seven commands is | |
passed to the address. You may give the count before or after | |
the command character itself, but always before the address. | |
What the address does with the count, if anything, is also a | |
case-by-case question. | |
USEFUL ADDRESSES | |
There are four addresses that together resemble a miniaturized, | |
localized version of the / and ? search patterns. In each case, | |
the search takes place only in the current line, and only for a | |
single character. To use any of them, you type one of the four | |
letters designating the kind of inline search, immediately | |
followed by the character to be searched for. (There are no | |
metacharacters used with these addresses.) | |
The letter "f" means that the search will go forward in the | |
current line and stop on the character typed next. "F" makes | |
the search run backward within the current line, otherwise the | |
same as "f". A "t" search is the same as an "f" search | |
except that the search stops with the character just short of the | |
one you type after the "t", and a "T" search is like a "t" | |
search but running backward within the current line. Any of | |
these addresses can take a preceding count, which tells the | |
search not to stop at the first instance of the character sought, | |
but to go on to the nth, where n is the count. | |
Any of these search commands, including the repeat-search | |
commands mentioned below, are character addresses and can be used | |
as an address for any of the three range commands that does not | |
require a line address. In every case, the character on which | |
the cursor would have landed had there been no command is the | |
furthest character included in the range the command will affect. | |
A few examples. "Fp" would cause a search that went backward and | |
landed on the closest prior letter "p". "3f-" would make the | |
search run forward within the current line and stop on the third | |
instance of a hyphen. "2T " would cause a backward search that | |
ended one character short of the second closest space character. | |
This search system has its own repeat-search characters, which | |
use storage buffers completely independent of those used for | |
storing previous / and ? search strings. A semicolon ";" repeats | |
the last inline search, in the same direction. A comma "," | |
repeats the last search but reverses the direction. Any count to | |
the original search is not included in the repeat, but you can | |
give a count to either repeat character which will be passed to | |
the search command that is repeated. While a search is limited | |
to the current line, you can run a search, move to another line, | |
then use a semicolon or comma to repeat the original search on | |
the new line. | |
Another very useful address that operates within a single line | |
is the vertical bar "|". When preceded by a count, this address | |
takes the cursor to the nth character on the current line, where | |
n is the count, regardless of where the cursor was when the | |
address was given. (In this address, n is absolute, not | |
relative, starting from character one at the left edge of the | |
text.) | |
This address can also be used with a command. If the target | |
character position is forward from the cursor position, the | |
furthest character affected will be the last one before the | |
target character. If the target is backward from the cursor, the | |
target character as well as all those between it and the cursor | |
will be affected by the command. | |
Editing in Columns | |
Although the Vi/Ex editor was not specifically designed to deal | |
with columnar material, there are ways to use it effectively for | |
this kind of work. Your choice of techniques will depend on | |
whether you are dealing with single-character columns wherein | |
each character in a line is in a separate column, or | |
multi-character columns where the columns are set apart from each | |
other by a separator character. | |
SINGLE-CHARACTER COLUMNS | |
Here I'm using "columns" the way most programmers do. A column | |
in this sense is simply the characters in a vertical section of a | |
file, one character wide. That is, the first character on each | |
line of the file is in the first column, the second character of | |
each line is in the second column, and so on. You'll find this | |
usage in systems that use punch-card images, such as early | |
Fortran programs; in the blocked records in certain databases, | |
such as the ones used for very large mailing lists; etcetera. | |
The essential point is that the systems that use these records | |
absolutely depend on each piece of information being entirely | |
within a certain column or range of columns, and nothing else | |
being within those columns except padding characters to fill up | |
any column positions not needed for the information in a | |
particular record. | |
For example, a mailing list may require that a suite or apartment | |
number be in columns 122 through 125 in each record (line), with | |
any padding following the actual number, so that an address | |
printing program that finds "316 " in those columns will | |
print ", #316" at the end of the street address line. If it | |
finds "3A " it will then print ", #3A", etcetera. | |
Should the suite number be even partially shifted out of the | |
designated columns, the system will either print garbage as the | |
suite number or issue an error message and skip that address | |
altogether. The principle is the same, and even more important, | |
with computer programs in punch-card image form. | |
When you are making changes in existing records, and editing | |
visually, the first important point is to be sure your are at the | |
start of the particular field you need to modify. The "|" | |
address I've explained above takes care of that -- wherever you | |
are in a line, typing 122| brings the cursor to the 122nd column. | |
Unless there are not 122 columns in that line: then the cursor | |
will be placed in the last column that does exist, without any | |
warning or error message. But files of this sort have generally | |
been checked for exact block sizing, and if yours have not been, | |
it's easy to check visually. | |
To check visually that all the lines in the file are of the | |
proper length, start by running a :se list command, which will | |
display a dollar sign at the end of each file line. Then scan | |
through the file to check that all those dollar signs are aligned | |
vertically. If so, then check that the uniform line length is | |
the correct one -- if your line length should be 66 characters | |
(not counting the nonvisible newline), then run a 65| command on | |
any line, and make sure that the cursor lands one column away | |
from the end of the line. | |
When you are at the start of the field to be changed, you have a | |
choice of ways to change it. If the change area is 12 characters | |
long, then typing 12cl followed by the 12 new characters and then | |
the escape key will do it. But if you miss the count by even one | |
character; if the actual number of characters you type in is 11 | |
or 13; then all the subsequent fields on that line will be | |
shifted one character out of place, which is probably a recipe | |
for disaster. | |
To avoid this hazard, make use of the little-known R command. It | |
starts like the familiar r command, in that when you type the | |
letter "R" in visual command mode the system waits to see what | |
character you type next, and whatever that next character is, it | |
replaces the character that was under the cursor. But instead of | |
then returning you to command mode, the R command then moves the | |
cursor one character to the right and again waits to see what | |
character you type next -- the character you now type replaces | |
the character that is now under the cursor. This process | |
continues until you stop it by hitting the escape key. So if | |
your cursor is on the capital P in the following line: | |
but the greatest ancient Greek was Plato, who | |
and you type in "RHomer" followed by the escape key, your line will | |
now read: | |
but the greatest ancient Greek was Homer, who | |
and the cursor will be on the letter r at the end of "Homer". | |
This character at a time replacement is the way to make sure you | |
don't inadvertently shift any fields. Just be certain that you | |
don't keep typing replacement characters beyond the existing end | |
of the line; you would extend the line length that way. You can | |
give a count to the R command, but you don't want to in this use | |
because the count will multiply the number of times the new | |
character string is inserted. That is, in that example above | |
about replacing "Plato" with "Homer", if you had typed 3R instead | |
of R your revised line would read: | |
but the greatest ancient Greek was HomerHomerHomer, who | |
Entering completely new lines of information is another matter. | |
You should just type them straight across, as you would with any | |
text entry, but if the existing lines are cryptic to human eyes | |
you may not be able to tell by looking just where one field ends | |
and another begins. You can try to keep count of the characters, | |
of course, but a single mistake will throw all the subsequent | |
fields in that line out of position. | |
What you need here is an on-screen template to show you what goes | |
where. You can make one on the spot, just by typing a template | |
line into your file, entering each data line just above it, and | |
deleting that template line when you are finished adding lines. | |
For example, suppose you are adding to a name file where each | |
record (line) starts with a month, day and year, continues with a | |
source code (each of the preceding as a two-digit number, with a | |
leading zero to pad it if necessary), and then has fields for a | |
last name, first name, and middle initial. It would not be | |
practical to judge where fields break just by looking at the | |
existing data lines, which might look like this: | |
07215854von TarekenstuttLeopold J | |
12077338Henderson-Blyth La Toya P | |
10108972Thistlethwaites Geraldine | |
But a simple template line can clear it all up. Here is one for the | |
job above: | |
m|d|y|s|LLLLLLLLLLLLLLL|FFFFFFFF|M | |
It has mnemonic characters to remind you of what goes in each | |
field, and the "|" to indicate the last position of each field | |
more noticeably. I've even used a lower-case letter for each | |
field that takes numeric characters right justified and zero | |
padded, and a capital letter for each field that takes alpha | |
characters left justified and space padded. | |
The way to use this template is to start entering data lines | |
immediately above the template line. That way, as you hit return | |
to start a new line, that new line replaces the one you've just | |
finished in the position right above the template line. Yes, | |
eventually the template line will be driven down off the bottom | |
of the screen, but returning to command mode and typing the | |
lower-case letter "z" followed by the return key will move the | |
template line and the lines around it to the top of the screen. | |
But there will be times when you don't want to spend time making | |
individual changes that you should be able to handle globally. | |
Suppose an obsolescent operations code has been replaced, and you | |
now need to change every "B27" to "K53" throughout your file, but | |
only when the "B27" appears in the operations code columns, which | |
are columns 9 through 11. Th is odd-looking command will do it: | |
:%s/^\(........\)B27/\1K53 | |
Those eight consecutive dots in the search pattern guarantee that | |
a match will occur only when there are exactly eight characters | |
between the beginning of the line and the "B27". So of | |
necessity, the "B" must occur in column 9, and so on. The "\1" | |
puts those eight characters right back in again, so only the | |
"B27" is actually replaced. | |
If your columnar file has all lines of equal length, as most do, | |
you can use this technique from the right side, too. If all | |
lines in the file have 66 characters, then typing that last | |
command as: | |
:%s/B27\(...\)$/K53\1 | |
will accomplish the changes in a case where the operations code | |
columns are 61 through 63, without the need to type (and | |
carefully count) sixty consecutive dots. | |
But there will be times when the columns to be changed are in the | |
middle of horrendously long record lines. There are still a | |
couple of tricks you may be able to use. One is to find a | |
landmark somewhere in mid-line. Does column 158 always contain | |
either a "*" or a "|" character, neither of which can appear | |
anywhere else in the lines? Then you can make the above change | |
in columns 163 through 165 by typing: | |
:%s/\([*|]....\)B27/\1K53 | |
Failing a landmark, let the editor count out a long string of | |
dots for you. To use this technique, you must first create your | |
substitution command as a text line within the file you are | |
editing, next write that line as a separate file (and then delete | |
the command line from your original file), and finally use the | |
:so command to pull in that one-line file and run it as a | |
line-mode command. If you need a string of 92 consecutive dots in | |
your command, create a blank line at the end of your file, next | |
type: | |
:1,92g/^/$s/^/. | |
to put those 92 dots there, and finally put the rest of the | |
command around that dot string. | |
MULTI-CHARACTER COLUMNS | |
The other meaning of "editing in columns" has to do with text | |
rather than data files. It refers to tables of data such as you | |
might find accompanying a technical article, columns of text | |
and/or illustrations running in parallel as you'd find on a | |
newspaper page, and the like. | |
Yes, Unix formatting utilities and some word processing programs | |
will format your final output into columns. But you may not have | |
all these utilities, you may not want to spend time trying to get | |
the results you want from those benighted programs, or you may | |
plan to direct your output where formatters won't work. | |
Visually editing the columns of data in a table requires little | |
explanation. The one thing to remember: use the R as far as | |
possible, to avoid shifting subsequent columns out of alignment | |
inadvertently. This holds for creating tables, too; start by | |
setting up a rectangular block of space characters, then replace | |
spaces with the column entries you want, to keep your next entry | |
from misaligning previous ones. This is also the best way to | |
create pictures, diagrams, graphs and maps using ASCII | |
characters. | |
Things become problematic when you want to shift whole columns | |
around -- there are no built-in Vi facilities for doing this. | |
Here is what it is practical to do in the editor. As a real life | |
example, consider the piece below, which I use as the tail end of | |
Usenet (Net news) posts that announce Indonesian classical music | |
and dance performances at a local restaurant: | |
It's at the Dutch East Indies ;,,,,;,,,,;,,,,;,,,,; | |
Restaurant on Oakland's /%%%%%%%%%%%%%%%%%%%%%\ | |
downtown waterfront. The /%%%%%%%%%%%%%%%%%%%%%%%\ | |
food there is very good "|""|"""|"""""|"""|""|" | |
Indonesian cuisine at _|__|___|_ _|___|__|_ | |
reasonable prices - dinners =|==|===|=====|===|==|= | |
$8.95 to $17.50. Views are ~~~~~~~~~~~( (~~~~~~~~~~~ | |
spectacular from the second ) ) | |
floor picture windows, out | |
over the water to Jack London Square, Alameda and San Francisco. | |
Formality is medium - cloth napkins and oil candles at the | |
tables, but no supercilious waiters, and the wall decorations are | |
mostly Indonesian handicrafts. The phone number for information | |
and reservations is 510/444-6555. | |
( ( ( | Broadway ||I The Dutch East Indies | |
) ) )Jack London |==========||== Restaurant is in Jack London | |
( ( ( Square |E ||8 Village, a boutiques & | |
) ) ) |m ||8 bistros cluster that is just | |
( ( ( JACK LONDON |b ||0 down the estuary from Jack | |
) ) )VILLAGE |a || London Square. Jack London | |
( ( ( Alice|r Amtrak ||f Village is rustic, | |
) ) ) -----------|c station ||r picturesque, quiet and safe. | |
( ( ( Street|a ||e To get there from the | |
) ) ) |d Jackson||e Interstate 880 freeway | |
( ( ( parking lot |e ------||-- heading north, take the Oak | |
) ) ) |r Street||w Street exit and turn left; | |
( ( ( |o ||a five blocks will bring you | |
) ) ) | ||y to Embarca- dero on your | |
( ( ( -------------||-- right, just before Oak | |
) ) ) Oak Street|| curves away to the left. | |
(Going south on I-880, take the Jackson Street exit and go two | |
blocks straight ahead before you turn right on Oak Street.) Turn | |
right onto Embarcadero and go three blocks, until you go under an | |
overpass of Victorian ironwork. Immediately turn left onto Alice | |
Street, where you will see Jack London Village on your right, and | |
a large lot that offers validated parking on the left. Walk into | |
the Village's central courtyard, and you'll see the Dutch East | |
Indies on the estuary side, toward the right, and upstairs. | |
To create this, I started by drawing the stylized building and | |
then the map. In each case I created a large rectangular block | |
of space characters, then began trying ideas with the R command | |
until I had something that satisfied me. (The pavilion sketch | |
eventually became wider than I had planned, so I had to run a | |
:%s/.*/ & / command to give me more working space.) Next I | |
put additional blocks of space characters on the left of the | |
drawing and the right of the map, to make a place for the text I | |
wanted to include. Then I started replacing spaces with text, | |
rewriting the text as I went along to fit it in nicely. When the | |
text reached the bottom of the figure I was fitting it to, I went | |
to full-width text lines, entering them the usual way. A tedious | |
labor, but pretty straightforward. | |
Now suppose I decided to redo this piece, by moving the picture | |
to where the map is now, and vice versa. A few well chosen | |
substitution and deletion commands would make copies of the two | |
figures minus the text, and I could just as easily copy the text | |
without the two figures. But how would I recombine them? | |
Short of typing the text in again from scratch, the best I could | |
do is to yank the lines of each figure, one at a time, and put | |
them after (or before) the appropriate text lines, one at a time. | |
Not that I would have to move back and forth between files with | |
each yank and put; I could yank up to 26 lines into the named | |
buffers, then move to the other file and put all 26 in their | |
proper places. But there is no Vi command to yank a rectangular | |
block of characters. | |
Also take note that I should yank using addresses that are not | |
line addresses, even though I will be yanking whole lines. If I | |
should yank with line addresses, putting the pieces into the | |
other file must make those pieces separate lines -- then I would | |
have to join each pair of lines to create the columns I want. | |
Next Time Around | |
In the next part of this tutorial, I will go over host of | |
complications and opportunities that come from allowing the | |
replacement commands I've discussed to use metacharacters. Then | |
I'll answer a couple of questions from readers that should be of | |
use to quite a few of you from time to time. | |
Part 7: The Replacement Commands | |
Back to the index |