The vi/ex Editor, Part 1: Vi Editor Fundamentals | |
Why Vi? | |
A Heartwarming Edit | |
The Plan Of This Ongoing Tutorial | |
The Editor's Basic Concepts | |
Search Patterns | |
Searching From Where You Are Now | |
The Find-Them-All Search | |
Simple Search Patterns | |
Metacharacters | |
Table Of Search Pattern Metacharacters | |
Character Classes. | |
What's Coming For The Next Installment. | |
Why Vi? | |
A HEARTWARMING EDIT. Pity poor Hal, a corporate maintenance | |
programmer. A large module of badly-broken, poorly-patched legacy | |
code -- the spaghetti variety -- finally broke down completely | |
yesterday, leaving one corporate division running at half speed. | |
By dint of some inspired fixes during an all-nighter, Hal has the | |
module up and running again this morning... but just as he's ready | |
to go out for food that isn't from a vending machine, in walks the | |
corporation's VP of IS, with a big surprise. | |
"Nice work on that crash fix, Hal; but right now I need some | |
formatted technical data about it, in a hurry. The Board of | |
Directors' Information Systems Committee has called a rush | |
meeting this morning to convince themselves they're on top of | |
the problem. I'll be in the hotseat, and I need technical data | |
I can put up on the video projector to keep them occupied. | |
"They'll want me to discuss the logfile of errors that led up to | |
the crash . . . yes, I know that's in /oltp/err/m7, but | |
appending puts the latest report lines at the bottom of the | |
file. Those suits aren't interested in what they think is | |
ancient history, and they wouldn't be caught reading anything | |
but a commuter train timetable from the bottom up, so you'll | |
have to make a copy with the order of the lines reversed: what | |
was the last line becomes the first line, what was the second to | |
the last line is now line number two, and so on. | |
"And let's take a look at that logfile. | |
374a12 44872 130295/074457 nonabort | |
5982d34 971 130295/221938 nonabort | |
853f7 2184 140295/102309 abort | |
... | |
Hmmm. Explaining the second column to them would be advertising | |
the fact that we knew this failure was just waiting for a chance | |
to happen. So while you're at it, go through and erase all but | |
the first and last digits of each number in column two. | |
"Oh, and when they get tired of that they'll want to scrutinize | |
the Lint report. Last month I told them that our Lint | |
substitute was the greatest thing since Marilyn Monroe, so now | |
they'll want me to tell them why the messages it still generates | |
on this module aren't real hazards. Just run Lint over the | |
revamped module; then combine the Lint output with a copy of the | |
source file by taking each message line like: | |
Line 257: obsolete operator += | |
and putting the significant part at the end of the source line | |
it refers to. And put a separator, like XXX, between the source | |
line and the message so I can page through quickly. Nothing like | |
a hefty dose of source code they can't begin to fathom to make | |
the meeting break up early. | |
"And get right on this. The meeting starts in 35 minutes." | |
Our VP walks away inwardly smiling, thinking he's getting out of | |
detailed explanations and putting all the blame on an underling, | |
just by demanding more editing than anyone could do in the time | |
available. "I'll tell the Information Systems Committee that I | |
made it perfectly clear to the programmer that we needed this at | |
9:30, but when I asked him for it a minute ago he said it wasn't | |
finished and he wasn't sure when it would be. Then I'll remark | |
that those programmers just can't understand that keeping | |
management informed is every bit as important as writing code!" | |
But Hal has a secret weapon against this squeeze play: an expert | |
knowledge of the Vi editor. | |
Reversing the order of the lines in a file is a piece of cake with | |
this editor. The eight keystrokes in: | |
:g/^/m0(ret) | |
will do it. Taking the digits out of the middle of the second | |
column throughout the file also requires just one command line: | |
:%s/^\([^ ]* [0-9]\)[0-9]*\([0-9] \)/\1\2(ret) | |
And integrating the Lint messages into a copy of the source code? | |
Even that can be automated with the Vi editor. The editor | |
command: | |
:%s/Line \([0-9][0-9]*\): \(.*\)/\1s;$; XXX \2(ret) | |
will turn that file of Lint messages into an editor script, and | |
running that script on a copy of the source file will mark it up | |
as requested. | |
Rather than being portrayed as a bungler, Hal can have it all | |
ready in a couple of minutes, just by typing a few lines. He'll | |
even have time to guard against vice-presidential prevarication, | |
by disappearing into the coffee shop across the street and | |
reappearing just as the meeting is getting started, to tell the VP | |
(and everyone else in earshot), "Those files you wanted are in | |
slash-temp-slash-hal". | |
THE PLAN OF THIS ONGOING TUTORIAL. I'm writing here for editor | |
users who have some fluency in Vi/Ex at the surface level. That | |
is, you know how to do the ordinary things that are belabored in | |
all the "Introducing Vi" books on the market, but rarely venture | |
beyond that level. | |
This tutorial series will explore a lot of other capabilities that | |
hardly anyone knows are in Vi/Ex. That includes quite a few | |
tricks that may be built on editor functions we all use every day, | |
but which nonetheless are not obvious--for instance, telling the | |
global command to mark every line it encounters. I'll also be | |
clarifying the real nature of the many misunderstood aspects of | |
this editor. | |
To do all this, I'll be explaining things in more depth than you | |
might think warranted at first. I'll also throw in examples | |
wherever they seem helpful. And to save you readers from gross | |
information overload, I'll write this tutorial in a large number | |
of fairly small modules, to be put up on our website at a calm, | |
reasonable pace. | |
The Editor's Basic Concepts | |
To get a real grasp on this editor's power, you need to know the | |
basic ideas embodied in it, and a few fundamental building blocks | |
that are used throughout its many functions. | |
One cause of editor misuse is that most users, even experienced | |
ones, don't really know what the editor is good at and what it's | |
not capable of. Here's a quick rundown on its capabilities. | |
First, it's strictly a general-purpose editor. It doesn't format | |
the text; it doesn't have the handholding of a word processor; it | |
doesn't have built-in special facilities for editing binaries, | |
graphics, tables, outlines, or any programming language except | |
Lisp. | |
It's two editors in one. Visual mode is a better full-screen | |
editor than most, and it runs faster than those rivals that have a | |
larger bag of screen-editing commands. Line editing mode dwarfs | |
the "global search and replace" facilities found in word | |
processors and simple screen editors; its only rivals are | |
non-visual editors like Sed where you must know in advance exactly | |
what you want to do. But in the Vi/Ex editor, the two sides are | |
very closely linked, giving the editor a combination punch that no | |
other editor I've tried can rival. | |
Finally, this editor is at its best when used by people who have | |
taken the trouble to learn it thoroughly. It's too capable to be | |
learned well in an hour or two, and too idiosyncratic to be | |
mastered in a week, and yet the power really is in it, for the few | |
who care to delve into it. A large part of that power requires | |
custom-programming the editor: that's not easy or straightforward, | |
but what can be done by the skillful user goes beyond the direct | |
programmability of any editor except (possibly) Emacs. | |
Search Patterns | |
In quite a few functions of this editor, you can use | |
string-pattern searching to say where something is to be done or | |
how far some effect is to extend. These search patterns are a | |
good example of an editor function that is very much in the Unix | |
style, but not exactly the same in detail as search patterns in | |
any other Unix utility. | |
Search patterns function in both line editing and visual editing | |
modes, and the work the same way in both, with just a few | |
exceptions. But how you tell the editor you're typing in a search | |
pattern will vary with the circumstances. | |
SEARCHING FROM WHERE YOU ARE NOW. The more common use for search | |
patterns is to go to some new place in the file, or make some | |
editing change that will extend from your present position to the | |
place the pattern search finds. (In line editing mode it's also | |
possible to have an action take place from one pattern's location | |
to where another pattern is found, but both searches still start | |
from your present location.) | |
If you want to search forward in the file from your present | |
location (toward the end of the file), precede the search pattern | |
with a slash (/) character, and type another to end the pattern. | |
So if you want to move forward to the next instance of the string | |
"j++" in your file, typing: | |
/j++/(ret) | |
will do it. And so will: | |
/j++(ret) | |
When there is nothing between the pattern and the RETURN key, the | |
RETURN itself will indicate the end of the search pattern, so the | |
second slash is not necessary. And if you are in visual mode, the | |
ESCAPE key works as well as RETURN does for ending search input, | |
so | |
/j++(esc) | |
is yet another way to make the same request from visual mode. | |
To search backward (toward the start of the file), begin and end | |
with a question mark instead of a slash. The same rules of | |
abbreviation apply to backward searches, so | |
?j++?(ret) | |
?j++(ret) | |
?j++(esc) | |
are all ways to head backward in the file to the same pattern. | |
Either way, you've expressed both your request for a pattern | |
search and the direction the search is to take in just one | |
keystroke. But don't assume that if you search backward, any | |
matching pattern the editor finds will be above your present | |
position in the file, and vice versa if you search forward. The | |
editor looks there first, certainly, but if it gets to the top or | |
bottom line of the file and hasn't found a match yet, it wraps | |
around to the other end of the file and continues the search in | |
the same direction. That is, if you used a question mark to order | |
a backward search and the editor searches all the way through the | |
top line of the file without finding a match, it will go on to | |
search the bottom line next, then the second-to-the-bottom line, | |
and so on until (if necessary) it gets back to the point where the | |
search started. Or if you were searching forward and the editor | |
found no match up through the very last line of the file, it would | |
next search the first line, then the second line, etcetera. | |
If you don't want searches to go past either end of the file, | |
you'll need to type in a line mode command: | |
:set nowrapscan(ret) | |
This will disable the wraparound searching during the present | |
session in the editor. If you want to restore the wraparound | |
searching mechanism before you leave the editor, typing | |
:set wrapscan(ret) | |
will do it, and you can turn this on and off as often as you like. | |
THE FIND-THEM-ALL SEARCH. Up to now, I've been considering | |
searches that find just one instance of the pattern; the one | |
closest to your current location in the file, in the direction you | |
chose for the search. But there is another style of search, used | |
primarily by certain line editing mode commands, such as global | |
and substitute. This search finds every line in the file (or in a | |
selected part of the file) that contains the pattern and operates | |
on them all. | |
Don't get confused when using the global and substitute commands. | |
You'll often use both styles of search pattern in one command | |
line. But the find-one-instance pattern or patterns will go | |
before the command name or abbreviation, while the find-them-all | |
pattern will come just behind it. For example, in the command: | |
:?Chapter 10?,/The End/substitute/cat/dog/g(ret) | |
the first two patterns refer to the preceding line closest to the | |
current line that contains the string "Chapter 10" and the closest | |
following line containing the string "The End". Note that each | |
address finds only one line. Combined with the intervening comma, | |
they indicate that the substitute command is to operate on those | |
two lines and all the lines in between them. But the patterns | |
immediately after the substitute command itself tell the command | |
to find every instance of the string "cat" withing that range of | |
lines and replace it with the string "dog". | |
Aside from the difference in meaning, the two styles also have | |
different standards for the delimiters that mark pattern | |
beginnings and (sometimes) endings. With a find-them-all pattern, | |
there's no need to indicate whether to search forward or backward. | |
Thus, you aren't limited to slash and question mark as your | |
pattern delimiters. Almost any punctuation mark will do, because | |
the editor takes note of the first punctuation mark to appear | |
after the command name, and regards it as the delimiter in that | |
instance. So | |
:?Chapter 10?,/The End/substitute;cat;dog;g(ret) | |
:?Chapter 10?,/The End/substitute+cat+dog+g(ret) | |
:?Chapter 10?,/The End/substitute{cat{dog{g(ret) | |
are all equivalent to the substitution command above. (It is a | |
good idea to avoid using punctuation characters that might have a | |
meaning in the command, such as an exclamation point, which often | |
appears as a switch at the end of a command name.) | |
The benefit of this liberty comes when the slash mark will appear | |
as itself in the search pattern. For example, suppose our | |
substitution command above was to find each pair of consecutive | |
slash marks in the text, and separate them with a hyphen--that is, | |
change // to /-/. Obviously, | |
:?Chapter 10?,/The End/substitute/////-//g(ret) | |
won't work; the command will only regard the first three slashes | |
as delimiters, and everything after that as extraneous characters | |
at the end of the command. This can be solved by backslashing: | |
:?Chapter 10?,/The End/substitute/\/\//\/-\//g(ret) | |
but this is even harder to type correctly than the first attempt | |
was. But with another punctuation mark as the separator | |
:?Chapter 10?,/The End/substitute;//;/-/;g(ret) | |
the typing is easy and the final command is readable. | |
SIMPLE SEARCH PATTERNS. The simplest search pattern is just a | |
string of characters you want the editor to find, exactly as | |
you've typed them in. For instance: "the cat". But, already there | |
are several caveats: | |
1. This search finds a string of characters, which may or may | |
not be words by themselves. That is, it may find its target in | |
the middle of the phrase "we fed the cat boiled chicken", or in | |
the middle of "we sailed a lithe catamaran down the coast". It's | |
all a matter of which it encounters first. | |
2. Whether the search calls "The Cat" a match or not depends on how | |
you've set an editor variable named ignorecase. If you've left | |
that variable in its default setting, the capitalized version will | |
not match. If you want a capital letter to match its lower-case | |
equivalent, and vice versa, type in the line mode command | |
:set ignorecase(ret) | |
To resume letting caps match only caps and vice versa, type | |
:set noignorecase(ret) | |
3. The search absolutely will not find a match where "the" | |
occurs at the end of one line and "cat" is at the start of the | |
next line: | |
and with Michael's careful help, we prodded the cat back into | |
its cage. Next afternoon several | |
It makes no difference whether there is or isn't a space | |
character between one of the words and the linebreak. Finding a | |
pattern that may break across a line ending is a practically | |
impossible task with this line-oriented editor. | |
4. Where the search starts depends on which editor mode you're | |
using. A search in visual mode starts with the character next to | |
the cursor. In line mode, the search starts with the line | |
adjacent to the current line. | |
METACHARACTERS. Then there are search metacharacters or "wild | |
cards": characters that represent something other than themselves | |
in the search. As an example, the metacharacters . and * in | |
/Then .ed paid me $50*!/(ret) | |
could cause the pattern to match any of: | |
Then Ted paid me $5! | |
Then Red paid me $5000! | |
Then Ned paid me $50! | |
or a myriad of other strings. Metacharacters are what give search | |
patterns their real power, but they need to be well understood. | |
To understand these, you must know the varied uses of the | |
backslash (\) metacharacter in turning the "wild card" value of | |
metacharacters on and off. | |
In many cases, the meta value of the metacharacter is on whenever | |
the character appears in a search pattern unless it is preceded by | |
a backslash; when the backslash is ahead of it the meta value is | |
turned off and the character simply represents itself. As an | |
example, the backslash is a metacharacter by itself, even if it | |
precedes a character that never has a meta value. The only way to | |
put an actual backslash in your search pattern is to precede it | |
with another backslash to remove its meta value. That is, to | |
search for the pattern "a\b", type | |
/a\\b/(ret) | |
as your search pattern. If you type | |
/a\b/(ret) | |
the backslash will be interpreted as a metacharacter without any | |
effect (since the letter b is never a metacharacter) and your | |
search pattern will find the string "ab". | |
Less-often-used metacharacters are used in exactly the opposite | |
way. This sort of character represents only itself when it | |
appears by itself. You must use a preceding backslash to turn the | |
meta value on. For example, in | |
/\<cat/ | |
the left angle bracket (<) is a metacharacter; in | |
/<cat/ | |
it only represents itself. These special metacharacters are | |
pointed out in the list below. | |
Finally there is a third class, the most difficult to keep track | |
of. Usually these metacharacters have their meta values on in | |
search patterns, and must be backslashed to make them represent | |
just themselves: like our first example, the backslash character | |
itself. But if you've changed the default value of an editor | |
variable named magic to turn it off, they work oppositely--you | |
then must backslash them to turn their meta value on: like our | |
second example, the left angle bracket. (Not that you are are | |
likely to have any reason to turn magic off.) These oddities are | |
also noted in the list below. | |
And don't forget the punctuation character that starts and ends | |
your search pattern, whether it is slash or question mark or | |
something else. Whatever it is, if it is also to appear as a | |
character in the pattern you are searching for, you'll have to | |
backslash it there to prevent the editor thinking it is the end of | |
the pattern. | |
TABLE OF SEARCH PATTERN METACHARACTERS | |
. | |
A period in a search pattern matches any single character, whether | |
a letter of the alphabet (upper or lower case), a digit, a | |
punctuation mark, in fact, any ASCII character except the newline. | |
So to find "default value" when it might be spelled | |
"default-value" or "default/value" or "default_value", etcetera, | |
use /default.value/ as your search pattern. When the editor | |
variable magic is turned off, you must backslash the period to | |
give it its meta value. | |
* | |
An asterisk, plus the character that precedes it, match any length | |
string (even zero length) of the character that precedes the | |
asterisk. So the search string /ab*c/ would match "ac" or "abc" | |
or "abbc" or "abbbc", and so on. (To find a string with at least | |
one "b" in it, use /abb*c/ as your search string.) When the | |
asterisk follows another metacharacter, the two match any length | |
string of characters that the metacharacter matches. That means | |
that /a.*b/ will find "a" followed by "b" with anything (or | |
nothing) between them. When the editor variable magic is turned | |
off, you must backslash the asterisk to give it its meta value. | |
^ | |
A circumflex as the first character in a search pattern means that | |
a match will be found only if the matching string occurs at the | |
start of a line of text. It doesn't represent any character at | |
the start of the line, of course, and a circumflex anywhere in a | |
search pattern except as the first character will have no meta | |
value. So /^cat/ will find "cat", but only at the start of a | |
line, while /cat^/ will find "cat^" anywhere in a line. | |
$ | |
A dollar sign as the last character in a search pattern means the | |
match must occur at the end of a line of text. Otherwise it's the | |
same as circumflex, above. | |
\< | |
At the start of a search pattern, a backslashed left angle bracket | |
means the match can only occur at the start of a simple word; at | |
any other position in a search pattern it is not a metacharacter. | |
(In this editor, a "simple" word is either a string of one or more | |
alphanumeric character(s) or a string of one or more | |
non-alphanumeric, non-whitespace character(s), so "shouldn't" | |
contains three simple words.) Thus /\<cat/ will find the last | |
three characters in "the cat" or in "tom-cat", but not in | |
"tomcat". To remove the meta value from the left angle bracket, | |
remove the preceding backslash: /<cat/ will find "<cat" regardless | |
of what precedes it. | |
\> | |
At the end of a search pattern, a backslashed right angle bracket | |
means the match can occur only at the end of a simple word. | |
Otherwise the same as the left angle bracket, above. | |
~ | |
The tilde represents the last string you put into a line by means | |
of a line mode substitute command, regardless of whether you were | |
in line mode then or ran it from visual mode by preceding it with | |
a colon (":"). For instance, if your last line mode substitution | |
command was s/dog/cat/ then a /the ~/ search pattern will find | |
"the cat". But the input string of a substitute command can use | |
metacharacters of its own, and if your last use involved any of | |
those metacharacters then a tilde in your search pattern will give | |
you either an error message or a match that is not what you | |
expected. When the editor variable magic is turned off, you must | |
backslash the tilde to give it its meta value. | |
CHARACTER CLASSES. There is one metastring form (a "multicharacter | |
metacharacter") used in search patterns. When several characters | |
are enclosed within a set of brackets ([]), the group matches any | |
one of the characters inside the brackets. That is, /part [123]/ | |
will match "part 1", "part 2" or "part 3", whichever the search | |
comes to first. One frequent use for this feature is in finding a | |
string that may or may not be capitalized, when the editor | |
variable ignorecase is turned off (as it is by default). Typing | |
/[Cc]at/ will find either "Cat" or "cat", and /[Cc][Aa][Tt]/ will | |
find those or "CAT". (In case there was a slip of the shift key | |
when "CAT" was typed in, the last pattern will even find "CaT", | |
"CAt", etcetera.) | |
There's more power (and some complication) in another feature of | |
this metastring: there can be metacharacters inside it. Inside the | |
brackets, a circumflex as the first character reverses the | |
meaning. Now the metastring matches any one character that is NOT | |
within the brackets. A /^[^ ]/ search pattern finds a line that | |
does not begin with a space character. (You're so right if you | |
think that the different meta values of the circumflex inside and | |
outside the character class brackets is not one of the editor's | |
best points.) A circumflex that is not the first character inside | |
the brackets represents just an actual circumflex. | |
A hyphen can be a metacharacter within the brackets, too. When | |
it's between two characters, and the first of the two other | |
characters has a lower ASCII value than the second, it's as if | |
you'd typed in all of the characters in the ASCII collating | |
sequence from the first to the second one, inclusive. So /[0-9]%/ | |
will find any numeral followed by the percent sign (%), just as | |
/[0123456789]%/ would. A /[a-z]/ search pattern will match any | |
lower-case letter, and /[a-zA-Z]/ matches any letter, capital or | |
lower case. These two internal metacharacters can be combined: | |
/[^A-Z]/ will find any character except a capital letter. A | |
hyphen that is either the first or the last character inside the | |
brackets has no meta value. When a character-hyphen-character | |
string has a first character with a higher ASCII value than the | |
last character, the hyphen and the two characters that surround it | |
are all ignored by the pattern search, so /[ABz-a]/ is the same as | |
/[AB]/. | |
Backslashing character classes is complex. Within the brackets | |
you must backslash a right bracket that's part of the class; | |
otherwise the editor will mistake it for the bracket that closes | |
the class. Of course you must backslash a backslash that you want | |
to be part of the class, and you can backslash a circumflex at the | |
start or a hyphen between two characters if you want them in the | |
class literally and don't want to move them elsewhere in the | |
construct. Elsewhere in a search pattern you will have to | |
backslash a left bracket that you want to appear as itself, or | |
else the editor will take it as your attempt to begin a character | |
class. Finally, if magic is turned off, you'll have to backslash | |
a left bracket when you do want it to begin a character class. | |
Coming Up Next | |
In the second part of this tutorial, I'll be following up on all | |
this information about search patterns, by showing the right ways | |
to combine them with other elements to generate command addresses. | |
As a second part finale, I'll show how to tap the enormous power | |
of the command that looks like an address: the global command. | |
Part 2: Line-Mode Addresses | |
Back to the index |