The vi/ex Editor, Part 6: Addresses and Columns

	The vi/ex Editor, Part 6: Addresses and Columns

	Screen-Mode Addresses
	A Few Address Principles
	Useful Addresses
	Editing in Columns
	Single-Character Columns
	Multi-Character Columns
	Next Installment

	By popular demand I'm trying something new in the tutorial,
	starting with this installment. The e-mail I receive from
	tutorial readers most often asks me how to do some specific type
	of editing job, using whatever editor tools are needed. So, I'm
	now mixing my general-principle explanations with in-depth
	coverage of particular work areas.

	The first application area I'm covering is the one readers ask
	about most often, by far: editing files where columns are a major
	factor. Future areas are up to you readers. If you have an
	application area you'd like to see explained in some depth,
	e-mail me your suggestion.

	Screen-Mode Addresses

	You use them all the time. They're the address targets that tell
	screen-mode commands like c d y which stretch of your file to act
	on. And even more often you use such addresses without commands,
	to move around in the file.

	For starters, I'll tell you some basics of screen-mode addressing
	that aren't particularly clear to most editor users. Then it's
	on to a few powerful but obscure addresses that most of us rarely
	or never use.

	A FEW ADDRESS PRINCIPLES

	The first fact of screen-mode range addresses is simple enough:
	one end of the range to be affected by the command is always
	marked by the cursor itself. The address you give the command
	(always a single address) indicates where the other end of the
	affected range is to be. The address target can be either forward
	or backward from the cursor position, in most cases. But exactly
	how the cursor and the target terminate the two ends of the range
	is variable.

	At the start we have to distinguish between line addresses and
	character addresses. Line addresses are very straightforward:
	the command affects the entire line the cursor is on, the entire
	line where the address point is located, and all the lines in
	between. If you are using an address without a command, in order
	to move the cursor, a line address generally puts the cursor on
	the first non-whitespace character in the line addressed.

	But line versus character addresses affect a lot more than
	exactly what's included in the range. As one example, if you
	yank or delete text using a line address and then place that text
	somewhere with a p or P command, that text will appear on a new
	line or lines, above or below the line you are on, respectively.
	But if you yanked or deleted with a character address, when you
	put the text back in, it will appear within the line you are on,
	just just ahead of or behind the cursor. And to dispose of one
	editor fallacy here and now, it does not make a bit of difference
	that the range of text you yanked or deleted with a character
	address amounts to exactly one or more lines -- it will still
	behave as any other text yanked or deleted with a character
	address.

	So which addresses are line addresses? That depends on what your
	command is.

	Besides the three commands I cited as examples above, there are
	four other, less-used commands -- ! < > = -- that also take
	addresses. The only thing you have to know right now about these
	four commands is that they can act only on entire lines; that's
	inherent in what they do. So with these four commands, every
	address is a line address. (Except a handful of addresses, such
	as "f", that cannot be used with these commands at all.)

	With the three more-used commands c d y or with an address used
	by itself to move the cursor, an individual address is either
	always a line address or always a character address -- usually.
	There are exceptions to this rule also, such as the address
	"j", which is a character address when you are just moving the
	cursor, but a line address to any command.

	So just where does a character address take you? When you are
	just moving around in the file, the cursor lands on the character
	that is the target you sought. Or if the target was a string of
	characters, the character address puts the cursor on the first of
	these.

	When you are using a character address with a command, the
	situation is more complex. The one firm rule is that if the
	character address is farther down in the file than the cursor
	position, the cursor position is included in the range the
	command affects; while if the address target is earlier in the
	file than the cursor, the cursor position is not included in the
	range.

	The question of whether the address target is included in the
	command's range, like all the other open questions raised in the
	last few paragraphs, will have to be answered separately for each
	address. (But the usual rule is that if the address target is
	forward of the cursor, the target is not included; if the target
	lies backward from the cursor, the target is included.)

	Note also that a count given with any of these seven commands is
	passed to the address. You may give the count before or after
	the command character itself, but always before the address.
	What the address does with the count, if anything, is also a
	case-by-case question.

	USEFUL ADDRESSES

	There are four addresses that together resemble a miniaturized,
	localized version of the / and ? search patterns. In each case,
	the search takes place only in the current line, and only for a
	single character. To use any of them, you type one of the four
	letters designating the kind of inline search, immediately
	followed by the character to be searched for. (There are no
	metacharacters used with these addresses.)

	The letter "f" means that the search will go forward in the
	current line and stop on the character typed next. "F" makes
	the search run backward within the current line, otherwise the
	same as "f". A "t" search is the same as an "f" search
	except that the search stops with the character just short of the
	one you type after the "t", and a "T" search is like a "t"
	search but running backward within the current line. Any of
	these addresses can take a preceding count, which tells the
	search not to stop at the first instance of the character sought,
	but to go on to the nth, where n is the count.

	Any of these search commands, including the repeat-search
	commands mentioned below, are character addresses and can be used
	as an address for any of the three range commands that does not
	require a line address. In every case, the character on which
	the cursor would have landed had there been no command is the
	furthest character included in the range the command will affect.

	A few examples. "Fp" would cause a search that went backward and
	landed on the closest prior letter "p". "3f-" would make the
	search run forward within the current line and stop on the third
	instance of a hyphen. "2T " would cause a backward search that
	ended one character short of the second closest space character.

	This search system has its own repeat-search characters, which
	use storage buffers completely independent of those used for
	storing previous / and ? search strings. A semicolon ";" repeats
	the last inline search, in the same direction. A comma ","
	repeats the last search but reverses the direction. Any count to
	the original search is not included in the repeat, but you can
	give a count to either repeat character which will be passed to
	the search command that is repeated. While a search is limited
	to the current line, you can run a search, move to another line,
	then use a semicolon or comma to repeat the original search on
	the new line.

	Another very useful address that operates within a single line
	is the vertical bar "\|". When preceded by a count, this address
	takes the cursor to the nth character on the current line, where
	n is the count, regardless of where the cursor was when the
	address was given. (In this address, n is absolute, not
	relative, starting from character one at the left edge of the
	text.)

	This address can also be used with a command. If the target
	character position is forward from the cursor position, the
	furthest character affected will be the last one before the
	target character. If the target is backward from the cursor, the
	target character as well as all those between it and the cursor
	will be affected by the command.

	Editing in Columns

	Although the Vi/Ex editor was not specifically designed to deal
	with columnar material, there are ways to use it effectively for
	this kind of work. Your choice of techniques will depend on
	whether you are dealing with single-character columns wherein
	each character in a line is in a separate column, or
	multi-character columns where the columns are set apart from each
	other by a separator character.

	SINGLE-CHARACTER COLUMNS

	Here I'm using "columns" the way most programmers do. A column
	in this sense is simply the characters in a vertical section of a
	file, one character wide. That is, the first character on each
	line of the file is in the first column, the second character of
	each line is in the second column, and so on. You'll find this
	usage in systems that use punch-card images, such as early
	Fortran programs; in the blocked records in certain databases,
	such as the ones used for very large mailing lists; etcetera.

	The essential point is that the systems that use these records
	absolutely depend on each piece of information being entirely
	within a certain column or range of columns, and nothing else
	being within those columns except padding characters to fill up
	any column positions not needed for the information in a
	particular record.

	For example, a mailing list may require that a suite or apartment
	number be in columns 122 through 125 in each record (line), with
	any padding following the actual number, so that an address
	printing program that finds "316 " in those columns will
	print ", #316" at the end of the street address line. If it
	finds "3A  " it will then print ", #3A", etcetera.
	Should the suite number be even partially shifted out of the
	designated columns, the system will either print garbage as the
	suite number or issue an error message and skip that address
	altogether. The principle is the same, and even more important,
	with computer programs in punch-card image form.

	When you are making changes in existing records, and editing
	visually, the first important point is to be sure your are at the
	start of the particular field you need to modify. The "\|"
	address I've explained above takes care of that -- wherever you
	are in a line, typing 122\| brings the cursor to the 122nd column.
	Unless there are not 122 columns in that line: then the cursor
	will be placed in the last column that does exist, without any
	warning or error message. But files of this sort have generally
	been checked for exact block sizing, and if yours have not been,
	it's easy to check visually.

	To check visually that all the lines in the file are of the
	proper length, start by running a :se list command, which will
	display a dollar sign at the end of each file line. Then scan
	through the file to check that all those dollar signs are aligned
	vertically. If so, then check that the uniform line length is
	the correct one -- if your line length should be 66 characters
	(not counting the nonvisible newline), then run a 65\| command on
	any line, and make sure that the cursor lands one column away
	from the end of the line.

	When you are at the start of the field to be changed, you have a
	choice of ways to change it. If the change area is 12 characters
	long, then typing 12cl followed by the 12 new characters and then
	the escape key will do it. But if you miss the count by even one
	character; if the actual number of characters you type in is 11
	or 13; then all the subsequent fields on that line will be
	shifted one character out of place, which is probably a recipe
	for disaster.

	To avoid this hazard, make use of the little-known R command. It
	starts like the familiar r command, in that when you type the
	letter "R" in visual command mode the system waits to see what
	character you type next, and whatever that next character is, it
	replaces the character that was under the cursor. But instead of
	then returning you to command mode, the R command then moves the
	cursor one character to the right and again waits to see what
	character you type next -- the character you now type replaces
	the character that is now under the cursor. This process
	continues until you stop it by hitting the escape key. So if
	your cursor is on the capital P in the following line:

	but the greatest ancient Greek was Plato, who

	and you type in "RHomer" followed by the escape key, your line will
	now read:

	but the greatest ancient Greek was Homer, who

	and the cursor will be on the letter r at the end of "Homer".
	This character at a time replacement is the way to make sure you
	don't inadvertently shift any fields. Just be certain that you
	don't keep typing replacement characters beyond the existing end
	of the line; you would extend the line length that way. You can
	give a count to the R command, but you don't want to in this use
	because the count will multiply the number of times the new
	character string is inserted. That is, in that example above
	about replacing "Plato" with "Homer", if you had typed 3R instead
	of R your revised line would read:

	but the greatest ancient Greek was HomerHomerHomer, who

	Entering completely new lines of information is another matter.
	You should just type them straight across, as you would with any
	text entry, but if the existing lines are cryptic to human eyes
	you may not be able to tell by looking just where one field ends
	and another begins. You can try to keep count of the characters,
	of course, but a single mistake will throw all the subsequent
	fields in that line out of position.

	What you need here is an on-screen template to show you what goes
	where. You can make one on the spot, just by typing a template
	line into your file, entering each data line just above it, and
	deleting that template line when you are finished adding lines.
	For example, suppose you are adding to a name file where each
	record (line) starts with a month, day and year, continues with a
	source code (each of the preceding as a two-digit number, with a
	leading zero to pad it if necessary), and then has fields for a
	last name, first name, and middle initial. It would not be
	practical to judge where fields break just by looking at the
	existing data lines, which might look like this:

	07215854von TarekenstuttLeopold J
	12077338Henderson-Blyth La Toya P
	10108972Thistlethwaites Geraldine

	But a simple template line can clear it all up. Here is one for the
	job above:

	m\|d\|y\|s\|LLLLLLLLLLLLLLL\|FFFFFFFF\|M

	It has mnemonic characters to remind you of what goes in each
	field, and the "\|" to indicate the last position of each field
	more noticeably. I've even used a lower-case letter for each
	field that takes numeric characters right justified and zero
	padded, and a capital letter for each field that takes alpha
	characters left justified and space padded.

	The way to use this template is to start entering data lines
	immediately above the template line. That way, as you hit return
	to start a new line, that new line replaces the one you've just
	finished in the position right above the template line. Yes,
	eventually the template line will be driven down off the bottom
	of the screen, but returning to command mode and typing the
	lower-case letter "z" followed by the return key will move the
	template line and the lines around it to the top of the screen.

	But there will be times when you don't want to spend time making
	individual changes that you should be able to handle globally.
	Suppose an obsolescent operations code has been replaced, and you
	now need to change every "B27" to "K53" throughout your file, but
	only when the "B27" appears in the operations code columns, which
	are columns 9 through 11. Th is odd-looking command will do it:

	:%s/^$........$B27/\1K53

	Those eight consecutive dots in the search pattern guarantee that
	a match will occur only when there are exactly eight characters
	between the beginning of the line and the "B27". So of
	necessity, the "B" must occur in column 9, and so on. The "\1"
	puts those eight characters right back in again, so only the
	"B27" is actually replaced.

	If your columnar file has all lines of equal length, as most do,
	you can use this technique from the right side, too. If all
	lines in the file have 66 characters, then typing that last
	command as:

	:%s/B27$...$$/K53\1

	will accomplish the changes in a case where the operations code
	columns are 61 through 63, without the need to type (and
	carefully count) sixty consecutive dots.

	But there will be times when the columns to be changed are in the
	middle of horrendously long record lines. There are still a
	couple of tricks you may be able to use. One is to find a
	landmark somewhere in mid-line. Does column 158 always contain
	either a "*" or a "\|" character, neither of which can appear
	anywhere else in the lines? Then you can make the above change
	in columns 163 through 165 by typing:

	:%s/$[*\|]....$B27/\1K53

	Failing a landmark, let the editor count out a long string of
	dots for you. To use this technique, you must first create your
	substitution command as a text line within the file you are
	editing, next write that line as a separate file (and then delete
	the command line from your original file), and finally use the
	:so command to pull in that one-line file and run it as a
	line-mode command. If you need a string of 92 consecutive dots in
	your command, create a blank line at the end of your file, next
	type:

	:1,92g/^/$s/^/.

	to put those 92 dots there, and finally put the rest of the
	command around that dot string.

	MULTI-CHARACTER COLUMNS

	The other meaning of "editing in columns" has to do with text
	rather than data files. It refers to tables of data such as you
	might find accompanying a technical article, columns of text
	and/or illustrations running in parallel as you'd find on a
	newspaper page, and the like.

	Yes, Unix formatting utilities and some word processing programs
	will format your final output into columns. But you may not have
	all these utilities, you may not want to spend time trying to get
	the results you want from those benighted programs, or you may
	plan to direct your output where formatters won't work.

	Visually editing the columns of data in a table requires little
	explanation. The one thing to remember: use the R as far as
	possible, to avoid shifting subsequent columns out of alignment
	inadvertently. This holds for creating tables, too; start by
	setting up a rectangular block of space characters, then replace
	spaces with the column entries you want, to keep your next entry
	from misaligning previous ones. This is also the best way to
	create pictures, diagrams, graphs and maps using ASCII
	characters.

	Things become problematic when you want to shift whole columns
	around -- there are no built-in Vi facilities for doing this.
	Here is what it is practical to do in the editor. As a real life
	example, consider the piece below, which I use as the tail end of
	Usenet (Net news) posts that announce Indonesian classical music
	and dance performances at a local restaurant:

	It's at the Dutch East Indies ;,,,,;,,,,;,,,,;,,,,;
	Restaurant on Oakland's /%%%%%%%%%%%%%%%%%%%%%\
	downtown waterfront. The /%%%%%%%%%%%%%%%%%%%%%%%\
	food there is very good "\|""\|"""\|"""""\|"""\|""\|"
	Indonesian cuisine at _\|__\|___\|_ _\|___\|__\|_
	reasonable prices - dinners =\|==\|===\|=====\|===\|==\|=
	$8.95 to $17.50. Views are ~~~~~~~~~~~( (~~~~~~~~~~~
	spectacular from the second ) )
	floor picture windows, out
	over the water to Jack London Square, Alameda and San Francisco.
	Formality is medium - cloth napkins and oil candles at the
	tables, but no supercilious waiters, and the wall decorations are
	mostly Indonesian handicrafts. The phone number for information
	and reservations is 510/444-6555.

	( ( ( \| Broadway \|\|I The Dutch East Indies
	) ) )Jack London \|==========\|\|== Restaurant is in Jack London
	( ( ( Square \|E \|\|8 Village, a boutiques &
	) ) ) \|m \|\|8 bistros cluster that is just
	( ( ( JACK LONDON \|b \|\|0 down the estuary from Jack
	) ) )VILLAGE \|a \|\| London Square. Jack London
	( ( ( Alice\|r Amtrak \|\|f Village is rustic,
	) ) ) -----------\|c station \|\|r picturesque, quiet and safe.
	( ( ( Street\|a \|\|e To get there from the
	) ) ) \|d Jackson\|\|e Interstate 880 freeway
	( ( ( parking lot \|e ------\|\|-- heading north, take the Oak
	) ) ) \|r Street\|\|w Street exit and turn left;
	( ( ( \|o \|\|a five blocks will bring you
	) ) ) \| \|\|y to Embarca- dero on your
	( ( ( -------------\|\|-- right, just before Oak
	) ) ) Oak Street\|\| curves away to the left.

	(Going south on I-880, take the Jackson Street exit and go two
	blocks straight ahead before you turn right on Oak Street.) Turn
	right onto Embarcadero and go three blocks, until you go under an
	overpass of Victorian ironwork. Immediately turn left onto Alice
	Street, where you will see Jack London Village on your right, and
	a large lot that offers validated parking on the left. Walk into
	the Village's central courtyard, and you'll see the Dutch East
	Indies on the estuary side, toward the right, and upstairs.

	To create this, I started by drawing the stylized building and
	then the map. In each case I created a large rectangular block
	of space characters, then began trying ideas with the R command
	until I had something that satisfied me. (The pavilion sketch
	eventually became wider than I had planned, so I had to run a
	:%s/.*/ & / command to give me more working space.) Next I
	put additional blocks of space characters on the left of the
	drawing and the right of the map, to make a place for the text I
	wanted to include. Then I started replacing spaces with text,
	rewriting the text as I went along to fit it in nicely. When the
	text reached the bottom of the figure I was fitting it to, I went
	to full-width text lines, entering them the usual way. A tedious
	labor, but pretty straightforward.

	Now suppose I decided to redo this piece, by moving the picture
	to where the map is now, and vice versa. A few well chosen
	substitution and deletion commands would make copies of the two
	figures minus the text, and I could just as easily copy the text
	without the two figures. But how would I recombine them?

	Short of typing the text in again from scratch, the best I could
	do is to yank the lines of each figure, one at a time, and put
	them after (or before) the appropriate text lines, one at a time.
	Not that I would have to move back and forth between files with
	each yank and put; I could yank up to 26 lines into the named
	buffers, then move to the other file and put all 26 in their
	proper places. But there is no Vi command to yank a rectangular
	block of characters.

	Also take note that I should yank using addresses that are not
	line addresses, even though I will be yanking whole lines. If I
	should yank with line addresses, putting the pieces into the
	other file must make those pieces separate lines -- then I would
	have to join each pair of lines to create the columns I want.

	Next Time Around

	In the next part of this tutorial, I will go over host of
	complications and opportunities that come from allowing the
	replacement commands I've discussed to use metacharacters. Then
	I'll answer a couple of questions from readers that should be of
	use to quite a few of you from time to time.

	Part 7: The Replacement Commands
	Back to the index