6. OTHER ISSUES

6.1. I have a certain problem that stumps me. Where can I get help?

  Newsgroups:

     - alt.comp.editors.batch  (best choice)
     - comp.editors
     - comp.unix.questions
     - comp.unix.shell

  Send e-mail to:  [141][email protected]

  Your question will be posted on the "sed-users" mailing list, where
  many sed users will be able to see your question. Sending your
  question will not automatically subscribe you to the list.

6.2. How does sed compare with awk, perl, and other utilities?

  Awk is a much richer language with many features of a programming
  language, including variable names, math functions, arrays, system
  calls, etc. Its command structure is similar to sed:

     address { command(s) }

  which means that for each line or range of lines that matches the
  address, execute the command(s). In both sed and awk, an address
  can be a line number or a RE somewhere on the line, or both.

  In program size, awk is 3-10 times larger than sed. Awk has most
  of the functions of sed, but not all. Notably, sed supports
  backreferences (\1, \2, ...) to previous expressions, and awk does
  not have any comparable function or syntax.

  Perl is a general-purpose programming language, with many features
  beyond text processing and interprocess communication, taking it
  well past awk or other scripting languages. Perl supports every
  feature sed does and has its own set of extended regular
  expressions, which give it extensive power in pattern matching and
  processing. (Note: the standard perl distribution comes with 's2p',
  a sed-to-perl conversion script. See section 3.6 for more info.)
  Like sed and awk, perl scripts do not need to be compiled into
  binary code. Like sed, perl can also run many useful "one-liners"
  from the command line, though with greater flexibility; see
  question 4.3 ("How do I make substitutions in every file in a
  directory, or in a complete directory tree?").

  On the other hand, the current version of perl is from 8 to 35
  times larger than sed in its executables alone (perl's library
  modules and allied files not included!). Further, for most simple
  tasks such as substitution, sed executes more quickly than either
  perl or awk. All these utilities serve to process input text,
  transforming it to meet our needs . . . or our arbitrary whims.

6.3. When should I use sed?

  When you need a small, fast program to modify words, lines, or
  blocks of lines in a textfile.

6.4. When should I NOT use sed?

  You should not use sed when you have "dedicated" tools which can do
  the job faster or with an easier syntax. Do not use sed when you
  only want to:

  - delete individual characters. Instead of "s/[abcd]//g", use

       tr -d "[a-d]"

  - squeeze sequential characters. Instead of "s/ee*/e/g", use

       tr -s "{character-set}"

  - change individual characters. Instead of "y/abcdef/ABCDEF/", use

       tr "[a-f]" "[A-F]"

  - print individual lines, based on patterns within the line itself.
    Instead, use "grep".

  - print blocks of lines, with 1 or more lines of context above
    and/or below a specific regular expression. Instead, use the GNU
    version of grep as follows:

       grep -A{number} -B{number}

  - remove individual lines, based on patterns within the line
    itself. Instead, use "grep -v".

  - print line numbers.  Instead, use "nl" or "cat -n".

  - reformat lines or paragraphs. Instead, use "fold", "fmt" or "par".

  Though sed can perfectly emulate certain functions of cat, grep,
  nl, rev, sort, tac, tail, tr, uniq, and other utilities, producing
  identical output, the native utilities are usually optimized to do
  the job more quickly than sed.

6.5. When should I ignore sed and use Awk or Perl instead?

  If you can write the same script in Awk or Perl and do it in less
  time, then use Perl or Awk. There's no reason to spend an hour
  writing and debugging a sed script if you can do it in Perl in 10
  minutes (assuming that you know Perl already) and if the processing
  time or memory use is not a factor. Don't hunt pheasants with a .22
  if you have a shotgun at your side . . . unless you simply enjoy
  the challenge!

  Specifically, if you need to:

  - heavily comment what your scripts do. Use GNU sed, awk, or perl.
  - do case insensitive searching. Use gsed302, sedmod, awk or perl.
  - count fields (words) in a line. Use awk.
  - count lines in a block or objects in a file. Use awk.
  - check lengths of strings or do math operations. Use awk or perl.
  - handle very long lines or need very large buffers. Use gsed or perl.
  - handle binary data (control characters). Use perl (binmode).
  - loop through an array or list. Use awk or perl.
  - test for file existence, filesize, or fileage. Use perl or shell.
  - treat each paragraph as a line. Use awk or perl.
  - indicate /alternate|options/ in regexes. Use gsed, awk or perl.
  - use syntax like \xNN to match hex codes. Use gsed-3.02.80 or perl.
  - use (nested (regexes)) with backreferences. Use perl.

  Perl lovers: I know that perl can do everything awk can do, but
  please don't write me to complain. Why heft a shotgun when a .45
  will do? As we all know, "There is more than one way to do it."

6.6. Known limitations among sed versions

  Limits on distributed versions, although source code for most
  versions of free sed allows for modification and recompilation.
  The term "no limit" when used below means there is no "fixed"
  limit. Limits are actually determined by one's hardware, memory,
  operating system, and which C library is used to compile sed.

6.6.1. Maximum line length

     GNU sed 3.02: no limit
     GNU sed 2.05: no limit
     sedmod 1.0:   4096 bytes
     HHsed:        4000 bytes

6.6.2. Maximum size for all buffers (pattern space + hold space)

     GNU sed 3.02: no limit
     GNU sed 2.05: no limit
     sedmod 1.0:   4096 bytes
     HHsed:        4000 bytes

6.6.3. Maximum number of files that can be read with read command

     GNU sed 3.02: no limit
     GNU sed 2.05: total no. of r and w commands may not exceed 32
     sedmod 1.0:   total no. of r and w commands may not exceed 20

6.6.4. Maximum number of files that can be written with 'w' command

     GNU sed 3.02: no limit (but typical Unix is 253)
     GNU sed 2.05: total no. of r and w commands may not exceed 32
     sedmod 1.0:   10
     HHsed:        10

6.6.5. Limits on length of label names

     BSD sed:      8 characters
     GNU sed 3.02: no limit
     GNU sed 2.05: no limit
     HHsed:        no limit

6.6.6. Limits on length of write-file names

     BSD sed:      40 characters
     GNU sed 3.02: no limit
     GNU sed 2.05: no limit
     HHsed:        no limit

6.6.7. Limits on branch/jump commands

     HHsed:        50

  As a practical consequence, this means that HHsed will not read
  more than 50 lines into the pattern space via an N command, even if
  the pattern space is only a few hundred bytes in size. HHsed exits
  with an error message, "infinite branch loop at line {nn}".

6.7. Known bugs among sed versions

A. GNU sed v3.02.80

  (1) N does not discard the contents of the pattern space upon
  reaching the end of file; not a bug. See section 6.8.6, below.

B. GNU sed v3.02

  (1) Affects only v3.02 binaries compiled with DJGPP for MS-DOS and
  MS-Windows: 'l' (list) command does not display a lone carriage
  return (0x0D, ^M) embedded in a line.

  (2) The expression "\<" causes problems when attempting the
  following types of substitutions, which should print "+aaa +bbb":

     echo aaa bbb | sed 's/\</+/g'    # prints "+a+a+a +b+b+b"
     echo aaa bbb | sed 's/\<./+&/g'  # prints "+a+a+a +b+b+b"

  (3) The N command no longer discards the contents of the pattern
  space upon reaching the end of file. This is not a bug, it's a
  feature. See section 6.8.6 "Commands which operate differently".

C. GNU sed v2.05

  (1) If a number follows the substitute command (e.g., s/f/F/10) and
  the number exceeds the possible matches on the pattern space, the
  command 't label' always jumps to the specified label. 't' should
  jump only if the substitution was successful (or returned "true").

  (2) 'l' (list) command does not convert the following characters to
  hex values, but passes them through unchanged: 0xF7, 0xFB, 0xFC,
  0xFD, 0xFE.

  (3) A range address like "/foo/,14d" should delete every line from
  the first occurrence of "foo" until line 14, inclusive, and then if
  /foo/ occurs thereafter, delete only those lines. In gsed 2.05, if
  a second "foo" occurs in the file, that line and everything to the
  end of file will be deleted (since gsed is looking for line 14 to
  occur again!).

  (4) The regex /\'/ is not interpreted as an apostrophe or a single
  quote mark, as it should be. Instead, it is interpreted as $,
  representing the end-of-line! This can be proven by these tests:

     echo hello | gsed "/\'/d"        # entire line is deleted!
     echo hello | gsed "s/\'/YYY/"    # 'YYY' appended to string

  (5) Multiple occurrences of the 'w' command fail, as shown here,
  given that both "aaa" and "bbb" occur within the file:

     gsed -e "/aaa/w FILE" -e "/bbb/w FILE" input.txt

  (6) The expression "\<" causes problems when attempting the
  following type of substitution, which should print "+aaa +bbb":

     echo aaa bbb | sed 's/\</+/g'    # sed hangs up with no output

  The syntax 's/\<./+&/g' issues the proper output.

D. GNU sed v1.18

  (1) same as #1 for GNU sed v2.05, above.

  (2) The following command will lock the computer under Win95. Echos
  is an echo command that does not issue a trailing newline:

     echos any_word | gsed "s/[ ]*$//"

  (3) same as #3 for GNU sed v2.05, above.

E. GNU sed v1.03 (by Frank Whaley)

  (1) The \w and \W escape sequences both match only nonword
  characters. \w is misdefined and should match word characters.

  (2) The underscore is defined as a nonword character; it should be
  defined as a word character.

  (3) same as #3 for GNU sed v2.05, above.

F. HHsed v1.5 (by Howard Helman)

  (1) If a number follows the substitute command (e.g., s/foo/bar/2),
  in a sed script entered from the command line, two semicolons must
  follow the number, or they must be separated by an -e switch.
  Normally, only 1 semicolon is needed to separate commands.

     echo bit bet | HHsed "s/b/n/2;;s/b/B/"          # solution 1
     echo bit bet | HHsed -e "s/b/n/2" -e "s/b/B"    # solution 2

  (2) If the substitute command is followed by a number and a "p"
  flag, when the -n switch is used, the "p" flag must occur first.

     echo aaa | HHsed -n "s/./B/3p"    # bug! nothing prints
     echo aaa | HHsed -n "s/./B/p3"    # prints "aaB" as expected

  (3) The following commands will cause HHsed to lock the computer
  under MS-DOS or Win95. Note that they occur because of malformed
  regular expressions which will match no characters.

     sed -n "p;s/\<//g;" file
     sed -n "p;s/[char-set]*//g;" file

  (4) The range command '/RE1/,/RE2/' in HHsed will match one line if
  both regexes occur on the same line (see section 6.8.5, below).
  Though this could be construed as a feature, it should probably be
  considered a bug since its operation differs from every other
  version of sed. For example, '/----/,/----/{s/^/>>/;}' should put
  two angle brackets ">>" before every line which is sandwiched
  between a row of 4 or more hyphens. With HHsed, this command will
  only prefix the hyphens themselves with the angle brackets.

  (5) If the hold space is empty, the H command copies the pattern
  space to the hold space but fails to prepend a leading newline. The
  H command is supposed to add a newline, followed by the contents of
  the pattern space, to the hold space at all times. A workaround is
  "{G;s/^\(.*\)\(\n\)$/\2\1/;H;s/\n$//;}", but it requires knowing
  that the hold space is empty and using the command only once.
  Another alternative is to use the G or the A command alone at key
  points in the script.

  (6) If grouping is followed by an '*' or '+' operator, HHsed does
  not match the pattern, but issues no warning. See below:

     echo aaa | HHsed "/\(a\)*/d"      # nothing is deleted
     echo aaa | HHsed "/\(a\)+/d"      # nothing is deleted
     echo aaa | HHsed "s/\(a\)*/\1B/"  # nothing is changed
     echo aaa | HHsed "s/\(a\)+/\1B/"  # nothing is changed

  (7) If grouping is followed by an interval expression, HHsed halts
  with the error message "garbled command", in all of the following
  examples:

     echo aaa | HHsed "/\(a\)\{3\}/d"
     echo aaa | HHsed "/\(a\)\{1,5\}/d"
     echo aaa | HHsed "s/\(a\)\{3\}/\1B/"

  (8) In interval expressions, 0 is not supported. E.g., \{0,3\)

G. sedmod v1.0 (by Hern Chen)

  Technically, the following are limits (or features?) of sedmod, not
  bugs, since the docs for sedmod do not claim to support these
  missing features.

  (1) sedmod does not support standard range arguments \{...\}
  present in nearly all versions of sed.

  (2) If grouping is followed by an '*' or '+' operator, sedmod gives
  a "garbled command" message. However, if the grouped expressions
  are strings literals with no metacharacters, a partial workaround
  can be done like so:

     \(string\)\1*    # matches 1 or more instances of 'string'
     \(string\)\1+    # matches 2 or more instances of 'string'

  (3) sedmod does not support a numeric argument after the s///
  command, as in 's/a/b/3', present in nearly all versions of sed.

  The following are bugs in sedmod v1.0:

  (4) When the -i (ignore case) switch is used, the '/regex/d'
  command is not properly obeyed. Sedmod may miss one or more lines
  matching the expression, regardless of where they occur in the
  script. Workaround: use "/regex/{d;}" instead.

H. HP-UX sed

  (1) Versions of HP-UX sed up to and including version 10.20 are
  buggy. According to the README file, which comes with the GNU cc
  at <[142]ftp://ftp.ntua.gr/pub/gnu/sed-2.05.bin.README>:

  "When building gcc on a hppa*-*-hpux10 platform, the `fixincludes'
  step (which involves running a sed script) fails because of a bug
  in the vendor's implementation of sed.  Currently the only known
  workaround is to install GNU sed before building gcc.  The file
  sed-2.05.bin.hpux10 is a precompiled binary for that platform."

I. SunOS 4.1 sed

  (1) Bug occurs in RE pattern matching when a non-null '[char-set]*'
  is followed by a null '\NUM' pattern recall, illustrated here and
  reported by Greg Ubben:

     s/\(a\)\(b*\)cd\1[0-9]*\2foo/bar/  # between '[0-9]*' and '\2'
     s/\(a\{0,1\}\).\{0,1\}\1/bar/      # between '.\{0,1\}' and '\1'

  Workaround: add a do-nothing 'X*' expression which will not match
  any characters on the line between the two components. E.g.,

     s/\(a\)\(b*\)cd\1[0-9]*X*\2foo/bar/
     s/\(a\{0,1\}\).\{0,1\}X*\1/bar/

J. SunOS 5.6 sed

  (1) If grouping is followed by an asterisk, SunOS sed does not match
  the null string, which it should do. The following command:

     echo foo | sed 's/f\(NO-MATCH\)*/g\1/'

  should transform "foo" to "goo" under normal versions of sed.

K. Ultrix 4.3 sed

  (1) If grouping is followed by an asterisk, Ultrix sed replies with
  "command garbled", as shown in the following example:

     echo foo | sed 's/f\(NO-MATCH\)*/g\1/'

  (2) If grouping is followed by a numeric operator such as \{0,9\},
  Ultrix sed does not find the match.

L. Digital Unix sed

  (1) The following comes from the man pages for sed distributed with
  new, 1998 versions of Digital Unix (reformatted to fit our
  margins):

  [Digital]  The h subcommand for sed does not work properly.  When
  you use the  h subcommand to place text into the hold area, only
  the last line of the specified text is saved.  You can use the H
  subcommand to append text to the hold area. The H subcommand and
  all others dealing with the hold area work correctly.

  (2) "$d" command issues an error message, "cannot parse".  Reported
  by Carlos Duarte on 8 June 1998.

6.8. Known incompatibilities between sed versions

6.8.1. Issuing commands from the command line

  Most versions of sed permit multiple commands to issued on the
  command line, separated by a semicolon (;). Thus,

     sed 'G;G' file

  should triple-space a file. However, certain commands REQUIRE
  separate expressions on the command line. These include:

     - all labels (':a', ':more', etc.)
     - all branching instructions ('b', 't')
     - commands to read and write files ('r' and 'w')
     - any closing brace, '}'

  If these commands are used, they must be the LAST commands of an
  expression. Subsequent commands must use another expression
  (another -e switch plus arguments).  E.g.,

     sed  -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/\( *\)\1/\1/' files

  GNU sed and HHsed v1.5 allow these commands to be followed by a
  semicolon, and the previous script can be written like this:

     sed  ':a;s/^.\{1,77\}$/ &/;ta;s/\( *\)\1/\1/' files

  Versions differ in implementing the 'a' (append), 'c' (change), and
  'i' (insert) commands:

     sed "/foo/i New text here"              # HHsed/sedmod/gsed-30280
     gsed -e "/foo/i\\" -e "New text here"   # GNU sed
     sed1 -e "/foo/i" -e "New text here"     # one version of sed
     sed2 "/foo/i\ New text here"            # another version

6.8.2. Using comments (prefixed by the '#' sign)

  Most versions of sed permit comments to appear in sed scripts only
  on the first line of the script. Comments on line 2 or thereafter
  are not recognized and will generate an error like "unrecognized
  command" or "command [bad-line-here] has trailing garbage".

  GNU sed, HHsed, sedmod, and HP-UX sed permit comments to appear on
  any line of the script, except after labels and branching commands
  (b,t), provided that a semicolon (;) occurs after the command
  itself. This syntax makes sed similar to awk and perl, which use a
  similar commenting structure in their scripts.  Thus,

     # GNU style sed script
     $!N;                        # except for last line, get next line
     s/^\([0-9]\{5\}\).*\n\1.*//;    # if first 5 digits of each line
                                     # match, delete BOTH lines.
     t skip
     P;                              # print 1st line only if no match
     :skip
     D;                    # delete 1st line of pattern space and loop
     #---end of script---

  is a valid script for GNU sed and Helman's sed, but is unrecognized
  for most other versions of sed.

6.8.3. Special syntax in REs

A. GNU sed v2.05 and higher versions

  BEGIN~STEP selection: GNU sed can select a series of lines in the
  form M~N, where M and N are integers (with gsed v2.05, M must be
  less than N). Beginning at line M (M may equal 0), every Nth line
  is selected. Thus,

     gsed '1~3d' file    # delete every 3d line, starting with line 1
                         # deletes lines 1, 4, 7, 10, 13, 16, ...

     gsed -n '2~5p' file # print every 5th line, starting with line 2
                         # prints lines 2, 7, 12, 17, 22, 27, ...

  With gsed v3.02, M may be any valid line number. With gsed v2.05,
  if M is greater than or equal to N (the STEP value), nothing will
  be selected, except in one pointless case, 0~0, which selects every
  line.

  The following expressions can be used for /RE/ addresses or in the
  LHS side of a substitution:

     \`  - matches the beginning of the pattern space (same as "^")
     \'  - matches the end of the pattern space (same as "$")
     \?  - 0 or 1 occurrences of previous character: same as \{0,1\}
     \+  - 1 or more occurrences of previous character: same as \{1,\}
     \|  - matches the string on either side, e.g., foo\|bar
     \b  - boundary between word and nonword chars (reversible)
     \B  - boundary between 2 word or between 2 nonword chars
     \n  - embedded newline (usable after N, G, or similar commands)
     \w  - any word character: [A-Za-z0-9_]
     \W  - any nonword char: [^A-Za-z0-9_]
     \<  - boundary between nonword and word character
     \>  - boundary between word and nonword character

  On \b, \B, \<, and \>, see section 6.8.4 ("Word boundaries"),
  below.

  Beginning with version 3.02.80, the following escape sequences can
  now be used on both sides of a "s///" substitution:

     \a      "alert" beep     (BEL, Ctrl-G, 0x07)
     \f      formfeed         (FF, Ctrl-L, 0x0C)
     \n      newline          (LF, Ctrl-J, 0x0A)
     \r      carriage-return  (CR, Ctrl-M, 0x0D)
     \t      horizontal tab   (HT, Ctrl-I, 0x09)
     \v      vertical tab     (VT, Ctrl-K, 0x0B)
     \oNNN   a character with the octal value NNN
     \dNNN   a character with the decimal value NNN
     \xNN    a character with the hexadecimal value NN

  Note that gsed does not have any syntax for designating characters
  in octal or hex notation. Traditionally, \ooo or \hh or \xhh have
  been used by the GNU project to do this, but they are not (yet)
  implemented in gsed. Note that GNU sed also supports "character
  classes", a POSIX extension to regexes, described in section 3.7,
  above.

B. GNU sed v1.03 (by Frank Whaley)

  When used with the -x (extended) switch on the command line, or
  when '#x' occurs as the first line of a script, Whaley's gsed103
  supports the following expressions in both the LHS and RHS of a
  substitution:

     \|      matches the expression on either side
     ?       0 or 1 occurrences of previous RE: same as \{0,1\}
     +       1 or more occurrence of previous RE: same as \{1,\}
     \a      "alert" beep     (BEL, Ctrl-G, 0x07)
     \b      backspace        (BS, Ctrl-H, 0x08)
     \f      formfeed         (FF, Ctrl-L, 0x0C)
     \n      newline          (LF, Ctrl-J, 0x0A)
     \r      carriage-return  (CR, Ctrl-M, 0x0D)
     \t      horizontal tab   (HT, Ctrl-I, 0x09)
     \v      vertical tab     (VT, Ctrl-K, 0x0B)
     \bBBB   binary char, where BBB are 1-8 binary digits, [0-1]
     \dDDD   decimal char, where DDD are 1-3 decimal digits, [0-9]
     \oOOO   octal char, where OOO are 1-3 octal digits, [0-7]
     \xXX    hex char, where XX are 1-2 hex digits, [0-9A-F]

  In normal mode, with or without the -x switch, the following escape
  sequences are also supported in regex addressing or in the LHS of a
  substitution:

     \`      matches beginning of pattern space: same as /^/
     \'      matches end of pattern space: same as /$/
     \B      boundary between 2 word or 2 nonword characters
     \w      any nonword character [*BUG!* should be a word char]
     \W      any nonword character: same as /[^A-Za-z0-9]/
     \<      boundary between nonword and word char
     \>      boundary between word and nonword char

C. HHsed v1.5 (by Howard Helman)

  The following expressions can be used for /RE/ addresses or in the
  LHS and RHS side of a substitution:

     +    - 1 or more occurrences of previous RE: same as \{1,\}
     \a   - bell         (ASCII 07, 0x07)
     \b   - backspace    (ASCII 08, 0x08)
     \e   - escape       (ASCII 27, 0x1B)
     \f   - formfeed     (ASCII 12, 0x0C)
     \n   - newline      (printed as 2 bytes, 0D 0A or ^M^J, in DOS)
     \r   - return       (ASCII 13, 0x0D)
     \t   - tab          (ASCII 09, 0x09)
     \v   - vertical tab (ASCII 11, 0x0B)
     \xhh - the ASCII character corresponding to 2 hex digits hh.
     \<   - boundary between nonword and word character
     \>   - boundary between word and nonword character

D. sedmod v1.0 (by Hern Chen)

  The following expressions can be used for /RE/ addresses in the LHS
  of a substitution:

     +    - 1 or more occurrences of previous RE: same as \{1,\}
     \a   - any alphanumeric: same as [a-zA-Z0-9]
     \A   - 1 or more alphas: same as \a+
     \d   - any digit: same as [0-9]
     \D   - 1 or more digits: same as \d+
     \h   - any hex digit: same as [0-9a-fA-F]
     \H   - 1 or more hexdigits: same as \h+
     \l   - any letter: same as [A-Za-z]
     \L   - 1 or more letters: same as \l+
     \n   - newline      (read as 2 bytes, 0D 0A or ^M^J, in DOS)
     \s   - any whitespace character: space, tab, or vertical tab
     \S   - 1 or more whitespace chars: same as \s+
     \t   - tab          (ASCII 09, 0x09)
     \<   - boundary between nonword and word character
     \>   - boundary between word and nonword character

  The following expressions can be used in the RHS of a substitution.
  "Elements" refer to \1 .. \9, &, $0, or $1 .. $9:

     &    - insert regexp defined on LHS
     \e   - end case conversion of next element
     \E   - end case conversion of remaining elements
     \l   - change next element to lower case
     \L   - change remaining elements to lower case
     \n   - newline      (printed as 2 bytes, 0D 0A or ^M^J, in DOS)
     \t   - tab          (ASCII 09, 0x09)
     \u   - change next element to upper case
     \U   - change remaining elements to upper case
     $0   - insert pattern space BEFORE the substitution
     $1-$9 - match Nth word on the pattern space

E. UnixDos sed

  The following expressions can be used in text, LHS, and RHS:

     \n   - newline      (printed as 2 bytes, 0D 0A or ^M^J, in DOS)

6.8.4. Word boundaries

  GNU sed, HHsed, and sedmod use certain symbols to define the
  boundary between a "word character" and a nonword character. A word
  character fits the regex "[A-Za-z0-9_]". Note: a word character
  includes the underscore "_" but not the hyphen, probably because
  the underscore is permissible as a label in sed and in other
  scripting languages. (In gsed103, a word character did NOT include
  the underscore; it included alphanumerics only.)

  These symbols include '\<' and '\>' (gsed, HHsed, sedmod) and '\b'
  and '\B' (gsed only). Note that the boundary symbols do not
  represent a character, but a position on the line. Word boundaries
  are used with literal characters or character sets to let you match
  (and delete or alter) whole words without affecting the spaces or
  punctuation marks outside of those words. They can only be used in
  a "/pattern/" address or in the LHS of a 's/LHS/RHS/' command. The
  following table shows how these symbols may be used in HHsed and
  GNU sed. Sedmod matches the syntax of HHsed.

     Match position      Possible word boundaries   HHsed   GNU sed
     ---------------------------------------------------------------
     start of word    [nonword char]^[word char]      \<    \< or \b
     end of word         [word char]^[nonword char]   \>    \> or \b
     middle of word      [word char]^[word char]     none      \B
     outside of word  [nonword char]^[nonword char]  none      \B
     ---------------------------------------------------------------

6.8.5. Range addressing with GNU sed and HHsed

  When addressing a range of lines, as in the following example to
  delete all lines between /RE1/ and /RE2/,

     sed '/RE1/,/RE2/d' file

  if /RE1/ and /RE2/ both occur on the same line, HHsed will delete
  that single line and then look forward in the file for the next
  occurrence of /RE1/ to attempt the deletion. GNU sed will match the
  first line containing /RE1/ but will look forward to the next and
  succeeding lines to match /RE2/. If /RE1/ and /RE2/ cannot be found
  on two different lines, nothing will be deleted.

  GNU sed v2.05 has a bug in range addressing (see section 6.7.C(3),
  above). This was fixed in gsed v3.02.

  GNU sed v3.02.80 supports 0 in range addressing, which means that
  the range "0,/RE/" will match every line from the top of the file
  to the first line containing /RE/, inclusive, and if /RE/ occurs on
  the first line of the file, only line 1 will be matched.

6.8.6. Commands which operate differently

A. GNU sed version 3.02 and 3.02.80

  The N command no longer discards the contents of the pattern space
  upon reaching the end of file. This is not a bug, it's a feature.
  However, it breaks certain scripts which relied on the older
  behavior of N.

  'N' adds the Next line to the pattern space, enabling multiple
  lines to be stored and acted upon. Upon reaching the last line of
  the file, if the N command was issued again, the contents of the
  pattern space would be silently deleted and the script would abort
  (this has been the traditional behavior). For this reason, sed
  users generally wrote:

     $!N;   # to add the Next line to every line but the last one.

  However, certain sed scripts relied on this behavior, such as the
  script to delete trailing blank lines at the end of a file (see
  script #12 in section 3.2, "Common one-line sed scripts", above).
  Also, classic textbooks such as Dale Dougherty and Arnold Robbins'
  sed & awk documented the older behavior.

  The GNU sed maintainer felt that despite the portability problems
  this would cause, changing the N command to print (rather than
  delete) the pattern space was more consistent with one's intuitions
  about how a command to "append the Next line" ought to behave.
  Another fact favoring the change was that "{N;command;}" will
  delete the last line if the file has an odd number of lines, but
  print the last line if the file has an even number of lines.

  To convert scripts which used the former behavior of N (deleting
  the pattern space upon reaching the EOF) to scripts compatible with
  all versions of sed, change a lone "N;" to "$d;N;".


[end-of-file]

References

  Visible links
  1. mailto:[email protected]
  2. http://www.cornerstonemag.com/sed/sedfaq.html
  3. mailto:[email protected]
  4. http://www.cornerstonemag.com/sed/sedfaq.html
  5. http://www.cornerstonemag.com/sed/sedfaq.txt
  6. http://www.dbnet.ece.ntua.gr/~george/sed/sedfaq.html
  7. http://www.dbnet.ece.ntua.gr/~george/sed/sedfaq.txt
  8. http://www.ptug.org/sed/sedfaq.html
  9. http://www.faqs.org/faqs/editor-faq/sed
 10. ftp://rtfm.mit.edu/pub/faqs/editor-faq/sed
 11. http://www.dreamwvr.com/sed-info/sed-faq.html
 12. mailto:[email protected]
 13. http://www.opengroup.org/onlinepubs/7908799/xbd/re.html#tag_007_003
 14. http://www.cornerstonemag.com/
 15. http://www.faqs.org/faqs/computer-lang/awk/faq/
 16. ftp://rtfm.mit.edu/pub/usenet/comp.lang.awk/faq
 17. http://www.perl.com/perl/FAQ
 18. http://www.perl.com/CPAN/doc/FAQs/FAQ/html/index.html
 19. ftp://ftp.cdrom.com/pub/perl/CPAN/doc/FAQs/FAQ
 20. http://www.columbia.edu/~rh120/ch106.x09
 21. http://www.gnu.org/philosophy/free-sw.html).
 22. ftp://alpha.gnu.org/pub/gnu/sed/sed-3.02.80.tar.gz
 23. ftp://ftp.gnu.org/pub/gnu/sed/sed-3.02.tar.gz
 24. http://www.ensta.fr/internet/unix/GNU-archives.html
 25. http://www.debian.org/Packages/unstable/base/sed.html
 26. http://www.debian.org/Packages/stable/base/sed.html
 27. ftp://ftp.ntua.gr/pub/bsd/4.4BSD/usr/src/usr.bin/sed/
 28. http://www.gnu.org/
 29. http://earthspace.net/~esr/sed-1.3.tar.gz
 30. mailto:[email protected]
 31. http://www.tuxedo.org/~esr/
 32. http://www2s.biglobe.ne.jp/~vtgf3mpr/gnu/sed.htm
 33. http://oak.oakland.edu/pub/os2/editors/gnused.zip
 34. http://oak.oakland.edu/pub/os2/emx09c/emxrt.zip
 35. http://oak.oakland.edu/pub/os2/editors/sed106.zip
 36. http://www.cornerstonemag.com/sed/sed3028a.zip
 37. ftp://alpha.gnu.org/pub/gnu/sed/sed-3.02.80.tar.gz
 38. ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed302b.zip
 39. ftp://ftp.cdrom.com/.27/simtelnet/gnu/djgpp/v2gnu/sed302b.zip
 40. ftp://ftp.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed302s.zip
 41. ftp://ftp.cdrom.com/.27/simtelnet/gnu/djgpp/v2gnu/sed302s.zip
 42. http://www.simtel.net/pub/simtelnet/win95/prog/gsed205b.zip
 43. ftp://ftp.cdrom.com/.27/simtelnet/win95/prog/gsed205b.zip
 44. ftp://ftp.itribe.net/pub/virtunix/gnused.zip
 45. http://www.itribe.net/virtunix/
 46. http://sourceware.cygnus.com/cygwin/
 47. ftp://agnes.dida.physik.uni-essen.de/home/janjaap/mingw32/binaries/sed-2.05.zip
 48. http://agnes.dida.physik.uni-essen.de/~janjaap/mingw32/download.html
 49. http://www.dbnet.ece.ntua.gr/~george/sed/sed15.exe
 50. http://www.cornerstonemag.com/sed/sed15exe.zip
 51. ftp://ftp.simtel.net/pub/simtelnet/msdos/txtutl/sed15.zip
 52. ftp://ftp.cdrom.com/pub/simtelnet/msdos/txtutl/sed15.zip
 53. ftp://oak.oakland.edu/pub/simtelnet/msdos/txtutl/sed15.zip
 54. ftp://uiarchive.uiuc.edu/pub/systems/pc/simtelnet/msdos/txtutl/sed15.zip
 55. ftp://ftp.simtel.net/pub/simtelnet/msdos/txtutl/sed15x.zip
 56. ftp://ftp.cdrom.com/pub/simtelnet/msdos/txtutl/sed15x.zip
 57. ftp://oak.oakland.edu/pub/simtelnet/msdos/txtutl/sed15x.zip
 58. ftp://uiarchive.uiuc.edu/pub/systems/pc/simtelnet/msdos/txtutl/sed15x.zip
 59. http://www.ptug.org/sed/SEDMOD10.ZIP
 60. http://www.cornerstonemag.com/sed/sedmod10.zip
 61. ftp://garbo.uwasa.fi/pc/unix/sedmod10.zip
 62. http://www.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed118b.zip
 63. ftp://ftp.cdrom.com/pub/simtelnet/gnu/djgpp/v2gnu/sed118b.zip
 64. http://www.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/sed118s.zip
 65. ftp://ftp.cdrom.com/pub/simtelnet/gnu/djgpp/v2gnu/sed118s.zip
 66. http://www.simtel.net/pub/simtelnet/gnu/gnuish/sed106.zip
 67. ftp://ftp.cdrom.com/pub/simtelnet/gnu/gnuish/sed106.zip
 68. http://oak.oakland.edu/pub/cpm/txtutl/ssed22.lbr
 69. http://oak.oakland.edu/pub/cpm/txtutl/ttools.lbr
 70. http://www.hamiltonlabs.com/cshell.htm
 71. http://www.hamiltonlabs.com/cshell.htm
 72. http://www.interix.com/
 73. http://www.datafocus.com/products/nutc/
 74. http://www.unixdos.com/
 75. http://www.research.att.com/sw/tools/uwin/
 76. http://www.mixsoftware.com/product/utility.htm
 77. http://www.mks.com/
 78. http://www.teleport.com/~thompson/
 79. http://www.oreilly.com/catalog/sed2/noframes.html
 80. http://www.cs.colostate.edu/~dzubera/sedawk.txt
 81. http://www.cs.colostate.edu/~dzubera/sedawk2.txt
 82. http://www.oreilly.com/catalog/regex/
 83. http://enterprise.ic.gc.ca/~jfriedl/regex/index.html
 84. http://enterprise.ic.gc.ca/~jfriedl/regex/email-opt.pl
 85. http://www.addison-wesley.de/katalog/item.ppml?id=00262
 86. mailto:[email protected]
 87. mailto:[email protected]
 88. mailto:[email protected]
 89. mailto:[email protected]
 90. mailto:[email protected]
 91. http://www.urc.bl.ac.yu/manuals/progunix/sed.txt
 92. http://www.softlab.ntua.gr/unix/docs/sed.txt
 93. http://plan9.bell-labs.com/7thEdMan/vol2/sed
 94. http://cm.bell-labs.com/7thEdMan/vol2/sed
 95. http://www.dbnet.ece.ntua.gr/~george/sed/sedtut_1.html
 96. http://wuarchive.wustl.edu/systems/ibmpc/garbo.uwasa.fi/editor/u-sedit2.zip
 97. ftp://ftp.cs.umu.se/pub/pc/u-sedit2.zip
 98. ftp://ftp.uni-stuttgart.de/pub/systems/msdos/util/unixlike/u-sedit2.zip
 99. ftp://sunsite.icm.edu.pl/vol/d2/garbo/pc/editor/u-sedit2.zip
100. ftp://ftp.sogang.ac.kr/.1/msdos_garbo/editor/u-sedit2.zip
101. http://www.cornerstonemag.com/sed/u-sedit3.zip
102. http://www.dreamwvr.com/sed-info/sed-faq.html
103. http://www.math.fu-berlin.de/~leitner/sed/tutorial.html
104. http://dontask.caltech.edu:457/cgi-bin/printchapter/OSUserG/BOOKCHAPTER-14.html
105. http://www.multisoft.it:457/OSUserG/_Manipulating_text_with_sed.html
106. ftp://ftp.u-aizu.ac.jp/u-aizu/doc/Tech-Report/1997/97-2-007.tar.gz
107. mailto:[email protected]
108. http://seders.icheme.org/
109. http://www.cis.nctu.edu.tw/~gis84806/sed/
110. http://www.math.fu-berlin.de/~guckes/sed/
111. http://www.math.fu-berlin.de/~leitner/sed/
112. http://www.dbnet.ece.ntua.gr/~george/sed/
113. http://www.cornerstonemag.com/sed/
114. http://spacsun.rice.edu/FAQ/sed.html
115. ftp://algos.inesc.pt/pub/users/cdua/scripts/sed
116. ftp://algos.inesc.pt/pub/users/cdua/scripts/sh
117. http://www.cornerstonemag.com/sed/sed1line.txt
118. http://www.dbnet.ece.ntua.gr/~george/sed/1liners.html
119. http://www.opengroup.org/onlinepubs/7908799/xcu/sed.html
120. http://ftp.uni-klu.ac.at/sed/sed.html
121. http://www.bluesky.com.au:457/OSUserG/_Comments_in_sed.html
122. http://www.multisoft.it:457/OSUserG/_Using_sed_main.html
123. http://www.delorie.com/djgpp/faq/converting/asm2s-sed.html
124. http://www.altavista.com/cgi-bin/query?pg=q&kl=XX&stype=stext&q=%22sed+script%22
125. http://www.google.com/search?q=%22sed+script%22
126. http://www.hotbot.com/?MT=%22sed+script%22&SM=MC&DV=0&LG=any&DC=10&DE=2
127. http://hiwaay.net/~crispen/src/mail2html.zip
128. http://www.fys.uio.no/~hakonrk/vim/syntax/sed.vim
129. http://users.cybercity.dk/~bse26236/batutil/help/SED.HTM
130. http://www.cornerstonemag.com/sed/sed1line.txt
131. mailto:[email protected]
132. ftp://garbo.uwasa.fi/pc/ts/tsbat61.zip
133. ftp://hobbes.nmsu.edu/pub/os2/util/disk/forall72.zip
134. http://www.geocities.com/SiliconValley/Lakes/2414/fortn711.zip
135. http://garbo.uwasa.fi/pc/filefind/target15.zip
136. mailto:[email protected]
137. http://www.buerg.com/list.html
138. http://www.simtel.net/pub/simtelnet/gnu/djgpp/v2gnu/txt122b.zip
139. http://www.simtel.net/pub/simtelnet/gnu/djgpp/v2misc/csdpmi4b.zip
140. ftp://ftp.cdrom.com/pub/simtelnet/gnu/djgpp/v2misc/csdpmi4b.zip
141. mailto:[email protected]
142. ftp://ftp.ntua.gr/pub/gnu/sed-2.05.bin.README

  Hidden links:
143. http://www.opengroup.org/onlinepubs/7908799/xbd/re.html#tag_007_003
144. http://www.columbia.edu/~rh120/ch106.x09
145. http://www.gnu.org/
146. http://www.tuxedo.org/~esr/
147. http://www.itribe.net/virtunix/
148. http://www.cs.colostate.edu/~dzubera/sedawk.txt
149. http://www.cs.colostate.edu/~dzubera/sedawk2.txt
150. http://www.cornerstonemag.com/sed/sed1line.txt
151. ftp://garbo.uwasa.fi/pc/ts/tsbat61.zip
152. http://www.buerg.com/list.html
153. ftp://ftp.ntua.gr/pub/gnu/sed-2.05.bin.README