Apopuli.124
net.general
utcsrgv!utzoo!decvax!ucbvax!G:tut
Fri Mar 26 10:49:53 1982
Regularizing Regular Expressions
Does it ever bother you that the metacharacter CDOT is ? in the shell
and . in the editor, grep, and lex?  How many users have you known who
wanted to remove some weird file named ???? and tried "rm ????", not
realizing that this removes all files with four character names?  And
how many times have you cursed the editor when you had to do /^\.DS/
rather than /^.DS/ to find the next display?

There may be historical reasons why regular expressions are the way
they are, but I don't know what they are.  Nor do I know the reason
for the inconsistency between the shell and the editor-based programs.

How about using the underscore _ in both the shell and the editor to
indicate any single character?  This character doesn't appear often in
English text, and is not part of any macro name I know.  Even though it
does appear in C programs and in some filenames, that should not make
much difference.

Some other things that may be worth considering:
1)  .* in the editor is equivalent to * in the shell
2)  NCCL is [^...] instead of the more Unix-like [!...]
3)  \( and \) are hard to type and hard to read
4)  \ are useful but available only in Berkeley software
5)  egrep offers the OR | operator but no AND & operator

               Bill Tuthill

-----------------------------------------------------------------
gopher://quux.org/ conversion by John Goerzen <[email protected]>
of http://communication.ucsd.edu/A-News/


This Usenet Oldnews Archive
article may be copied and distributed freely, provided:

1. There is no money collected for the text(s) of the articles.

2. The following notice remains appended to each copy:

The Usenet Oldnews Archive: Compilation Copyright (C) 1981, 1996
Bruce Jones, Henry Spencer, David Wiseman.