awk.1 - 9base - revived minimalist port of Plan 9 userland to Unix | |
git clone git://git.suckless.org/9base | |
Log | |
Files | |
Refs | |
README | |
LICENSE | |
--- | |
awk.1 (10645B) | |
--- | |
1 .TH AWK 1 | |
2 .SH NAME | |
3 awk \- pattern-directed scanning and processing language | |
4 .SH SYNOPSIS | |
5 .B awk | |
6 [ | |
7 .BI -F fs | |
8 ] | |
9 [ | |
10 .BI -v | |
11 .I var=value | |
12 ] | |
13 [ | |
14 .BI -mr n | |
15 ] | |
16 [ | |
17 .BI -mf n | |
18 ] | |
19 [ | |
20 .B -f | |
21 .I prog | |
22 [ | |
23 .I prog | |
24 ] | |
25 [ | |
26 .I file ... | |
27 ] | |
28 .SH DESCRIPTION | |
29 .I Awk | |
30 scans each input | |
31 .I file | |
32 for lines that match any of a set of patterns specified literally in | |
33 .IR prog | |
34 or in one or more files | |
35 specified as | |
36 .B -f | |
37 .IR file . | |
38 With each pattern | |
39 there can be an associated action that will be performed | |
40 when a line of a | |
41 .I file | |
42 matches the pattern. | |
43 Each line is matched against the | |
44 pattern portion of every pattern-action statement; | |
45 the associated action is performed for each matched pattern. | |
46 The file name | |
47 .L - | |
48 means the standard input. | |
49 Any | |
50 .IR file | |
51 of the form | |
52 .I var=value | |
53 is treated as an assignment, not a file name, | |
54 and is executed at the time it would have been opened if it were a file … | |
55 The option | |
56 .B -v | |
57 followed by | |
58 .I var=value | |
59 is an assignment to be done before | |
60 .I prog | |
61 is executed; | |
62 any number of | |
63 .B -v | |
64 options may be present. | |
65 .B \-F | |
66 .IR fs | |
67 option defines the input field separator to be the regular expression | |
68 .IR fs . | |
69 .PP | |
70 An input line is normally made up of fields separated by white space, | |
71 or by regular expression | |
72 .BR FS . | |
73 The fields are denoted | |
74 .BR $1 , | |
75 .BR $2 , | |
76 \&..., while | |
77 .B $0 | |
78 refers to the entire line. | |
79 If | |
80 .BR FS | |
81 is null, the input line is split into one field per character. | |
82 .PP | |
83 To compensate for inadequate implementation of storage management, | |
84 the | |
85 .B \-mr | |
86 option can be used to set the maximum size of the input record, | |
87 and the | |
88 .B \-mf | |
89 option to set the maximum number of fields. | |
90 .PP | |
91 A pattern-action statement has the form | |
92 .IP | |
93 .IB pattern " { " action " } | |
94 .PP | |
95 A missing | |
96 .BI { " action " } | |
97 means print the line; | |
98 a missing pattern always matches. | |
99 Pattern-action statements are separated by newlines or semicolons. | |
100 .PP | |
101 An action is a sequence of statements. | |
102 A statement can be one of the following: | |
103 .PP | |
104 .EX | |
105 .ta \w'\fLdelete array[expression]'u | |
106 if(\fI expression \fP)\fI statement \fP\fR[ \fPelse\fI statement \fP\fR]… | |
107 while(\fI expression \fP)\fI statement\fP | |
108 for(\fI expression \fP;\fI expression \fP;\fI expression \fP)\fI stateme… | |
109 for(\fI var \fPin\fI array \fP)\fI statement\fP | |
110 do\fI statement \fPwhile(\fI expression \fP) | |
111 break | |
112 continue | |
113 {\fR [\fP\fI statement ... \fP\fR] \fP} | |
114 \fIexpression\fP #\fR commonly\fP\fI var = expression\fP | |
115 print\fR [ \fP\fIexpression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\… | |
116 printf\fI format \fP\fR[ \fP,\fI expression-list \fP\fR] \fP\fR[ \fP>\fI… | |
117 return\fR [ \fP\fIexpression \fP\fR]\fP | |
118 next #\fR skip remaining patterns on this input line\fP | |
119 nextfile #\fR skip rest of this file, open next, start at top\fP | |
120 delete\fI array\fP[\fI expression \fP] #\fR delete an array eleme… | |
121 delete\fI array\fP #\fR delete all elements of array\fP | |
122 exit\fR [ \fP\fIexpression \fP\fR]\fP #\fR exit immediately; stat… | |
123 .EE | |
124 .DT | |
125 .PP | |
126 Statements are terminated by | |
127 semicolons, newlines or right braces. | |
128 An empty | |
129 .I expression-list | |
130 stands for | |
131 .BR $0 . | |
132 String constants are quoted \&\fL"\ "\fR, | |
133 with the usual C escapes recognized within. | |
134 Expressions take on string or numeric values as appropriate, | |
135 and are built using the operators | |
136 .B + \- * / % ^ | |
137 (exponentiation), and concatenation (indicated by white space). | |
138 The operators | |
139 .B | |
140 ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?: | |
141 are also available in expressions. | |
142 Variables may be scalars, array elements | |
143 (denoted | |
144 .IB x [ i ] ) | |
145 or fields. | |
146 Variables are initialized to the null string. | |
147 Array subscripts may be any string, | |
148 not necessarily numeric; | |
149 this allows for a form of associative memory. | |
150 Multiple subscripts such as | |
151 .B [i,j,k] | |
152 are permitted; the constituents are concatenated, | |
153 separated by the value of | |
154 .BR SUBSEP . | |
155 .PP | |
156 The | |
157 .B print | |
158 statement prints its arguments on the standard output | |
159 (or on a file if | |
160 .BI > file | |
161 or | |
162 .BI >> file | |
163 is present or on a pipe if | |
164 .BI | cmd | |
165 is present), separated by the current output field separator, | |
166 and terminated by the output record separator. | |
167 .I file | |
168 and | |
169 .I cmd | |
170 may be literal names or parenthesized expressions; | |
171 identical string values in different statements denote | |
172 the same open file. | |
173 The | |
174 .B printf | |
175 statement formats its expression list according to the format | |
176 (see | |
177 .IR fprintf (2)) . | |
178 The built-in function | |
179 .BI close( expr ) | |
180 closes the file or pipe | |
181 .IR expr . | |
182 The built-in function | |
183 .BI fflush( expr ) | |
184 flushes any buffered output for the file or pipe | |
185 .IR expr . | |
186 .PP | |
187 The mathematical functions | |
188 .BR exp , | |
189 .BR log , | |
190 .BR sqrt , | |
191 .BR sin , | |
192 .BR cos , | |
193 and | |
194 .BR atan2 | |
195 are built in. | |
196 Other built-in functions: | |
197 .TF length | |
198 .TP | |
199 .B length | |
200 the length of its argument | |
201 taken as a string, | |
202 or of | |
203 .B $0 | |
204 if no argument. | |
205 .TP | |
206 .B rand | |
207 random number on (0,1) | |
208 .TP | |
209 .B srand | |
210 sets seed for | |
211 .B rand | |
212 and returns the previous seed. | |
213 .TP | |
214 .B int | |
215 truncates to an integer value | |
216 .TP | |
217 .B utf | |
218 converts its numerical argument, a character number, to a | |
219 .SM UTF | |
220 string | |
221 .TP | |
222 .BI substr( s , " m" , " n\fL) | |
223 the | |
224 .IR n -character | |
225 substring of | |
226 .I s | |
227 that begins at position | |
228 .IR m | |
229 counted from 1. | |
230 .TP | |
231 .BI index( s , " t" ) | |
232 the position in | |
233 .I s | |
234 where the string | |
235 .I t | |
236 occurs, or 0 if it does not. | |
237 .TP | |
238 .BI match( s , " r" ) | |
239 the position in | |
240 .I s | |
241 where the regular expression | |
242 .I r | |
243 occurs, or 0 if it does not. | |
244 The variables | |
245 .B RSTART | |
246 and | |
247 .B RLENGTH | |
248 are set to the position and length of the matched string. | |
249 .TP | |
250 .BI split( s , " a" , " fs\fL) | |
251 splits the string | |
252 .I s | |
253 into array elements | |
254 .IB a [1]\f1, | |
255 .IB a [2]\f1, | |
256 \&..., | |
257 .IB a [ n ]\f1, | |
258 and returns | |
259 .IR n . | |
260 The separation is done with the regular expression | |
261 .I fs | |
262 or with the field separator | |
263 .B FS | |
264 if | |
265 .I fs | |
266 is not given. | |
267 An empty string as field separator splits the string | |
268 into one array element per character. | |
269 .TP | |
270 .BI sub( r , " t" , " s\fL) | |
271 substitutes | |
272 .I t | |
273 for the first occurrence of the regular expression | |
274 .I r | |
275 in the string | |
276 .IR s . | |
277 If | |
278 .I s | |
279 is not given, | |
280 .B $0 | |
281 is used. | |
282 .TP | |
283 .B gsub | |
284 same as | |
285 .B sub | |
286 except that all occurrences of the regular expression | |
287 are replaced; | |
288 .B sub | |
289 and | |
290 .B gsub | |
291 return the number of replacements. | |
292 .TP | |
293 .BI sprintf( fmt , " expr" , " ...\fL) | |
294 the string resulting from formatting | |
295 .I expr ... | |
296 according to the | |
297 .I printf | |
298 format | |
299 .I fmt | |
300 .TP | |
301 .BI system( cmd ) | |
302 executes | |
303 .I cmd | |
304 and returns its exit status | |
305 .TP | |
306 .BI tolower( str ) | |
307 returns a copy of | |
308 .I str | |
309 with all upper-case characters translated to their | |
310 corresponding lower-case equivalents. | |
311 .TP | |
312 .BI toupper( str ) | |
313 returns a copy of | |
314 .I str | |
315 with all lower-case characters translated to their | |
316 corresponding upper-case equivalents. | |
317 .PD | |
318 .PP | |
319 The ``function'' | |
320 .B getline | |
321 sets | |
322 .B $0 | |
323 to the next input record from the current input file; | |
324 .B getline | |
325 .BI < file | |
326 sets | |
327 .B $0 | |
328 to the next record from | |
329 .IR file . | |
330 .B getline | |
331 .I x | |
332 sets variable | |
333 .I x | |
334 instead. | |
335 Finally, | |
336 .IB cmd " | getline | |
337 pipes the output of | |
338 .I cmd | |
339 into | |
340 .BR getline ; | |
341 each call of | |
342 .B getline | |
343 returns the next line of output from | |
344 .IR cmd . | |
345 In all cases, | |
346 .B getline | |
347 returns 1 for a successful input, | |
348 0 for end of file, and \-1 for an error. | |
349 .PP | |
350 Patterns are arbitrary Boolean combinations | |
351 (with | |
352 .BR "! || &&" ) | |
353 of regular expressions and | |
354 relational expressions. | |
355 Regular expressions are as in | |
356 .IR regexp (6). | |
357 Isolated regular expressions | |
358 in a pattern apply to the entire line. | |
359 Regular expressions may also occur in | |
360 relational expressions, using the operators | |
361 .BR ~ | |
362 and | |
363 .BR !~ . | |
364 .BI / re / | |
365 is a constant regular expression; | |
366 any string (constant or variable) may be used | |
367 as a regular expression, except in the position of an isolated regular e… | |
368 in a pattern. | |
369 .PP | |
370 A pattern may consist of two patterns separated by a comma; | |
371 in this case, the action is performed for all lines | |
372 from an occurrence of the first pattern | |
373 though an occurrence of the second. | |
374 .PP | |
375 A relational expression is one of the following: | |
376 .IP | |
377 .I expression matchop regular-expression | |
378 .br | |
379 .I expression relop expression | |
380 .br | |
381 .IB expression " in " array-name | |
382 .br | |
383 .BI ( expr , expr,... ") in " array-name | |
384 .PP | |
385 where a | |
386 .I relop | |
387 is any of the six relational operators in C, | |
388 and a | |
389 .I matchop | |
390 is either | |
391 .B ~ | |
392 (matches) | |
393 or | |
394 .B !~ | |
395 (does not match). | |
396 A conditional is an arithmetic expression, | |
397 a relational expression, | |
398 or a Boolean combination | |
399 of these. | |
400 .PP | |
401 The special patterns | |
402 .B BEGIN | |
403 and | |
404 .B END | |
405 may be used to capture control before the first input line is read | |
406 and after the last. | |
407 .B BEGIN | |
408 and | |
409 .B END | |
410 do not combine with other patterns. | |
411 .PP | |
412 Variable names with special meanings: | |
413 .TF FILENAME | |
414 .TP | |
415 .B CONVFMT | |
416 conversion format used when converting numbers | |
417 (default | |
418 .BR "%.6g" ) | |
419 .TP | |
420 .B FS | |
421 regular expression used to separate fields; also settable | |
422 by option | |
423 .BI \-F fs\f1. | |
424 .TP | |
425 .BR NF | |
426 number of fields in the current record | |
427 .TP | |
428 .B NR | |
429 ordinal number of the current record | |
430 .TP | |
431 .B FNR | |
432 ordinal number of the current record in the current file | |
433 .TP | |
434 .B FILENAME | |
435 the name of the current input file | |
436 .TP | |
437 .B RS | |
438 input record separator (default newline) | |
439 .TP | |
440 .B OFS | |
441 output field separator (default blank) | |
442 .TP | |
443 .B ORS | |
444 output record separator (default newline) | |
445 .TP | |
446 .B OFMT | |
447 output format for numbers (default | |
448 .BR "%.6g" ) | |
449 .TP | |
450 .B SUBSEP | |
451 separates multiple subscripts (default 034) | |
452 .TP | |
453 .B ARGC | |
454 argument count, assignable | |
455 .TP | |
456 .B ARGV | |
457 argument array, assignable; | |
458 non-null members are taken as file names | |
459 .TP | |
460 .B ENVIRON | |
461 array of environment variables; subscripts are names. | |
462 .PD | |
463 .PP | |
464 Functions may be defined (at the position of a pattern-action statement)… | |
465 .IP | |
466 .L | |
467 function foo(a, b, c) { ...; return x } | |
468 .PP | |
469 Parameters are passed by value if scalar and by reference if array name; | |
470 functions may be called recursively. | |
471 Parameters are local to the function; all other variables are global. | |
472 Thus local variables may be created by providing excess parameters in | |
473 the function definition. | |
474 .SH EXAMPLES | |
475 .TP | |
476 .L | |
477 length($0) > 72 | |
478 Print lines longer than 72 characters. | |
479 .TP | |
480 .L | |
481 { print $2, $1 } | |
482 Print first two fields in opposite order. | |
483 .PP | |
484 .EX | |
485 BEGIN { FS = ",[ \et]*|[ \et]+" } | |
486 { print $2, $1 } | |
487 .EE | |
488 .ns | |
489 .IP | |
490 Same, with input fields separated by comma and/or blanks and tabs. | |
491 .PP | |
492 .EX | |
493 { s += $1 } | |
494 END { print "sum is", s, " average is", s/NR } | |
495 .EE | |
496 .ns | |
497 .IP | |
498 Add up first column, print sum and average. | |
499 .TP | |
500 .L | |
501 /start/, /stop/ | |
502 Print all lines between start/stop pairs. | |
503 .PP | |
504 .EX | |
505 BEGIN { # Simulate echo(1) | |
506 for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i] | |
507 printf "\en" | |
508 exit } | |
509 .EE | |
510 .SH SOURCE | |
511 .B /sys/src/cmd/awk | |
512 .SH SEE ALSO | |
513 .IR sed (1), | |
514 .IR regexp (6), | |
515 .br | |
516 A. V. Aho, B. W. Kernighan, P. J. Weinberger, | |
517 .I | |
518 The AWK Programming Language, | |
519 Addison-Wesley, 1988. ISBN 0-201-07981-X | |
520 .SH BUGS | |
521 There are no explicit conversions between numbers and strings. | |
522 To force an expression to be treated as a number add 0 to it; | |
523 to force it to be treated as a string concatenate | |
524 \&\fL""\fP to it. | |
525 .br | |
526 The scope rules for variables in functions are a botch; | |
527 the syntax is worse. |