Introduction
Introduction Statistics Contact Development Disclaimer Help
awk.1 - 9base - revived minimalist port of Plan 9 userland to Unix
git clone git://git.suckless.org/9base
Log
Files
Refs
README
LICENSE
---
awk.1 (10645B)
---
1 .TH AWK 1
2 .SH NAME
3 awk \- pattern-directed scanning and processing language
4 .SH SYNOPSIS
5 .B awk
6 [
7 .BI -F fs
8 ]
9 [
10 .BI -v
11 .I var=value
12 ]
13 [
14 .BI -mr n
15 ]
16 [
17 .BI -mf n
18 ]
19 [
20 .B -f
21 .I prog
22 [
23 .I prog
24 ]
25 [
26 .I file ...
27 ]
28 .SH DESCRIPTION
29 .I Awk
30 scans each input
31 .I file
32 for lines that match any of a set of patterns specified literally in
33 .IR prog
34 or in one or more files
35 specified as
36 .B -f
37 .IR file .
38 With each pattern
39 there can be an associated action that will be performed
40 when a line of a
41 .I file
42 matches the pattern.
43 Each line is matched against the
44 pattern portion of every pattern-action statement;
45 the associated action is performed for each matched pattern.
46 The file name
47 .L -
48 means the standard input.
49 Any
50 .IR file
51 of the form
52 .I var=value
53 is treated as an assignment, not a file name,
54 and is executed at the time it would have been opened if it were a file …
55 The option
56 .B -v
57 followed by
58 .I var=value
59 is an assignment to be done before
60 .I prog
61 is executed;
62 any number of
63 .B -v
64 options may be present.
65 .B \-F
66 .IR fs
67 option defines the input field separator to be the regular expression
68 .IR fs .
69 .PP
70 An input line is normally made up of fields separated by white space,
71 or by regular expression
72 .BR FS .
73 The fields are denoted
74 .BR $1 ,
75 .BR $2 ,
76 \&..., while
77 .B $0
78 refers to the entire line.
79 If
80 .BR FS
81 is null, the input line is split into one field per character.
82 .PP
83 To compensate for inadequate implementation of storage management,
84 the
85 .B \-mr
86 option can be used to set the maximum size of the input record,
87 and the
88 .B \-mf
89 option to set the maximum number of fields.
90 .PP
91 A pattern-action statement has the form
92 .IP
93 .IB pattern " { " action " }
94 .PP
95 A missing
96 .BI { " action " }
97 means print the line;
98 a missing pattern always matches.
99 Pattern-action statements are separated by newlines or semicolons.
100 .PP
101 An action is a sequence of statements.
102 A statement can be one of the following:
103 .PP
104 .EX
105 .ta \w'\fLdelete array[expression]'u
106 if(\fI expression \fP)\fI statement \fP\fR[ \fPelse\fI statement \fP\fR]…
107 while(\fI expression \fP)\fI statement\fP
108 for(\fI expression \fP;\fI expression \fP;\fI expression \fP)\fI stateme…
109 for(\fI var \fPin\fI array \fP)\fI statement\fP
110 do\fI statement \fPwhile(\fI expression \fP)
111 break
112 continue
113 {\fR [\fP\fI statement ... \fP\fR] \fP}
114 \fIexpression\fP #\fR commonly\fP\fI var = expression\fP
115 print\fR [ \fP\fIexpression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\…
116 printf\fI format \fP\fR[ \fP,\fI expression-list \fP\fR] \fP\fR[ \fP>\fI…
117 return\fR [ \fP\fIexpression \fP\fR]\fP
118 next #\fR skip remaining patterns on this input line\fP
119 nextfile #\fR skip rest of this file, open next, start at top\fP
120 delete\fI array\fP[\fI expression \fP] #\fR delete an array eleme…
121 delete\fI array\fP #\fR delete all elements of array\fP
122 exit\fR [ \fP\fIexpression \fP\fR]\fP #\fR exit immediately; stat…
123 .EE
124 .DT
125 .PP
126 Statements are terminated by
127 semicolons, newlines or right braces.
128 An empty
129 .I expression-list
130 stands for
131 .BR $0 .
132 String constants are quoted \&\fL"\ "\fR,
133 with the usual C escapes recognized within.
134 Expressions take on string or numeric values as appropriate,
135 and are built using the operators
136 .B + \- * / % ^
137 (exponentiation), and concatenation (indicated by white space).
138 The operators
139 .B
140 ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?:
141 are also available in expressions.
142 Variables may be scalars, array elements
143 (denoted
144 .IB x [ i ] )
145 or fields.
146 Variables are initialized to the null string.
147 Array subscripts may be any string,
148 not necessarily numeric;
149 this allows for a form of associative memory.
150 Multiple subscripts such as
151 .B [i,j,k]
152 are permitted; the constituents are concatenated,
153 separated by the value of
154 .BR SUBSEP .
155 .PP
156 The
157 .B print
158 statement prints its arguments on the standard output
159 (or on a file if
160 .BI > file
161 or
162 .BI >> file
163 is present or on a pipe if
164 .BI | cmd
165 is present), separated by the current output field separator,
166 and terminated by the output record separator.
167 .I file
168 and
169 .I cmd
170 may be literal names or parenthesized expressions;
171 identical string values in different statements denote
172 the same open file.
173 The
174 .B printf
175 statement formats its expression list according to the format
176 (see
177 .IR fprintf (2)) .
178 The built-in function
179 .BI close( expr )
180 closes the file or pipe
181 .IR expr .
182 The built-in function
183 .BI fflush( expr )
184 flushes any buffered output for the file or pipe
185 .IR expr .
186 .PP
187 The mathematical functions
188 .BR exp ,
189 .BR log ,
190 .BR sqrt ,
191 .BR sin ,
192 .BR cos ,
193 and
194 .BR atan2
195 are built in.
196 Other built-in functions:
197 .TF length
198 .TP
199 .B length
200 the length of its argument
201 taken as a string,
202 or of
203 .B $0
204 if no argument.
205 .TP
206 .B rand
207 random number on (0,1)
208 .TP
209 .B srand
210 sets seed for
211 .B rand
212 and returns the previous seed.
213 .TP
214 .B int
215 truncates to an integer value
216 .TP
217 .B utf
218 converts its numerical argument, a character number, to a
219 .SM UTF
220 string
221 .TP
222 .BI substr( s , " m" , " n\fL)
223 the
224 .IR n -character
225 substring of
226 .I s
227 that begins at position
228 .IR m
229 counted from 1.
230 .TP
231 .BI index( s , " t" )
232 the position in
233 .I s
234 where the string
235 .I t
236 occurs, or 0 if it does not.
237 .TP
238 .BI match( s , " r" )
239 the position in
240 .I s
241 where the regular expression
242 .I r
243 occurs, or 0 if it does not.
244 The variables
245 .B RSTART
246 and
247 .B RLENGTH
248 are set to the position and length of the matched string.
249 .TP
250 .BI split( s , " a" , " fs\fL)
251 splits the string
252 .I s
253 into array elements
254 .IB a [1]\f1,
255 .IB a [2]\f1,
256 \&...,
257 .IB a [ n ]\f1,
258 and returns
259 .IR n .
260 The separation is done with the regular expression
261 .I fs
262 or with the field separator
263 .B FS
264 if
265 .I fs
266 is not given.
267 An empty string as field separator splits the string
268 into one array element per character.
269 .TP
270 .BI sub( r , " t" , " s\fL)
271 substitutes
272 .I t
273 for the first occurrence of the regular expression
274 .I r
275 in the string
276 .IR s .
277 If
278 .I s
279 is not given,
280 .B $0
281 is used.
282 .TP
283 .B gsub
284 same as
285 .B sub
286 except that all occurrences of the regular expression
287 are replaced;
288 .B sub
289 and
290 .B gsub
291 return the number of replacements.
292 .TP
293 .BI sprintf( fmt , " expr" , " ...\fL)
294 the string resulting from formatting
295 .I expr ...
296 according to the
297 .I printf
298 format
299 .I fmt
300 .TP
301 .BI system( cmd )
302 executes
303 .I cmd
304 and returns its exit status
305 .TP
306 .BI tolower( str )
307 returns a copy of
308 .I str
309 with all upper-case characters translated to their
310 corresponding lower-case equivalents.
311 .TP
312 .BI toupper( str )
313 returns a copy of
314 .I str
315 with all lower-case characters translated to their
316 corresponding upper-case equivalents.
317 .PD
318 .PP
319 The ``function''
320 .B getline
321 sets
322 .B $0
323 to the next input record from the current input file;
324 .B getline
325 .BI < file
326 sets
327 .B $0
328 to the next record from
329 .IR file .
330 .B getline
331 .I x
332 sets variable
333 .I x
334 instead.
335 Finally,
336 .IB cmd " | getline
337 pipes the output of
338 .I cmd
339 into
340 .BR getline ;
341 each call of
342 .B getline
343 returns the next line of output from
344 .IR cmd .
345 In all cases,
346 .B getline
347 returns 1 for a successful input,
348 0 for end of file, and \-1 for an error.
349 .PP
350 Patterns are arbitrary Boolean combinations
351 (with
352 .BR "! || &&" )
353 of regular expressions and
354 relational expressions.
355 Regular expressions are as in
356 .IR regexp (6).
357 Isolated regular expressions
358 in a pattern apply to the entire line.
359 Regular expressions may also occur in
360 relational expressions, using the operators
361 .BR ~
362 and
363 .BR !~ .
364 .BI / re /
365 is a constant regular expression;
366 any string (constant or variable) may be used
367 as a regular expression, except in the position of an isolated regular e…
368 in a pattern.
369 .PP
370 A pattern may consist of two patterns separated by a comma;
371 in this case, the action is performed for all lines
372 from an occurrence of the first pattern
373 though an occurrence of the second.
374 .PP
375 A relational expression is one of the following:
376 .IP
377 .I expression matchop regular-expression
378 .br
379 .I expression relop expression
380 .br
381 .IB expression " in " array-name
382 .br
383 .BI ( expr , expr,... ") in " array-name
384 .PP
385 where a
386 .I relop
387 is any of the six relational operators in C,
388 and a
389 .I matchop
390 is either
391 .B ~
392 (matches)
393 or
394 .B !~
395 (does not match).
396 A conditional is an arithmetic expression,
397 a relational expression,
398 or a Boolean combination
399 of these.
400 .PP
401 The special patterns
402 .B BEGIN
403 and
404 .B END
405 may be used to capture control before the first input line is read
406 and after the last.
407 .B BEGIN
408 and
409 .B END
410 do not combine with other patterns.
411 .PP
412 Variable names with special meanings:
413 .TF FILENAME
414 .TP
415 .B CONVFMT
416 conversion format used when converting numbers
417 (default
418 .BR "%.6g" )
419 .TP
420 .B FS
421 regular expression used to separate fields; also settable
422 by option
423 .BI \-F fs\f1.
424 .TP
425 .BR NF
426 number of fields in the current record
427 .TP
428 .B NR
429 ordinal number of the current record
430 .TP
431 .B FNR
432 ordinal number of the current record in the current file
433 .TP
434 .B FILENAME
435 the name of the current input file
436 .TP
437 .B RS
438 input record separator (default newline)
439 .TP
440 .B OFS
441 output field separator (default blank)
442 .TP
443 .B ORS
444 output record separator (default newline)
445 .TP
446 .B OFMT
447 output format for numbers (default
448 .BR "%.6g" )
449 .TP
450 .B SUBSEP
451 separates multiple subscripts (default 034)
452 .TP
453 .B ARGC
454 argument count, assignable
455 .TP
456 .B ARGV
457 argument array, assignable;
458 non-null members are taken as file names
459 .TP
460 .B ENVIRON
461 array of environment variables; subscripts are names.
462 .PD
463 .PP
464 Functions may be defined (at the position of a pattern-action statement)…
465 .IP
466 .L
467 function foo(a, b, c) { ...; return x }
468 .PP
469 Parameters are passed by value if scalar and by reference if array name;
470 functions may be called recursively.
471 Parameters are local to the function; all other variables are global.
472 Thus local variables may be created by providing excess parameters in
473 the function definition.
474 .SH EXAMPLES
475 .TP
476 .L
477 length($0) > 72
478 Print lines longer than 72 characters.
479 .TP
480 .L
481 { print $2, $1 }
482 Print first two fields in opposite order.
483 .PP
484 .EX
485 BEGIN { FS = ",[ \et]*|[ \et]+" }
486 { print $2, $1 }
487 .EE
488 .ns
489 .IP
490 Same, with input fields separated by comma and/or blanks and tabs.
491 .PP
492 .EX
493 { s += $1 }
494 END { print "sum is", s, " average is", s/NR }
495 .EE
496 .ns
497 .IP
498 Add up first column, print sum and average.
499 .TP
500 .L
501 /start/, /stop/
502 Print all lines between start/stop pairs.
503 .PP
504 .EX
505 BEGIN { # Simulate echo(1)
506 for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
507 printf "\en"
508 exit }
509 .EE
510 .SH SOURCE
511 .B /sys/src/cmd/awk
512 .SH SEE ALSO
513 .IR sed (1),
514 .IR regexp (6),
515 .br
516 A. V. Aho, B. W. Kernighan, P. J. Weinberger,
517 .I
518 The AWK Programming Language,
519 Addison-Wesley, 1988. ISBN 0-201-07981-X
520 .SH BUGS
521 There are no explicit conversions between numbers and strings.
522 To force an expression to be treated as a number add 0 to it;
523 to force it to be treated as a string concatenate
524 \&\fL""\fP to it.
525 .br
526 The scope rules for variables in functions are a botch;
527 the syntax is worse.
You are viewing proxied material from suckless.org. The copyright of proxied material belongs to its original authors. Any comments or complaints in relation to proxied material should be directed to the original authors of the content concerned. Please see the disclaimer for more details.