About Us Documentation

Contact Site Map



GAWK Manual - Invoking awk Go to the previous, next chapter.

Invoking awk

There are two ways to run awk: with an explicit program, or with one or more program files. Here are templates for both of them; items enclosed in [...] in these templates are optional.

Besides traditional one-letter POSIX-style options, gawk also supports GNU long named options.

	awk [POSIX or GNU style options] -f progfile [--] file ...
	awk [POSIX or GNU style options] [--] 'program' file ...

Command Line Options

Options begin with a minus sign, and consist of a single character. GNU style long named options consist of two minus signs and a keyword that can be abbreviated if the abbreviation allows the option to be uniquely identified. If the option takes an argument, then the keyword is immediately followed by an equals sign (=) and the argument's value. For brevity, the discussion below only refers to the traditional short options; however the long and short options are interchangeable in all contexts.

Each long named option for gawk has a corresponding POSIX-style option. The options and their meanings are as follows:

  • -F fs
  • --field-separator=fs Sets the FS variable to fs (see section Specifying how Fields are Separated).

  • -f source-file
  • --file=source-file Indicates that the awk program is to be found in source-file instead of in the first non-option argument.

  • -v var=val
  • --assign=var=val Sets the variable var to the value val before execution of the program begins. Such variable values are available inside the BEGIN rule (see below for a fuller explanation).

    The -v option can only set one variable, but you can use it more than once, setting another variable each time, like this: -v foo=1 -v bar=2.

  • -W gawk-opt Following the POSIX standard, options that are implementation specific are supplied as arguments to the -W option. With gawk, these arguments may be separated by commas, or quoted and separated by whitespace. Case is ignored when processing these options. These options also have corresponding GNU style long named options. The following gawk-specific options are available:

  • -W compat
  • --compat Specifies compatibility mode, in which the GNU extensions in gawk are disabled, so that gawk behaves just like Unix awk. See section 'Extensions in @cod which summarizes the extensions. Also see Downward Compatibility and Debugging : Compatibility Mode.
  • -W copyright
  • --copyleft
  • --copyright Print the short version of the General Public License. This option may disappear in a future version of gawk.

  • -W help
  • -W usage
  • --help
  • --usage Print a ``usage'' message summarizing the short and long style options that gawk accepts, and then exit.

  • -W lint
  • --lint Provide warnings about constructs that are dubious or non-portable to other awk implementations. Some warnings are issued when gawk first reads your program. Others are issued at run-time, as your program executes.

  • -W posix
  • --posix Operate in strict POSIX mode. This disables all gawk extensions (just like -W compat), and adds the following additional restrictions:

    Although you can supply both -W compat and -W posix on the command line, -W posix will take precedence.

  • -W source=program-text
  • --source=program-text Program source code is taken from the program-text. This option allows you to mix awk source code in files with program source code that you would enter on the command line. This is particularly useful when you have library functions that you wish to use from your command line programs (see also: The AWKPATH Environment Variable).
  • --version Prints version information for this particular copy of gawk. This is so you can determine if your copy of gawk is up to date with respect to whatever the Free Software Foundation is currently distributing. This option may disappear in a future version of gawk.
  • -- Signals the end of the command line options. The following arguments are not treated as options even if they begin with -. This interpretation of -- follows the POSIX argument parsing conventions.

    This is useful if you have file names that start with -, or in shell scripts, if you have file names that will be specified by the user which could start with -.

  • Any other options are flagged as invalid with a warning message, but are otherwise ignored.

    In compatibility mode, as a special case, if the value of fs supplied to the -F option is t, then FS is set to the tab character ("\t"). This is only true for -W compat, and not for -W posix (see section Specifying how Fields are Separated).

    If the -f option is not used, then the first non-option command line argument is expected to be the program text.

    The -f option may be used more than once on the command line. If it is, awk reads its program source from all of the named files, as if they had been concatenated together into one big file. This is useful for creating libraries of awk functions. Useful functions can be written once, and then retrieved from a standard place, instead of having to be included into each individual program. You can still type in a program at the terminal and use library functions, by specifying -f /dev/tty. awk will read a file from the terminal to use as part of the awk program. After typing your program, type Control-d (the end-of-file character) to terminate it. (You may also use -f - to read program source from the standard input, but then you will not be able to also use the standard input as a source of data.)

    Because it is clumsy using the standard awk mechanisms to mix source file and command line awk programs, gawk provides the --source option. This does not require you to pre-empt the standard input for your source code, and allows you to easily mix command line and library source code (see section The AWKPATH Environment Variable or --source option is specified, then gawk will use the first non-option command line argument as the text of the program source code.

    Other Command Line Arguments

    Any additional arguments on the command line are normally treated as input files to be processed in the order specified. However, an argument that has the form var=value, means to assign the value value to the variable var---it does not specify a file at all.

    All these arguments are made available to your awk program in the ARGV array (see section Built-in Variables). Command line options and the program text (if present) are omitted from the ARGV array. All other arguments, including variable assignments, are included.

    The distinction between file name arguments and variable-assignment arguments is made when awk is about to open the next input file. At that point in execution, it checks the ``file name'' to see whether it is really a variable assignment; if so, awk sets the variable instead of reading a file.

    Therefore, the variables actually receive the specified values after all previously specified files have been read. In particular, the values of variables assigned in this fashion are not available inside a BEGIN rule (see section BEGIN and END Special Patterns begins scanning the argument list. The values given on the command line are processed for escape sequences (see section Constant Expressions).

    In some earlier implementations of awk, when a variable assignment occurred before any file names, the assignment would happen before the BEGIN rule was executed. Some applications came to depend upon this ``feature.'' When awk was changed to be more consistent, the -v option was added to accommodate applications that depended upon this old behavior.

    The variable assignment feature is most useful for assigning to variables such as RS, OFS, and ORS, which control input and output formats, before scanning the data files. It is also useful for controlling state if multiple passes are needed over a data file. For example:

    	awk 'pass == 1  { pass 1 stuff }
    	     pass == 2  { pass 2 stuff }' pass=1 datafile pass=2 datafile

    Given the variable assignment feature, the -F option is not strictly necessary. It remains for historical compatibility.

    The AWKPATH Environment Variable

    The previous section described how awk program files can be named on the command line with the -f option. In some awk implementations, you must supply a precise path name for each program file, unless the file is in the current directory.

    But in gawk, if the file name supplied in the -f option does not contain a /, then gawk searches a list of directories (called the search path), one by one, looking for a file with the specified name.

    The search path is actually a string consisting of directory names separated by colons. gawk gets its search path from the AWKPATH environment variable. If that variable does not exist, gawk uses the default path, which is .:/usr/lib/awk:/usr/local/lib/awk. (Programs written by system administrators should use an AWKPATH variable that does not include the current directory, ..)

    The search path feature is particularly useful for building up libraries of useful awk functions. The library files can be placed in a standard directory that is in the default path, and then specified on the command line with a short file name. Otherwise, the full file name would have to be typed for each file.

    By combining the --source and -f options, your command line awk programs can use facilities in awk library files.

    Path searching is not done if gawk is in compatibility mode. This is true for both -W compat and -W posix. See section Command Line Options.

    Note: if you want files in the current directory to be found, you must include the current directory in the path, either by writing . as an entry in the path, or by writing a null entry in the path. (A null entry is indicated by starting or ending the path with a colon, or by placing two colons next to each other (::).) If the current directory is not included in the path, then files cannot be found in the current directory. This path search mechanism is identical to the shell's.

    Obsolete Options and/or Features

    This section describes features and/or command line options from the previous release of gawk that are either not available in the current version, or that are still supported but deprecated (meaning that they will not be in the next release).

    For version 2.15 of gawk, the following command line options from version 2.11.1 are no longer recognized.

  • -c Use -W compat instead.

  • -V Use -W version instead.

  • -C Use -W copyright instead.

  • -a
  • -e These options produce an ``unrecognized option'' error message but have no effect on the execution of gawk. The POSIX standard now specifies traditional awk regular expressions for the awk utility.
  • The public-domain version of strftime that is distributed with gawk changed for the 2.14 release. The %V conversion specifier that used to generate the date in VMS format was changed to %v. This is because the POSIX standard for the date utility now specifies a %V conversion specifier. See section Functions for Dealing with Time Stamps, for details.

    Undocumented Options and Features

    This section intentionally left blank.