GAWK Manual - Implementation Notes
Go to the previous, next chapter.|
This appendix contains information mainly of interest to implementors and
gawk. Everything in it applies specifically to
gawk, and not to other implementations.
See section Extensions in
gawk not in POSIX
awk language and program.
All of these features can be turned off by invoking
gawk with the
-W compat option, or with the -W posix option.
gawk is compiled for debugging with -DDEBUG, then there
is one more option available on the command line:
Print out the parse stack information as the program is being parsed.
This option is intended only for serious
and not for the casual user. It probably has not even been compiled into
your version of
gawk, since it slows down execution.
This section briefly lists extensions that indicate the directions we are
currently considering for
gawk. The file FUTURES in the
gawk distributions lists these extensions, as well as several others.
RS as a regexp
The meaning of
RS may be generalized along the lines of
Control of subprocess environment
Changes made in
gawk to the array
ENVIRON may be
propagated to subprocesses run by
It may be possible to map a GDBM/NDBM/SDBM file into an
The null string,
"", as a field separator, will cause field
splitting and the
split function to separate individual characters.
split(a, "abcd", "") would yield
a == "a",
a == "b", and so on.
There are more things that could be checked for portability.
RECLEN variable for fixed length records
FIELDWIDTHS, this would speed up the processing of
RT variable to hold the record terminator
It is occasionally useful to have access to the actual string of
characters that matched the
RS variable. The
variable would hold these characters.
restart would restart the pattern
matching loop, without reading a new record from the input.
A |& redirection
The |& redirection, in place of |, would open a two-way
pipeline for communication with a sub-process (via
IGNORECASE affecting all comparisons
The effects of the
IGNORECASE variable may be generalized to
all string comparisons, and not just regular expression operations.
A way to mix command line source code and library files
There may be a new option that would make it possible to easily use library
functions from a program entered on the command line.
GNU-style long options
We will add GNU-style long options
gawk for compatibility with other GNU programs.
(For example, --field-separator=: would be equivalent to
Here are some projects that would-be
gawk hackers might like to take
on. They vary in size from a few days to a few weeks of programming,
depending on which one you choose and how fast a programmer you are. Please
send any improvements you write to the maintainers at the GNU
gawk uses a Bison (YACC-like)
parser to convert the script given it into a syntax tree; the syntax
tree is then executed by a simple recursive evaluator. This method incurs
a lot of overhead, since the recursive evaluator performs many procedure
calls to do even the simplest things.
It should be possible for
gawk to convert the script's parse tree
into a C program which the user would then compile, using the normal
C compiler and a special
gawk library to provide all the needed
functions (regexps, fields, associative arrays, type coercion, and so
An easier possibility might be for an intermediate phase of
convert the parse tree into a linear byte code form like the one used
in GNU Emacs Lisp. The recursive evaluator would then be replaced by
a straight line byte code interpreter that would be intermediate in speed
between running a compiled program and doing what
This may actually happen for the 3.0 version of
An error message section has not been included in this version of the
manual. Perhaps some nice beta testers will document some of the messages
for the future.
The programs in the test suite could use documenting in this manual.
The programs and data files in the manual should be available in
separate files to facilitate experimentation.
See the FUTURES file for more ideas. Contact us if you would
seriously like to tackle any of the items listed there.