Go to the first, previous, next, last section, table of contents.

Debugging Your Parser

Zyacc provides facilities to get a debugger compiled into your program. The resulting program can be debugged in several different ways:

  1. You can interact with the program using a command-line interface and control it using simple single letter debugging commands. This approach has the disadvantage that debugger I/O is interspersed with the I/O of your program.
  2. As in (1) above, but separate processes are used to run your program and the debugger interface. These processes communicate using a socket-based interface. Hence the program being debugged can be on a computer different from the one on which you are doing the debugging, or it could be on the same one (the usual case). When debugging and running your program on the same computer, you can run your program in one window and debug it from another window. This avoids the problem in (1) where debugger and program I/O are interspersed.
  3. As in (2) above, but the interaction uses a GUI java-based application, rather than a command-line interface. Most of the facilities of the command-line interface are available via the GUI.
  4. By specifying certain environmental variables when you run your program, you can get it to generate HTML files which can then be accessed using any Web browser. The GUI used in (3) is then run as a java applet by accessing the generated HTML files using your favorite Web browser. This approach has the advantage that the GUI can automatically use your Web browser to display the current state of the parser in the HTML files generated using the --HTML option (see section Invoking Zyacc).

Building Debugging Parsers

To build a parser which can be debugged, it is necessary to define a C preprocessor symbol when compiling the generated parser and to link your program with the Zyacc library. Optionally, if you would like token semantics to be printed out, then you need to define an auxiliary function.

Compiling Debugging Parsers

To enable compilation of debugging facilities, you must define the C preprocessor macro YY_ZYACC_DEBUG to be nonzero when you compile the parser. This can be done in any of the following ways:

  1. You can define the YY_ZYACC_DEBUG macro in the C-declarations section of section 1 of your Zyacc source file.
  2. You can use the -t or --debug option when you run Zyacc (see section Invoking Zyacc). This results in a definition of YY_ZYACC_DEBUG automatically being added to the generated parser.
  3. You can define the YY_ZYACC_DEBUG macro on the compiler command line when you compile the generated parser. For most compilers, this can typically be done by specifying the option -DYY_ZYACC_DEBUG.

The third approach is the recommended approach as it defers the debugging decision to the latest possible point in the build process.

For compatibility with other parser generators, the macro YYDEBUG can be used instead of YY_ZYACC_DEBUG. Zyacc provides YY_ZYACC_DEBUG in addition to YYDEBUG for the following reason: YYDEBUG is also used to turn on debugging in lex-based scanner generators; if a project uses both lex and yacc, and its compilation is controlled by a Makefile then it can sometimes be inconvenient to turn on debugging in the parser but not in the scanner, or vice-versa.

Linking Debugging Parsers

A debugging parser must be linked with the Zyacc library. Depending on your system, you may also need to link it with the networking library --- typically you need to specify -lsocket and -lnsl on the link command-line. For most compilers, this is typically done using the -L option (for specifying the library path) and the -l option (for specifying the library name). For example, if your project consists of a parser with object file parse.o and some helper functions in helpers.o and the Zyacc library is installed in the lib subdirectory of your home directory, then a typical link command line is:

	cc parse.o helpers.o -L$HOME/lib -lzyacc -lsocket -lnsl -o my_prj
It is usually necessary that the -L and -l options occur after the object files.

Printing Token Semantics

If you would like to have the debugger print out tokens using their semantics (rather than the token names), then you need to define a function to print them out. If your function is semFn, then it should have the prototype

void semFn(FILE *out, int tokNum, void *yylvalP);
and should print in FILE out, the semantics associated with the token whose token number is tokNum and whose yylval semantic value (see section Semantic Values of Tokens) is pointed to by yylvalP.

You may call this function whatever you like. You should communicate the name of the function to Zyacc by defining (in the C declarations section of the source file) the C preprocessor macro YY_SEM_FN to the chosen name.

If you are only using the textual interface, then this function may print anything you wish. However, if you wish to use the GUI, then it is imperative that the function not print any newlines.

If you are using the GUI, then you may also want to define the C macro YY_SEM_MAX to specify the maximum number of characters your semantic function will print. If you do not define this macro, then a default value is used.

Environmental Variables

Since the program being run is your program and Zyacc has no control over the arguments accepted by your program from the command-line, all arguments to the debugger are provided via environmental variables.

If your shell is derived from the C-shell (csh, tcsh, etc), then you can specify an environmental variable using the setenv command. For example,

	setenv	ZYDEBUG_PORT 1

If your shell is derived from the Bourne-shell (sh, ksh, bash, etc), then you can specify an environmental variable using the export command. For example,

	export	ZYDEBUG_PORT=1

If you prefer to avoid cluttering up your environment with gratuitous definitions, then the sh-based derivatives provide a neat alternative. You can simply type the variable definitions immediately before the name of your program, as illustrated by the following example:

	ZYDEBUG_PORT=1 my_prg my_arg1 my_arg2

The environmental variables used by the Zyacc debugger are listed below. They are discussed in more detail later.

ZYDEBUG_APPLET
Specifies the name of a applet to be generated by the debugger.
ZYDEBUG_CODEBASE
Specifies the relative path to the java-based GUI debugger.
ZYDEBUG_HTMLBASE
Specifies the relative path to the HTML files describing your parser.
ZYDEBUG_PORT
Suggests a socket port to be used by the debugger.
ZYDEBUG_SRCBASE
Specifies the relative path to the source file for your parser.

Debugging Parsers Using a Textual Interface

Parsers can be debugged using a textual interface with a single process for both your program and the debugging interface, or by using a separate process for your program and a separate process for the debugger interface. The single process alternative is fine as long as your program does not generate any terminal I/O. If your program does generate terminal I/O, then sorting out your program's I/O from the debugger's I/O could get messy and multi-process debugging is to be preferred.

In a networked environment, the multi-process debugger also allows remote debugging, with the program being debugged running on one computer and the debugging interface running on another computer. For this to work, your network environment must support BSD-style sockets, -- this is usually the case for most modern systems.

Starting a Single Process Textual Debugger

To debug your program using a single process textual interface you must first compile and link your program as outlined in section Building Debugging Parsers. Then you should start your program the way your normally do specifying any command-line arguments required. You should not specify any environmental variables. Your program starts up as normal and when the parsing function yyparse is first entered, the debugger takes control and allows you to interact with it using the debugger commands (see section Textual Debugger Commands). (As mentioned earlier, if your program does any terminal I/O, then the debugger interaction will be interspersed with the interaction with your program).

Starting a Multiple Process Textual Debugger

To debug your program using a textual interface with multiple processes you must first compile and link your program as outlined in section Building Debugging Parsers. To start your program, you should define the environmental variables described below and then start your program the way your normally do specifying any command-line arguments required. To make the debugger start in multiple-process mode, the environmental variable (see section Environmental Variables) ZYDEBUG_PORT must be defined when the program is started. The value specified for ZYDEBUG_PORT should be a suggested socket port number to use: Zyacc merely uses it as a starting point in its search for a free socket and it is usually best to specify it simply as 1. Once it finds a free port, it outputs its number to the terminal as follows:

	zydebug port: 6001
Given the port number, you can start that portion of the debugger which communicates with your program. To do this, you should run the zydebug program (which should have been installed along with Zyacc) giving it the above socket number:
	zydebug 6001
If you want to run zydebug on a computer different from the one where your program is running then simply type:
	zydebug HOST PORT
where HOST is the network address (hostname or dotted IP address) of the computer on which your program is running, and PORT is the port number output when you started your program. For example, if I started the program above on host alpha.romeo.com, I would use:
	zydebug alpha.romeo.com 6001

Textual Debugger Commands

Irrespective of the manner in which you start the debugger, the commands it accepts are always the same: a single letter followed by an optional argument. These commands allow you to specify breakpoints and displaypoints. When the parser encounters a breakpoint the debugger suspends execution of the parser and allows you to interact with it. When the parser encounters a displaypoint, the debugger displays information about the state of the parse. The concepts of displaypoints and breakpoints are orthogonal -- i.e. it is possible to display information at a point in the parse without suspending parser execution at that point, or vice-versa.

The commands which are relevant to human users are the following:

b [breakSpec]
Set or list breakpoint(s) as specified by breakSpec. At a breakpoint, the parser stops and the user can type in commands (it may or may not display its current state, depending on whether or not a displaypoint is set too). If breakSpec is omitted then list all breakpoints. breakSpec can have one of the following forms with the specified meaning:
TERM
Terminal with name TERM is about to be shifted.
NON_TERM
Any rule with LHS nonterminal named NON_TERM is about to be reduced.
RULE_NUM
Rule number RULE_NUM is about to be reduced.
%n
Reduce action on any nonterminal.
%t
Shift action on any terminal.
*
Both shift and reduce actions.
B [breakSpec]
Clear breakpoint(s) as specified by breakSpec. At a breakpoint, the parser stops and the user can type in commands (it may or may not display its current state, depending on whether or not a displaypoint is set too). If breakSpec is omitted then clear all breakpoints. breakSpec is as specified for the b command.
c [temporaryBreakSpec]
Continue execution until a breakpoint is entered. If temporaryBreakSpec is specified, then it specifies a temporary break point which is automatically cleared whenever the next breakpoint is entered. temporaryBreakSpec can have the same form as breakSpec for the b command.
d [displaySpec]
Set or list displaypoint(s) as specified by displaySpec. At a displaypoint, the parser displays its current state (it may or may not stop to let the user interact with it, depending on whether or not a breakpoint is set too). If displaySpec is omitted then list all displaypoints. displaySpec can have the same form as breakSpec for the b command.
D [displaySpec]
Clear displaypoint(s) as specified by displaySpec. At a displaypoint, the parser displays its current state (it may or may not stop to let the user interact with it, depending on whether or not a breakpoint is set too). If displaySpec is omitted then clear all displaypoints. displaySpec can have the same form as breakSpec for the b command.
h [cmd]
Print help on single-letter command cmd. If cmd is omitted, give help on all commands.
l [listSpec]
List terminals, non-terminals or rules as specified by listSpec. If listSpec is omitted, then all rules are listed. Otherwise listSpec can be one of the following:
%n
List all non-terminal symbols.
%t
List all terminal symbols.
RULE_NUM
List rule with number RULE_NUM.
m [depth]
Set maximum depth printed for the stack. If depth is 0 or omitted, then entire stack is printed.
n
Execute parser till next shift action. Equivalent to c %t.
p
Print current parser state.
q
Quit debugger and run parser without debugging.
s
<blank line>
Single-step parser to next shift or reduce action. Equivalent to c *.

Debugging Parsers Using a Graphical User Interface

It is also possible to use a java-based GUI to debug parsers generated by Zyacc. This can be done in two ways:

  1. The GUI can be used as a standalone java application. To do this, you should have a java runtime system installed on your computer.
  2. The GUI can be used as an applet from within a web browser which supports java.

Starting a GUI Debugger as a Java Application

To debug your program using a GUI java application you must first compile and link your program as outlined in section Building Debugging Parsers. To use the GUI as a standalone application, you should first start your program the way your normally do, specifying any command-line arguments required. The following environmental variables (see section Environmental Variables) are used by your program to control the setup of the debugger.

ZYDEBUG_PORT
This is required. As discussed earlier (see section Starting a Multiple Process Textual Debugger), it's value is best specified simply as 1.
ZYDEBUG_SRCBASE
This should be the relative path to the parser source file (the `.y' file) from the directory which is current when the parsing function yyparse is first entered. (The directory in which yyparse is entered is usually the same directory in which your program executable resides, assuming that your program has not performed any chdir() calls and that you started the program in the directory in which it resides). If it not specified, it is assumed that your parser source file lives in the directory the parsing function was in when it was first entered.

When your program starts it will output the port number of the socket it has connected to; for example:

	zydebug port: 6001
This port number will be used to connect the GUI to your executing program.

To start the GUI, you must run the java runtime system on your machine, telling it where to find the java classfiles referenced by zdu.zydebug.ZYDebug which form the GUI and providing it with arguments telling it how to connect to your executing program. The exact procedures may be different on your machine, but as of this writing, the most common setup is as follows:

Java is started by simply using the command java. For the java command to work it must be in your PATH (see section Environmental Variables) or you should specify the full path name. For java to find the java classfiles for the debugger, those files must be in your CLASSPATH (see section Environmental Variables). Finally, you must specify the port number output by your program, (optionally preceeded by the hostname or dotted IP address of the computer on which your program is running, if it is different from the one on which you are running the GUI).

The following example shows the starting a debugger GUI under a csh or derivatives.


% setenv CLASSPATH /usr/local/share/classes
% java zdu.zydebug.ZYDebug 6001 &

Under sh or derivatives, the following command can be used to connect to a program running on a different machine:

$ CLASSPATH=/usr/local/share/classes java zdu.zydebug.ZYDebug pear 6001 &
where pear is the name of the machine on which the program is running.

Starting a GUI Debugger as a Java Applet

A java applet can only be run by being embedded within an HTML file which provides the environment in which the applet lives. Among other things, the HTML file provides the arguments to the applet.

The debugging applet talks to the program being debugged using a socket referred to by a port number. Since this port number can vary for different executions of the program being debugged, it has to be provided as an argument to the applet. As an applet gets its argument from its associated HTML file, and this particular port number argument can be different for different executions of the program being debugged, the HTML file associated with the debugging applet has to be generated dynamically.

The HTML file is generated by your enhanced program when it is started (actually a second HTML file is generated as well). Using environmental variables (see section Environmental Variables) you can specify the names of the HTML files, as well as paths to various other resources required by the applet.

To debug your program using a GUI java applet you must first compile and link your program as outlined in section Building Debugging Parsers. You should then start your program the way your normally do, specifying any command-line arguments required. The following environmental variables (see section Environmental Variables) are used by your program to control the setup of the debugger.

ZYDEBUG_APPLET
This is required. It specifies the root name used for the generated HTML documents which are used to access the debugging applet.
ZYDEBUG_CODEBASE
This should provide the path to the debugger's java classfiles relative to the directory where the generated HTML documents live (relative to the DOCBASE directory in java terminology). If not specified, then it is assumed that the java classfiles reside in the same directory as the generated HTML documents.
ZYDEBUG_HTMLBASE
This should provide the path to the parser's HTML description files generated using the --HTML option see section Invoking Zyacc), relative to the directory where the generated HTML documents live (relative to the DOCBASE directory in java terminology). If not specified, then it is assumed that the HTML description files reside in the same directory as the generated HTML documents.
ZYDEBUG_PORT
This is not required. As discussed earlier See section Starting a Multiple Process Textual Debugger, it's value is best specified simply as 1.
ZYDEBUG_SRCBASE
This is exactly as for the standalone GUI application. It should be the relative path to the parser source file (the `.y' file) from the directory which is current when the parsing function yyparse is first entered. (The directory in which yyparse is entered is usually the same directory in which your program executable resides, assuming that your program has not performed any chdir() calls and that you started the program in the directory in which it resides). If it not specified, it is assumed that your parser source file lives in the directory the parsing function was in when it was first entered.

Usually, the parser source file and the HTML description files are in the same directory as the program executable and all you need to specify are ZYDEBUG_APPLET and ZYDEBUG_CODEBASE.

Consider the following more complicated situation:

Under csh and derivatives, you can use the following sequence of commands:

setenv ZYDEBUG_APPLET $HOME/tmp/XXX
setenv CODEBASE /usr/local/share/classes
setenv HTMLBASE `pwd`/..
setenv SRCBASE `pwd`/..
./foo bar

Under sh and derivatives, the following command suffices:

ZYDEBUG_APPLET=$HOME/tmp/XXX CODEBASE=/usr/local/share/classes \
  HTMLBASE=`pwd`/.. SRCBASE=`pwd`/.. \
  ./foo bar

Having generated the HTML files, you can now use a browser to start the debugger's GUI and debug your parser. There are two methods of doing this available in most browsers:

Using the Debugger GUI

As the applet is talking to your compiled program which has not been modified in any way, it is not possible to have the applet restart your program once it has completed its parse. Instead you will have to restart the parser and the debugger's GUI applet.

The applet has four main windows. Clockwise from the top-left they are the following:

Parse Forest Window
Shows the current parse-forest. The nodes on top are the nodes currently on the stack. Terminal nodes are in red, non-terminal nodes are in green and error nodes are in pink. The last active node is highlighted in yellow. Each node contains text of the form S/Sym, where S is the state at which that node was created and Sym is the grammar symbol or token semantics corresponding to the node. The nodes in the top row correspond to nodes currently on the parse stack. Each non-leaf node in the forest is clickable. Clicking on such a node hides all its subtrees; clicking again on that node displays the subtrees again. This can be useful as the parse tree typically gets pretty large for practical parsers.
Trace Window
Shows the parse stack in gray (each entry is in the same format as a parse tree node), the current lookahead in red and the following action in blue.
Breakpoint Window
This allows you to set/clear breakpoints on all or selected nonterminals and terminals. Clicking a line in the window sets a breakpoint on the symbol displayed on that line; clicking it again clears the breakpoint. The currently selected breakpoints are highlighted.
Source Window
Shows the parser source file. During a reduction, the line corresponding to the reduction is highlighted.

The debugger is controlled by the following controls:

Shadows Checkbox
This checkbox controls whether or not the parser shows crude shadows while displaying the parse forest. It is useful to avoid having shadows cluttering up the display of large forests.
Update Checkbox
Selecting this checkbox results in the parser displaying the current state in the LR(0) machine in a browser frame. For this to work, the parser should have been generated using the --HTML option (see section Invoking Zyacc), and the environmental variable ZYDEBUG_HTMLBASE should have been specified when the parser was started.
Step Button
Steps the parser by a single step.
Next Button
Steps the parser to the next shift action.
Continue Button
Steps the parser till the next breakpoint. If no breakpoints are set, then the parser runs to completion. As mentioned above, it is not possible to restart the parser.

Tradeoffs between Debugging Approaches

The popular adage "a picture is worth a thousand words" may be true in the real world, but with the primitive visualization techniques used by the GUI debugger, it may not hold for the GUI debugger. For practical parsers, the parse forest displayed by the debugger rapidly becomes unmanageable, even after hiding many large subtrees. If the large amount of screen real estate taken up by the parse forest was occupied instead by words more information might be conveyed.

It is probably best to use the GUI debugger only under the following conditions:

Another problem with GUI debugging is that it can be quite slow as the java GUI has to build up the parse forest, trace output, etc. I have not had a chance to analyze its performance, but I would not be surprised if it is spending a fair amount of time merely collecting garbage.

For debugging most practical parsers, I would recommend the textual interface.

Feedback: Please email any feedback to zdu@acm.org.


Go to the first, previous, next, last section, table of contents.