The TOM compiler compiles TOM source to C, intended for compilation by GNU
CC. Every effort is made to ensure that tools available for
programs written in C can be applied to TOM programs as well. TOM
programs can be profiled using gprof or Quantify,
and tested using PureCoverage
You won't need a malloc debugger, though Purify is
known to work. Other testing products probably work as well;
those mentioned here have actually been used on TOM programs. (If
you know of any developer tool that apparently does or does not
work with TOM, please mail tiggr at gerbil.org.)
TOM programs can be debugged using the GNU debugger, GDB, provided GNU
CC on your system supports GDB debugging (this includes all
machines currently supported by TOM). Using an
unmodified GDB, debugging TOM code is at least as easy as
debugging C code. If GDB is unavailable on your system, you can
probably get the same functionality using any other debugger,
modulo the differences between that debugger and GDB. What
follows is a collection of remarks and hints on the subject; they
are the result of more than a year of experience with using GDB on
TOM programs.
Example input to GDB is prefixed by the usual GDB prompt,
`(gdb) '. Shell input is prefixed by `$ '
- Fool proof
- To usefully run programs under a debugger, you need to pass the
-g option to the C compiler for every C source file
you compile. In the usual setup of TOM compilation (using the TOM
makefiles) -g is already passed to the C compiler
automatically.
The TOM compiler emits #line directives to point the
C compiler and the debugger to the actual TOM source file instead
of the intermediate C file emitted by the TOM compiler. Thus, any
error message emitted by the C compiler (which should not happen
or you'd have encountered a bug in the TOM compiler) and every
action in the debugger will be concerned with the TOM source.
You never see the intermediate C file. Throw it away if
you feel like it; it won't matter.
- Local variables
- Every TOM method has a corresponding C function (see methods). Local variables in a method map
directly to the C local variables understood by GDB. When
compiling with optimization turned on, GCC can map multiple local
variables to the same stack slot or register; it can even
eliminate local variables. If this hurts, compile with
optimization turned off: i.e. do not pass -O or -O2 to the
compiler. With the TOM Makefiles, simply invoke
make
with extra CFLAGS , as in
- Class variables
- Class variables, which are the closest TOM has resembling global
variables, come in three flavors: local (to a thread), static, and
normal. Thread-local variables won't be discussed here.
Static variables in TOM correspond to global variables in C, with some
simple prefixing to ensure a unique name. For example, the static
variable num_cows in the class MyClass
in the unit MyUnit is availabe in the debugger as
c_MyUnit_MyClass_num_cows . If an extension named
MyExtension of MyClass defines a static
variable num_bulls , it will be available as
c_MyUnit_MyClass_MyExtension_num_cows in the
debugger. Note that the name of the unit containing the extension
is irrelevant; extension names must be unique within a class
anyway.
A normal class variable is part of the class object and of the class
object of every subclass. They can be examined in a way similar
to how instance variables can be examined. For this purpose, you
need a pointer to the class object, which you can retrieve in two
different ways, again using the MyUnit.MyClass as an
example:
&_md_c_MyUnit_MyClass
- This is the actual address of the class object. It will never
change, since the class object is a statically allocated C struct.
_mr_c_MyUnit_MyClass
- This refers to the class object in the same way that invoking a class
method does. The value of this pointer is thus affected by
posing. Without posing, its value is
&_md_c_MyUnit_MyClass .
To refer to the meta class object, replace the c (for
Class) by an m (for Meta class), as in
_mr_m_MyUnit_MyClass and
&_md_m_MyUnit_MyClass .
- Methods
- Every method implementation is a C function to GDB. Given the class
implementing the method and the selector to which the method
corresponds, the name of the C function is easily deduced. For
example, the
alloc method implemented by the
tom.State class corresponds to the function
c_tom_State_r_alloc . The prefix is the same as for
class variables, except that for instance methods, the leading
c should be replaced by an i . The
suffix of the method name is the mangled name of the corresponding
selector. Once you've seen a few examples, you'll get the hang of
it; see the TOM language reference manual for a discussion of the
selector name mangling scheme.
To invoke a method from the debugger, there are several options. The
simplest is of course to directly invoke the function implementing
the method. For example, to allocate a new State
object:
|
(gdb) print c_tom_State_r_alloc (_mr_c_tom_State, 0)
$3 = (struct trt_instance *) 0x2002380
|
The first argument is the receiver, in this case the
State class object. The second argument is the
second implicit argument, the selector. Since we `know' that the
alloc method won't use the selector, we can safely
pass 0. If we were to pass the actual selector, it would be
&_sd_r_alloc , i.e. the mangled selector name
preceded by _sd_ . Any arguments to the method come
after the selector.
An advantage of directly invoking the implementation is that you know
which implementation you invoke. In fact, you've just done method
binding by hand. This is not free of perils: you can make errors,
for example forgetting that the object is (an instance of) a
subclass that actually overrides that method. To overcome this
problem, the function send_msg , or its shorthand
s , can be used, which first checks that the receiver
is a valid object and then dispatches the message as usual.
|
(gdb) p s(_mr_c_tom_State,&_sd_r_alloc)
$4 = (struct trt_instance *) 0x2002390
|
Keep in mind that this is a little more tricky than a direct call,
since the types of the arguments are not known to the compiler, as
send_msg employs va_arg to retrieve the
arguments to be passed to the method implementation.
What is left is a word about multiple return values. All but the first
are passed as invisible last arguments. In the following example,
a Cons cell is allocated and initialized with the two
previously allocated State objects.
|
(gdb) p c_tom_Cons_r_with__rr_ (_mr_c_tom_Cons, 0, $3, $4)
$6 = (struct i_tom_Cons *) 0x20023a0
|
We now want to invoke the decons method which returns
the two objects referenced by the Cons object, and
which is decalred as:
The first return value will be returned as normal; for the second
return value to be returned by-reference, we need some space, for
instance by calling malloc .
|
(gdb) p malloc (sizeof (void *))
$7 = 3380356
|
And now we can retrieve the actual objects:
|
(gdb) p i_tom_Cons__rr__decons ($6, 0, $7)
$8 = (struct i__builtin__Any *) 0x2002380
(gdb) x/x $7
0x339484: 0x02002390
|
Obviously, invoking a method with multiple return values isn't all
that trivial. Luckily, you won't need to perform this excercise
in the debugger very often since most methods return a single
value.
- Examining objects
- At compile time, every TOM object is declared to the C compiler as a C
struct. Unless you have modified the runtime library to not
provide the functions described below, you should never need to
look at an object directly, for instance by using
for several reasons:
- The struct only resembles information known to the compiler at
compile time while compiling the source file containing the
current method. Objects can be amended and extended at compile,
link, and run time and the odds are that the order of the
variables in the object will actually be different from the
ordering of tags in the struct.
- With multiple inheritance, the order of direct superclasses at various
places in the tools and runtime library is almost random. This
means that it will be very likely, extremely likely, for the
compiler to have a different view of the contents of an object
than the object itself at run time.
- Need a stronger reason? Here's one: It Just Doesn't Work. Do Not Try
This Anywhere, Kids.
The TOM runtime library has two functions which provide much more
functionality than what you get by staring at a struct:
print_object , or d for short, lets an
object descriptively print itself, dump_object , or
u for short, will dump the instance variables of an
object, recursively to the level specified. Normally you'll use
the short names d and u , unless a local
variable hides the name, and GDB will complain `called object is
not a variable'.
Both functions send their output to the err stream
declared by the stdio class. Also, they check that
the address of the object passed to them actually denotes an
object. Instead of causing a fatal signal, they'll moan about an
address that does not denote an object, or that denotes a dead
object.
void print_object (tom_object o);
- Invoke the object's
write method which is declared as
|
OutputStream
write OutputStream s;
|
This is similar to simply printing an object to a stream; the
following call also invokes the write method:
|
[[stdio err] print my_obj];
|
Objects are free to implement the write method as
they like. For example, an array prints its elements, and a
string its characters. The default implementation of the method
prints the object's address and class:
|
(gdb) call d(c_tom_stdio_err)
#<instance 020020a0 ByteStream>
|
An object can add extra information to this by implementing the
writeFields method. For example, this is what a
File outputs:
|
(gdb) finish
Value returned is $2 = (struct i_tom_File *) 0x20002f0
(gdb) call d($)
#<instance 020002f0 File filename=TestParse.tp flags=256>
|
void dump_object (tom_object o, int simple, int level);
- If
print_object does not provide enough information,
dump_object can be called. This function calls the
dump simple method implemented by the
tom.All instance and which is not intended to be
overridden. There are however, several ways in which an object
can adjust the way it is dumped.
Dumping an object outputs the object's variables:
|
(gdb) call u($,1,1)
#(File 020002f0 asi=0 descriptor=3 name="TestParse.tp" flags=256)
|
The level argument restricts the level up to which
object references are traversed. If a variable references another
object, that object is dumped with one level less to traverse. If
however the level would become 0, the object is simply printed,
which should cause only limited, albeith descriptive, output.
If an object responds positively to the question
dump_simple_p , it will not be dumped in the usual
way. Instead, its dump_simple method is invoked.
String objects use this to simply print themselves instead of
their instance variables, which aren't all that interesting anyway
(an int and a pointer).
If the argument simple to dump_object is not
0, and an object responds positively to the question
dump_self_p , it will be given the opportunity to dump
itself, and it must implement the method dumpSelf indent
simple level to . Array objects use this to actually dump
their elements instead of having their instance variables dumped,
which are just as boring as a string's ivars.
See the documentation for the All instance for
more information on these methods.
(This highlight is a slight revamp of the developer's corner
on the old TOM site.)
Up: Highlights
|