703 lines
27 KiB
Plaintext
703 lines
27 KiB
Plaintext
@node Debugging Tools
|
|
@appendix Debugging Tools
|
|
|
|
Many tools lie at your disposal for debugging PintOS. This appendix
|
|
introduces you to a few of them.
|
|
|
|
@menu
|
|
* printf::
|
|
* ASSERT::
|
|
* Function and Parameter Attributes::
|
|
* Backtraces::
|
|
* GDB::
|
|
* Triple Faults::
|
|
* Debugging Tips::
|
|
@end menu
|
|
|
|
@node printf
|
|
@section @code{printf()}
|
|
|
|
Don't underestimate the value of @func{printf}. The way
|
|
@func{printf} is implemented in PintOS, you can call it from
|
|
practically anywhere in the kernel, whether it's in a kernel thread or
|
|
an interrupt handler, almost regardless of what locks are held.
|
|
|
|
@func{printf} is useful for more than just examining data.
|
|
It can also help figure out when and where something goes wrong, even
|
|
when the kernel crashes or panics without a useful error message. The
|
|
strategy is to sprinkle calls to @func{printf} with different strings
|
|
(e.g.@: @code{"<1>"}, @code{"<2>"}, @dots{}) throughout the pieces of
|
|
code you suspect are failing. If you don't even see @code{<1>} printed,
|
|
then something bad happened before that point, if you see @code{<1>}
|
|
but not @code{<2>}, then something bad happened between those two
|
|
points, and so on. Based on what you learn, you can then insert more
|
|
@func{printf} calls in the new, smaller region of code you suspect.
|
|
Eventually you can narrow the problem down to a single statement.
|
|
@xref{Triple Faults}, for a related technique.
|
|
|
|
@node ASSERT
|
|
@section @code{ASSERT}
|
|
|
|
Assertions are useful because they can catch problems early, before
|
|
they'd otherwise be noticed. Ideally, each function should begin with a
|
|
set of assertions that check its arguments for validity. (Initializers
|
|
for functions' local variables are evaluated before assertions are
|
|
checked, so be careful not to assume that an argument is valid in an
|
|
initializer.) You can also sprinkle assertions throughout the body of
|
|
functions in places where you suspect things are likely to go wrong.
|
|
They are especially useful for checking loop invariants.
|
|
|
|
PintOS provides the @code{ASSERT} macro, defined in @file{<debug.h>},
|
|
for checking assertions.
|
|
|
|
@defmac ASSERT (expression)
|
|
Tests the value of @var{expression}. If it evaluates to zero (false),
|
|
the kernel panics. The panic message includes the expression that
|
|
failed, its file and line number, and a backtrace, which should help you
|
|
to find the problem. @xref{Backtraces}, for more information.
|
|
@end defmac
|
|
|
|
@node Function and Parameter Attributes
|
|
@section Function and Parameter Attributes
|
|
|
|
These macros defined in @file{<debug.h>} tell the compiler special
|
|
attributes of a function or function parameter. Their expansions are
|
|
GCC-specific.
|
|
|
|
@defmac UNUSED
|
|
Appended to a function parameter to tell the compiler that the
|
|
parameter might not be used within the function. It suppresses the
|
|
warning that would otherwise appear.
|
|
@end defmac
|
|
|
|
@defmac NO_RETURN
|
|
Appended to a function prototype to tell the compiler that the
|
|
function never returns. It allows the compiler to fine-tune its
|
|
warnings and its code generation.
|
|
@end defmac
|
|
|
|
@defmac NO_INLINE
|
|
Appended to a function prototype to tell the compiler to never emit
|
|
the function in-line. Occasionally useful to improve the quality of
|
|
backtraces (see below).
|
|
@end defmac
|
|
|
|
@defmac PRINTF_FORMAT (@var{format}, @var{first})
|
|
Appended to a function prototype to tell the compiler that the function
|
|
takes a @func{printf}-like format string as the argument numbered
|
|
@var{format} (starting from 1) and that the corresponding value
|
|
arguments start at the argument numbered @var{first}. This lets the
|
|
compiler tell you if you pass the wrong argument types.
|
|
@end defmac
|
|
|
|
@node Backtraces
|
|
@section Backtraces
|
|
|
|
When the kernel panics, it prints a ``backtrace,'' that is, a summary
|
|
of how your program got where it is, as a list of addresses inside the
|
|
functions that were running at the time of the panic. You can also
|
|
insert a call to @func{debug_backtrace}, prototyped in
|
|
@file{<debug.h>}, to print a backtrace at any point in your code.
|
|
@func{debug_backtrace_all}, also declared in @file{<debug.h>},
|
|
prints backtraces of all threads.
|
|
|
|
The addresses in a backtrace are listed as raw hexadecimal numbers,
|
|
which are difficult to interpret. We provide a tool called
|
|
@command{backtrace} to translate these into function names and source
|
|
file line numbers.
|
|
Give it the name of your @file{kernel.o} as the first argument and the
|
|
hexadecimal numbers composing the backtrace (including the @samp{0x}
|
|
prefixes) as the remaining arguments. It outputs the function name
|
|
and source file line numbers that correspond to each address.
|
|
|
|
If the translated form of a backtrace is garbled, or doesn't make
|
|
sense (e.g.@: function A is listed above function B, but B doesn't
|
|
call A), then it's a good sign that you're corrupting a kernel
|
|
thread's stack, because the backtrace is extracted from the stack.
|
|
Alternatively, it could be that the @file{kernel.o} you passed to
|
|
@command{backtrace} is not the same kernel that produced
|
|
the backtrace.
|
|
|
|
Sometimes backtraces can be confusing without any corruption.
|
|
Compiler optimizations can cause surprising behaviour. When a function
|
|
has called another function as its final action (a @dfn{tail call}), the
|
|
calling function may not appear in a backtrace at all. Similarly, when
|
|
function A calls another function B that never returns, the compiler may
|
|
optimize such that an unrelated function C appears in the backtrace
|
|
instead of A. Function C is simply the function that happens to be in
|
|
memory just after A. In the threads task, this is commonly seen in
|
|
backtraces for test failures; see @ref{The pass function fails, ,
|
|
@func{pass} fails}, for more information.
|
|
|
|
@menu
|
|
* Backtrace Example::
|
|
@end menu
|
|
|
|
@node Backtrace Example
|
|
@subsection Example
|
|
|
|
Here's an example. Suppose that PintOS printed out this following call
|
|
stack, which is taken from an actual PintOS submission:
|
|
|
|
@example
|
|
Call stack: 0xc0106eff 0xc01102fb 0xc010dc22 0xc010cf67 0xc0102319
|
|
0xc010325a 0x804812c 0x8048a96 0x8048ac8.
|
|
@end example
|
|
|
|
You would then invoke the @command{backtrace} utility like shown below,
|
|
cutting and pasting the backtrace information into the command line.
|
|
This assumes that @file{kernel.o} is in the current directory. You
|
|
would of course enter all of the following on a single shell command
|
|
line, even though that would overflow our margins here:
|
|
|
|
@example
|
|
backtrace kernel.o 0xc0106eff 0xc01102fb 0xc010dc22 0xc010cf67
|
|
0xc0102319 0xc010325a 0x804812c 0x8048a96 0x8048ac8
|
|
@end example
|
|
|
|
The backtrace output would then look something like this:
|
|
|
|
@example
|
|
0xc0106eff: debug_panic (lib/debug.c:86)
|
|
0xc01102fb: file_seek (filesys/file.c:405)
|
|
0xc010dc22: seek (userprog/syscall.c:744)
|
|
0xc010cf67: syscall_handler (userprog/syscall.c:444)
|
|
0xc0102319: intr_handler (threads/interrupt.c:334)
|
|
0xc010325a: intr_entry (threads/intr-stubs.S:38)
|
|
0x0804812c: (unknown)
|
|
0x08048a96: (unknown)
|
|
0x08048ac8: (unknown)
|
|
@end example
|
|
|
|
(You will probably not see exactly the same addresses if you run the
|
|
command above on your own kernel binary, because the source code you
|
|
compiled and the compiler you used are probably different.)
|
|
|
|
The first line in the backtrace refers to @func{debug_panic}, the
|
|
function that implements kernel panics. Because backtraces commonly
|
|
result from kernel panics, @func{debug_panic} will often be the first
|
|
function shown in a backtrace.
|
|
|
|
The second line shows @func{file_seek} as the function that panicked,
|
|
in this case as the result of an assertion failure. In the source code
|
|
tree used for this example, line 405 of @file{filesys/file.c} is the
|
|
assertion
|
|
|
|
@example
|
|
ASSERT (file_ofs >= 0);
|
|
@end example
|
|
|
|
@noindent
|
|
(This line was also cited in the assertion failure message.)
|
|
Thus, @func{file_seek} panicked because it passed a negative file offset
|
|
argument.
|
|
|
|
The third line indicates that @func{seek} called @func{file_seek},
|
|
presumably without validating the offset argument. In this submission,
|
|
@func{seek} implements the @code{seek} system call.
|
|
|
|
The fourth line shows that @func{syscall_handler}, the system call
|
|
handler, invoked @func{seek}.
|
|
|
|
The fifth and sixth lines are the interrupt handler entry path.
|
|
|
|
The remaining lines are for addresses below @code{PHYS_BASE}. This
|
|
means that they refer to addresses in the user program, not in the
|
|
kernel. If you know what user program was running when the kernel
|
|
panicked, you can re-run @command{backtrace} on the user program, like
|
|
so: (typing the command on a single line, of course):
|
|
|
|
@example
|
|
backtrace tests/filesys/extended/grow-too-big 0xc0106eff 0xc01102fb
|
|
0xc010dc22 0xc010cf67 0xc0102319 0xc010325a 0x804812c 0x8048a96
|
|
0x8048ac8
|
|
@end example
|
|
|
|
The results look like this:
|
|
|
|
@example
|
|
0xc0106eff: (unknown)
|
|
0xc01102fb: (unknown)
|
|
0xc010dc22: (unknown)
|
|
0xc010cf67: (unknown)
|
|
0xc0102319: (unknown)
|
|
0xc010325a: (unknown)
|
|
0x0804812c: test_main (...xtended/grow-too-big.c:20)
|
|
0x08048a96: main (tests/main.c:10)
|
|
0x08048ac8: _start (lib/user/entry.c:9)
|
|
@end example
|
|
|
|
You can even specify both the kernel and the user program names on
|
|
the command line, like so:
|
|
|
|
@example
|
|
backtrace kernel.o tests/filesys/extended/grow-too-big 0xc0106eff
|
|
0xc01102fb 0xc010dc22 0xc010cf67 0xc0102319 0xc010325a 0x804812c
|
|
0x8048a96 0x8048ac8
|
|
@end example
|
|
|
|
The result is a combined backtrace:
|
|
|
|
@example
|
|
In kernel.o:
|
|
0xc0106eff: debug_panic (lib/debug.c:86)
|
|
0xc01102fb: file_seek (filesys/file.c:405)
|
|
0xc010dc22: seek (userprog/syscall.c:744)
|
|
0xc010cf67: syscall_handler (userprog/syscall.c:444)
|
|
0xc0102319: intr_handler (threads/interrupt.c:334)
|
|
0xc010325a: intr_entry (threads/intr-stubs.S:38)
|
|
In tests/filesys/extended/grow-too-big:
|
|
0x0804812c: test_main (...xtended/grow-too-big.c:20)
|
|
0x08048a96: main (tests/main.c:10)
|
|
0x08048ac8: _start (lib/user/entry.c:9)
|
|
@end example
|
|
|
|
Here's an extra tip for anyone who read this far: @command{backtrace}
|
|
is smart enough to strip the @code{Call stack:} header and @samp{.}
|
|
trailer from the command line if you include them. This can save you
|
|
a little bit of trouble in cutting and pasting. Thus, the following
|
|
command prints the same output as the first one we used:
|
|
|
|
@example
|
|
backtrace kernel.o Call stack: 0xc0106eff 0xc01102fb 0xc010dc22
|
|
0xc010cf67 0xc0102319 0xc010325a 0x804812c 0x8048a96 0x8048ac8.
|
|
@end example
|
|
|
|
@node GDB
|
|
@section GDB
|
|
|
|
You can run PintOS under the supervision of the GDB debugger.
|
|
First, start PintOS with the @option{--gdb} option, e.g.@:
|
|
@command{pintos --gdb -- run mytest}. Second, open a second terminal on
|
|
the same machine and
|
|
use @command{pintos-gdb} to invoke GDB on
|
|
@file{kernel.o}:@footnote{@command{pintos-gdb} is a wrapper around
|
|
@command{gdb} (80@var{x}86) that loads the PintOS macros at startup.}
|
|
@example
|
|
pintos-gdb kernel.o
|
|
@end example
|
|
@noindent and issue the following GDB command:
|
|
@example
|
|
target remote localhost:1234
|
|
@end example
|
|
@noindent or alternatively issue the following GDB macro:
|
|
@example
|
|
debugpintos
|
|
@end example
|
|
|
|
Now GDB is connected to the simulator over a local
|
|
network connection. You can now issue any normal GDB
|
|
commands. If you issue the @samp{c} command, the simulated BIOS will take
|
|
control, load PintOS, and then PintOS will run in the usual way. You
|
|
can pause the process at any point with @key{Ctrl+C}.
|
|
|
|
@menu
|
|
* Using GDB::
|
|
* Example GDB Session::
|
|
* GDB FAQ::
|
|
@end menu
|
|
|
|
@node Using GDB
|
|
@subsection Using GDB
|
|
|
|
You can read the GDB manual by typing @code{info gdb} at a
|
|
terminal command prompt. Here's a few commonly useful GDB commands:
|
|
|
|
@deffn {GDB Command} c
|
|
Continues execution until @key{Ctrl+C} or the next breakpoint.
|
|
@end deffn
|
|
|
|
@deffn {GDB Command} break function
|
|
@deffnx {GDB Command} break file:line
|
|
@deffnx {GDB Command} break *address
|
|
Sets a breakpoint at @var{function}, at @var{line} within @var{file}, or
|
|
@var{address}.
|
|
(Use a @samp{0x} prefix to specify an address in hex.)
|
|
|
|
Use @code{break main} to make GDB stop when PintOS starts running.
|
|
@end deffn
|
|
|
|
@deffn {GDB Command} p expression
|
|
Evaluates the given @var{expression} and prints its value.
|
|
If the expression contains a function call, that function will actually
|
|
be executed.
|
|
@end deffn
|
|
|
|
@deffn {GDB Command} l *address
|
|
Lists a few lines of code around @var{address}.
|
|
(Use a @samp{0x} prefix to specify an address in hex.)
|
|
@end deffn
|
|
|
|
@deffn {GDB Command} bt
|
|
Prints a stack backtrace similar to that output by the
|
|
@command{backtrace} program described above.
|
|
@end deffn
|
|
|
|
@deffn {GDB Command} p/a address
|
|
Prints the name of the function or variable that occupies @var{address}.
|
|
(Use a @samp{0x} prefix to specify an address in hex.)
|
|
@end deffn
|
|
|
|
@deffn {GDB Command} diassemble function
|
|
Disassembles @var{function}.
|
|
@end deffn
|
|
|
|
We also provide a set of macros specialized for debugging PintOS,
|
|
written by Godmar Back @email{gback@@cs.vt.edu}. You can type
|
|
@code{help user-defined} for basic help with the macros. Here is an
|
|
overview of their functionality, based on Godmar's documentation:
|
|
|
|
@deffn {GDB Macro} debugpintos
|
|
Attach debugger to a waiting pintos process on the same machine.
|
|
Shorthand for @code{target remote localhost:1234}.
|
|
@end deffn
|
|
|
|
@deffn {GDB Macro} dumplist list type element
|
|
Prints the elements of @var{list}, which must be passed by reference and should be a @code{struct list}
|
|
that contains elements of the given @var{type} (without the word
|
|
@code{struct}) in which @var{element} is the @struct{list_elem} member
|
|
that links the elements.
|
|
|
|
Example: @code{dumplist &all_list thread allelem} prints all elements of
|
|
@struct{thread} that are linked in @code{struct list all_list} using the
|
|
@code{struct list_elem allelem} which is part of @struct{thread}.
|
|
@end deffn
|
|
|
|
@deffn {GDB Macro} dumphash hash type element
|
|
Similar to @code{dumplist}. Prints the elements of @var{hash}, which must be passed by reference and should be a @code{struct hash}
|
|
that contains elements of the given @var{type} (without the word
|
|
@code{struct}) in which @var{element} is the @struct{hash_elem} member
|
|
that links the elements.
|
|
@end deffn
|
|
|
|
@deffn {GDB Macro} btthread thread
|
|
Shows the backtrace of @var{thread}, which is a pointer to the
|
|
@struct{thread} of the thread whose backtrace it should show. For the
|
|
current thread, this is identical to the @code{bt} (backtrace) command.
|
|
It also works for any thread suspended in @func{schedule},
|
|
provided you know where its kernel stack page is located.
|
|
@end deffn
|
|
|
|
@deffn {GDB Macro} btthreadlist list element
|
|
Shows the backtraces of all threads in @var{list}, which must be passed by reference and is the @struct{list} in which the threads are kept.
|
|
Specify @var{element} as the @struct{list_elem} field used inside @struct{thread} to link the threads together.
|
|
|
|
Example: @code{btthreadlist &all_list allelem} shows the backtraces of
|
|
all threads contained in @code{struct list all_list}, linked together by
|
|
@code{allelem}. This command is useful to determine where your threads
|
|
are stuck when a deadlock occurs. Please see the example scenario below.
|
|
@end deffn
|
|
|
|
@deffn {GDB Macro} btthreadall
|
|
Short-hand for @code{btthreadlist all_list allelem}.
|
|
@end deffn
|
|
|
|
@deffn {GDB Macro} hook-stop
|
|
GDB invokes this macro every time the simulation stops, which QEMU will
|
|
do for every processor exception, among other reasons. If the
|
|
simulation stops due to a page fault, @code{hook-stop} will print a
|
|
message that says and explains further whether the page fault occurred
|
|
in the kernel or in user code.
|
|
|
|
If the exception occurred from user code, @code{hook-stop} will say:
|
|
@example
|
|
pintos-debug: a page fault exception occurred in user mode
|
|
pintos-debug: hit 'c' to continue, or 's' to step to intr_handler
|
|
@end example
|
|
|
|
In Task 2, a page fault in a user process leads to the termination of
|
|
the process. You should expect those page faults to occur in the
|
|
robustness tests where we test that your kernel properly terminates
|
|
processes that try to access invalid addresses. To debug those, set a
|
|
break point in @func{page_fault} in @file{exception.c}, which you will
|
|
need to modify accordingly.
|
|
|
|
In Task 3, a page fault in a user process no longer automatically
|
|
leads to the termination of a process. Instead, it may require reading in
|
|
data for the page the process was trying to access, either
|
|
because it was swapped out or because this is the first time it's
|
|
accessed. In either case, you will reach @func{page_fault} and need to
|
|
take the appropriate action there.
|
|
|
|
If the page fault did not occur in user mode while executing a user
|
|
process, then it occurred in kernel mode while executing kernel code.
|
|
In this case, @code{hook-stop} will print this message:
|
|
@example
|
|
pintos-debug: a page fault occurred in kernel mode
|
|
@end example
|
|
|
|
Before Task 3, a page fault exception in kernel code is always a bug
|
|
in your kernel, because your kernel should never crash. Starting with
|
|
Task 3, the situation will change if you use the @func{get_user} and
|
|
@func{put_user} strategy to verify user memory accesses
|
|
(@pxref{Accessing User Memory}).
|
|
|
|
@c ----
|
|
@c Unfortunately, this does not work with Bochs's gdb stub.
|
|
@c ----
|
|
@c If you don't want GDB to stop for page faults, then issue the command
|
|
@c @code{handle SIGSEGV nostop}. GDB will still print a message for
|
|
@c every page fault, but it will not come back to a command prompt.
|
|
@end deffn
|
|
|
|
@node Example GDB Session
|
|
@subsection Example GDB Session
|
|
|
|
This section narrates a sample GDB session, provided by Godmar Back
|
|
(modified by Mark Rutland and Feroz Abdul Salam, and updated by Fidelis Perkonigg).
|
|
This example illustrates how one might debug a Task 1 solution in
|
|
which occasionally a thread that calls @func{timer_sleep} is not woken
|
|
up. With this bug, tests such as @code{mlfqs_load_1} get stuck.
|
|
|
|
Program output is shown in normal type, user input in @strong{strong}
|
|
type.
|
|
|
|
@noindent First, we start PintOS using the QEMU emulator:
|
|
|
|
@smallexample
|
|
@code{$ pintos -v --qemu --gdb -- -q -mlfqs run mlfqs-load-1}
|
|
|
|
qemu-system-i386 -drive file=/tmp/GKWoGG8QE6.dsk,index=0,media=disk,format=raw -m 4 -net none -nographic -s -S
|
|
@end smallexample
|
|
|
|
@noindent This starts QEMU but pauses the execution of PintOS immediately to
|
|
allow us to attach GDB to PintOS. We open a second window in the same directory
|
|
on the same machine and start GDB:
|
|
|
|
@smallexample
|
|
$ @strong{pintos-gdb kernel.o}
|
|
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
|
|
Copyright (C) 2020 Free Software Foundation, Inc.
|
|
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
|
|
This is free software: you are free to change and redistribute it.
|
|
There is NO WARRANTY, to the extent permitted by law.
|
|
Type "show copying" and "show warranty" for details.
|
|
This GDB was configured as "x86_64-linux-gnu".
|
|
...
|
|
For help, type "help".
|
|
Type "apropos word" to search for commands related to "word"...
|
|
Reading symbols from kernel.o...done.
|
|
The target architecture is assumed to be i386
|
|
@end smallexample
|
|
|
|
@noindent Then, we tell GDB to attach to the waiting PintOS emulator:
|
|
|
|
@smallexample
|
|
(gdb) @strong{debugpintos}
|
|
0x0000fff0 in ?? ()
|
|
@end smallexample
|
|
|
|
@noindent Now we instruct GDB to continue the execution of PintOS by using the
|
|
command @code{continue} (or the abbreviation @code{c}):
|
|
|
|
@smallexample
|
|
(gdb) @strong{c}
|
|
Continuing.
|
|
@end smallexample
|
|
|
|
@noindent Now PintOS will continue and output:
|
|
|
|
@smallexample
|
|
PiLo hda1
|
|
Loading..........
|
|
Kernel command line: -q -mlfqs run mlfqs-load-1
|
|
PintOS booting with 3,968 kB RAM...
|
|
367 pages available in kernel pool.
|
|
367 pages available in user pool.
|
|
Calibrating timer... 838,041,600 loops/s.
|
|
Boot complete.
|
|
Executing 'mlfqs-load-1':
|
|
(mlfqs-load-1) begin
|
|
(mlfqs-load-1) spinning for up to 45 seconds, please wait...
|
|
@end smallexample
|
|
|
|
@noindent
|
|
... until it gets stuck due to the bug in the code.
|
|
We hit @key{Ctrl+C} in the debugger window to stop PintOS.
|
|
|
|
@smallexample
|
|
Program received signal SIGINT, Interrupt.
|
|
intr_get_level () at ../../threads/interrupt.c:66
|
|
(gdb)
|
|
@end smallexample
|
|
|
|
@noindent
|
|
The thread that was running when we stopped PintOS happened to be the main
|
|
thread. If we run @code{backtrace}, it shows this backtrace:
|
|
|
|
@smallexample
|
|
(gdb) @strong{bt}
|
|
#0 intr_get_level () at ../../threads/interrupt.c:66
|
|
#1 0xc0021103 in intr_enable () at ../../threads/interrupt.c:90
|
|
#2 0xc0021150 in intr_set_level (level=INTR_ON) at ../../threads/interrupt.c:83
|
|
#3 0xc0023422 in timer_ticks () at ../../devices/timer.c:75
|
|
#4 0xc002343d in timer_elapsed (then=23) at ../../devices/timer.c:84
|
|
#5 0xc002aabf in test_mlfqs_load_1 () at ../../tests/threads/mlfqs-load-1.c:33
|
|
#6 0xc002b610 in run_test (name=0xc0007d4c "mlfqs-load-1") at ../../tests/devices/tests.c:72
|
|
#7 0xc00201a2 in run_task (argv=0xc00349e8 <argv+8>) at ../../threads/init.c:290
|
|
#8 0xc0020687 in run_actions (argv=0xc00349e8 <argv+8>) at ../../threads/init.c:340
|
|
#9 main () at ../../threads/init.c:133
|
|
@end smallexample
|
|
|
|
@noindent
|
|
Not terribly useful. What we really like to know is what's up with the
|
|
other thread (or threads). Since we keep all threads in a linked list
|
|
called @code{all_list}, linked together by a @struct{list_elem} member
|
|
named @code{allelem}, we can use the @code{btthreadlist} macro.
|
|
@code{btthreadlist} iterates through the list of threads and prints the
|
|
backtrace for each thread:
|
|
|
|
@smallexample
|
|
(gdb) @strong{btthreadlist &all_list allelem}
|
|
pintos-debug: dumping backtrace of thread 'main' @@0xc000e000
|
|
#0 intr_get_level () at ../../threads/interrupt.c:66
|
|
#1 0xc0021103 in intr_enable () at ../../threads/interrupt.c:90
|
|
#2 0xc0021150 in intr_set_level (level=INTR_ON) at ../../threads/interrupt.c:83
|
|
#3 0xc0023422 in timer_ticks () at ../../devices/timer.c:75
|
|
#4 0xc002343d in timer_elapsed (then=23) at ../../devices/timer.c:84
|
|
#5 0xc002aabf in test_mlfqs_load_1 () at ../../tests/threads/mlfqs-load-1.c:33
|
|
#6 0xc002b610 in run_test (name=0xc0007d4c "mlfqs-load-1") at ../../tests/devices/tests.c:72
|
|
#7 0xc00201a2 in run_task (argv=0xc00349e8 <argv+8>) at ../../threads/init.c:290
|
|
#8 0xc0020687 in run_actions (argv=0xc00349e8 <argv+8>) at ../../threads/init.c:340
|
|
#9 main () at ../../threads/init.c:133
|
|
|
|
pintos-debug: dumping backtrace of thread 'idle' @@0xc0103000
|
|
#0 0xc0020dc5 in schedule () at ../../threads/thread.c:579
|
|
#1 0xc0020e01 in thread_block () at ../../threads/thread.c:235
|
|
#2 0xc0020e6d in idle (idle_started_=0xc000ef7c) at ../../threads/thread.c:414
|
|
#3 0xc0020efc in kernel_thread (function=0xc0020e45 <idle>, aux=0xc000ef7c)
|
|
at ../../threads/thread.c:439
|
|
#4 0x00000000 in ?? ()
|
|
@end smallexample
|
|
|
|
@noindent
|
|
In this case, there are only two threads, the main thread and the idle
|
|
thread. The kernel stack pages (to which the @struct{thread} points)
|
|
are at @t{0xc000e000} and @t{0xc0103000}, respectively. The main thread
|
|
was in @func{timer_elapsed}, called from @code{test_mlfqs_load_1} when stopped.
|
|
|
|
Knowing where threads are can be tremendously useful, for instance
|
|
when diagnosing deadlocks or unexplained hangs.
|
|
|
|
@deffn {GDB Macro} loadusersymbols
|
|
|
|
You can also use GDB to debug a user program running under PintOS.
|
|
To do that, use the @code{loadusersymbols} macro to load the program's
|
|
symbol table:
|
|
@example
|
|
loadusersymbols @var{program}
|
|
@end example
|
|
@noindent
|
|
where @var{program} is the name of the program's executable (in the host
|
|
file system, not in the PintOS file system). For example, you may issue:
|
|
@smallexample
|
|
(gdb) @strong{loadusersymbols tests/userprog/exec-multiple}
|
|
add symbol table from file "tests/userprog/exec-multiple" at
|
|
.text_addr = 0x80480a0
|
|
(gdb)
|
|
@end smallexample
|
|
|
|
After this, you should be
|
|
able to debug the user program the same way you would the kernel, by
|
|
placing breakpoints, inspecting data, etc. Your actions apply to every
|
|
user program running in PintOS, not just to the one you want to debug,
|
|
so be careful in interpreting the results: GDB does not know
|
|
which process is currently active (because that is an abstraction
|
|
the PintOS kernel creates). Also, a name that appears in
|
|
both the kernel and the user program will actually refer to the kernel
|
|
name. (The latter problem can be avoided by giving the user executable
|
|
name on the GDB command line, instead of @file{kernel.o}, and then using
|
|
@code{loadusersymbols} to load @file{kernel.o}.)
|
|
@code{loadusersymbols} is implemented via GDB's @code{add-symbol-file}
|
|
command.
|
|
|
|
@end deffn
|
|
|
|
@node GDB FAQ
|
|
@subsection FAQ
|
|
|
|
@table @asis
|
|
@item GDB can't connect to QEMU (Error: localhost:1234: Connection refused)
|
|
|
|
If the @command{target remote} command fails, then make sure that both
|
|
GDB and @command{pintos} are running on the same machine by
|
|
running @command{hostname} in each terminal. If the names printed
|
|
differ, then you need to open a new terminal for GDB on the
|
|
machine running @command{pintos}.
|
|
|
|
@item GDB doesn't recognize any of the macros.
|
|
|
|
If you start GDB with @command{pintos-gdb}, it should load the PintOS
|
|
macros automatically. If you start GDB some other way, then you must
|
|
issue the command @code{source @var{pintosdir}/src/misc/gdb-macros},
|
|
where @var{pintosdir} is the root of your PintOS directory, before you
|
|
can use them.
|
|
|
|
@item Can I debug PintOS with DDD?
|
|
|
|
Yes, you can. DDD invokes GDB as a subprocess, so you'll need to tell
|
|
it to invokes @command{pintos-gdb} instead:
|
|
@example
|
|
ddd --gdb --debugger pintos-gdb
|
|
@end example
|
|
|
|
@item Can I use GDB inside Emacs?
|
|
|
|
Yes, you can. Emacs has special support for running GDB as a
|
|
subprocess. Type @kbd{M-x gdb} and enter your @command{pintos-gdb}
|
|
command at the prompt. The Emacs manual has information on how to use
|
|
its debugging features in a section titled ``Debuggers.''
|
|
|
|
@end table
|
|
|
|
@node Triple Faults
|
|
@section Triple Faults
|
|
|
|
When a CPU exception handler, such as a page fault handler, cannot be
|
|
invoked because it is missing or defective, the CPU will try to invoke
|
|
the ``double fault'' handler. If the double fault handler is itself
|
|
missing or defective, that's called a ``triple fault.'' A triple fault
|
|
causes an immediate CPU reset.
|
|
|
|
Thus, if you get yourself into a situation where the machine reboots in
|
|
a loop, that's probably a ``triple fault.'' In a triple fault
|
|
situation, you might not be able to use @func{printf} for debugging,
|
|
because the reboots might be happening even before everything needed for
|
|
@func{printf} is initialized.
|
|
|
|
Currently, the only option is ``debugging by infinite loop.''
|
|
Pick a place in the PintOS code, insert the infinite loop
|
|
@code{for (;;);} there, and recompile and run. There are two likely
|
|
possibilities:
|
|
|
|
@itemize @bullet
|
|
@item
|
|
The machine hangs without rebooting. If this happens, you know that
|
|
the infinite loop is running. That means that whatever caused the
|
|
reboot must be @emph{after} the place you inserted the infinite loop.
|
|
Now move the infinite loop later in the code sequence.
|
|
|
|
@item
|
|
The machine reboots in a loop. If this happens, you know that the
|
|
machine didn't make it to the infinite loop. Thus, whatever caused the
|
|
reboot must be @emph{before} the place you inserted the infinite loop.
|
|
Now move the infinite loop earlier in the code sequence.
|
|
@end itemize
|
|
|
|
If you move around the infinite loop in a ``binary search'' fashion, you
|
|
can use this technique to pin down the exact spot that everything goes
|
|
wrong. It should only take a few minutes at most.
|
|
|
|
@node Debugging Tips
|
|
@section Tips
|
|
|
|
The page allocator in @file{threads/palloc.c} and the block allocator in
|
|
@file{threads/malloc.c} clear all the bytes in memory to
|
|
@t{0xcc} at time of free. Thus, if you see an attempt to
|
|
dereference a pointer like @t{0xcccccccc}, or some other reference to
|
|
@t{0xcc}, there's a good chance you're trying to reuse a page that's
|
|
already been freed. Also, byte @t{0xcc} is the CPU opcode for ``invoke
|
|
interrupt 3,'' so if you see an error like @code{Interrupt 0x03 (#BP
|
|
Breakpoint Exception)}, then PintOS tried to execute code in a freed page or
|
|
block.
|