Linux Device Drivers (3rd Edition) [Electronic resources] نسخه متنی

4.6. Debuggers and Related Tools

The last resort in debugging

modules is using a
debugger to step through the code, watching the value of variables
and machine registers. This approach is time-consuming and should be
avoided whenever possible. Nonetheless, the fine-grained perspective
on the code that is achieved through a debugger is sometimes
invaluable.

Using an interactive debugger on the kernel is a challenge. The
kernel runs in its own address space on behalf of all the processes
on the system. As a result, a number of common capabilities provided
by user-space debuggers, such as breakpoints and single-stepping, are
harder to come by in the kernel. In this section we look at several
ways of debugging the kernel; each of them has advantages and
disadvantages.

4.6.1. Using gdb

gdb can be quite useful for looking at the
system internals. Proficient use of the debugger at this level
requires some confidence with
gdb

commands, some understanding of assembly code for the target
platform, and the ability to match source code and optimized
assembly.

The debugger must be invoked as though the
kernel were an application. In addition to specifying the filename
for the ELF kernel image, you need to provide the name of a core file
on the command line. For a running kernel, that core file is the
kernel core image, /proc/kcore. A typical
invocation of gdb looks like the following:

gdb /usr/src/linux/vmlinux /proc/kcore

The first argument is the name of the uncompressed ELF kernel
executable, not the zImage or
bzImage or anything built specifically for the
boot environment.

The second argument on the gdb command line is
the name of the core file. Like any file in
/proc, /proc/kcore is
generated when it is read. When the read system
call executes in the /proc filesystem, it maps
to a data-generation function rather than a data-retrieval one;
we've already exploited this feature in the section
Section 4.3.1.
kcore is used to represent the kernel
"executable" in the format of a
core file; it is a huge file, because it represents the whole kernel
address space, which corresponds to all physical memory. From within
gdb, you can look at kernel variables by issuing
the standard gdb commands. For example,
p jiffies prints the number of clock ticks from
system boot to the current time.

When you print
data from gdb, the kernel is still running, and
the various data items have different values at different times;
gdb, however, optimizes access to the core file
by caching data that has already been read. If you try to look at the
jiffies variable once again,
you'll get the same answer as before. Caching values
to avoid extra disk access is a correct behavior for conventional
core files but is inconvenient when a
"dynamic" core image is used. The
solution is to issue the command core-file
/proc/kcore whenever you want to flush the
gdb cache; the debugger gets ready to use a new
core file and discards any old information. You
won't, however, always need to issue
core-file when reading a new datum;
gdb reads the core in chunks of a few kilobytes
and caches only chunks it has already referenced.

Numerous capabilities normally provided by gdb
are not available when you are working with the kernel. For example,
gdb is not able to modify kernel data; it
expects to be running a program to be debugged under its own control
before playing with its memory image. It is also not possible to set
breakpoints or watchpoints, or to single-step through kernel
functions.

Note that, in order to have symbol information available for
gdb, you must compile your kernel with the
CONFIG_DEBUG_INFO option set. The result is a far
larger kernel image on disk, but, without that information, digging
through kernel variables is almost impossible.

With the debugging information available, you can learn a lot about
what is going on inside the kernel. gdb happily
prints out structures, follows pointers, etc. One thing that is
harder, however, is examining modules. Since modules are not part of
the vmlinux image passed to
gdb, the debugger knows nothing about them.
Fortunately, as of kernel 2.6.7, it is possible to teach
gdb what it needs to know to examine loadable
modules.

Linux loadable modules are ELF-format executable images; as such,
they have been divided up into numerous sections. A typical module
can contain a dozen or more sections, but there are typically three
that are relevant in a debugging session:

.text

This section contains the executable code for the module. The
debugger must know where this section is to be able to give
tracebacks or set breakpoints. (Neither of these operations is
relevant when running the debugger on
/proc/kcore, but they can useful when working
with kgdb, described below).

.bss

.data

These two sections hold the module's variables. Any
variable that is not initialized at compile time ends up in
.bss, while those that are initialized go into
.data.

Making gdb work with loadable modules requires
informing the debugger about where a given module's
sections have been loaded. That information is available in sysfs,
under /sys/module. For example, after loading
the scull module, the directory
/sys/module/scull/sections contains files with
names such as .text; the content of each file is
the base address for that section.

We are now in a position to issue a gdb command
telling it about our module. The command we need is
add-symbol-file; this command takes as parameters
the name of the module object file, the .text
base address, and a series of optional parameters describing where
any other sections of interest have been put. After digging through
the module section data in sysfs, we can construct a command such as:

(gdb) add-symbol-file .../scull.ko 0xd0832000 \
-s .bss 0xd0837100 \
        -s .data 0xd0836be0

We have included a small script in the sample source
(gdbline) that can create this command for a
given module.

We can now use gdb to examine variables in our
loadable module. Here is a quick example taken from a
scull debugging session:

(gdb) add-symbol-file scull.ko 0xd0832000 \
-s .bss 0xd0837100 \
      -s .data 0xd0836be0
add symbol table from file "scull.ko" at
.text_addr = 0xd0832000
.bss_addr = 0xd0837100
.data_addr = 0xd0836be0
(y or n) y
Reading symbols from scull.ko...done.
(gdb) p scull_devices[0]
$1 = {data = 0xcfd66c50, 
quantum = 4000, 
qset = 1000, 
size = 20881,
access_key = 0, 
...}

Here we see that the first scull device
currently holds 20,881 bytes. If we wanted, we could follow the
data chain, or look at anything else of interest
in the module.

One other useful trick worth knowing about is this:

(gdb) print *(address)

Here, fill in a hex address for address; the
output is a file and line number for the code corresponding to that
address. This technique may be useful, for example, to find out where
a function pointer really points.

We still cannot perform typical debugging tasks like setting
breakpoints or modifying data; to perform those operations, we need
to use a tool like kdb (described next) or
kgdb (which we get to shortly).

4.6.2. The kdb Kernel Debugger

Many readers may be wondering why the kernel does not have any more
advanced debugging features built into it. The answer, quite simply,
is that Linus does not believe in interactive debuggers. He fears
that they lead to poor fixes, those which patch up symptoms rather
than addressing the real cause of problems. Thus, no built-in
debuggers.

Other kernel developers, however, see an occasional use for
interactive debugging tools. One such tool is the
kdb built-in kernel debugger, available as a
nonofficial patch from oss.sgi.com. To use
kdb, you must obtain the patch (be sure to get a
version that matches your kernel version), apply it, and rebuild and
reinstall the kernel. Note that, as of this writing,
kdb works only on IA-32 (x86) systems (though a
version for the IA-64 existed for a while in the mainline kernel
source before being removed).

Once you are running a kdb-enabled kernel, there
are a couple of ways to enter the debugger. Pressing the Pause (or
Break) key on the console starts up the debugger.
kdb also starts up when a kernel oops happens or
when a breakpoint is hit. In any case, you see a message that looks
something like this:

Entering kdb (0xc0347b80) on processor 0 due to Keyboard Entry
[0]kdb>

Note that just about everything the kernel does stops when
kdb is running. Nothing else should be running
on a system where you invoke kdb; in particular,
you should not have networking turned onunless, of course, you
are debugging a network driver. It is generally a good idea to boot
the system in single-user mode if you will be using
kdb.

As an example, consider a quick scull debugging
session. Assuming that the driver is already loaded, we can tell
kdb to set a breakpoint in
scull_read as follows:

[0]kdb> bp scull_read
Instruction(i) BP #0 at 0xcd087c5dc (scull_read)
is enabled globally adjust 1
[0]kdb> go

The bp command tells kdb to
stop the next time the kernel enters scull_read.
You then type go to continue execution. After
putting something into one of the scull devices,
we can attempt to read it by running cat under a
shell on another terminal, yielding the following:

Instruction(i) breakpoint #0 at 0xd087c5dc (adjusted)
0xd087c5dc scull_read:          int3
Entering kdb (current=0xcf09f890, pid 1575) on processor 0 due to
Breakpoint @ 0xd087c5dc
[0]kdb>

We are now positioned at the beginning of
scull_read. To see how we got there, we can get
a stack trace:

[0]kdb> bt
ESP    EIP        Function (args)
0xcdbddf74 0xd087c5dc [scull]scull_read
0xcdbddf78 0xc0150718 vfs_read+0xb8
0xcdbddfa4 0xc01509c2 sys_read+0x42
0xcdbddfc4 0xc0103fcf syscall_call+0x7
[0]kdb>

kdb attempts to print out the arguments to every
function in the call trace. It gets confused, however, by
optimization tricks used by the compiler. Therefore, it fails to
print the arguments to scull_read.

Time to look at some data. The mds command
manipulates data; we can query the value of the
scull_devices pointer with a command such as:

[0]kdb> mds scull_devices 1
0xd0880de8 cf36ac00    ....

Here we asked for one (4-byte) word of data starting at the location
of scull_devices; the answer tells us that our
device array is at the address 0xd0880de8; the
first device structure itself is at 0xcf36ac00. To
look at that device structure, we need to use that address:

[0]kdb> mds cf36ac00
0xcf36ac00 ce137dbc ....
0xcf36ac04 00000fa0 ....
0xcf36ac08 000003e8 ....
0xcf36ac0c 0000009b ....
0xcf36ac10 00000000 ....
0xcf36ac14 00000001 ....
0xcf36ac18 00000000 ....
0xcf36ac1c 00000001 ....

The eight lines here correspond to the beginning part of the
scull_dev structure. Therefore, we see that the
memory for the first device is allocated at
0xce137dbc, the quantum is 4000 (hex
fa0), the quantum set size is 1000 (hex
3e8), and there are currently 155 (hex
9b) bytes stored in the device.

kdb can change data as well. Suppose we wanted
to trim some of the data from the device:

[0]kdb> mm cf26ac0c 0x50
0xcf26ac0c = 0x50

A subsequent cat on the device will now return
less data than before.

kdb has a number of other capabilities,
including single-stepping (by instructions, not lines of C source
code), setting breakpoints on data access, disassembling code,
stepping through linked lists, accessing register data, and more.
After you have applied the kdb patch, a full set
of manual pages can be found in the
Documentation/kdb directory in your kernel
source tree.

4.6.3. The kgdb Patches

The two interactive debugging
approaches we have looked at so far (using gdb
on /proc/kcore and kdb)
both fall short of the sort of environment that user-space
application developers have become used to. Wouldn't
it be nice if there were a true debugger for the kernel that
supported features like changing variables, breakpoints, etc.?

As it turns out, such a solution does exist. There are, as of this
writing, two separate patches in circulation that allow
gdb, with full capabilities, to be run against
the kernel. Confusingly, both of these patches are called
kgdb. They work by separating the system running
the test kernel from the system running the debugger; the two are
typically connected via a serial cable. Therefore, the developer can
run gdb on his or her stable desktop system,
while operating on a kernel running on a sacrificial test box.
Setting up gdb in this mode takes a little time
at the outset, but that investment can pay off quickly when a
difficult bug shows up.

These patches are in a strong state of flux, and may even be merged
at some point, so we avoid saying much about them beyond where they
are and their basic features. Interested readers are encouraged to
look and see the current state of affairs.

The first kgdb patch is currently found in the
-mm kernel treethe staging area for patches
on their way into the 2.6 mainline. This version of the patch
supports the x86, SuperH, ia64, x86_64, SPARC, and 32-bit PPC
architectures. In addition to the usual mode of operation over a
serial port, this version of kgdb can also
communicate over a local-area network. It is simply a matter of
enabling the Ethernet mode and booting with the
kgdboe parameter set to indicate the IP address
from which debugging commands can originate. The documentation under
Documentation/i386/kgdb describes how to set
things up.^[4]

^[4] It does neglect to point out that you
should have your network adapter driver built into the kernel,
however, or the debugger fails to find it at boot time and will shut
itself down.

As an alternative, you can use the kgdb patch
found on http://kgdb.sf.net/.
This version of the debugger does not support the network
communication mode (though that is said to be under development), but
it does have some built-in support for working with loadable modules.
It supports the x86, x86_64, PowerPC, and S/390 architectures.

4.6.4. The User-Mode Linux Port

User-Mode Linux (UML) is an interesting
concept. It is structured as a separate port of the Linux kernel with
its own arch/um subdirectory. It does not run on
a new type of hardware, however; instead, it runs on a virtual
machine implemented on the Linux system call interface. Thus, UML
allows the Linux kernel to run as a separate, user-mode process on a
Linux system.

Having a copy of the kernel running as a user-mode process brings a
number of advantages. Because it is running on a constrained, virtual
processor, a buggy kernel cannot damage the
"real" system. Different hardware
and software configurations can be tried easily on the same box. And,
perhaps most significantly for kernel developers, the user-mode
kernel can be easily manipulated with gdb or
another debugger. After all, it is just another process. UML clearly
has the potential to accelerate kernel development.

However, UML has a big shortcoming from the point of view of driver
writers: the user-mode kernel has no access to the host
system's hardware. Thus, while it can be useful for
debugging most of the sample drivers in this book, UML is not yet
useful for debugging drivers that have to deal with real hardware.

See http://user-mode-linux.sf.net/ for more
information on UML.

4.6.5. The Linux Trace Toolkit

The
Linux Trace Toolkit (LTT) is a kernel patch and a set of related
utilities that allow the tracing of events in the kernel. The trace
includes timing information and can create a reasonably complete
picture of what happened over a given period of time. Thus, it can be
used not only for debugging but also for tracking down performance
problems.

LTT, along with extensive documentation, can be found at
http://www.opersys.com/LTT.

4.6.6. Dynamic Probes

Dynamic Probes (or DProbes) is a
debugging tool released (under the GPL) by IBM for Linux on the IA-32
architecture. It allows the placement of a
"probe" at almost any place in the
system, in both user and kernel space. The probe consists of some
code (written in a specialized, stack-oriented language) that is
executed when control hits the given point. This code can report
information back to user space, change registers, or do a number of
other things. The useful feature of DProbes is that once the
capability has been built into the kernel, probes can be inserted
anywhere within a running system without kernel builds or reboots.
DProbes can also work with the LTT to insert new tracing events at
arbitrary locations.

The DProbes tool can be downloaded from
IBM's open source site:
http://oss.software.ibm.com.

Linux Device Drivers (3rd Edition) [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی