Recipe 9.15 Tracing Processes
9.15.1 Problem
You want to know what an unfamiliar
process is doing.
9.15.2 Solution
To attach to a running
process and trace system calls:
# strace -p pid
To trace network system calls:
# strace -e trace=network,read,write ...
9.15.3 Discussion
The strace
command lets you observe a given process in detail, printing its
system calls as they occur. It expands all arguments, return values,
and errors (if any) for the system calls, showing all information
passed between the process and the kernel. (It can also trace
signals.) This provides a very complete picture of what the process
is doing.Use the strace -p option to attach to and trace a
process, identified by its process ID, say, 12345:
# strace -p 12345
To detach and stop tracing, just kill strace.
Other than a small performance penalty, strace has
no effect on the traced process.Tracing all system calls for a process can produce overwhelming
output, so you can select sets of interesting system calls to print.
For monitoring network activity, the -e
trace=network option is appropriate. Network sockets often
use the generic read and
write system calls as well, so trace those too:
$ strace -e trace=network,read,write finger katie@server.example.com
...
socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 4
connect(4, {sin_family=AF_INET,
sin_port=htons(79),
sin_addr=inet_addr("10.12.104.222")}, 16) = 0
write(4, "katie", 5) = 5
write(4, "\r\n", 2) = 2
read(4, "Login: katie \t\t\tName: K"..., 4096) = 244
read(4, ", 4096) = 0
...
The trace shows the creation of a TCP socket, followed by a
connection to port 79 for the finger service at the IP address for
the server. The program then follows the finger protocol by writing
the username and reading the response.By default, strace prints only 32 characters of
string arguments, which can lead to the truncated output shown. For a
more complete trace, use the -s option to specify
a larger maximum data size. Similarly, strace
abbreviates some large structure arguments, such as the environment
for new processes: supply the -v option to print
this information in full.You can trace most network activity effectively by following file
descriptors: in the previous example, the value is 4 (returned by the
socket-creation call, and used as the first argument for the
subsequent system calls). Then match these values to the file
descriptors displayed in the FD column by lsof.
[Recipe 9.14]When you identify an interesting file descriptor, you can print the
transferred data in both hexadecimal and ASCII using the options
-e [read|write]=fd:
$ strace -e trace=read -e read=4 finger katie@server.example.com
...
read(4, "Login: katie \t\t\tName: K"..., 4096) = 244
| 00000 4c 6f 67 69 6e 3a 20 6b 61 74 69 65 20 20 20 20 Login: k atie |
| 00010 20 20 20 20 20 20 09 09 09 4e 61 6d 65 3a 20 4b .. .Name: K |
...
strace watches data transfers much like network
packet sniffers do, but it also can observe input/output involving
local files and other system activities.If you trace programs for long periods, ask strace
to annotate its output with timestamps. The -t
option records absolute times (repeat the option for more detail),
the -r option records relative times between
system calls, and -T records time spent in the
kernel within system calls. Finally, add the strace
-f option to follow child processes.[6]
[6] To
follow child processes created by vfork, include
the -F option as well, but this requires support
from the kernel that is not widely available at press time. Also,
strace does not currently work well with
multithreaded processes: be sure you have the latest version, and a
kernel Version 2.4 or later, before attempting thread tracing.
Each line of the trace has the process ID added for children.
Alternatively, you can untangle the system calls by directing the
trace for each child process to a separate file, using the options:
$ strace -f -ff -o filename ...
9.15.4 See Also
strace(1), and the manpages for the system calls appearing in
strace output.