Performance Tuning for Linux Servers [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Performance Tuning for Linux Servers [Electronic resources] - نسخه متنی

Sandra K. Johnson

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید



OProfile



OProfile is a system-wide tracing tool that is available for the Intel, Power PC, AMD, sparc64, and PA-RISC platforms.


The OProfile tool works off of the system timer support or the performance counter support, if available. OProfile is a useful tool for obtaining a profile of a running system. This profile shows the amount of time being spent in an application and the number of times an application was running when OProfile made its measurement. It also gives information by application and library symbols.


The most unique function of OProfile is that it provides a report that shows the source code for an application with the number of measurements made for each line of code in the application. This function applies to C code and is very useful to determine where in an application to concentrate the tuning efforts.


OProfile is now shipped as part of some Red Hat distributions and is available from other Linux operating system providers. Note that the OProfile web site contains a disclaimer that they are not responsible for the version that ships with Red Hat. Some versions of Red Hat require that SMP support be enabled in the kernel configuration and that the OProfile driver must be built as a loadable module and not built into the kernel. If it's built into the kernel, duplicate function name errors occur.


One other item to note is that on the version that is downloaded from the OProfile development web site, you must run the autosetup.sh command as the first step. This is not always clear in the documentation.


It is best to use the OProfile patch on kernel source that is obtained from the www.kernel.org web site. This patch ensures that all the required support is included in the kernel source. If you use a specific vendor version of the kernel source, you might not have the latest patches. As vendors upgrade their kernel source to newer versions of the Linux kernel, the patches will be included.


Refer to the OProfile web site at http://oprofile.sourceforge.net/ for detailed information on the tool and to download a copy.


The next sections examine the various parts of OProfile, including opcontrol, oprof_start, oprofpp, op_time, op_to_source, and op_merge.


opcontrol



opcontrol initializes the OProfile tool and starts and stops the data collection. The specific command options available with opcontrol are as follows:



--init.
Causes the OProfile driver to be loaded if it isn't already active and makes the driver interface available.



--setup.
Defines the specific setup parameters to be used by OProfile. You can bypass this option by accepting the defaults or by using one of the setup options directly. Some of the more useful setup options are buffer size, performance counter mask to use, number of events between samples, and filters for looking at a specific PID or PID group.



--start-daemon.
Starts the profiling daemon without starting the profiler. Use the --start option to start profiling.



--start.
Starts the profile measurement.



--dump.
Causes the collected profiling data to be dumped to the oprofile daemon.



--stop.
Stops the profiling data collection.



--shutdown.
Stops data collection and removes the daemon.



--reset.
Clears current session data but not saved session data.



Refer to the OProfile documentation for further information.


The following examples show the initialization and starting of the OProfile functions for counter support, creating separate profiles for libraries and the kernel, and saving session data.


Using the Counter Support



The following command initializes OProfile to start counter 0 to use the CPU clock not-halted event and to make the measurement after every 400,000 counter events. Performance counter 1 is set to data memory references, and the measurement is made after every 10,000 events. The Linux kernel executable file is located at /usr/src/Linux and is called vmlinux. After the initialization, the measurement is started.



opcontrol --ctr0-event=CPU_CLK_UNHALTED --ctr0-count=400000
opcontrol --ctr1-event=DATA_MEM_REFS --ctr1-count=10000
opcontrol --vmlinux=/usr/src/Linux/vmlinux
opcontrol --start


When using performance counters, ensure that the count values are large enough to prevent all the time from being spent in processing the performance counter handler. If the counter count value is too low, it is possible that the system will stop functioning and require a reboot. The performance counters are set up to interrupt when a counter overflow occurs. Therefore, if the count is too low, the time will be spent processing the counter overflow interrupt and nothing else.


Separate Profiles for Libraries and the Kernel



opcontrol --separate=kernel --vmlinux=/usr/src/Linux/vmlinux
opcontrol --start
my_test --run
pprofpp -kl -p /lib/modules/Linux/kernel /usr/local/bin/oprofiled


These commands profile the OProfile daemon itself, including when the daemon was running inside the kernel driver and the libraries it is using.


Saving Session Data



The following command saves the profiling data that has been collected in a file called /var/lib/oprofile/samples/run1. This name can then be passed on to the reporting functions to be processed.



opcontrol --save=run1


oprof_start



oprof_start is a GUI interface for opcontrol. oprof_start contains window sections where you can select counter events and a sampling rate, configuration options, and the status of the profiling.


Data Profiling Tools



The following sections describe the tools that profile the sample data OProfile collects. These tools are oprofpp, op_time, op_to_source, and op_merge.


oprofpp



oprofpp produces the following types of reports:



List symbol mode.
Provides a sorted histogram report of sample counts against functions, as shown in a system data walkthrough.



Detailed symbol mode.
Provides individual sample counts for instructions inside a function.



grprof mode, which produces a flat grprof-style report.



Some of the more interesting options that can be used to control the report output are as follows:



Symbol processing: -l is a histogram of counts by symbol name; -s is a detailed listing of the position of the address and the number of samples by symbol; -L is a detailed listing of all symbols.



-g dumps the report to a file in gprof format.



-o reports the function and line number for all samples.



-k reports on shared libraries and kernel entries with respect to the application.



-d and -D are used to detangle the C++ symbols. -D is a more simplified style of reporting.



The following sample listing shows the counts and time by kernel symbols.[View full width]


# oprofpp l
Cpu type: PIII
Cpu speed was (MHz estimation) : 863.886
Counter 0 counted CPU_CLK_UNHALTED events (clocks processor is not halted) with a unit
mask of 0x00 (No unit mask) count 431500
vma samples % symbol name image name
c0123290 24 0.0182187 do_timer /usr/src/linux/vmlinux
c0138810 24 0.0182187 sys_fchdir /usr/src/linux/vmlinux
. . .
c0122950 100 0.0759111 ptrace_readdata /usr/src/linux/vmlinux
c0122a10 103 0.0781885 ptrace_writedata /usr/src/linux/vmlinux
c024b690 103 0.0781885 tcp_rfree /usr/src/linux/vmlinux
c0107df0 107 0.0812249 copy_siginfo_to_user /usr/src/linux/vmlinux
c011eb40 108 0.081984 sys_setitimer /usr/src/linux/vmlinux
c0157240 108 0.081984 proc_file_lseek /usr/src/linux/vmlinux
c0142e30 109 0.0827431 pipe_write /usr/src/linux/vmlinux
c0123900 111 0.0842613 alloc_uid /usr/src/linux/vmlinux
c0122af0 112 0.0850205 init_timervecs /usr/src/linux/vmlinux
. . .
c011f8c0 406 0.308199 do_softirq /usr/src/linux/vmlinux
c010d464 411 0.311995 IRQ0x77_interrupt /usr/src/linux/vmlinux
c0133ad0 416 0.31579 swap_out_pmd /usr/src/linux/vmlinux
c01073f0 437 0.331732 release_segments /usr/src/linux/vmlinux
c024b6b0 454 0.344636 tcp_poll /usr/src/linux/vmlinux
c0131e50 463 0.351469 alloc_area_pmd /usr/src/linux/vmlinux
c01483e0 463 0.351469 dcache_dir_close /usr/src/linux/vmlinux
c014c6b0 465 0.352987 move_lock_status /usr/src/linux/vmlinux
c014c440 479 0.363614 lock_get_status /usr/src/linux/vmlinux
c010c850 522 0.396256 sys_ptrace /usr/src/linux/vmlinux
c0134250 528 0.400811 rmqueue /usr/src/linux/vmlinux
c024b480 541 0.410679 tcp_mem_schedule /usr/src/linux/vmlinux
c0122c70 555 0.421307 second_overflow /usr/src/linux/vmlinux
c01c9b70 564 0.428139 config_chipset_for_pio /usr/src/linux/vmlinux
c011f2b0 571 0.433453 do_adjtimex /usr/src/linux/vmlinux
c0148a00 572 0.434212 filldir64 /usr/src/linux/vmlinux
c01b3150 603 0.457744 figure_loop_size /usr/src/linux/vmlinux
c0114920 613 0.465335 mtrr_ioctl /usr/src/linux/vmlinux
c01b30d0 630 0.47824 compute_loop_size /usr/src/linux/vmlinux
c0208bb0 649 0.492663 qla1280_initialize_adapter /usr/src/linux/vmlinux
c018aea0 676 0.513159 sys_semop /usr/src/linux/vmlinux
c01b1e10 723 0.548837 floppy_release_irq_and_dma /usr/src/linux/vmlinux
c0247ed0 725 0.550356 ip_build_and_send_pkt /usr/src/linux/vmlinux
c0133bf0 726 0.551115 try_to_swap_out /usr/src/linux/vmlinux
c0118a60 728 0.552633 sys_sched_getparam /usr/src/linux/vmlinux
c01325e0 757 0.574647 kmem_cache_grow /usr/src/linux/vmlinuxc0129f80
959 0.727988 do_swap_page /usr/src/linux/vmlinux
c02484b0 1011 0.767461 ip_build_xmit_slow /usr/src/linux/vmlinux
c01c9180 1417 1.07566 hpt34x_dmaproc /usr/src/linux/vmlinux
c0148400 1540 1.16903 dcache_dir_lseek /usr/src/linux/vmlinux
c01069c0 1828 1.38766 gunzip /usr/src/linux/vmlinux
c0208fd0 1890 1.43472 qla1280_pci_config /usr/src/linux/vmlinux
c0108b60 1950 1.48027 handle_signal /usr/src/linux/vmlinux
c0148150 2193 1.66473 sys_ioctl /usr/src/linux/vmlinux
c010e5f0 2414 1.83249 __constant_c_and_count_memset /usr/src/linux/vmlinux
c013a1a0 2482 1.88411 sys_writev /usr/src/linux/vmlinux
c01174c0 2813 2.13538 do_page_fault /usr/src/linux/vmlinux
c0118cb0 3646 2.76772 show_task /usr/src/linux/vmlinux
c02489f0 3718 2.82238 ip_build_xmit /usr/src/linux/vmlinux
c013a230 6183 4.69358 sys_pread /usr/src/linux/vmlinux
c010e4e0 8802 6.6817 __constant_memcpy /usr/src/linux/vmlinux
c013ae60 42015 31.8941 write_locked_buffers /usr/src/linux/vmlinux


op_time



op_time produces summary reports relative to the binaries that are running on the system. Use this report to find out where time is being spent when your test case is running. You can then use other reports to narrow your focus to specific functions and code segments within the application to see where the problem may exist.


Many of the same options that are available for oprofpp are also available for op_time.


The following sample report uses the -D option.



# op_time -D
1 0.0043 0.0000 /lib/libhistory.so.4.3
1 0.0043 0.0000 /opt/kde3/lib/kwin.so
1 0.0043 0.0000 /opt/kde3/lib/libartsflow.so.1.0.0
1 0.0043 0.0000 /usr/X11R6/lib/X11/locale/lib/common/ximcp.so.2
1 0.0043 0.0000 /usr/lib/vslick/bin/vs
2 0.0087 0.0000 /bin/gawk
2 0.0087 0.0000 /lib/modules/2.4.19-4GB/kernel
/drivers/usb/usb-uhci.o
2 0.0087 0.0000 /opt/kde3/lib/libartsmidi.so.0.0.0
3 0.0130 0.0000 /bin/ps
3 0.0130 0.0000 /lib/modules/2.4.19-4GB/oprofile/oprofile.o
4 0.0173 0.0000 /lib/libreadline.so.4.3
4 0.0173 0.0000 /opt/kde3/lib/libkdecore.so.4.0.0
5 0.0216 0.0000 /opt/kde3/lib/libmcop.so.1.0.0
5 0.0216 0.0000 /usr/local/bin/op_help
16 0.0692 0.0000 /usr/X11R6/lib/libX11.so.6.2
16 0.0692 0.0000 /usr/lib/gconv/ISO8859-1.so
19 0.0822 0.0000 /lib/libpthread.so.0
20 0.0865 0.0000 /opt/kde3/lib/libkio.so.4.0.0
25 0.1082 0.0000 /usr/lib/libstdc++.so.5.0.0
26 0.1125 0.0000 /usr/X11R6/bin/XFree86
33 0.1428 0.0000 /opt/kde3/lib/libkonsolepart.so
45 0.1947 0.0000 /bin/bash
59 0.2553 0.0000 /sbin/insmod
68 0.2942 0.0000 /lib/ld-2.2.5.so
163 0.7052 0.0000 /lib/modules/2.4.19-4GB/kernel/fs
/reiserfs/reiserfs.o
297 1.2849 0.0000 /lib/libc.so.6
308 1.3325 0.0000 /usr/lib/qt-3.0.5/lib/libqt-mt.so.3.0.5
661 2.8597 0.0000 /usr/src/Linux/vmlinux
3950 17.0892 0.0000 /usr/local/bin/oprofiled
17373 75.1622 0.0000 /piperf/bin/pi_watch


op_to_source



The op_to_source tool generates annotated source for assembly listings. If the profiled application is built with debug information, a report contains the actual source code with the profiled counts interleaved. If the application is built without the debug information, op_to_source can produce the assembler report. Use the --assembly (-a) option to produce this type of report.[View full width]


# op_to_source
:/*
* IBM Performance Inspector
* Copyright (c) International Business Machines Corp., 2003
*
* This library is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published
* by the Free Software Foundation; either version 2.1 of the License, or
* (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
* the GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this library; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int One(unsigned long long cps);
int Two(unsigned long long cps);
int Four(unsigned long long cps);
int Eight(unsigned long long cps);
int Sixteen(unsigned long long cps);
int ThirtyTwo(unsigned long long cps);
int SixtyFour(unsigned long long cps);
int val = 0;
/************************************/
10 0.003% 580 0.000% :void pc50() { /* pc50 total: 153902 52.31% 40667094 52.46% */
: int i;
:
119226 40.52% 29583773 38.16% : for(i = 0; i < 5000; i++) {
34613 11.76% 11071427 14.28% : val = val + i;
: }
53 0.018% 11314 0.014% :}
: :/************************************/
8 0.002% 290 0.000% :void pc25() { /* pc25 total: 70463 23.95% 18765427 24.20% */
: int i;
:
60711 20.63% 16897936 21.79% : for(i = 0; i < 2500; i++) {
9693 3.295% 1851007 2.387% : val = val + i;
: }
51 0.017% 16194 0.020% :}
:
:/************************************/
13 0.004% 5455 0.007% :void pc15() { /* pc15 total: 43572 14.81% 11245713 14.50% */
: int i;
:
39415 13.39% 11069245 14.27% : for(i = 0; i < 1500; i++) {
4046 1.375% 136253 0.175% : val = val + i;
: }
98 0.033% 34760 0.044% :}
:
:/************************************/
9 0.003% 116 0.000% :void pc10() { /* pc10 total: 26148 8.888% 6818423 8.795% */
: int i;
:
20936 7.116% 5433262 7.008% : for(i = 0; i < 1000; i++) {
5154 1.752% 1365913 1.762% : val = val + i;
: }
49 0.016% 19132 0.024% :}
:
:/************************************/
:int main(int argc, char *argv[]) { /* main total: 84
0.028% 22107 0.028% */
: int i;
: int iter;
:
: if(argc == 2) {
: iter = atoi(argv[1]);
: }
: else {
: fprintf(stderr, " Usage: watch iter\n");
: fflush(stderr);
: exit(0);
: }
:
: val = 0;
47 0.015% 10143 0.013% : for(i = 0; i < iter; i++) {
5 0.001% 1011 0.001% : pc50();
11 0.003% 6072 0.007% : pc25();
9 0.003% 1313 0.001% : pc15();
12 0.004% 3568 0.004% : pc10();
: }
:
: fprintf(stderr, " val = %d\n", val);
: return(0);
:}
:
:
/* * Total samples for file : "/piperf/src/pi_watch.c"
*
* 294169 100.0% 77518764 100.0%
*/
/*
* Command line: op_to_source --source-dir=/piperf/src --output-dir=/temp /piperf/bin/pi_watch
*
* Interpretation of command line:
* Output annotated source file with samples
* Output all files
*
* Cpu type: PIII
* Cpu speed (MHz estimation) : 662.229
*
*
Counter 0 counted CPU_CLK_UNHALTED events (clock's processor is not halted) with a unit
mask of 0x00 (No unit mask) count 400000
* Total samples : 294169
*
Counter 1 counted DATA_MEM_REFS events (all memory references, cacheable and non) with a
unit mask of 0x00 (No unit mask) count 1000
* Total samples : 77518764
*/


op_merge



op_merge merges profiling samples created with the --separate option when applied to the same binary. op_merge can produce reports for separate shared libraries and then merge the reports for a complete system view. You can also produce reports on a combination of a subset of profiled applications.



/ 227