Performance Tuning for Linux Servers [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Performance Tuning for Linux Servers [Electronic resources] - نسخه متنی

Sandra K. Johnson

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید



Memory Utilization



Workloads have a tendency to consume all available memory. Linux provides reasonably efficient access to physical memory and provides access to potentially huge amounts of "virtual" memory. Virtual memory is usually little more than a capability of an operating system to offload less frequently used data to disk storage while presenting the illusion that the system has an enormous amount of physical memory. Unfortunately, the price for offloading memory can be ten or a hundred times more expensive in terms of application latency. Those high latencies can impact application response times dramatically if the memory that is paged out to disk is the wrong memory, or if the application's active memory footprint is larger than the size of physical memory.


Many performance problems are caused by insufficient memory, which triggers system swapping. Thus, it is useful to have tools that monitor memory utilizationfor example, how the kernel memory is consumed per process or per thread, and how the memory is consumed by the kernel data structures along with their counts and sizes. As with CPU utilization, understanding how both the system and individual processes are behaving is key to tracking down any performance problems caused by memory shortages.


/proc/meminfo and /proc/slabinfo



Linux provides facilities to monitor the utilization of overall system memory resources under the /proc file systemnamely, /proc/meminfo and /proc/slabinfo. These two files capture the state of the physical memory. A partial display of /proc/meminfo is as follows:



MemTotal: 8282420 kB
MemFree: 7942396 kB
Buffers: 46992 kB
Cached: 191936 kB
SwapCached: 0 kB
HighTotal: 7470784 kB
HighFree: 7232384 kB
LowTotal: 811636 kB
LowFree: 710012 kB
SwapTotal: 618492 kB
SwapFree: 618492 kB
Mapped: 36008 kB
Slab: 36652 kB


MemTotal gives the total amount of physical memory of the system, whereas MemFree gives the total amount of unused memory.


Buffers corresponds to the buffer cache for I/O operations. Cached corresponds to the memory for reading files from the disk.


SwapCached represents the amount of cache memory that has been swapped out in the swap space.


SwapTotal represents the amount of disk memory for swapping purposes. If an IA32-based system has more than 1GB of physical memory, HighTotal is nonzero.


HighTotal corresponds to memory greater than ~860MB of the physical memory.


LowTotal is the memory used by the kernel. Mapped corresponds to the files that are memory-mapped.


Slab corresponds to the memory used for the kernel data structures. By capturing /proc/meminfo periodically, you can establish a pattern of memory utilization. With the aid of simple scripts and graphics tools, the pattern can be also summarized visually.


To understand kernel memory consumption, examine /proc/slabinfo. A partial display of /proc/slabinfo is as follows:



tcp_bind_bucket 56 224 32 2 2 1
tcp_open_request 16 58 64 1 1 1
inet_peer_cache 0 0 64 0 0 1
secpath_cache 0 0 32 0 0 1
flow_cache 0 0 64 0 0 1


The first column lists the names of the kernel data structures. To further describe tcp_bind_bucket, there is a total of 224 tcp_bind_bucket objects, 56 of which are active. Each data structure takes up 32 bytes. There are two pages that have at least one active object, and there is a total of two allocated pages. Moreover, one page is allocated for each slab. This information highlights certain data structures that merit more focus, such as those with larger counts or sizes. Thus, by capturing meminfo and slabinfo together, you can begin to understand what elements of the operating system are consuming the most memory. If the values of LowFree or HighFree are relatively small (or smaller than usual), that might indicate that the system is running with more requests for memory than usual, which may lead to a reduction in overall performance or application response times.


ps



To find out how the memory is used within a particular process, use ps for an overview of memory used per process:



$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 1528 528 ? S 15:24 0:00 init [2]
root 2 0.0 0.0 0 0 ? SN 15:24 0:00 [ksoftirqd/0]
root 3 0.0 0.0 0 0 ? S< 15:24 0:00 [events/0]
root 4 0.0 0.0 0 0 ? S< 15:24 0:00 [khelper]
root 5 0.0 0.0 0 0 ? S< 15:24 0:00 [kacpid]
root 48 0.0 0.0 0 0 ? S< 15:24 0:00 [kblockd/0]
root 63 0.0 0.0 0 0 ? S 15:24 0:00 [pdflush]
root 64 0.0 0.0 0 0 ? S 15:24 0:00 [pdflush]


The output of the ps aux command shows the total percentage of system memory that each process consumes, as well as its virtual memory footprint (VSZ) and the amount of physical memory that the process is currently using (RSS). You can also use top(1) to sort the process listing interactively to see which processes are consuming the most memory and how that consumption changes as the system runs.


After you have identified a few processes of interest, you can look into the specific allocations of memory that the process is using by looking at the layout of the processes' virtual address space. /proc/pid/maps, where pid is the process ID of a particular process as found through ps(1) or top(1), contains all mappings of the processes' address spaces and their sizes. Each map shows the address range that is allocated, the permissions on the page, and the location of the backing store associated with that address range (if any). /proc/pid/maps is not a performance tool per se; however, it provides insight into how memory is allocated. For example, for performance purposes, you can confirm whether a certain amount of shared memory is allocated between 1GB and 2GB in the virtual address space. The preceding map can be used to examine its utilization.[View full width]


$ cat /proc/3162/maps
08048000-08056000 r-xp
00000000 03:05 33015 /usr/lib/gnome-applets/battstat-applet-2
08056000-08058000 rw-p
0000d000 03:05 33015 /usr/lib/gnome-applets/battstat-applet-2
08058000-08163000 rw-p
08058000 00:00 0
40000000-40016000 r-xp
00000000 03:02 40006 /lib/ld-2.3.2.so
40016000-40017000 rw-p
00015000 03:02 40006 /lib/ld-2.3.2.so
40017000-40018000 rw-p
40017000 00:00 0
40018000-4001a000 r-xp
00000000 03:05 578493 /usr/X11R6/lib/X11/locale/lib/common/xlcDef.so.2
4001a000-4001b000 rw-p
00001000 03:05 578493 /usr/X11R6/lib/X11/locale/lib/common/xlcDef.so.2
4001b000-4001d000 r-xp
00000000 03:05 128867 /usr/lib/gconv/ISO8859-1.so
4001d000-4001e000 rw-p
00001000 03:05 128867 /usr/lib/gconv/ISO8859-1.so
4001f000-40023000 r-xp
00000000 03:05 514375 /usr/lib/gtk-2.0/2.4.0/loaders
/libpixbufloader-png.so
40023000-40024000 rw-p
00003000 03:05 514375 /usr/lib/gtk-2.0/2.4.0/loaders
/libpixbufloader-png.so
40025000-40031000 r-xp
00000000 03:05 337881 /usr/lib/libpanel-applet-2.so.0.0.19
40031000-40032000 rw-p
0000c000 03:05 337881 /usr/lib/libpanel-applet-2.so.0.0.19
40032000-400d2000 r-xp
00000000 03:05 337625 /usr/lib/libgnomeui-2.so.0.600.1
400d2000-400d6000 rw-p
0009f000 03:05 337625 /usr/lib/libgnomeui-2.so.0.600.1
400d6000-400d7000 rw-p
400d6000 00:00 0
400d7000-400df000 r-xp
00000000 03:05 53 /usr/X11R6/lib/libSM.so.6.0
400df000-400e0000 rw-p
00007000 03:05 53 /usr/X11R6/lib/libSM.so.6.0
400e0000-400f4000 r-xp
00000000 03:05 51 /usr/X11R6/lib/libICE.so.6.3
400f4000-400f5000 rw-p
00013000 03:05 51 /usr/X11R6/lib/libICE.so.6.3


vmstat



vmstat was introduced in the section on CPU utilization. However, its primary purpose is to monitor memory availability and swapping activity, and it provides an overview of I/O activity. vmstat can be used to help find unusual system activity, such as high page faults or excessive context switches, that can lead to a degradation in system performance. A sample of the vmstat output is as follows:



procs -----------memory----------
r b swpd free buff cache
18 8 0 5626196 3008 122788
18 15 0 5625132 3008 122828
17 12 0 5622004 3008 122828
22 2 0 5621644 3008 122828
23 5 0 5621616 3008 122868
21 14 0 5621868 3008 122868
22 10 0 5625216 3008 122868
---swap-- -----io---- --system-- ----cpu----
si so bi bo in cs us sy id wa
0 0 330403 454 2575 4090 91 8 1 0
0 0 328767 322 2544 4264 91 8 0 0
0 0 327956 130 2406 3998 92 8 0 0
0 0 327892 689 2445 4077 92 8 0 0
0 0 323171 407 2339 4037 92 8 1 0
0 0 323663 23 2418 4160 91 9 0 0
0 0 328828 153 2934 4518 90 9 1 0



The memory-related data reported by vmstat includes the following:



memory reports the amount of memory being swapped out (swpd), free memory (free), buffer cache for I/O data structures (buff), and cached memory for files read from the disk (cache) in kilobytes.



swap, in kilobytes per second, is the amount of memory swapped in (si) from disk and swapped out (so) to disk.



io reports the number of blocks read in (bi) from the devices and blocks written out (bo) to the devices in kilobytes per second.



For I/O-intensive workloads, you can monitor bi and bo for the transfer rate, and in for the interrupt rate. You can monitor swpd, si, and so to see whether the system is swapping. If so, you can check on the swapping rate. Perhaps the most common metric is CPU utilization and the monitoring of us, sy, id, and wa. If wa is large, you need to examine the I/O subsystem. You might come to the conclusion that more I/O controllers and disks are needed to reduce the I/O wait time.



/ 227