Linux Device Drivers (3rd Edition) [Electronic resources] نسخه متنی

8.3. get_free_page and Friends

If a module needs to allocate big chunks
of memory, it is usually better to use a
page-oriented technique.
Requesting whole pages also has other advantages, which are
introduced in Chapter 15.

To allocate pages, the following functions are available:

get_zeroed_page(unsigned int flags);

Returns
a pointer to a new page and fills the page with zeros.

_ _get_free_page(unsigned int flags);

Similar to get_zeroed_page, but
doesn't clear the page.

_ _get_free_pages(unsigned int flags, unsigned int order);

Allocates and returns a pointer to the first byte of a memory area
that is potentially several (physically contiguous) pages long but
doesn't zero the area.

The flags
argument works in the same way as with kmalloc;
usually either GFP_KERNEL or
GFP_ATOMIC is used, perhaps with the addition of
the _ _GFP_DMA flag (for memory that can be used
for ISA direct-memory-access operations) or _ _GFP_HIGHMEM when high memory can be used.^[2]
order is the base-two logarithm of the number of
pages you are requesting or freeing (i.e.,
log₂N). For example,
order is 0 if you want one page
and 3 if you request eight pages. If
order is too big (no contiguous area of that size
is available), the page allocation fails. The
get_order function, which takes an integer
argument, can be used to extract the order from a size (that must be
a power of two) for the hosting platform. The maximum allowed value
for order is 10 or
11 (corresponding to 1024 or 2048 pages),
depending on the architecture. The chances of an order-10 allocation
succeeding on anything other than a freshly booted system with a lot
of memory are small, however.

^[2] Although alloc_pages (described shortly)
should really be used for allocating high-memory pages, for reasons
we can't really get into until Chapter 15.

If you are curious, /proc/buddyinfo tells you
how many blocks of each order are available for each memory zone on
the system.

When
a program is done with the pages, it can free them with one of the
following functions. The first function is a macro that falls back on
the second:

void free_page(unsigned long addr);
void free_pages(unsigned long addr, unsigned long order);

If you try to free a different number of pages from what you
allocated, the memory map becomes corrupted, and the system gets in
trouble at a later time.

It's
worth stressing that _ _get_free_pages and the
other functions can be called at any time, subject to the same rules
we saw for kmalloc. The functions can fail to
allocate memory in certain circumstances, particularly when
GFP_ATOMIC is used. Therefore, the program calling
these allocation functions must be prepared to handle an allocation
failure.

Although
kmalloc(GFP_KERNEL) sometimes fails when there is
no available memory, the kernel does its best to fulfill allocation
requests. Therefore, it's easy to degrade system
responsiveness by allocating too much memory. For example, you can
bring the computer down by pushing too much data into a
scull device; the system starts crawling while
it tries to swap out as much as possible in order to fulfill the
kmalloc request. Since every resource is being
sucked up by the growing device, the computer is soon rendered
unusable; at that point, you can no longer even start a new process
to try to deal with the problem. We don't address
this issue in scull, since it is just a sample
module and not a real tool to put into a multiuser system. As a
programmer, you must be careful nonetheless, because a module is
privileged code and can open new security holes in the system (the
most likely is a denial-of-service hole like the one just outlined).

8.3.1. A scull Using Whole Pages: scullp

In order to test page allocation for
real, we have released the scullp module
together with other sample code. It is a reduced
scull, just like scullc
introduced earlier.

Memory quanta allocated by scullp are whole
pages or page sets: the scullp_order variable
defaults to 0 but can be changed at either compile
or load time.

The following lines show how it allocates memory:

/* Here's the allocation of a single quantum */
if (!dptr->data[s_pos]) {
dptr->data[s_pos] =
(void *)_ _get_free_pages(GFP_KERNEL, dptr->order);
if (!dptr->data[s_pos])
goto nomem;
memset(dptr->data[s_pos], 0, PAGE_SIZE << dptr->order);
}

The code to deallocate memory in scullp looks
like this:

/* This code frees a whole quantum-set */
for (i = 0; i < qset; i++)
if (dptr->data[i])
free_pages((unsigned long)(dptr->data[i]),
dptr->order);

At the user level, the perceived difference is primarily a speed
improvement and better memory use, because there is no internal
fragmentation of memory. We ran some tests copying 4 MB from
scull0 to scull1 and then
from scullp0 to scullp1;
the results showed a slight improvement in kernel-space processor
usage.

The performance improvement is not dramatic, because
kmalloc is designed to be fast. The main
advantage of page-level allocation isn't actually
speed, but rather more efficient memory usage. Allocating by pages
wastes no memory, whereas using kmalloc wastes
an unpredictable amount of memory because of allocation granularity.

But the biggest advantage of the _
_get_free_page functions is that the pages obtained are
completely yours, and you could, in theory, assemble the pages into a
linear area by appropriate tweaking of the page tables. For example,
you can allow a user process to mmap memory
areas obtained as single unrelated pages. We discuss this kind of
operation in Chapter 15, where
we show how scullp offers memory mapping,
something that scull cannot offer.

8.3.2. The alloc_pages Interface

For completeness, we introduce
another
interface for memory allocation, even though we will not be prepared
to use it until after Chapter 15. For now, suffice it to say that struct page
is an internal kernel structure that describes a page of memory. As
we will see, there are many places in the kernel where it is
necessary to work with page structures; they are
especially useful in any situation where you might be dealing with
high memory, which does not have a constant address in kernel space.

The real core of the Linux
page
allocator is a function called alloc_pages_node:

struct page *alloc_pages_node(int nid, unsigned int flags, 
unsigned int order);

This function also has two variants (which are simply macros); these
are the versions that you will most likely use:

struct page *alloc_pages(unsigned int flags, unsigned int order);
struct page *alloc_page(unsigned int flags);

The core function, alloc_pages_node, takes three
arguments. nid is the NUMA node ID^[3]
whose memory should be allocated, flags is the
usual GFP_ allocation flags, and
order is the size of the allocation. The return
value is a pointer to the first of (possibly many)
page structures describing the allocated memory,
or, as usual, NULL on failure.

^[3] NUMA (nonuniform memory access) computers are multiprocessor
systems where memory is "local" to
specific groups of processors
("nodes"). Access to local memory
is faster than access to nonlocal memory. On such systems, allocating
memory on the correct node is important. Driver authors do not
normally have to worry about NUMA issues, however.

alloc_pages simplifies the situation by
allocating the memory on the current NUMA node (it calls
alloc_pages_node with the return value from
numa_node_id as the nid
parameter). And, of course, alloc_page omits the
order parameter and allocates a single page.

To release pages allocated in this manner, you should use one of the
following:

void _ _free_page(struct page *page);
void _ _free_pages(struct page *page, unsigned int order);
void free_hot_page(struct page *page);
void free_cold_page(struct page *page);

If you have specific knowledge of whether a single
page's contents are likely to be resident in the
processor cache, you should communicate that to the kernel with
free_hot_page (for cache-resident pages) or
free_cold_page. This information helps the
memory allocator optimize its use of memory across the
system.

Linux Device Drivers (3rd Edition) [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی