Chapter 15. The Page Cache and Page Writeback
The linux kernel implements a primary disk cache called the page cache . The goal of this cache is to minimize disk I/O by storing in physical memory data that would otherwise be accessed from disk. This chapter deals with the page cache and page writeback.Chapter 13, "The Block I/O Layer," that a buffer is the in-memory representation of a single physical disk block. Buffers act as descriptors that map pages in memory to disk blocks; thus, the page cache also reduces disk access during block I/O operations by both caching disk blocks and buffering block I/O operations until later. This caching is often referred to as the "buffer cache," although in reality it is not a separate cache and is part of the page cache. Let's look at the sort of operations and data that end up in the page cache. The page cache is primarily populated by page I/O operations, such as read() and write(). Page I/O operations manipulate entire pages of data at a time; this entails operations on more than one disk block. Consequently, the page cache caches page-size chunks of files. Block I/O operations manipulate a single disk block at a time. A common block I/O operation is reading and writing inodes. The kernel provides the bread() function to perform a low-level read of a single block from disk. Via buffers, disk blocks are mapped to their associated in-memory pages and, thus, cached in the page cache. For example, when you first open a source file in a text editor, data from the file is read into memory from disk. As you edit the file, more and more pages are read in. When you later compile the file, the kernel can use the pages directly from the page cache; it need not reread the file from disk. Because users tend to read and manipulate the same files repeatedly, the page cache reduces the need for a large number of disk operations.