Advanced Programming in the UNIX Environment: Second Edition [Electronic resources]

W. Richard Stevens; Stephen A. Rago

نسخه متنی -صفحه : 369/ 231

14.7. readv and writev Functions

[View full width]

#include <sys/uio.h> ssize_t readv(int

filedes , const struct iovec *

iov , int

iovcnt ); ssize_t writev(int

filedes , const struct iovec *

iov , int

iovcnt );

Both return: number of bytes read or written, 1 on error

The second argument to both functions is a pointer to an array of iovec structures:

struct iovec { void *iov_base; /* starting address of buffer */ size_t iov_len; /* size of buffer */ };

The number of elements in the

iov array is specified by

iovcnt . It is limited to IOV_MAX (Recall Figure 2.10). Figure 14.27 shows a picture relating the arguments to these two functions and the iovec structure.

Figure 14.27. The iovec structure for readv and writev

[View full size image]

The writev function gathers the output data from the buffers in order:

iov[0] ,

iov[1] , through

iov[iovcnt

1] ; writev returns the total number of bytes output, which should normally equal the sum of all the buffer lengths.

The readv function scatters the data into the buffers in order, always filling one buffer before proceeding to the next. readv returns the total number of bytes that were read. A count of 0 is returned if there is no more data and the end of file is encountered.

These two functions originated in 4.2BSD and were later added to SVR4. These two functions are included in the XSI extension of the Single UNIX Specification.

Although the Single UNIX Specification defines the buffer address to be a void *, many implementations that predate the standard still use a char * instead.

Example

Section 20.8, in the function _db_writeidx, we need to write two buffers consecutively to a file. The second buffer to output is an argument passed by the caller, and the first buffer is one we create, containing the length of the second buffer and a file offset of other information in the file. There are three ways we can do this.

Call write twice, once for each buffer.

Allocate a buffer of our own that is large enough to contain both buffers, and copy both into the new buffer. We then call write once for this new buffer.

Call writev to output both buffers.

The solution we use in Section 20.8 is to use writev, but it's instructive to compare it to the other two solutions.

Section 8.16) to obtain the user CPU time, system CPU time, and wall clock time before and after the writes. All three times are shown in seconds.

As we expect, the system time increases when we call write twice, compared to calling either write or writev once. This correlates with the results in Figure 3.5.

Next, note that the sum of the CPU times (user plus system) is less when we do a buffer copy followed by a single write compared to a single call to writev. With the single write, we copy the buffers to a staging buffer at user level, and then the kernel will copy the data to its internal buffers when we call write. With writev, we should do less copying, because the kernel only needs to copy the data directly into its staging buffers. The fixed cost of using writev for such small amounts of data, however, is greater than the benefit. As the amount of data we need to copy increases, the more expensive it will be to copy the buffers in our program, and the writev alternative will be more attractive.

Be careful not to infer too much about the relative performance of Linux to Mac OS X from the numbers shown in