The second argument to both functions is a pointer to an array of iovec structures:
struct iovec { void *iov_base; /* starting address of buffer */ size_t iov_len; /* size of buffer */ };
The number of elements in the
iov array is specified by
iovcnt . It is limited to IOV_MAX (Recall Figure 2.10). Figure 14.27 shows a picture relating the arguments to these two functions and the iovec structure.
The writev function gathers the output data from the buffers in order:
iov[0] ,
iov[1] , through
iov[iovcnt
1] ; writev returns the total number of bytes output, which should normally equal the sum of all the buffer lengths.
The readv function scatters the data into the buffers in order, always filling one buffer before proceeding to the next. readv returns the total number of bytes that were read. A count of 0 is returned if there is no more data and the end of file is encountered.
These two functions originated in 4.2BSD and were later added to SVR4. These two functions are included in the XSI extension of the Single UNIX Specification.
Although the Single UNIX Specification defines the buffer address to be a void *, many implementations that predate the standard still use a char * instead.
Section 20.8, in the function _db_writeidx, we need to write two buffers consecutively to a file. The second buffer to output is an argument passed by the caller, and the first buffer is one we create, containing the length of the second buffer and a file offset of other information in the file. There are three ways we can do this.
Call write twice, once for each buffer.
Allocate a buffer of our own that is large enough to contain both buffers, and copy both into the new buffer. We then call write once for this new buffer.
Call writev to output both buffers.
The solution we use in Section 20.8 is to use writev, but it's instructive to compare it to the other two solutions.
Section 8.16) to obtain the user CPU time, system CPU time, and wall clock time before and after the writes. All three times are shown in seconds.
As we expect, the system time increases when we call write twice, compared to calling either write or writev once. This correlates with the results in Figure 3.5.
Next, note that the sum of the CPU times (user plus system) is less when we do a buffer copy followed by a single write compared to a single call to writev. With the single write, we copy the buffers to a staging buffer at user level, and then the kernel will copy the data to its internal buffers when we call write. With writev, we should do less copying, because the kernel only needs to copy the data directly into its staging buffers. The fixed cost of using writev for such small amounts of data, however, is greater than the benefit. As the amount of data we need to copy increases, the more expensive it will be to copy the buffers in our program, and the writev alternative will be more attractive.
Be careful not to infer too much about the relative performance of Linux to Mac OS X from the numbers shown in