Advanced Programming in the UNIX Environment: Second Edition [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Advanced Programming in the UNIX Environment: Second Edition [Electronic resources] - نسخه متنی

W. Richard Stevens; Stephen A. Rago

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید



3.6. lseek Function


Every open file has an associated "current file offset," normally a non-negative integer that measures the number of bytes from the beginning of the file. (We describe some exceptions to the "non-negative" qualifier later in this section.) Read and write operations normally start at the current file offset and cause the offset to be incremented by the number of bytes read or written. By default, this offset is initialized to 0 when a file is opened, unless the O_APPEND option is specified.

An open file's offset can be set explicitly by calling lseek.

#include <unistd.h>
off_t lseek(int

filedes , off_t

offset , int

whence );

Returns: new file offset if OK, 1 on error

The interpretation of the

offset depends on the value of the

whence argument.

  • If

    whence is SEEK_SET, the file's offset is set to

    offset bytes from the beginning of the file.

  • If

    whence is SEEK_CUR, the file's offset is set to its current value plus the

    offset . The

    offset can be positive or negative.

  • If

    whence is SEEK_END, the file's offset is set to the size of the file plus the

    offset . The

    offset can be positive or negative.


Because a successful call to lseek returns the new file offset, we can seek zero bytes from the current position to determine the current offset:

off_t currpos;
currpos = lseek(fd, 0, SEEK_CUR);

This technique can also be used to determine if a file is capable of seeking. If the file descriptor refers to a pipe, FIFO, or socket, lseek sets errno to ESPIPE and returns 1.

The three symbolic constantsSEEK_SET, SEEK_CUR, and SEEK_ENDwere introduced with System V. Prior to this,

whence was specified as 0 (absolute), 1 (relative to current offset), or 2 (relative to end of file). Much software still exists with these numbers hard coded.

The character l in the name lseek means "long integer." Before the introduction of the off_t data type, the

offset argument and the return value were long integers. lseek was introduced with Version 7 when long integers were added to C. (Similar functionality was provided in Version 6 by the functions seek and tell.)


Example

The program in Figure 3.1 tests its standard input to see whether it is capable of seeking.

If we invoke this program interactively, we get

$

./a.out < /etc/motd
seek OK
$

cat < /etc/motd | ./a.out
cannot seek
$

./a.out < /var/spool/cron/FIFO
cannot seek


Figure 3.1. Test whether standard input is capable of seeking

#include "apue.h"
int
main(void)
{
if (lseek(STDIN_FILENO, 0, SEEK_CUR) == -1)
printf("cannot seek\n");
else
printf("seek OK\n");
exit(0);
}

Figure 2.20), we lose a factor of 2 in the maximum file size. If off_t is a 32-bit integer, the maximum file size is 231-1 bytes.

lseek only records the current file offset within the kernelit does not cause any I/O to take place. This offset is then used by the next read or write operation.

The file's offset can be greater than the file's current size, in which case the next write to the file will extend the file. This is referred to as creating a hole in a file and is allowed. Any bytes in a file that have not been written are read back as 0.

A hole in a file isn't required to have storage backing it on disk. Depending on the file system implementation, when you write after seeking past the end of the file, new disk blocks might be allocated to store the data, but there is no need to allocate disk blocks for the data between the old end of file and the location where you start writing.


Example

The program shown in Figure 3.2 creates a file with a hole in it.

Running this program gives us

$

./a.out
$

ls -l file.hole

check its size
-rw-r--r-- 1 sar 16394 Nov 25 01:01 file.hole
$

od -c file.hole

let's look at the actual contents
0000000 a b c d e f g h i j \0 \0 \0 \0 \0 \0
0000020 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
0040000 A B C D E F G H I J
0040012

We use the od(1) command to look at the contents of the file. The -c flag tells it to print the contents as characters. We can see that the unwritten bytes in the middle are read back as zero. The seven-digit number at the beginning of each line is the byte offset in octal.

To prove that there is really a hole in the file, let's compare the file we've just created with a file of the same size, but without holes:

$

ls -ls file.hole file.nohole

compare sizes
8 -rw-r--r-- 1 sar 16394 Nov 25 01:01 file.hole
20 -rw-r--r-- 1 sar 16394 Nov 25 01:03 file.nohole

Although both files are the same size, the file without holes consumes 20 disk blocks, whereas the file with holes consumes only 8 blocks.

In this example, we call the write function (Section 3.8). We'll have more to say about files with holes in Section 4.12.


Figure 3.2. Create a file with a hole in it

#include "apue.h"
#include <fcntl.h>
char buf1[] = "abcdefghij";
char buf2[] = "ABCDEFGHIJ";
int
main(void)
{
int fd;
if ((fd = creat("file.hole", FILE_MODE)) < 0)
err_sys("creat error");
if (write(fd, buf1, 10) != 10)
err_sys("buf1 write error");
/* offset now = 10 */
if (lseek(fd, 16384, SEEK_SET) == -1)
err_sys("lseek error");
/* offset now = 16384 */
if (write(fd, buf2, 10) != 10)
err_sys("buf2 write error");
/* offset now = 16394 */
exit(0);
}

Because the offset address that lseek uses is represented by an off_t, implementations are allowed to support whatever size is appropriate on their particular platform. Most platforms today provide two sets of interfaces to manipulate file offsets: one set that uses 32-bit file offsets and another set that uses 64-bit file offsets.

The Single UNIX Specification provides a way for applications to determine which environments are supported through the sysconf function (Section 2.5.4.). Figure 3.3 summarizes the sysconf constants that are defined.

Figure 3.3. Data size options and

name arguments to sysconf

Name of option

Description

name argument

_POSIX_V6_ILP32_OFF32

int, long, pointer, and off_t types are 32 bits.

_SC_V6_ILP32_OFF32

_POSIX_V6_ILP32_OFFBIG

int, long, and pointer types are 32 bits; off_t types are at least 64 bits.

_SC_V6_ILP32_OFFBIG

_POSIX_V6_LP64_OFF64

int types are 32 bits; long, pointer, and off_t types are 64 bits.

_SC_V6_LP64_OFF64

_POSIX_V6_LP64_OFFBIG

int types are 32 bits; long, pointer, and off_t types are at least 64 bits.

_SC_V6_LP64_OFFBIG


    / 369