Chapter 9. Directories - Perl Cd Bookshelf [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Perl Cd Bookshelf [Electronic resources] - نسخه متنی

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید



Chapter 9. Directories


Contents:

Introduction

Getting and Setting Timestamps

Deleting a File

Copying or Moving a File

Recognizing Two Names for the Same File

Processing All Files in a Directory

Globbing, or Getting a List of Filenames Matching a Pattern

Processing All Files in a Directory Recursively

Removing a Directory and Its Contents

Renaming Files

Splitting a Filename into Its Component Parts

Working with Symbolic File Permissions Instead of Octal Values

Program: symirror

Program: lst

Chris Torek

Unix has its weak points, but its file system is not one of them.


9.0. Introduction


To
fully understand directories, you need to be acquainted with the
underlying mechanics. The following explanation is slanted toward the
Unix filesystem, for whose syscalls and behavior Perl's directory
access routines were designed, but it is applicable to some degree to
most other platforms.

A
filesystem consists of two parts: a set of data blocks where the
contents of files and directories are kept, and an index to those
blocks. Each entity in the filesystem has an entry in the index, be
it a plain file, a directory, a link, or a special file like those in
/dev. Each entry in the index is called an
inode (short for index
node
). Since the index is a flat index, inodes are
addressed by number.

A directory is a specially formatted file, whose inode entry marks it
as a directory. A directory's data blocks contain a set of pairs.
Each pair consists of the name of something in that directory and the
inode number of that thing. The data blocks for
/usr/bin might contain:



























Name


Inode


bc


17


du


29


nvi


8


pine


55


vi


8

Every directory is like this, even
the root directory (/). To read the file
/usr/bin/vi, the operating system reads the
inode for /, reads its data blocks to find the
entry for /usr, reads
/usr's inode, reads its data block to find
/usr/bin, reads /usr/bin's
inode, reads its data block to find /usr/bin/vi,
reads /usr/bin/vi's inode, and then reads the
data from its data block.

The name in a directory entry isn't fully qualified. The file
/usr/bin/vi has an entry with the name
vi in the /usr/bin
directory. If you open the directory /usr/bin
and read entries one by one, you get filenames like
patch, rlogin, and
vi instead of fully qualified names like
/usr/bin/patch,
/usr/bin/rlogin, and
/usr/bin/vi.

The inode has more than a pointer to the data blocks. Each inode also
contains the type of thing it represents (directory, plain file,
etc.), the size of the thing, a set of permissions bits, owner and
group information, the time the thing was last modified, the number
of directory entries that point to this inode, and so on.

Some operations on files change the contents of the file's data
blocks; others change just the inode. For instance, appending to or
truncating a file updates its inode by changing the size field. Other
operations change the directory entry that points to the file's
inode. Changing a file's name changes only the directory entry; it
updates neither the file's data nor its inode.


Three fields
in the inode structure contain the last access, change, and
modification times: atime,
ctime, and mtime. The
atime field is updated each time the pointer to
the file's data blocks is followed and the file's data is read. The
mtime field is updated each time the file's data
changes. The ctime field is updated each time the
file's inode changes. The ctime is
not creation time; there is no way under
standard Unix to find a file's creation time.

Reading a file changes its atime only. Changing a
file's name doesn't change atime,
ctime, or mtime, because the
directory entry changed (it does change the
atime and mtime of the
directory the file is in, though). Truncating a file doesn't change
its atime (because we haven't read; we've just
changed the size field in its directory entry), but it does change
its ctime because we changed its size field and
its mtime because we changed its contents (even
though we didn't follow the pointer to do so).

We can access the inode of a file or directory by calling the
built-in function stat on its name. For instance,
to get the inode for /usr/bin/vi,
say:

@entry = stat("/usr/bin/vi") or die "Couldn't stat /usr/bin/vi : $!";

To get the inode for the directory /usr/bin, say:

@entry = stat("/usr/bin") or die "Couldn't stat /usr/bin : $!";

You can stat filehandles, too:

@entry = stat(INFILE) or die "Couldn't stat INFILE : $!";

The stat function returns a list of the values of
the fields in the directory entry. If it couldn't get this
information (for instance, if the file doesn't exist), it returns an
empty list. It's this empty list we test for using the
or die construct. Be careful of
using || die because that
throws the expression into scalar context, in which case
stat only reports whether it worked. It doesn't
return the list of values. The underscore ( _ )
cache referred to later will still be updated,
though.

The values returned by stat are listed in
Table 9-1.

Table 9-1. Stat return values









































































Element


Abbreviation


Description


0


dev


Device number of filesystem


1


ino


Inode number (the "pointer" field)


2


mode


File mode (type and permissions)


3


nlink


Number of (hard) links to the file


4


uid


Numeric user ID of file's owner


5


gid


Numeric group ID of file's owner


6


rdev


The device identifier (special files only)


7


size


Total size of file, in bytes


8


atime


Last access time, in seconds, since the Epoch


9


mtime


Last modify time, in seconds, since the Epoch


10


ctime


Inode change time, in seconds, since the Epoch


11


blksize


Preferred block size for filesystem I/O


12


blocks


Actual number of blocks allocated

The standard File::stat module provides a named interface to these
values. It overrides the stat function, so instead
of returning the preceding array, it returns an object with a method
for each attribute:

use File::stat;
$inode = stat("/usr/bin/vi");
$ctime = $inode->ctime;
$size = $inode->size;

In addition, Perl provides operators that call
stat and return one value only (see
Table 9-2). These are collectively referred to as the
-X operators because they all take the form of a
dash followed by a single character. They're modeled on the shell's
test operators.

Table 9-2. File test operators













































































































































































-X


Stat field


Meaning


-r


mode


File is readable by effective UID/GID


-w


mode


File is writable by effective UID/GID


-x


mode


File is executable by effective UID/GID


-o


mode


File is owned by effective UID





-R


mode


File is readable by real UID/GID


-W


mode


File is writable by real UID/GID


-X


mode


File is executable by real UID/GID


-O


mode


File is owned by real UID





-e



File exists


-z


size


File has zero size


-s


size


File has nonzero size (returns size)





-f


mode,rdev


File is a plain file


-d


mode,rdev


File is a directory


-l


mode


File is a symbolic link


-p


mode


File is a named pipe (FIFO)


-S


mode


File is a socket


-b


rdev


File is a block special file


-c


rdev


File is a character special file


-t


rdev


Filehandle is opened to a tty





-u


mode


File has setuid bit set


-g


mode


File has setgid bit set


-k


mode


File has sticky bit set





-T


N/A


File is a text file


-B


N/A


File is a binary file (opposite of -T)





-M


mtime


Age of file in days when script started


-A


atime


Same for access time


-C


ctime


Same for inode change time (not creation)

The stat and the -X operators
cache the values that the stat(2) syscall
returned. If you then call stat or a
-X operator with the special filehandle
_ (a single underscore), it won't call
stat again but will instead return information
from its cache. This lets you test many properties of a single file
without calling stat(2) many times or
introducing a race condition:

open(F, "<", $filename )
or die "Opening $filename: $!\n";
unless (-s F && -T _) {
die "$filename doesn't have text in it.\n";
}

The stat call just returns the information in one
inode, though. How do we list the directory contents? For that, Perl
provides opendir, readdir, and
closedir:

opendir(DIRHANDLE, "/usr/bin") or die "couldn't open /usr/bin : $!";
while ( defined ($filename = readdir(DIRHANDLE)) ) {
print "Inside /usr/bin is something called $filename\n";
}
closedir(DIRHANDLE);

These directory-reading functions are designed to look like the file
open and close functions. Where open takes a
filehandle, though, opendir takes a directory
handle. They may look the same to you (the same bare word), but they
occupy different namespaces. Therefore, you could
open(BIN, "/a/file") and
opendir(BIN, "/a/dir"), and
Perl won't get confused. You might, but Perl won't. Because
filehandles and directory handles are different, you can't use the
<> operator to read from a directory handle
(<> calls readline on the
filehandle).

Similar to what happens with open and the other
functions that initialize filehandles, you can supply
opendir an undefined scalar variable where the
directory handle is expected. If the function succeeds, Perl
initializes that variable with a reference to a new, anonymous
directory handle.

opendir(my $dh, "/usr/bin") or die;
while (defined ($filename = readdir($dh))) {
# ...
}
closedir($dh);

Just like any other autovivified reference, when this one is no
longer used (for example, when it goes out of scope and no other
references to it are held), Perl automatically deallocates it. And
just as close is implicitly called on filehandles
autovivified through open at that point, directory
handles autovivified through opendir have
closedir called on them, too.



Filenames in a directory aren't necessarily
stored alphabetically. For an alphabetical list of files, read the
entries and sort them yourself.

The separation of directory information from inode information can
create some odd situations. Operations that update the
directory—such as linking, unlinking, or renaming a
file—all require write permission only on the directory, not on
the file. This is because the name of a file is actually something
the directory calls that file, not a property inherent to the file
itself. Only directories hold names of files; files are ignorant of
their own names. Only operations that change information in the file
data itself demand write permission on the file. Lastly, operations
that alter the file's permissions or other metadata are restricted to
the file's owner or the superuser. This can lead to the interesting
situation of being able to delete (i.e., unlink from its directory) a
file you can't read, or write to a file you can't delete.

Although these situations may make the filesystem structure seem odd
at first, they're actually the source of much of Unix's power. Links,
two filenames that refer to the same file, are now extremely simple.
The two directory entries just list the same inode number. The inode
structure includes a count of the number of directory entries
referring to the file (nlink in the values
returned by stat). This lets the operating system
store and maintain only one copy of the modification times, size, and
other file attributes. When one directory entry is
unlink ed, data blocks are deleted only if the
directory entry was the last one that referred to the file's
inode—and no processes still have the file open. You can
unlink an open file, but its disk space won't be
released until the last close.


Links come in
two forms. The kind described previously, where two directory entries
list the same inode number (like vi and
nvi in the earlier table), are called
hard links. The operating system cannot tell the
first directory entry of a file (the one created when the file was
created) from any subsequent hard links to it. The other kind,
soft or symbolic links, are
very different. A soft link is a special type of file whose data
block stores the filename the file is linked to. Soft links have a
different mode value, indicating they're not
regular files. The operating system, when asked to
open a soft link, instead opens the filename
contained in the data block.

9.0.1. Executive Summary


Filenames are kept in a directory, separate from the size,
protections, and other metadata kept in an inode.

The stat function returns the inode information
(metadata).

opendir, readdir, and friends
provide access to filenames in a directory through a
directory handle.

Directory handles look like filehandles, but they are not the same.
In particular, you can't use <> on directory
handles.

Permissions on a directory determine whether you can read and write
the list of filenames. Permissions on a file determine whether you
can change the file's metadata or contents.

Three different times are stored in an inode. None of them is the
file's creation time.



8.27. Program: Flat File Indexes9.1. Getting and Setting Timestamps




Copyright © 2003 O'Reilly & Associates. All rights reserved.

/ 875