5.3 UNIX File System ImplementationDisk formatting divides a physical disk into regions called partitions. Each partition can have its own file system associated with it. A particular file system can be mounted at any node in the tree of another file system. The topmost node in a file system is called the root of the file system. The root directory of a process (denoted by /) is the topmost directory that the process can access. All fully qualified paths in UNIX start from the root directory /.Figure 5.2 shows a typical root file system tree containing some of the standard UNIX subdirectories. The /dev directory holds specifications for the devices (special files) on the system. The /etc directory holds files containing information regarding the network, accounts and other databases that are specific to the machine. The /home directory is the default directory for user accounts. The /opt directory is a standard location for applications in System V Release 4. Look for include files in the /usr/include directory. The /var directory contains system files that vary and can grow arbitrarily large (e.g., log files, or mail when it arrives but before it has been read). POSIX does not require that a file system have these subdirectories, but many systems organize their directory structure in a similar way. Figure 5.2. Structure of a typical UNIX file system![]() 5.3.1 UNIX file implementationPOSIX does not mandate any particular representation of files on disk, but traditionally UNIX files have been implemented with a modified tree structure, as described in this section. Directory entries contain a filename and a reference to a fixed-length structure called an inode. The inode contains information about the file size, the file location, the owner of the file, the time of creation, time of last access, time of last modification, permissions and soon.Figure 5.3 shows the inode structure for a typical file. In addition to descriptive information about the file, the inode contains pointers to the first few data blocks of the file. If the file is large, the indirect pointer is a pointer to a block of pointers that point to additional data blocks. If the file is still larger, the double indirect pointer is a pointer to a block of indirect pointers. If the file is really huge, the triple indirect pointer contains a pointer to a block of double indirect pointers. The word block can mean different things (even within UNIX). In this context a block is typically 8K bytes. The number of bytes in a block is always a power of 2. Figure 5.3. Schematic structure of a traditional UNIX file.![]() Exercise 5.11Suppose that an inode is 128 bytes, pointers are 4 bytes long, and the status information takes up 68 bytes. Assume a block size of 8K bytes and block pointers of 32 bits each. How much room is there for pointers in the inode? How big a file can be represented with direct pointers? Indirect? Double indirect? Triple indirect?Answer:The single, double, and triple indirect pointers take 4 bytes each, so 128 - 68 - 12 = 48 bytes are available for 12 direct pointers. The size of the inode and the block size depend on the system. A file as large as 8192 x 12 = 98, 304 bytes can be represented solely with direct pointers. If the block size is 8K bytes, the single indirect pointer addresses an 8K block that can hold 8192 ÷ 4 = 2048 pointers to data blocks. Thus, the single indirect pointer provides the capability of addressing an additional 2048 x 8192 = 16, 777, 216 bytes or 16 megabytes of information. Double indirect addressing provide 2048 x 2048 pointers with the capability of addressing an additional 32 gigabytes. Triple indirect addressing provides 2048 x 2048 x 2048 pointers with the capability of addressing an additional 64 terabytes. However, since 20483 = 233, pointers would need to be longer than 4 bytes to fully address this storage. Exercise 5.12How large a file can you access using only the single indirect, double indirect, and triple indirect pointers if the block size is 8K bytes and pointers are 64 bits?Answer:A block can now hold only 1024 pointers, so the single indirect pointer can address 1024 x 8192 = 8,388,608 bytes. Double indirect addressing provides 1024 x 1024 pointers with the capability of addressing an additional 8 gigabytes. Triple indirect addressing provides 1024 x 1024 x 1024 pointers with the capability of addressing an additional 8 terabytes. Exercise 5.13How big can you make a disk partition if the block size is 8K bytes and pointers are 32 bits? How can bigger disks be handled? What are the tradeoffs?Answer:32-bit addresses can access approximately 4 billion blocks (4,294,967,296 to be exact). 8K blocks give 245 Exercise 5.14Give some limitations of a file implementation based on inodes.Answer:The file must fit entirely in a single disk partition. The partition size and maximum number of files are fixed when the system is set up. 5.3.2 Directory implementationA directory is a file containing a correspondence between filenames and file locations. UNIX has traditionally implemented the location specification as an inode number, but as noted above, POSIX does not require this. The inode itself does not contain the filename. When a program references a file by pathname, the operating system traverses the file system tree to find the filename and inode number in the appropriate directory. Once it has the inode number, the operating system can determine other information about the file by accessing the inode. (For performance reasons, this is not as simple as it seems, because the operating system caches both directory entries and inode entries in main memory.)A directory implementation that contains only names and inode numbers has the following advantages.
|