File System FundamentalsThis section looks at some file system basics, including factors to take into account when creating a new file system, creating optimized file systems, and basic file system terminology.In the following sections, the term file system is used to indicate both a type of file system implementation and an actual hierarchical implementation of user or system data. File System Implementation ConsiderationsFile systems are one of the most important parts of an operating system. File systems store and manage user data on disk drives and ensure that what is read from storage is identical to what was written. In addition to storing user data in files, file systems also create and manage information about files and about themselves. Besides guaranteeing the integrity of the data, file systems need to be extremely reliable and have excellent performance.Before creating file systems, you need a plan for their layout. The following are some general considerations to be aware of when planning your system:I/O workload should be distributed as evenly as possible across the disk drives.The number of file systems on any one disk should be kept to a minimum. All the Linux file systems are better able to manage fragmentation of a file system in a larger partition/volume than in a small, completely full partition.If a large set of files (in size, number, or both) has characteristics that make the files significantly different from "typical" files, create a separate file system for these files that is tuned to their requirements. The parameters that affect file system performance are set when the file system is initially defined. When a file system is created with the mkfs command, a set of default values is applied to define the file system, unless the defaults are specifically overridden. Although it is possible to use the file system tune command to modify some of the parameters after the file system is defined, not all parameters can be changed after the file system is created. Most of the time, it is simpler to use the mkfs command to define the file system right from the beginning. Advanced planning to create the file system is essential. Later in this chapter, we examine some of the specifics that can be set when a new file system is created. Creating Optimized File Systemsmkfs is the front end to each file system's format program and its subprograms. mkfs creates file systems on disk partitions (or volumes, if a volume manager is being used).The mkfs command can usually run without the optional parameters that affect performance. To a large degree, this is because the default parameters are adequate in most cases. mkfs calculates the appropriate parameters to use based on the information it can ascertain about the volume or disk drive, and then it calls the file system mkfs command to actually create the file system. Basic File System TermsBefore delving any further into file systems on Linux, it's important to learn some of the common file system terms. These terms are used throughout the remainder of this chapter.A logical block is the smallest unit of storage that can be allocated by the file system. A logical block is measured in bytes, and it may take several blocks to store a single file.A logical volume can be one or more physical disks or some subset of the physical disk space.Block allocation is a method of allocating blocks where the file system allocates one block at a time. In this method, a pointer to every block in a file is maintained and recorded. Ext2 uses block allocation.Extent allocation. Large numbers of contiguous blocks, called extents, are allocated to the file and are tracked as a unit. A pointer needs only to be maintained to the beginning of the extent. Because a single pointer is used to track a large number of blocks, the bookkeeping for large files is much more efficient.Fragmentation is the scattering of files into blocks that are not contiguous and is a problem that all file systems encounter. Fragmentation is caused when files are created and deleted. The fragmentation problem can be solved by having the file system use advanced algorithms to reduce fragmentation.Internal fragmentation occurs when a file does not a fill a block completely. For example, if a file is 10K and a block is 8K, the file system allocates two blocks to hold the file, but 6K is wasted. Notice that as blocks get bigger, so does the waste.External fragmentation occurs when the logical blocks that make up a file are scattered all over the disk. External fragmentation can cause poor performance.An extent is a large number of contiguous blocks. Each extent is described by a triple consisting of (file offset, starting block number, length), where file offset is the offset of the extent's first block from the beginning of the file, starting block number is the first block in the extent, and length is the number of blocks in the extent. Extents are allocated and tracked as a single unit, meaning that a single pointer tracks a group of blocks. For large files, extent allocation is a much more efficient technique than block allocation. Figure 11-1 shows how extents are used. Figure 11-1. An extent is described by its block offset in the file, the location of the first block in the extent, and the length of the extent.![]() If file sample.txt requires 18 blocks, and the file system can allocate one extent of length 8, a second extent of length 5, and a third extent of length 5, the file system would look something like Figure 11-1. The first extent has offset 0 (block A in the file), location 10, and length 8. The second extent has offset 8 (block I), location 20, and length 5. The last extent has offset 13 (block N), location 35, and length 5. |
File System FundamentalsThis section looks at some file system basics, including factors to take into account when creating a new file system, creating optimized file systems, and basic file system terminology.In the following sections, the term file system is used to indicate both a type of file system implementation and an actual hierarchical implementation of user or system data. File System Implementation ConsiderationsFile systems are one of the most important parts of an operating system. File systems store and manage user data on disk drives and ensure that what is read from storage is identical to what was written. In addition to storing user data in files, file systems also create and manage information about files and about themselves. Besides guaranteeing the integrity of the data, file systems need to be extremely reliable and have excellent performance.Before creating file systems, you need a plan for their layout. The following are some general considerations to be aware of when planning your system:I/O workload should be distributed as evenly as possible across the disk drives.The number of file systems on any one disk should be kept to a minimum. All the Linux file systems are better able to manage fragmentation of a file system in a larger partition/volume than in a small, completely full partition.If a large set of files (in size, number, or both) has characteristics that make the files significantly different from "typical" files, create a separate file system for these files that is tuned to their requirements. The parameters that affect file system performance are set when the file system is initially defined. When a file system is created with the mkfs command, a set of default values is applied to define the file system, unless the defaults are specifically overridden. Although it is possible to use the file system tune command to modify some of the parameters after the file system is defined, not all parameters can be changed after the file system is created. Most of the time, it is simpler to use the mkfs command to define the file system right from the beginning. Advanced planning to create the file system is essential. Later in this chapter, we examine some of the specifics that can be set when a new file system is created. Creating Optimized File Systemsmkfs is the front end to each file system's format program and its subprograms. mkfs creates file systems on disk partitions (or volumes, if a volume manager is being used).The mkfs command can usually run without the optional parameters that affect performance. To a large degree, this is because the default parameters are adequate in most cases. mkfs calculates the appropriate parameters to use based on the information it can ascertain about the volume or disk drive, and then it calls the file system mkfs command to actually create the file system. Basic File System TermsBefore delving any further into file systems on Linux, it's important to learn some of the common file system terms. These terms are used throughout the remainder of this chapter.A logical block is the smallest unit of storage that can be allocated by the file system. A logical block is measured in bytes, and it may take several blocks to store a single file.A logical volume can be one or more physical disks or some subset of the physical disk space.Block allocation is a method of allocating blocks where the file system allocates one block at a time. In this method, a pointer to every block in a file is maintained and recorded. Ext2 uses block allocation.Extent allocation. Large numbers of contiguous blocks, called extents, are allocated to the file and are tracked as a unit. A pointer needs only to be maintained to the beginning of the extent. Because a single pointer is used to track a large number of blocks, the bookkeeping for large files is much more efficient.Fragmentation is the scattering of files into blocks that are not contiguous and is a problem that all file systems encounter. Fragmentation is caused when files are created and deleted. The fragmentation problem can be solved by having the file system use advanced algorithms to reduce fragmentation.Internal fragmentation occurs when a file does not a fill a block completely. For example, if a file is 10K and a block is 8K, the file system allocates two blocks to hold the file, but 6K is wasted. Notice that as blocks get bigger, so does the waste.External fragmentation occurs when the logical blocks that make up a file are scattered all over the disk. External fragmentation can cause poor performance.An extent is a large number of contiguous blocks. Each extent is described by a triple consisting of (file offset, starting block number, length), where file offset is the offset of the extent's first block from the beginning of the file, starting block number is the first block in the extent, and length is the number of blocks in the extent. Extents are allocated and tracked as a single unit, meaning that a single pointer tracks a group of blocks. For large files, extent allocation is a much more efficient technique than block allocation. Figure 11-1 shows how extents are used. Figure 11-1. An extent is described by its block offset in the file, the location of the first block in the extent, and the length of the extent.![]() If file sample.txt requires 18 blocks, and the file system can allocate one extent of length 8, a second extent of length 5, and a third extent of length 5, the file system would look something like Figure 11-1. The first extent has offset 0 (block A in the file), location 10, and length 8. The second extent has offset 8 (block I), location 20, and length 5. The last extent has offset 13 (block N), location 35, and length 5. |