NTFS Attributes
As I mentioned earlier, the MFT is a database. The records in this database contain attributes that describe elements of the file system such as files, directories, filenames, and so forth. These attributes are defined by the $AttrDef metadata record. Each attribute is given a code number that identifies it in the MFT records.Following is a brief description of each attribute. If you would like to see this list in a production file system, the attributes are defined in the $Attribute metadata record. You can't see this record from the operating system shell, but you can scan the NTFS volume using a hex editor and search for any of the attribute names. Remember that they are prefixed with a dollar sign ($).
- Header .
The first part of every MFT record consists of header information that does not have an attribute. This information includes the MFT record number, an increment counter for tracking changes to the record, and the MFT record number of the directory that contains the file or folder represented by the record. - $Standard_Information .
Contains the standard file attributes (Read-only, Hidden, System, and Archive) along with a set of timestamps. In addition, for NTFS 3.0 and later, $Standard_Information contains a pointer to a security descriptor in the $Secure metadata record. See section, "Security Descriptor," for more information. - $File_Name .
Contains the name of the MFT record. If a record has a long name, a second $File_Name attribute is added with the short, DOS-compatible name (unless short name generation has been disabled). - $Security_Descriptor .
This attribute is no longer used. Starting with NTFS 3.0, the $Security_Descriptor attribute was replaced by entries in the $Secure metadata record. The $Standard_Information attribute in an MFT record contains an index number for the $Secure entry that represents the security descriptor for the MFT record. - $Data .
This attribute stores what is commonly thought of as the contents of a file. An MFT record can have multiple $Data attributes. See the "Named Data Streams" section for more information. - $Index_Root, $Index_Allocation , and $Bitmap .
These attributes are used to index MFT attributes for quick access. They are primarily used by directory records to index filenames, but they are also used to index other attributes to support features such as link tracking and reparse points. - $Reparse_Point .
This attribute contains a pointer to a volume, folder record, or device. When the record containing this pointer is opened, the file system opens the target of the pointer instead. The $Reparse_Point attribute supports features such as mount points and Remote Storage Services. - $Logged_Utility_Stream .
This attribute is used by the Encrypting File System. See Chapter 17, "Managing File Encryption," for details. - $Ea and $Ea_Information .
These attributes were originally used to support the High-Performance File System (HPFS) used by the OS/2 subsystem. The OS/2 subsystem and HPFS are no longer supported by Windows 2000 or Windows Server 2003.
Security Descriptor
Before looking at the way NTFS 3.x deals with security descriptors, let's consider the handling of attributes in general.NTFS attributes are classified by whether they reside completely in the MFT record (resident) or they sit somewhere else on the disk with a pointer in the MFT record (non-resident).
Figure 15.8. Layout of the $Secure database.

The security descriptors themselves are stored in a data attribute called $SDS. Along with the security descriptor itself, the system stores these additional values in each $SDS entry:
- A hash of the security descriptor to use as an index key.
- A sequence number assigned to each security descriptor to act as an identifier.
- The security descriptor size.
- The offset of the security descriptor from the start of the $SDS data stream.
These last two entries are important because security descriptors can be different sizes so that they can't be found by a simple fixed-record lookup algorithm.
Security Descriptor Links
The $Standard_Information attribute for each MFT record contains the sequence number of its assigned security descriptor. This makes it simple to implement inheritance. When a new file is created, NTFS links the MFT record to the same security descriptor as the other files in the folder by putting the same $SDS sequence number in the $Standard_Information attribute.If you modify the security descriptor of a file or folder by adding entries to its access control list (ACL), or you select the Don't Inherit option in the ACL Editor to break the chain of inheritance, a new security descriptor is created in $SDS and the $Standard_Information attribute in the MFT record is updated to contain the sequence number assigned to the new security descriptor.In practice, NTFS uses separate security descriptors for folders and files. So, if you do not make any changes to the security permissions for files or folders in an NFTS volume, there will be just two entries in $SDS, one for all the folders in the volume and one for all the files under those folders.
Security Descriptor Lookups
The $Secure record maintains two indexes named $SDH and $SII. These indexes are shown in Figure 15.8 and are described as follows:
- The $SDH index sorts the $SDS entries by their hash key.
- The $SII index sorts the contents of the $SDH index by the security descriptor sequence number.
The $SDH index helps to limit the number of security descriptors by recycling those that are already in $SDS. If you set the contents of a security descriptor for a file or folder to match those of another file or folder, including all the inherited permissions, the system uses the sequence number of the existing $SDS entry.To make this trick work, the system needs a way to quickly scan for identical security descriptors. That is where the hash index in $SDH comes into play. The statistical likelihood of two security descriptors having an identical hash is vanishingly small, so the file system calculates the hash for a new security descriptor then scans for a match in $SDH. If it finds one, it uses the sequence number in the index to update the $Standard_Information attribute in the MFT record.What is the end result? An NTFS volume has a compact set of security descriptors that is fully cross-indexed to use for controlling access permissions to files and folders. This improves performance and simplifies inheritance.
Security Descriptor Highlights
You don't need to remember the details of the various components of the $Secure database. Here is a quick checklist of the operational requirements based on how the $Secure database works:
- If you add ACE entries to the ACL of a file or folder while retaining inheritance, a new $SDS entry is created with a security descriptor that has an ACL reflecting the inherited and explicitly applied access control entries (ACEs).
- If you move a file to another location on the same volume, the MFT record location is not changed and neither is the $SDS index number. Therefore, the file retains its old security settings plus inherits the new settings from its new folder.
- If you copy a file to another location on the same volume, a new MFT record is created. This new record gets the same $SDS index number as its parent folder.
- If you move a file to a different volume, once again, a new MFT record is created. The record gets the $SDS index number from its parent folder.
If you use xcopy /o to copy a file to a new location while retaining its security permissions, the result depends on whether the file had explicitly assigned entries in its security descriptor:
- If the file had its own security descriptor in $SDS, the sequence number to that entry is copied to the new MFT record.
- If the file had explicitly assigned ACEs along with inherited ACEs, the system scans to see if another $SDS entry has the same combination. If not, it creates a new $SDS entry and puts the sequence number in the new MFT record.
- If the file had no explicitly assigned ACEs, the system uses the $SDS sequence number for the security descriptor used by the other files in the folder. (Remember that files use different security descriptor entries than folders.)
File Records
A file record is used to store data for access by the operating system or by user applications. A file record in the MFT also stores information about the file itself, such as the file's name and when it was created and how big it is and if it has any special attributes such as Read-Only or Hidden or Compressed.Figure 15.9 shows the layout of a simple file record. It has a header and three attributes: $Standard_Information, $File_Name, and $Data.
Figure 15.9. Attributes of a simple file record.

Header Information
Some of the more important elements of the file header include the following:
- MFT record number .
A sequential number assigned to the record that helps identify it. Indexes such as directories use this information to correlate a filename to an MFT record. - Record type flag.
Indicates whether this is a file record or a directory record. If this flag is set to 0, it indicates that the record has been flagged for deletion and can be overwritten. Other record types are 1, File, and 2, Directory. - Actual size and Allocated size .
The actual size is the true number of bytes in the file. The allocated size is the number of bytes in the clusters assigned to the file. The larger the clusters, the more likely it is to have large differences in these numbers. - Update Sequence Number , or USN .
This acts as a version number. Each time the file is modified, this number is incremented by one. If a file is deleted and the MFT record subsequently reused, this number is set to a starting value of 2. This indicates to the system that no references are being made to deleted files.
$Standard_Information Elements
This MFT attribute contains a set of timestamps, a pointer to the security descriptor for the MFT record, and a flag that is commonly referred to as the file attributes . Here are the important file attribute flag settings:
- Standard DOS file attributes: Read-Only, Hidden, System, and Archive
- Compressed
- Reparse Point
- Sparse file
- Not Indexed (Content Indexer flag)
- Encrypted
Keep in mind as you work with files and folders that these flags are kept in a separate attribute from the actual contents of the file, which are stored in the $Data attribute. For instance, if you copy a file to a new location, you build a new MFT record with a new $Standard_Information attribute and therefore new file attributes. But if you move a file, the MFT record remains the same and so does the setting of the attribute flags in $Standard_Information.
$File_Name
Every MFT record has at least one $File_Name attribute. The filename in this attribute can be 256 characters long, a limitation based on the single-byte Length field for the name entry.A file can have more than one filename. For instance, if a name does not meet DOS 8.3 naming standards, the file system generates a short name and places it in a second $File_Name attribute. This supports DOS/Windows 3.x clients on the network plus any DOS applications at the console. See the following "Hard Links" sidebar for another example of multiple filenames.
Short Name Generation
DOS compatibility is still important and will remain so for the next few years. (I'm sure that by 2020, the last lawyer who insists on using WordPerfect 5.1 will have retired.) DOS support requires 8.3 filenames. All the file systems in Windows Server 2003 will generate short, DOS-compliant filenames automatically when a long name is assigned. You can see the short names from the command line using dir /x.Creating a short, 8.3 filename to represent a long 256-character name is not as straightforward as you might think. The system cannot simply take the first eight characters of the long name and the first three characters of the extension and call it a job well done. What if several files start with the same eight letters? Or how would you differentiate between .HTM files and .HTML files?Windows Server 2003 uses two algorithms to generate short filenames. Both preserve the first letters of the long name for alphabetical sorting. The algorithm changes when five or more files in the same directory start with the same letters. Here's the algorithm for fewer than five files:
- Delete all Unicode characters that do not map to standard ANSI characters.
- Remove spaces, internal periods, and other illegal DOS characters. The name Long.File.Name.Test becomes LongFileName.Test.
- Keep the first three characters after the last period as an extension. LongFileName.Test becomes LongFileName.Tes.
- Drop all characters after the first six. LongFileName.Tes becomes LongFi.Tes.
- Append a tilde (~) followed by a sequence numeral to the filename to prevent duplicate filenames. LongFi.Tes becomes LongFi~1.Tes.
- Finally, convert the name to uppercase. The final short form of Long.File.Name.Test is LONGFI~1.TES.
The fifth and subsequent file with a long name that starts with the same six letters as other files in the folder is treated somewhat differently:
- Drop Unicode characters, spaces, and extra periods (same process).
- Keep the first three characters after the last period as an extension (same process).
- Drop all characters after the first two instead of six. At this stage, Long.File.Name.Test5 becomes Lo.Tes.
- Append four hexadecimal numbers derived via an algorithm applied to the remaining characters in the long filename. Long.File.Name.Test5 yields D623 and Long.File.Name.Test6 becomes E623. At this stage, the short name is LoD623.Tes.
- Append a tilde (~) followed by a sequence numeral to the new filename just in case the algorithm comes up with duplicate names. LoD623.Tes becomes LoD623~1.Tes.
- Finally, convert the name to uppercase. The final short form of Long.File.Name.Test5 becomes LOD623~1.TES.
Long filenames have been around for years now, so I'm sure you are well aware of the standard pitfalls. Keep the following important items in mind:
- Moving long-name files between machines.
The short name algorithm used by the NT family of Windows works differently than the 9x family, and these both use a different algorithm than the two long namespaces in NetWareOs2.nam for version 3.x and Longname.nam for 4.1x and above. - Excessively long names slow performance.
It's a good idea to keep names shorter than 32 characters for optimal performance. If a directory gets heavily fragmented, the short and long name attributes can get separated, causing a dramatic increase in file lookup times. - Use caution in batch files.
The CMD.EXE command interpreter in Windows Server 2003 does not act the same as COMMAND.COM. For example, using CMD, you do not need to enclose long names with quotes when changing directories. You can enter cd c:\dir one and go right to the directory. This is not standard for all commands, however. If you enter del dir one*, you'll delete every file starting with dir and every file starting with one. - Special handling for file extensions.
File extensions also affect the operation of wildcards. Consider Long.File.Name.Test1 and Long.File.Name.Test2l as examples. If you go to the command prompt and do a directory listing for *, you get both files in the list instead of just the .HTM file. This seems like a fairly innocuous bug, but what if you enter del *, thinking to get rid of only old .HTM files? You also delete the .HTML files, as well. - DOS applications delete long names.
If a DOS application changes a short name, the long name is deleted. This has the potential for upsetting Windows users.
$Data
A file record stores user data inside an attribute called $Data. The $Data attribute resembles a nestling. Although it's small, a few hundred bytes or so, it lives with the MFT record. When it gets too big for the nest, the file system locates some free space out on the disk and pushes the data out there. It becomes non-resident.Data in a non-resident $Data attribute is stored in a contiguous set of clusters called a run . The portion of the $Data attribute remaining in the MFT contains a pointer to the location of this run. The pointer gives the following information:
- The number of the cluster that starts the run. This is called the Logical Cluster Number , or LCN. The LCN is measured from the start of the volume, so the 1500th cluster from the beginning of the volume would have an LCN of 1500.
- The length of the run in numbers of clusters. A 1024-byte file on a disk with 512-byte clusters would have a Run Length of 2.
If you add any data to the file, the file system simply appends the bytes onto the existing run and keeps expanding into new clusters as necessary.
Figure 15.10. Fragmented $Data attribute.

Unlike the pointer at the first run, which used a Logical Cluster Number to identify the start of the run, the pointer for the second and subsequent runs identifies the starting cluster in relation to the run before it. This is called a Virtual Cluster Number , or VCN. For instance, if Run 1 starts at cluster 100 and Run 2 starts at cluster 350, the VCN in the pointer to the second run would be 250.As the file continues to grow and grow, it might encounter occupied clusters, forcing the file system to fragment the file even further. As the file becomes more and more fragmented, it requires more and more pointers in the MFT record.At some point, the file becomes so fragmented that the pointers themselves will not fit in the 1K MFT record. When this happens, the file system creates another MFT record to hold the pointers. It leaves behind another type of pointer in an attribute called $Attribute_List. This pointer identifies the MFT record number that holds the additional pointers. This only happens when a file is severely fragmented.
MFT and Fragmentation
The MFT is a file, just like any other file in NTFS, and as such it can become fragmented if it is forced to grow into portions of the disk that already have clusters claimed by files.MFT fragmentation can seriously degrade performance, so NTFS tries to prevent it if possible. The strategy used by NTFS to protect the MFT is the same strategy used by nations to protect their territory. NTFS sets up a defensive zone in front of the MFT and avoids this zone when assigning new data clusters on the drive.By default, this MFT buffer zone takes up 12.5 percent of the volume size. On an 18GB volume, the MFT buffer would take up about 2.25GB. That is enough space for the MFT to hold more than 2 million files and directories without getting fragmented.If the disk starts to get full, the file system behaves like a pirate nation and starts encroaching on the MFT buffer. At some point, if this encroachment continues, you are likely to get excessive MFT fragmentation. You should never let an NTFS volume get to less than 15 percent free space, with 25 percent free space being the optimal lower limit to allow room for defragmentation.You can increase the MFT buffer size if you want, but the default should work fine as long as you don't overload your volumes. The following Registry entry controls the buffer size:
Key: HKLM | System | CurrentControlSet | Control | FileSystem
Value: NtfsMftZoneReservation
Data: 1-4 (REG_DWORD)
A setting of 1 represents the default 12.5 percent buffer allocation. Entering 2 carves out 25 percent. A 3 takes 50 percent, and a 4 takes 75 percent. The system works hard to keep the MFT buffer immaculate, so if you designate a bigger buffer, you should reduce your maximum volume loadings accordingly.
Named Data Streams
As we've seen, data saved to an NTFS file is stored in an attribute called $Data. As it turns out, an MFT record can have any number of $Data attributes. The default $Data attribute is like the star in an old-style spaghetti western. It has no name. When an application issues an API call to read from or write to a file, NTFS delivers the contents of this unnamed $Data attribute unless told otherwise.
- Build a file named Superman.txt by echoing a few characters from the command prompt into a file as follows:
C:\>echo It's a bird. > Superman.txt
This creates a Master File Table record for a file named Superman.txt with an unnamed data attribute that contains the characters It's a bird. - Add a second data attribute by echoing text to a named stream in the same file as follows:
C:\>echo It's a plane. > superman.txt:stream1 - Now, add a third data attribute with a different name:
C:\>echo It's SUPERMAN. > superman.txt:stream2
It can be something of a trick to view the contents of a named data stream. The application you use for viewing must be able to address the stream by name. Very few applications support this feature. In this simple example, let's use the MORE command to expose the named data streams:
C:\>more < superman.txt
It's a bird.
C:\>more < superman.txt:stream1
It's a plane.
C:\>more < superman.txt: stream2
It's SUPERMAN.
Implementations of Named Data Streams
Named data streams have more uses than just parlor tricks. Microsoft uses them to support several features, such as Services for Macintosh (SFM). An SFM volume uses named data streams to support dual-fork Macintosh files.Another feature that makes use of named data streams is Summary Information. You can see this feature by opening the Properties window for a file and selecting the Summary tab. Figure 15.11 shows an example.
Figure 15.11. Summary tab for a typical file.

When you store information about a file using Summary Information, the data is stored in named data streams using the GUID for the file as the stream name. Because named data streams are supported by earlier versions of NTFS, you can copy files from Windows Server 2003 and Windows 2000 servers to NT servers without losing the summary information. You cannot access Summary Information from NT, though, because the interface is not coded to look for it.
Named Data Streams and WebDAV
Another feature coming to prominence also makes use of named data streams. This feature is called Web-based Distributed Authoring and Versioning Chapter 16, "Managing Shared Resources."The reason I bring up WebDAV at this point is because it uses named data streams in a way that may surprise you the first time you use the feature. To get an idea of how this works, set up a shared web folder on Windows Server 2003 that is running IIS. To do this, open the Properties window for a folder and select the Web Sharing tab. Click the Share This Folder radio button and accept the default options. This creates a virtual folder in the IIS metabase.You must also configure IIS to publish WebDAV shares. It does not do this by default. Open the Internet Information Services console, right-click the server icon, and select SECURITY from the flyout menu. This launches a Security Lockdown wizard. Step through the wizard and in the Enable Request Handlers window, select Enable WebDAV Publishing under the ISAPI Handlers icon.Instead of using a browser to connect to the web folder, use the new WebDAV redirector in Windows Server 2003 or XP by opening a command prompt and entering this command:
net use * http://<server_name >/<webshare_name >
You may be prompted for credentials. After the connection is established and the drive has been redirected, create a couple files in the network drive. If you were to take a look at the network traffic at this point, you would see that communications with this shared resource occur using HTTP rather than the Server Message Block (SMB) commands that would normally be used between Windows machines.Open the Properties window for one of the files you created. You'll notice that there are only a few attributes you can change. They are Read-Only, Hidden, and Archive. You can also set the Encryption attribute, which will encrypt the temporary copy of the file on your machine and then copy the encrypted blob over the network to the web share. Read more about this functionality in Chapter 17, "Managing File Encryption."If you set the Read-only or Hidden flags and click OK, you'll notice that the file shows these attributes in the Explorer window. Now go to the server and open the properties for the file. You'll notice that the attributes have not changed.Here's the reason for this attribute duality. When you set an attribute via WebDAV, it is saved into a named data stream in the file. It does not touch the flags in the $Standard_Information attribute. This allows the system to manage WebDAV attributes via standard HTTP using methods such as PropFind and PropSet. It also means that WebDAV attributes must be managed separately from NTFS attributes. Keep this behavior in mind, because if you copy a WebDAV file to a location that is formatted with anything other than NTFS, you'll lose the WebDAV attributes.
Directory Records
Every database needs an index to locate records and speed lookups. NTFS is no exception. The most familiar attribute index is a directory, which indexes $File_Name attributes. The MFT permits indexing any attribute, though. Following are other attributes that also have indexes in NTFS:
- Security Descriptors.
These attributes are stored in the $Secure metadata record and indexed by $SDH and $SII. - Globally Unique Identifiers (GUIDs).
If a file record is the target of an object linking and embedding (OLE) link, it is assigned a GUID. These GUIDs are stored in an $Object_ID attribute in the file's MFT record and they are indexed in the $ObjID metadata record. - Quota charges.
A file record header contains quota information that is indexed in the $Quota metadata folder. - Reparse points.
Folders with symbolic links to other folders or volumes or devices contain a $Reparse_Point attribute. These attributes are indexed in the $Reparse metadata folder.
Directory Record Components
A directory record in the MFT is a special form of a file record. It has a header plus a $Standard_Information attribute and at least one $File_Name attribute. Instead of a $Data attribute, though, the MFT uses three additional attributes to store index information. Figure 15.12 shows how these attributes fit together in a typical directory:
- $Index_Root .
This holds a copy of the indexed attribute. For example, in a Directory record, the $Index_Root attribute contains a copy of the $File_Name attributes from each file and folder in the directory. The $Index_Root attribute is always resident. - $Index_Allocation .
When the number of indexed entries grows to the point that the $Index_Root attribute cannot fit in its MFT record, the indexed entries are moved onto the disk into a set of 8K buffers. The $Index_Root attribute cannot be made non-resident, so the entries are put into a new attribute, $Index_Allocation, that contains the LCN of the start of the buffer run, the size of the buffer, and the length of the run. - $Bitmap .
This attribute assists in housekeeping by mapping out the free space in the index buffers.
Figure 15.12. Example Directory record structure.

B-Tree Sorting
When index entries are made non-resident, the $Index_Root attribute retains the first entry of each buffer to act as a sorting mechanism. These root entries form a b-tree, a structured format that speeds sorting. Figure 15.13 shows the b-tree structure for a shallow directory. A b-tree lookup doesn't take much work on the part of the file system. In the example, if they system is searching for a filename that is lexicographically less than 120.txt, it goes down the left path. Otherwise, its goes to the right.
Figure 15.13. Directory record showing b-tree entries and several non-resident index buffers.

Short Names and B-Tree Sorting
Short filenames add complexity to the b-tree sorting scheme. Filename entries are placed in index buffers alphanumerically without distinguishing between short names and long names. For example, if you have three long names such as Twilight of the Gods.txt, Twilight Double-Header.txt, and Twilight Zone.txt in the same directory, the short names TWILIG~1, TWILIG~2, and TWILIG~3 will sort to the top of the index buffer above the long filenames.When you open a folder in Explorer or do a DIR from the command line, the file system retrieves both the short and long names. If you have many, many long filenames that all start with the same few letters, you can seriously degrade performance by forcing the system to do a full scan of all the index buffers looking for corresponding short names.If you must continue to use short filenames to support downlevel clients or DOS applications, work hard to come up with naming schemes that do not use the same first letters. If that is not possible, consider breaking up your directory tree into many smaller folders to reduce the size of the index buffers.
Directory Fragmentation
Directories can become fragmented just like files. If a run of index buffers encounters a cluster owned by another file or folder, the file system is forced to start another run. This creates a second pointer in the $Index_Allocation attribute.A heavily fragmented directory might fill the MFT record with pointers, at which time the system moves the pointers to another MFT record and leaves behind a pointer in an $Attribute_List attribute.Fragmented directories slow performance as much or more than fragmented files because they force the file system to scrabble around on the drive collecting index buffers. Let's take a look at how NTFS handles defragmentation of files and directories.
Defragmentation
Like Windows 2000, Windows Server 2003 and XP include a defragmentation utility using code licensed from Executive Software. This code makes use of API calls created by Microsoft. These API calls are designed to safely move clusters without taking a chance of file system corruption should there be a power interruption or system lockup.The defragmentation engine consists of two executables, Dfrgfat.exe and Dfrgntfs. exe. The Dfrgfat engine works with both FAT and FAT32. The management interface is the Disk Defragmenter console, Dfrg.msc. Figure 15.14 shows a typical defragmentation analysis graph for a volume.
Figure 15.14. Defragmentation console showing typical fragmentation analysis graph.

For details on performing defragmentation, including how to use the new command-line defragger utility, see the "Defragmentation Operations" section later in this chapter.Defragmentation in Windows Server 2003 has improved considerably compared to Windows 2000. Many of the nagging restrictions have been removed. Here is a quick list of the improvements:
- MFT defrag.
The MFT can now be defragmented using the defrag API. Further, the MFT can be defragged while online. Earlier versions of Windows required running a commercial defragger at boot time to defrag the MFT. If you've ever sat through several nail-biting hours waiting for a boot-time defrag to finish so you could get a server back online, this is a welcome new feature. - Deeper defrag.
The defrag API has been tweaked to permit access to corners and cracks of the file system that were inaccessible in prior versions. For example, previous versions of the API were unable to defrag heavily fragmented files that used Attribute Lists. It was also unable to defragment extensive bitmaps or reparse points. All of these elements can be defragged using the new API. - Compressed file defrag.
Defragmentation now works with compressed files, but you still cannot completely defrag a heavily fragmented compressed volume. In production, compressed volumes tend to get very fragmented even when you defrag regularly. The only workaround is to get a backup, wipe the volume, and restore from tape. - Improved encrypted file security.
Encrypted files are defragged without being opened. This eliminates a potential vulnerability where temp files created during defrag could expose sensitive data. - Less intrusive defrag.
The same API fix that protects encrypted files also makes it possible to run the defragmenter with just Read Attributes and Synchronize permissions. This makes defragmentation less intrusive, but you still need Administrative permissions to defragment. - Flexible cluster sizes.
The defrag engine now works with any cluster size. Previous versions were limited to a maximum of a 4K cluster. This means that you can increase cluster size on volumes holding large database files without worrying that you cannot defrag the files. - Command-line operation.
A new command-line interface called DEFRAG makes it possible to use batch files to kick off a defrag. Using the batch file and Task Scheduler, you can schedule periodic defrags to run after hours.
In addition to these features, a performance tuning service runs every three days to jockey frequently used files into more advantageous locations on the drive. This tuning service does not perform a full defragmentation but it does a nice job of tidying up on a regular basis.Two defrag limitations remain:
- Paging file.
You cannot defrag the paging file. - Registry.
You cannot defrag the Registry.
You can prevent the paging file from becoming fragmented by defining the same value for the normal and maximum size. This prevents the file from growing and becoming fragmented. The simplest way to correct Registry fragmentation is to use the Pagedefrag utility from www.sysinternals.com.Executive Software ships a commercial version of Diskeeper that has additional functionality and runs much faster than the engine included with Windows Server 2003. Other third-party defraggers include the following:
- PerfectDisk from Raxco Software, www.raxco.com
- SpeedDisk from Symantec, www.symantec.com
- Defrag Commander from Winternals, www.winternals.com
SpeedDisk uses a proprietary method for defragging that does not make use of the Microsoft APIs. This permits the product to defrag more thoroughly at each pass. Cautious administrators have expressed concern about this proprietary engine, but I have not heard of widespread problems. Make sure any defrag product you use has been certified for Windows Server 2003.
File Compression
When it comes to disk capacity, you can never have too much and it can never be too fast. As of this writing, 15,000 rpm UltraSCSI 160 drives cost about 1.6 cents a megabyte. By the time you read this, drives with double that capacity will probably sell for about the same price.Even at those low prices, storage isn't free, and the time it takes to install and configure new drives certainly comes at a price. This is especially true of drives on user machines, which often contain data that must be backed up prior to replacing and imaging the drive. Compression helps to resolve storage problems quickly and cheaply, but it has its limitations.Using NTFS, you can compress an individual file, all the files in a folder, or all files on a volume. The compression algorithm balances speed and disk storage. The compression engine has been improved in Windows Server 2003 to permit compressing files of any size as long as the $Data attribute is non-resident. Earlier versions of NTFS were limited to compressing files of 16 clusters or more.The maximum cluster size that can be handled by the compression API is 4K. If you do not plan on using compression and you have applications that store data in large files, you can format a volume with larger cluster sizes to improve performance.
Registry Tip: Disk Defragmenter KeysThe controls for the Disk Defragmenter are located at the key HKLM | Software | Microsoft | frg.The path to the Disk Defragmenter console, Dfrg.msc, is stored in HKLM | Software | Microsoft | Windows | CurrentVersion | Explorer | MyComputer | DefragPath.The default defrag path launches the console using the command %systemroot%\System32\drfg.msc %c:, which opens the console with the focus set forFile Compression Operations" section later in this chapter for procedures to manage compressed files using Explorer and the command-line COMPACT utility. File Compression and PerformanceCompression exacts a significant performance penalty on file and print servers. Microsoft publishes numbers ranging from a 5 percent to 15 percent reduction in end-to-end data transfer times. My own experience points to much higher throughput degradation.Exact numbers are difficult to quantify because busy production servers have hundreds of connected users doing who-knows-what with applications, personal databases, and data files from 1000 different vendors, and so on and so forth. Imposing compression on this mishmash generally makes you unpopular. Compressing personal files on a server makes better use of the feature than wholesale compression. Because users can compress their own files, take this into account when moving data.You should never compress database files. The performance penalty of handling random file access into a compressed file is simply too high to be acceptable. Transfer the database files to a larger drive if you need more space. File Compression HighlightsWorking with compressed files in a production environment can result in some surprises. Here are some general operational guidelines:
Sparse FilesDatabase and imaging applications typically allocate large amounts of disk space that they don't necessarily fill right away. Windows Server 2003 supports an API that can build file structures called sparse files .A sparse file specifies a certain size for itself but does not actually claim the disk space until the file begins to fill up. Because sparse files are handled at the application level, the disk savings come without the performance penalty of regular file compression.You cannot create a sparse file simply by filling a text file with zeros. Nor do you necessarily get a sparse file when you build huge databases with lots of wasted space in the records. The database application must use the sparse file API. The only Windows Server 2003 application that uses sparse files is the Content Indexer, which stores its catalog information in sparse format. No special settings or Registry hacks are available for sparse file handling. NTFS ConversionYou can convert a FAT or FAT32 partition to NTFS without losing data. That's the good news. The bad news is that you cannot change your mind and do the reverse. If want to go back to FAT or FAT32, you must back up your data, reformat the partition, then restore the data from tape.Windows Server 2003 and XP contain many file system improvements that focus on NTFS conversion. That is because NTFS is available in XP Home Edition, making this the first time that NTFS has been available on a consumer product. Over the next few years, millions of Windows 9x and ME desktops will be upgraded to Windows XP. The conversion to NTFS needs to be a smooth one. Conversion and SetupOne of the first improvements in NTFS conversion was to eliminate the need for conversion at all, at least in fresh installs of Windows Server 2003. The Setup program can now format an NTFS partition directly rather than going through an interim step of formatting with FAT/FAT32 and then converting. This means that the initial bulk file copy from CD puts the files directly into an NTFS file system, virtually eliminating the nasty MFT and system file fragmentation that normally occurs during Setup.Still, not many shops do their server or desktop installations from CD. Most administrators prefer to install across the network to take advantage of scripted installations or Remote Installation Service (RIS).If you install using the network, you must first format the system partition as FAT or FAT32 and then use WINNT to transfer the setup files to the local drive. Converting this partition to NTFS would normally cause fragmentation, but Windows Server 2003 improves the situation in two ways:
There are also improvements in the conversion process itself. Conversion is much faster thanks to additional memory assigned to the task. Also, the existing FAT or FAT32 cluster size can be retained for cluster sizes up to 4K as long as the partition was formatted using Windows Server 2003 or XP. This is a big improvement over previous conversion programs, which insisted on using 512-byte cluster sizes regardless of the partition size.The conversion can retain cluster sizes on volumes formatted by Windows 9x or NT only if the FAT/FAT32 cluster boundaries happen to fall at the required NTFS cluster boundaries. If this does not happen, conversion falls back to a 512-byte cluster size. Conversion and Free SpaceThe conversion process preserves the integrity of the FAT right up until the last moment. All temporary writes are done to free space, so you need lots of elbow room on the volume to convert it. Use this rough computation as a guideline:
For example, the computation for a 4GB volume with 100,000 files looks like this:
This volume needs approximately 134MB of free space to do the NTFS conversion. That represents less than 5 percent of the total space. You can get by with slim margins of free space, but for best results, give the conversion a lot more room than that. Otherwise, you will fragment the volume and spend lots of time defragging. You should also specify a conversion file in another partition for building the MFT to eliminate MFT fragmentation. Conversion and File SecurityAnother weakness of previous conversion utilities was the way they left the file system completely open by putting the Everyone group on the security descriptor of every file and folder.Windows 2000 improved the situation somewhat by making it easier to change the NTFS permissions at the top of the file system and letting inheritance take care of the rest, but you had to remember to do that extra step. Windows Server 2003 avoids the problem entirely by assigning the same default ACL to a newly converted partition that it assigns during the initial installation of the operating system. This consists of the following:
You can add other groups onto the permissions list after the volume has been converted |