The Ultimate Windows Server 1002003 System Administrators Guide [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

The Ultimate Windows Server 1002003 System Administrators Guide [Electronic resources] - نسخه متنی

Robert Williams, Mark Walla

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید



INDEXING SERVICE


Windows Server 2003 Indexing Service builds an index catalog from local and remote disk drives that contain information about document contents and properties. Like a regular index, this catalog includes all words that will help a user locate data. Typically, 15 to 30 percent of the contents of a document are cataloged. The rest are held in an exception list and include common nouns, verbs, articles, prepositions, and other words deemed not appropriate to finding data.

Document properties cataloged include creation date, author name, and size in number of characters. Each document type treats properties differently, requiring that some be generated automatically and others entered manually. This is the case with products like Microsoft Office. Meta tags found in html files must be stored in the property cache in order for html properties to be indexed.

The indexing process involves scanning documents, on either a full or an incremental basis. The first time indexing is run, all documents in the selected directory or directories are scanned for the catalog. A new full scan can be created at any time using the Indexing Services snap-in, selecting Directories, right-clicking the desired directory, selecting All Tasks, and choosing Full Scan (Figure 17.3). Manual incremental scans are initiated in the same manner except, of course, for the final selection. Automatic incremental scans are conducted each time Indexing Services is started, and when a document is modified, Indexing Services is notified that it requires rescanning. Incremental scans index only new or modified documents.

Figure 17.3. The MMC Indexing Service Snap-in Full Scan Option


The full process invoked by Indexing Services has a number of steps. First, it identifies the type of file involved and then loads the appropriate document filter. A filter merely reflects the document's structure. Word processors from different vendors, for example, have different structures, and the filters strip encoded structural information from the content. They also determine the international language in which the document is written and parse the information into words. At that point the exception list of verbs, common nouns, articles, and prepositions for the international language is applied, and those words are dropped from the process. Every international language supported by Windows Server 2003 has an exception list. All words that remain are placed in the index. Properties are also collected and stored in the properties cache.

NOTEIn terms of security, users will be able to view only those files for which they have permission on NTFS partitions. Catalogs saved on a FAT system will be visible to all users. Encrypted documents are never indexed.

Using Indexing Service to Find Data


Indexed information can be found in one of two ways. The one best known by users is Start menu Search function, discussed in Chapter 4. Any of the found items can be opened by selecting the targeted item and double-clicking it.

The second method of locating content is the Indexing Service query form, which locates content and data output according to several criteria. Figure 17.4 shows a simple query and its results. Note that the query provides direct hot links to each found item. To open one of the files, simply double-click its icon (see Figure 17.5).

Figure 17.4. Content Location from the Search Dialog


Figure 17.5. The Indexing Service Query Form and Output


Indexing Service Sizing


As with any other Windows Server 2003 service, indexing large numbers of documents increases the pressure on the host server. The complexity and rapidity of queries also dictate system requirements. For fewer than 100,000 documents, 64 MB to 128 MB of storage memory should be sufficient. However, for 500,000 documents, the memory demands can easily exceed 256 MB. The system administrator should monitor memory demands and routinely add memory as the number and complexity of queries shift upward. CPU speed also impacts indexing efficiency, as does disk space. As a general rule, allocate an additional 30 percent for disk storage for all documents saved on a FAT system. Catalogs on NTFS require about 15 percent of the disk space used by the indexed documents.

Indexing Service Query Language


A query is simply a structure string of data that narrows the scope of a search. Its syntax is precise and generally takes the form


[Mode] [property name] [query_text] [attribute=value] query_text [/mode]

The mode is optional and defines the type of query, including free text, phrase, relational, and regular expressions (regex). Free text is the default if mode is not used. Short-form inquiries use @ for phrase mode, # for regular expressions, and $ for free text. Long-form queries use curly braces to define the type. For example, the long form for a query about a title is {prop name=title} ; the short form is #name #title.

Indexing Service recognizes more than 40 property types that can be queried. These include creation time, time last accessed, physical path, number of characters, and document author. Refer to the Windows Server 2003 Help menu for a full list of query properties.

RULES AND OPERATORS


Indexing Service requires adherence to several query rules, within which standard query operators can be applied. The following rules are applicable to all Indexing Service queries:


Queries are not case-sensitive.


A number of special characters must be enclosed in quotation marks (' '). They are &, @, $, #, ^, ( ), and |. (How these characters can be applied is covered later.)


Two forms of date and time queries are recognized: yyyy/mm/dd hh:mm:ss and yyyy-mm-dd hh:mm:ss.


No words in the exception list can be used. Remember, these include verbs, common nouns, articles, and pronouns.


Numeric values can be hexadecimals or decimals.



Available Operators


Standard industry operators can be applied to queries. The following describes them and how they are applied. This is not a comprehensive treatment; the Help menu provides additional information.


Contains and Equals.
The contains operator is employed to find specific words or phrases. For example, if you want to find the phrase "all basset hounds are cool," use the short form @DocTitle "all basset hounds are cool" or the long form {phrase name=DocTitle} Contains {phrase}all basset hounds are cool{/phrase}{/prop}. The equals operator specifies an exact equivalent condition. The short form is @DocTitle = "all basset hounds are cool", which finds documents with that exact phrase.


Boolean.
Boolean operators use conditions that are strung together by AND (&), OR (|), NOT (&!), and NEAR (~). For example, to find documents that contain the words "basset hound" and "cool," use "basset hound & cool" or "basset hound and cool." (Don't forget the quotation marks.) The order of precedence for Booleans is NOT, AND or NEAR, and OR.


Free text.
Free text is the default operator. The words or phrases are entered without filtering.


Phrase.
In a phrase query, the quotation in the short form is basset hounds are cool. In the long form, the {phrase} tag is in the form {phrase} basset hounds are cool {/phrase}.


Wildcards.
Wildcards are used for very simple pattern matching using the asterisk or question mark. The asterisk with the letters b, a, and s bas* results in findings such as "basset," "basket," "bass," and many others. The short form of the query is @filename = bas*.


Regular expression pattern matching.
Regular expressions are used commonly in statistical and other forms of analysis. If you are unfamiliar with regular expressions, refer to the Help pages.


Alternative word forms.
These forms are often called fuzzy or imprecise queries and are of two types. The first finds all words with common prefixes, and the short form simply uses the wildcard asterisk. The second type involves alternative forms of the word, such as "come" and "came." Here the long form for the inflected operator is {Generate method=inflect} come {/Generate}.


Relational.
These forms permit you to set parameters that include constraints such as less than (<), less than or equal to (<=), equal to (=), greater than or equal to (>=), greater than (>), and not equal to (|=).


Vector space queries.
In the vector space model that is common in information retrieval, the order can be separated by a commabasset, houndand a ranking can be applied to set the priority of the query match. In this case, ranking basset [20], hound [4] places a greater importance on the word "hound."


Term weighting queries.
This operator assigns a relative value to each word being queried. The syntax is {weight} word, and a value between 0.0 and 1.0 is assigned. For example, a query on 'basset hounds are cool' might be weighted with this structure:{weight value = .300}basset AND {weight value=.600} hound AND {weight value=.100}cool. In this example, "hound" has a much heavier weight than "cool."



Common Indexing Service Administrative Tasks


Indexing Service is one of the simplest Windows Server 2003 applications to administer. In this section, we will review only some of the common administrative tasks; the complete task scope is fairly obvious when exploring the Indexing Services MMC snap-in or via Computer Management snap-in Server Applications and Services Indexing Service. Both provide the same set of options and can be used interchangeably, but for the sake of simplicity we will refer to the Indexing Service snap-in the following examples.

Launch Indexing Service by opening the Indexing Service snap-in, clicking the Action menu, and selecting Start. To pause or stop Indexing Service, follow the same steps, but select Pause or Stop as appropriate.

CREATING A CATALOG


The creation of an Indexing Service catalog requires the following steps:



Open the Indexing Service snap-in.

Select the Action menu, click New, and select Catalog.

In the Add Catalog dialog box, type the name of catalog and enter the location. Use the Browse button to navigate to the desired location (Figure 17.6).

Figure 17.6. The Add Catalog Dialog Box

Click OK to complete the process.


ADDING OR EXCLUDING A DIRECTORY TO THE CATALOG


To add or exclude an Indexing Service directory to the catalog, follow these steps:



Open the Indexing Service snap-in.

Open the Systems tree console and click the Directories folder.

Select the Action menu, click New, and select Directory.

In the Add Directory dialog box, type the name of the catalog and enter the location. Use the Browse button to navigate to the desired location. Enter an alias name for the directory in the Alias (UNC) text box (Figure 17.7).

Figure 17.7. The Add Directory Dialog Box

If this is a remote directory, add an authorized user name and password in the appropriate text boxes.

To include the directory, select Yes; to exclude the directory, select No.

Click OK to complete the process.


The removal of a directory from a catalog is accomplished by loading the Indexing Service snap-in, double-clicking the targeted catalog, double-clicking the Directories folder, selecting the targeted directory, clicking the Action menu, and clicking Delete.

ADDING, EDITING, AND REMOVING PROPERTIES


In the properties cache, to add, edit, and remove properties, follow these steps:



Open the Indexing Service snap-in.

Double-click the targeted catalog.

Click Properties from the catalog console tree.

In the right-hand details pane, move to the Property Set column and select the property ID to add to the cache.

Click the Action menu and select Properties.

From within the Properties dialog box, check the Cached option and select the data type appropriate for the property from the Datatype list. Set the size in bytes for the property in the Size box.


To remove the properties, follow the preceding process in steps 1 through 5. Then in the Properties box, remove the check mark on the Cached option. Edit properties by following these steps and making appropriate changes.

The Indexing Service snap-in does not have to be installed for a typical user to find things using the Start menu Search function. However, with the service in place, searches can be more detailed, accurate, and efficient. Remember, the primary function of Indexing Service is to catalog document content and properties for fast retrieval.


/ 158