1.5. Special Syntax
In addition to the basic AND,
OR, and phrase searches, Google offers some rather
extensive special syntax for narrowing your searches.As a full-text search engine, Google indexes entire web pages instead
of just titles and descriptions. Additional commands, called
special syntax or advanced
operators , let Google users search specific parts of web
pages for specific types of information. This comes in handy when
you're dealing with more than eight billion web
pages and need every opportunity to narrow your search results.
Specifying that your query words must appear only in the title or URL
of a returned web page is a great way to have your results get very
specific without making your keywords themselves too specific.
Following are descriptions of the special syntax elements, ordered by
common usage and function.
|
intitle: restricts your
search to the titles of web pages. The variation
allintitle: finds pages
wherein all the words specified appear in the title of the web page.
Using allintitle: is basically the same as using
the intitle: before each keyword. intitle:"george bush"
allintitle:"money supply" economics You may wish to avoid the allintitle: variation,
because it doesn't mix well with some of the other
syntax elements. intext :
intext: searches only
body text (i.e., ignores link text, URLs, and titles). While its uses
are limited, it's perfect for finding query words
that might be too common in URLs or link titles. intext:"yahoo.com"
intext:html There's an
allintext:
variation, but again, this doesn't play well with
others.
inanchor :
inanchor: searches for
text in a page's link anchors. A link anchor is the
descriptive text of a link. For example, the link anchor in the HTML
code <a
href="http://www.oreilly.com">O'Reilly
Media</a> is
"O'Reilly Media." inanchor:"tom peters" As with other in*: syntax elements,
there's an allinanchor:
variation, which works in a similar way (i.e., all the keywords
specified must appear in a page's link anchors).
site :
site: allows
you to narrow your search by either a site or a top-level domain. The
AltaVista search engine, by contrast, has two syntax elements for
this function (host: and
domain:), but Google has only the one. site:loc.gov
site:thomas.loc.gov
site:edu
site:nc.us Be aware that site: is no good for trying to
search for a page that exists beneath the main or default site (i.e.,
in a subdirectory such as /~sam/album/ ). For
example, if you're looking for something below the
main GeoCities site, you can't use
site: to find all the pages in http://www.geocities.com/Heartland/Meadows/6485/;
Google returns no results. Use inurl: instead.
inurl :
inurl:
restricts your search to the URLs of web pages. This syntax tends to
work well for finding search and help pages, because they tend to be
rather regular in composition. An
allinurl:
variation finds all the words listed in a URL but
doesn't mix well with some other special syntax. inurl:help
allinurl:search help You'll see that using the inurl:
query instead of the site: query has one immediate
advantage: you can use it to search subdirectories.
|
site: syntax to draw out information on
subdomains. For example, how many subdomains does
oreilly.com really have? A quick query will help
you figure that out: site:oreilly.com -inurl:www.oreilly.com This query asks Google to list all pages from the
oreilly.com domain, but leave out those pages
which are from the common subdomain www, since
you already know about that one.
link :
link: returns a
list of pages linking to the specified URL. Enter
link: and you'll
get a list of pages that link to the Google home page, http://www.raelity.org/apps/blosxom/, for
instanceas with top-level URLs such as
raelity.org .
cache :
cache: finds a
copy of the page that Google indexed even if that page is no longer
available at its original URL or has since changed its content
completely. cache:www.yahoo.com If Google returns a result that appears to have little to do with
your query, you're almost sure to find what
you're looking for in the latest cached version of
the page at Google.The Google cache is particularly useful for retrieving a previous
version of a page that changes often.
daterange :
daterange:
limits your search to a particular date or range of dates on which a
page was indexed. It's important to note that a
daterange: search has nothing to do with when a
page was created, but when it was indexed by Google. So a page
created on February 2 but not indexed by Google until April 11 would
turn up in a daterange: search for April 11. "Geri Halliwell" "Spice Girls" daterange:2450958-2450968 For an in-depth treatment of finding content either by the date it
was created or when it was first noticed by Google, see [Hack #16] .
filetype :
filetype:
searches the suffixes or filename extensions. These are usually, but
not necessarily, different file types;
filetype:htm and filetype:html
will give you different result counts, even though
they're the same file type. You can even search for
different page generatorssuch as ASP, PHP, CGI, and so
forthpresuming the site isn't hiding them
behind redirection and proxying. Google indexes several different
Microsoft formats, including PowerPoint (.ppt),
Excel (.xls), and Word
(.doc). homeschooling filetype:pdf
"leading economic indicators" filetype:ppt
related :
related:, as
you might expect, finds pages that are related to the specified page.
This is a good way to find categories of pages; a search for
related:google.com returns a variety of search
engines, including Lycos, Yahoo!, and Northern Light. related:www.yahoo.com
related:www.cnn.com While an increasingly rare occurrence, you'll find
that not all pages are related to other pages.
info :
info: provides
a page of links to more information about a specified URL. This
information includes a link to the URL's cache, a
list of pages that link to the URL, pages that are related to the
URL, and pages that contain the URL. info:www.oreilly.com
info:www.nytimes.com/technology Note that this information is dependent on whether Google has indexed
the specified URL; if not, information will obviously be far more
limited.
phonebook :
phonebook:, as
you might expect, looks up phone numbers. phonebook:John Doe CA
phonebook:(510) 555-1212 The phonebook is covered in detail in [Hack #6] .