Hack 38. Perform Proximity Searches

words .Sometimes it would be advantageous to search both
forward and backward. For example, if you're doing
genealogy research, you might find your uncle John Smith as both
"John Smith" or
"Smith John." Similarly, some pages
might include John's middle
initial"John Q Smith" or
"Smith John Q."
|
don't make up a phrase. For example, you might want
to learn about keeping squirrels out of your bird feeder. Various
attempts to create a phrase based on this idea might not work, but
just searching for several words might not find specific enough
results.GAPS, created by Kevin Shay, allows you to run searches both
forward and backward and within a certain number of spaces of each
other. GAPS stands for Google API Proximity Search, and
that's exactly what this application is: a way to
search for topics within a few words of each other without having to
run several queries in a row. The program runs the queries and
automatically organizes the results.You enter two terms (there is an option to add more terms that will
not be searched for in proximity) and specify how far apart you want
them (1, 2, or 3 words). You can specify that the words be found only
in the order you request (wordA, wordB) or in either order (wordA,
wordB, and wordB, wordA). You can specify how many results you want
and in what order they appear (sorted by title, URL, ranking, and
proximity).Search results are formatted much like regular Google results, only a
distance ranking is included beside each title. The distance ranking,
between one and three, specifies how far apart the two query words
were on the page. Figure 2-12 shows a GAPS search
for google and hacks within two
words of one another, order intact.
Figure 2-12. GAPS search for "google" and "hacks" within two words of one another

Google directly.
2.20.1. Making the Most of GAPS
GAPS works best when you have words on the same page that are
ambiguously or not at all related to one another. For example, if
you're looking for information on Google and search
engine optimization (SEO), you might find that searching for the
words Google and SEO doesn't find the results that
you want, while using GAPS to search for the words Google and SEO
within three words of each other finds material focused much more on
search engine optimization for Google.GAPS also works well when you're searching for
information about two famous people who might often appear on the
same page, though not necessarily in proximity to each other. For
example, you might want information on Bill Clinton and Alan
Greenspan, but might find that you're getting too
many pages that happen to list the two of them. By searching for
their names in proximity to each other, you'll get
better results.Finally, you might find GAPS useful in medical research. Many times
your search results will include index pages
that list several symptoms. However, including symptoms or other
medical terms within a few words of each other can help you find more
relevant results. Note that this technique will take some
experimentation. Many pages about medical conditions contain long
lists of symptoms and effects, and there's no reason
that one symptom might be within a few words of another.
2.20.2. The Code
The GAPS source code is rather lengthy, so we're not
making it available here. You can, however, get it
online at http://www.staggernation.com/gaps/readmel.
2.20.3. See Also
If you like GAPS, you might want to try a couple of other scripts
from Staggernation: GAWSH (http://www.staggernation.com/gawsh)
Stands for Google API Web Search by Host. This program allows you to
enter a query and get a list of domains that contain information on
that query. If you click on the triangle beside any domain name,
you'll get a list of pages in that domain that match
your query. This program uses DHTML, which means that
it'll only work with Internet Explorer or
Mozilla/Netscape.
GARBO (http://www.staggernation.com/garbo)
Stands for Google API Relation Browsing Outliner. Like GAWSH, this
program uses DHTML, so it'll work only with
Mozilla/Netscape and Internet Explorer. When you enter a URL, GARBO
will do a search for either pages that link to the URL you specify or
pages related to that URL. Run a search and you'll
get a list of URLs with triangles beside them. If you click on a
triangle, you'll get a list of pages that either
link to the URL you chose or are related to the URL you chose,
depending on what you chose in the initial query.