Google Hacks 2Nd Edition [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Google Hacks 2Nd Edition [Electronic resources] - نسخه متنی

Tara Calishain

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید







Hack 28. Permute a Query

Run all permutations of query keywords and
phrases to squeeze the last drop of results from the Google
index .

Google, ah, Google.

Search engine of over eight billion
pages and zillions of possible results. If you're a
search engine geek like I am, few things are more entertaining than
trying various tweaks with your Google search to see what exactly
makes a difference to the results.

It's amazing what makes a difference. For example,
you wouldn't think that word order would make much
of an impact, but it does. In fact, buried in
Google's documentation is the admission that the
word order of a query will impact search results.

While that's an interesting thought, who has time to
generate and run every possible iteration of a multiword query?
Google API to the rescue! This hack takes a query of up to four
keywords or "quoted phrases" (as
well as supporting special syntaxes) and runs all possible
permutations, showing result counts by permutation and the top
results for each permutation.


2.10.1. The Code


Save the following code as a CGI script ["How to Run
the Hacks" in the Preface] named
order_matters.cgi in your web
site's cgi-bin directory. As
you type in the script, be sure to replace
insert key here
with your Google API key.


You'll need to have the
Algorithm::Permute Perl module for this
program to work correctly (http://search.cpan.org/search?query=algorithm%3A%3Apermute&mode=all).

#!/usr/local/bin/perl
# order_matters.cgi
# Queries Google for every possible permutation of up to 4 query keywords,
# returning result counts by permutation and top results across permutations.
# order_matters.cgi is called as a CGI with form input
# Your Google API developer's key.
my $google_key='insert key here';
# Location of the GoogleSearch WSDL file.
my $google_wdsl = "./GoogleSearch.wsdl";
use strict;
use SOAP::Lite;
use CGI qw/:standard *table/;
use Algorithm::Permute;
print
header( ),
start_html("Order Matters"),
h1("Order Matters"),
start_form(-method=>'GET'),
'Query:   ', textfield(-name=>'query'),
'   ',
submit(-name=>'submit', -value=>'Search'), br( ),
'<font size="-2" color="green">Enter up to 4 query keywords or "quoted phrases"</font>',
end_form( ), p( );
if (param('query')) {
# Glean keywords.
my @keywords = grep !/^\s*$/, split /([+-]?".+?")|\s+/, param('query');
scalar @keywords > 4 and
print('<font color="red">Only 4 query keywords or phrases allowed.</font>'), last;
my $google_search = SOAP::Lite->service("file:$google_wdsl");
print
start_table({-cellpadding=>'10', -border=>'1'}),
Tr([th({-colspan=>'2'}, ['Result Counts by Permutation' ])]),
Tr([th({-align=>'left'}, ['Query', 'Count'])]);
my $results = {}; # keep track of what we've seen across queries
# Iterate over every possible permutation.
my $p = new Algorithm::Permute( \@keywords );
while (my $query = join(' ', $p->next)) {
# Query Google.
my $r = $google_search ->
doGoogleSearch(
$google_key,
$query,
0, 10, "false", ", "false", ", "latin1", "latin1"
);
print Tr([td({-align=>'left'}, [$query, $r->{'estimatedTotalResultsCount'}] )]);
@{$r->{'resultElements'}} or next;
# Assign a rank.
my $rank = 10;
foreach (@{$r->{'resultElements'}}) {
$results->{$_->__CON_L_BRACKETCON_R_BRACKET_ _} = {
title => $_->{title},
snippet => $_->{snippet},
seen => ($results->{$_->{URL}}->{seen}) + $rank
};
$rank--;
}
}
print
end_table( ), p( ),
start_table({-cellpadding=>'10', -border=>'1'}),
Tr([th({-colspan=>'2'}, ['Top Results across Permutations' ])]),
Tr([th({-align=>'left'}, ['Score', 'Result'])]);
foreach ( sort { $results->{$b}->{seen} <=> $results->{$a}->{seen} } keys %$results ) {
print Tr(td([
$results->{$_}->{seen},
b($results->{$_}->{title}||'no title') . br( ) .
a({href= $_) . br( ) .
i($results->{$_}->{snippet}||'no snippet')
]));
}
print end_table( ),
}
print end_html( );


2.10.2. Running the Hack


Point your web browser at the CGI script
order_matters.cgi on your web server. Enter the
query you want to check (up to four words or phrases). The script
will first search for every possible combination of the search words
and phrases, as shown in Figure 2-4.


Figure 2-4. Permutations for applescript google api


The script will then display the top 10 search results across all
permutations of the query, as shown in Figure 2-5.


Figure 2-5. Top results for permutations of applescript google api


At first blush, this hack looks like a novelty with few practical
applications. But if you're a regular researcher or
a web wrangler, you might find it of interest.

If you're a regular researcherthat is, there
are certain topics that you research on a regular basisyou
might want to spend some time with this hack and see if you can
detect a pattern in how your regular search terms are impacted by
changing word order. You might need to revise your searching so that
certain words always come first or last in your query.

If you're a web wrangler, you need to know where
your page appears in Google's search results. If
your page loses a lot of ranking ground because of a shift in a query
arrangement, maybe you want to add some more words to your text or
shift your existing text.


/ 209