AdWordsthe text ads that appear to the right of the regular search resultsare delivered on a cost-per-click basis, and purchasers of the AdWords are allowed to set a ceiling on the amount of money that they spend on their ad. This means that, even if you run a search for the same query word multiple times, you won't necessarily get the same set of ads each time.
If you're considering using Google AdWords to run ads, you might want to gather up and save the ads that are running for the query words that interest you. Google AdWords is not included in the functionality provided by the Google API, so you're left to a little scraping to get at that data.
Save this code to a text file named adwords.pl:
#!/usr/bin/perl # usage: perl adwords.pl resultsl # use strict; use HTML::TokeParser; die "I need at least one file: $!\n" unless @ARGV; my @Ads; for my $file (@ARGV){ # skip if the file doesn't exist # you could add more file testing here. # errors go to STDERR so they won't # pollute our csv file unless (-e $file) { warn "What??: $file -- $! \n-- skipping --\n"; next; } # now parse the file my $p = HTML::TokeParser->new($file); while(my $token = $p->get_token) { next unless $token->[0] eq 'S' and $token->[1] eq 'a' and $token->[2]{id} =~ /^aw\d$/; my $link = $token->[2]{href}; my $ad; if($link =~ /pagead/) { my($url) = $link =~ /adurl=([^\&]+)/; $ad->{href} = $url; } elsif($link =~ m{^/url\?}) { my($url) = $link =~ /\&q=([^&]+)/; $url =~ s/%3F/\?/; $url =~ s/%3D/=/g; $url =~ s/%25/%/g; $ad->{href} = $url; } $ad->{adwords} = $p->get_trimmed_text('/a'); $ad->{desc} = $p->get_trimmed_text('/font'); ($ad->{url}) = $ad->{desc} =~ /([\S]+)$/; push(@Ads,$ad); } } print quoted( qw( AdWords HREF Description URL Interest ) ); for my $ad (@Ads) { print quoted( @$ad{qw( adwords href desc url )} ); } sub quoted { return join( ",", map { "\"$_\" } @_ )."\n"; }
% perl adwords.pl inputl > output.csv inputl is the name of the Google results page that you've saved. output.csv is the name of the comma-delimited file to which you want to save your results. You can also provide multiple input files on the command line if you'd like:
% perl adwords.pl inputl input2l > output.csv
The results will appear in a comma-delimited format that looks like this:
"AdWords","HREF","Description","URL","Interest" "Free Blogging Site","http://www.1sound.com/ix", " The ultimate blog spot Start your journal now ","www.1sound.com/ix","40" "New Webaga Blog","http://www.webaga.com/blog.php", " Fully customizable. Fairly inexpensive. ","www.webaga.com","24" "Blog this","http://edebates.e-thepeople.org/a-national/article/10245/view&", " Will online diarists rule the Net strewn with failed dotcoms? ", "e-thePeople.org","26" "Ford - Ford Cars","http://quickquote.forddirect.com/FordDirect.jsp", " Build a Ford online here and get a price quote from your local dealer! ", "www.forddirect.com","40" "See Ford Dealer's Invoice","http://buyingadvice.com/search/", " Save $1,400 in hidden dealership profits on your next new car. ", "buyingadvice.com","28" "New Ford Dealer Prices","http://www.pricequotes.com/", " Compare Low Price Quotes on a New Ford from Local Dealers and Save! ", "www.pricequotes.com","25"