Hack 16. Search a Particular Date Range

Google's search is the ability to search within a
particular date range .Before delving into the actual use of date range
searching, there are a few things you should understand. The first is
this: a date range search has nothing to do with the creation date of
the content and everything to do with the indexing date of the
content. If I create a page on March 8, 1999, and Google
doesn't get around to indexing it until May 22,
2002, for the purposes of a date range search, the date in question
is May 22, 2002.The second thing is that Google can index pages several times, and
each time it does so the date on it changes. So
don't count on a date range search staying
consistent from day to day. The daterange:
timestamp can change when a page is indexed more than once. Whether
it does change depends on whether the content of the page has
changed.Third, Google doesn't "stand
behind" the results of a search done using the date
range syntaxes. So if you get a weird result, you
can't complain to them. Google would rather you use
the date range options on their Advanced Search page, but that page
allows you to restrict your options only to the last three months,
six months, or year.
1.28.1. The daterange: Syntax
Why would you want to search by
daterange:?
There are several reasons: It narrows down your search results to fresher content. Google might
find some obscure, out-of-the-way page and index it only once. Two
years later, this obscure, never-updated page is still turning up in
your search results. Limiting your search to a more recent date range
will result in only the most current of matches. It helps you dodge current events. Say John Doe sets a world record
for eating hot dogs and immediately afterward rescues a baby from a
burning building. Less than a week after that happens,
Google's search results are going to be filled with
John Doe. If you're searching for information on
(another) John Doe, babies, or burning buildings,
you'll scarcely be able to get rid of him. However, you can avoid Mr. Doe's exploits by setting
the date range syntax to before the hot dog contest. This also works
well for avoiding recent, heavily covered news events, such as a
crime spree or a forest fire, and annual events of at least national
importance such as national elections or the Olympics. It allows you to compare results over a period of time; for example,
if you want to search for occurrences of "Mac OS
X" and "Windows
XP" over a period of time. Of course, a count like
this isn't foolproof; indexing dates change over
time. But generally it works well enough that you can spot trends.
Using the daterange: syntax is as simple as: daterange:startdate-enddate The catch is that the date must be expressed as a Julian date (read
the sidebar, "Understanding Julian
Dates") So, for example, July 8, 2002, is Julian
date 2452463.5 and May 22, 1968, is 2439998.5. Furthermore, Google
isn't fond of decimals in its
daterange: queries; use only integers: 2452463 or
2452464 (depending on whether you prefer to round up or down) in the
previous example.
Understanding Julian DatesWhile date-based searching is fantastically useful, date-based searching with Julian dates is annoying at bestfor a human, anyway. A Julian date is just one number. It's not broken up into month, day, and year. It's the number of days that have passed since January 1, 4713 B.C. Unlike Gregorian days (those on the calendar you and I use every day), which begin at midnight, Julian days begin at noon, making them useful for astronomers.While problematic for humans, they're rather handy for computer programming, because to change dates you simply have to add and subtract from one number and not worry about month and year changes, not to mention leap years and the differing number of days in each month. Google's daterange: special syntax element employs Julian dates.If things weren't confusing enough, there is actually another date format that is also known as a Julian date format, a five-digit number, yyddd , where the first two digits represent the most significant digits of the year and the last three represent the day of the year, where the value is between 1 and 365 (or 366 in a leap year). Google's daterange: syntax doesn't support the yyddd format.There are plenty of places you can convert Julian dates online. There are a couple of nice converters at the U.S. Naval Observatory Astronomical Applications Department (http://aa.usno.navy.mil/data/docs/JulianDatel) and Mauro Orlandini's home page (http://www.tesre.bo.cnr.it/~mauro/JD/), the latter converting either Julian to Gregorian or vice versa.More Julian dates and online computers can be found via a Google search for julian date (http://www.google.com/search?q=julian+date). |
Google special syntaxes, with the exception of the
link: syntax, which doesn't mix
well ["Mixing Syntaxes" earlier in
this chapter] with other special syntax and other magic words (e.g.,
stocks: and phonebook:).daterange: does wonders for narrowing your search
results. Let's look at a couple of examples. Geri
Halliwell left the Spice Girls around May 27, 1998. If you wanted to
get a lot of information about the breakup, you could try doing a
date search in a 10-day windowsay, May 25 to June 4. That
query would look like this: "Geri Halliwell" "Spice Girls" daterange:2450958-2450968 At the time of this writing, you'll get about 16
results, including several news stories about the breakup. If you
wanted to find less formal sources, search for
Geri or Ginger
Spice instead of Geri
Halliwell.That example's a bit on the silly side, but you get
the idea. Any event that you can clearly divide into before and after
datesan event, a death, an overwhelming change in
circumstancescan be reflected in a date range search.You can also use an individual event's date to
change the results of a larger search. For example, former ImClone
CEO Sam Waksal was arrested on June 12, 2002. You
don't have to search for the name Sam Waksal to get
a very narrow set of results for June 13, 2002: imclone daterange:2452439-2452439 Similarly, if you search for imclone before the
date of 2452439, you'll get very
different results. As an interesting exercise, try a search that
reflects the arrest, but date it a few days before the actual arrest: imclone investigated daterange:2452000-2452435 This is a good way to find information or analysis that predates the
actual event, but that provides background that might help explain
the event itself. (Unless you use the date range search, usually this
kind of information is buried underneath news of the event itself.) If you'd prefer to perform Google date range
searches without all this nonsense about Julian date formats, use the
FaganFinder Google
interface (http://www.faganfinder.com/engines/google.shtml),
an alternative to the Google Advanced Search page, sporting
daterange: searching via a Gregorian (read:
familiar) date pull-down menu. In Figure 1-28,
we're using the FaganFinder on our Spice Girls
breakup example.
Figure 1-28. The FaganFinder Google interface with Gregorian-based date range searching

ran across the content you're after, but what about
narrowing your search results based on content creation date?
1.28.2. Searching by Content Creation Date
Searching for materials based on content creation is difficult.
There's no standard date format (score one for
Julian dates), many people don't date their pages
anyway, some pages don't contain date information in
their header, and still other content management systems routinely
stamp pages with today's date, confusing things
still further.I can offer few suggestions for searching by content creation date.
Try adding a string of common date formats to your query. If you
wanted something from May 2003, for example, you could try appending: ("May * 2003" | "May 2003" | 05/03 | 05/*/03) A query like that uses up most of your 10-word limit, however, so
it's best to be judicious, perhaps by cycling
through these formats one a time. If any one of these is giving you
too many results, try restricting your search to the
title tag of the page.If you're feeling really lucky, you can search for a
full date, such as May 9, 2003. Your decision then is whether you
want to search for the date in the format above or as one of many
variations: 9 May 2003,
9/5/2003, 9 May 03, and so
forth. Exact-date searching will severely limit your results and
should be used only as a last-ditch option.When using date range searching, you'll have to be
flexible in your thinking, more general in your search than you
otherwise would be (because the date range search will narrow your
results substantially), and persistent in your queries, because
different dates and date ranges will yield very different results.
That said, you'll be rewarded with smaller result
sets focused on very specific events and topics in time.