Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید


Recipe 3.6. Looking up Holidays Automatically


Credit: Anna Martelli Ravenscroft, Alex Martelli


Problem


Holidays vary by country, by
region, even by union within the same company. You want an automatic
way to determine the number of holidays that fall between two given
dates.


Solution


Between two dates, there may be movable holidays, such as Easter and
Labor Day (U.S.); holidays that are based on Easter, such as Boxing
Day; holidays with a fixed date, such as Christmas; holidays that
your company has designated (the CEO's birthday).
You can deal with all of them using datetime and
the third-party module
dateutil.

A very flexible architecture is to factor out the various
possibilities into separate functions to be called as appropriate:

import datetime
from dateutil import rrule, easter
try: set
except NameError: from sets import Set as set
def all_easter(start, end):
# return the list of Easter dates within start..end
easters = [easter.easter(y)
for y in xrange(start.year, end.year+1)]
return [d for d in easters if start<=d<=end]
def all_boxing(start, end):
# return the list of Boxing Day dates within start..end
one_day = datetime.timedelta(days=1)
boxings = [easter.easter(y)+one_day
for y in xrange(start.year, end.year+1)]
return [d for d in boxings if start<=d<=end]
def all_christmas(start, end):
# return the list of Christmas Day dates within start..end
christmases = [datetime.date(y, 12, 25)
for y in xrange(start.year, end.year+1)]
return [d for d in christmases if start<=d<=end]
def all_labor(start, end):
# return the list of Labor Day dates within start..end
labors = rrule.rrule(rrule.YEARLY, bymonth=9, byweekday=rrule.MO(1),
dtstart=start, until=end)
return [d.date( ) for d in labors]
# no need to test for in-between here
def read_holidays(start, end, holidays_file='holidays.txt'):
# return the list of dates from holidays_file within start..end
try:
holidays_file = open(holidays_file)
except IOError, err:
print 'cannot read holidays (%r):' % (holidays_file,), err
return [ ]
holidays = [ ]
for line in holidays_file:
# skip blank lines and comments
if line.isspace( ) or line.startswith('#'):
continue
# try to parse the format: YYYY, M, D
try:
y, m, d = [int(x.strip( )) for x in line.split(',')]
date = datetime.date(y, m, d)
except ValueError:
# diagnose invalid line and just go on
print "Invalid line %r in holidays file %r" % (
line, holidays_file)
continue
if start<=date<=end:
holidays.append(date)
holidays_file.close( )
return holidays
holidays_by_country = {
# map each country code to a sequence of functions
'US': (all_easter, all_christmas, all_labor),
'IT': (all_easter, all_boxing, all_christmas),
}
def holidays(cc, start, end, holidays_file='holidays.txt'):
# read applicable holidays from the file
all_holidays = read_holidays(start, end, holidays_file)
# add all holidays computed by applicable functions
functions = holidays_by_country.get(cc, ( ))
for function in functions:
all_holidays += function(start, end)
# eliminate duplicates
all_holidays = list(set(all_holidays))
# uncomment the following 2 lines to return a sorted list:
# all_holidays.sort( )
# return all_holidays
return len(all_holidays) # comment this out if returning list
if _ _name_ _ == '_ _main_ _':
test_file = open('test_holidays.txt', 'w')
test_file.write('2004, 9, 6\n')
test_file.close( )
testdates = [ (datetime.date(2004, 8, 1),
datetime.date(2004, 11, 14)),
(datetime.date(2003, 2, 28), datetime.date(2003, 5, 30)),
(datetime.date(2004, 2, 28), datetime.date(2004, 5, 30)),
]
def test(cc, testdates, expected):
for (s, e), expect in zip(testdates, expected):
print 'total holidays in %s from %s to %s is %d (exp %d)' % (
cc, s, e, holidays(cc, s, e, test_file.name), expect)
print
test('US', testdates, (1,1,1) )
test('IT', testdates, (1,2,2) )
import os
os.remove(test_file.name)


Discussion


In one company I worked for, there were three different unions, and
holidays varied among the unions by contract. In addition, we had to
track any snow days or other release days in the same way as
"official" holidays. To deal with
all the potential variations in holidays, it's
easiest to factor out the calculation of standard holidays into their
own functions, as we did in the preceding example for
all_easter, all_labor, and so
on. Examples of different types of calculations are provided so
it's easy to roll your own as needed.

Although half-open intervals (with the lower bound included but the
upper one excluded) are the norm in Python (and for good reasons,
since they're arithmetically more malleable and tend
to induce fewer bugs in your computations!), this recipe deals with
closed intervals instead (both lower and upper bounds included).
Unfortunately, that's how specifications in terms of
date intervals tend to be given, and dateutil also
works that way, so the choice was essentially obvious.

Each function is responsible for ensuring that it only returns
results that meet our criteria: lists of
datetime.date instances that lie between the dates
(inclusive) passed to the function. For example, in
all_labor, we coerce the
datetime.datetime results returned by
dateutil's
rrule into datetime.date
instances with the date method.

A company may choose to set a specific date as a holiday (such as a
snow day) "just this once," and a
text file may be used to hold such unique instances. In our example,
the read_holidays function handles the task of
reading and processing a text file, with one date per line, each in
the format year, month, day. You could also choose to refactor this
function to use a "fuzzy" date
parser, as shown in Recipe 3.7.

If you need to look up holidays many times within a single run of
your program, you may apply the optimization of reading and parsing
the text file just once, then using the list of dates parsed from its
contents each time that data is needed. However,
"premature optimization is the root of all evil in
programming," as Knuth said, quoting Hoare: by
avoiding even this "obvious"
optimization, we gain clarity and flexibility. Imagine these
functions being used in an interactive environment, where the text
file containing holidays may be edited between one computation and
the next: by rereading the file each time, there is no need for any
special check about whether the file was changed since you last read
it!

Since countries often celebrate different holidays, the recipe
provides a rudimentary holidays_by_country
dictionary. You can consult plenty of web sites that list holidays by
country to flesh out the dictionary for your needs. The important
part is that this dictionary allows a different group of
holidays-generating functions to be called, depending on which
country code is passed to the holidays function. If
your company has multiple unions, you could easily create a
union-based dictionary, passing the union-code instead of (or for
multinationals, in addition to) a country code to
holidays. The holidays function
calls the appropriate functions (including, unconditionally,
read_holidays), concatenates the results, eliminates
duplicates, and returns the length of the list. If you prefer, of
course, you can return the list instead, by simply uncommenting two
lines as indicated in the code.


See Also


Recipe 3.7 for fuzzy
parsing; dateutil documentation at ,
datetime documentation in the Library
Reference
.

/ 394