Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید







Recipe 14.6. Resuming the HTTP Download of a File


Credit: Chris Moffitt


Problem


You need to resume an HTTP download
of a file that has been partially transferred.


Solution


Downloads of large files are sometimes interrupted. However, a good
HTTP server that supports the Range header lets you resume the
download from where it was interrupted. The standard Python module
urllib lets you access this functionality almost
seamlessly: you just have to add the required header and intercept
the error code that the server sends to confirm that it will respond
with a partial file. Here is a function, with a little helper class,
to perform this task:

import urllib, os
class myURLOpener(urllib.FancyURLopener):
"" Subclass to override err 206 (partial file being sent); okay for us ""
def http_error_206(self, url, fp, errcode, errmsg, headers, data=None):
pass # Ignore the expected "non-error" code
def getrest(dlFile, fromUrl, verbose=0):
myUrlclass = myURLOpener( )
if os.path.exists(dlFile):
outputFile = open(dlFile, "ab")
existSize = os.path.getsize(dlFile)
# If the file exists, then download only the remainder
myUrlclass.addheader("Range","bytes=%s-" % (existSize))
else:
outputFile = open(dlFile, "wb")
existSize = 0
webPage = myUrlclass.open(fromUrl)
if verbose:
for k, v in webPage.headers.items( ):
print k, "=", v
# If we already have the whole file, there is no need to download it again
numBytes = 0
webSize = int(webPage.headers['Content-Length'])
if webSize == existSize:
if verbose:
print "File (%s) was already downloaded from URL (%s)" % (
dlFile, fromUrl)
else:
if verbose:
print "Downloading %d more bytes" % (webSize-existSize)
while True:
data = webPage.read(8192)
if not data:
break
outputFile.write(data)
numBytes = numBytes + len(data)
webPage.close( )
outputFile.close( )
if verbose:
print "downloaded", numBytes, "bytes from", webPage.url
return numbytes


Discussion


The HTTP Range header lets the web server know that you want only a
certain range of data to be downloaded, and this recipe takes
advantage of this header. Of course, the server needs to support the
Range header, but since the header is part of the HTTP 1.1
specification, it's widely supported. This recipe
has been tested with Apache 1.3 as the server, but I expect no
problems with other reasonably modern servers.

The recipe lets urllib.FancyURLopener do all the
hard work of adding a new header, as well as the normal handshaking.
I had to subclass the standard class from urllib
only to make it known that the error 206 is not really an error in
this caseso you can proceed normally. In the function, I also
perform extra checks to quit the download if I've
already downloaded the entire file.

Check out HTTP 1.1 RFC (2616) to learn more about the meaning of the
headers. You may find a header that is especially useful, and
Python's urllib lets you send any
header you want.


See Also


Documentation on the urllib standard library
module in the Library Reference and
Python in a Nutshell; the HTTP 1.1 RFC
(http://www.ietf.org/rfc/rfc2616.txt).


/ 394