Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Recipe 14.6. Resuming the HTTP Download of a File

Credit: Chris Moffitt

Problem

You need to resume an HTTP download
of a file that has been partially transferred.

Solution

Downloads of large files are sometimes interrupted. However, a good
HTTP server that supports the Range header lets you resume the
download from where it was interrupted. The standard Python module
urllib lets you access this functionality almost
seamlessly: you just have to add the required header and intercept
the error code that the server sends to confirm that it will respond
with a partial file. Here is a function, with a little helper class,
to perform this task:

import urllib, os
class myURLOpener(urllib.FancyURLopener):
"" Subclass to override err 206 (partial file being sent); okay for us ""
def http_error_206(self, url, fp, errcode, errmsg, headers, data=None):
pass    # Ignore the expected "non-error" code
def getrest(dlFile, fromUrl, verbose=0):
myUrlclass = myURLOpener( )
if os.path.exists(dlFile):
outputFile = open(dlFile, "ab")
existSize = os.path.getsize(dlFile)
# If the file exists, then download only the remainder
myUrlclass.addheader("Range","bytes=%s-" % (existSize))
else:
outputFile = open(dlFile, "wb")
existSize = 0
webPage = myUrlclass.open(fromUrl)
if verbose:
for k, v in webPage.headers.items( ):
print k, "=", v
# If we already have the whole file, there is no need to download it again
numBytes = 0
webSize = int(webPage.headers['Content-Length'])
if webSize == existSize:
if verbose:
print "File (%s) was already downloaded from URL (%s)" % (
dlFile, fromUrl)
else:
if verbose:
print "Downloading %d more bytes" % (webSize-existSize)
while True:
data = webPage.read(8192)
if not data:
break
outputFile.write(data)
numBytes = numBytes + len(data)
webPage.close( )
outputFile.close( )
if verbose:
print "downloaded", numBytes, "bytes from", webPage.url
return numbytes

Discussion

The HTTP Range header lets the web server know that you want only a
certain range of data to be downloaded, and this recipe takes
advantage of this header. Of course, the server needs to support the
Range header, but since the header is part of the HTTP 1.1
specification, it's widely supported. This recipe
has been tested with Apache 1.3 as the server, but I expect no
problems with other reasonably modern servers.

The recipe lets urllib.FancyURLopener do all the
hard work of adding a new header, as well as the normal handshaking.
I had to subclass the standard class from urllib
only to make it known that the error 206 is not really an error in
this caseso you can proceed normally. In the function, I also
perform extra checks to quit the download if I've
already downloaded the entire file.

Check out HTTP 1.1 RFC (2616) to learn more about the meaning of the
headers. You may find a header that is especially useful, and
Python's urllib lets you send any
header you want.

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

فارسی

کردی

العربیه

اردو

Türkçe

Русский

English

Français

کانال فیلم من

تبیان من

فایلهای من

کتابخانه من

پنل پیامکی

وبلاگ من

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی