Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Python Cookbook 2Nd Edition Jun 1002005 [Electronic resources] - نسخه متنی

David Ascher, Alex Martelli, Anna Ravenscroft

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید


Recipe 2.14. Rewinding an Input File to the Beginning


Credit: Andrew Dalke


Problem


You need to make an input file object
(with data coming from a socket or other input file handle)
rewindable back to the beginning so you can read it over.


Solution


Wrap the file object into a suitable class:

from cStringIO import StringIO
class RewindableFile(object):
"" Wrap a file handle to allow seeks back to the beginning. ""
def _ _init_ _(self, input_file):
"" Wraps input_file into a file-like object with rewind. ""
self.file = input_file
self.buffer_file = StringIO( )
self.at_start = True
try:
self.start = input_file.tell( )
except (IOError, AttributeError):
self.start = 0
self._use_buffer = True
def seek(self, offset, whence=0):
"" Seek to a given byte position.
Must be: whence == 0 and offset == self.start
""
if whence != 0:
raise ValueError("whence=%r; expecting 0" % (whence,))
if offset != self.start:
raise ValueError("offset=%r; expecting %s" % (offset, self.start))
self.rewind( )
def rewind(self):
"" Simplified way to seek back to the beginning. ""
self.buffer_file.seek(0)
self.at_start = True
def tell(self):
"" Return the current position of the file (must be at start). ""
if not self.at_start:
raise TypeError("RewindableFile can't tell except at start of file")
return self.start
def _read(self, size):
if size < 0: # read all the way to the end of the file
y = self.file.read( )
if self._use_buffer:
self.buffer_file.write(y)
return self.buffer_file.read( ) + y
elif size == 0: # no need to actually read the empty string
return "
x = self.buffer_file.read(size)
if len(x) < size:
y = self.file.read(size - len(x))
if self._use_buffer:
self.buffer_file.write(y)
return x + y
return x
def read(self, size=-1):
"" Read up to 'size' bytes from the file.
Default is -1, which means to read to end of file.
""
x = self._read(size)
if self.at_start and x:
self.at_start = False
self._check_no_buffer( )
return x
def readline(self):
"" Read a line from the file. ""
# Can we get it out of the buffer_file?
s = self.buffer_file.readline( )
if s[-1:] == "\n":
return s
# No, so read a line from the input file
t = self.file.readline( )
if self._use_buffer:
self.buffer_file.write(t)
self._check_no_buffer( )
return s + t
def readlines(self):
""read all remaining lines from the file""
return self.read( ).splitlines(True)
def _check_no_buffer(self):
# If 'nobuffer' has been called and we're finished with the buffer file,
# get rid of the buffer, redirect everything to the original input file.
if not self._use_buffer and self.
buffer_file.tell( ) == len(self.buffer_file.getvalue( )):
# for top performance, we rebind all relevant methods in self
for n in 'seek tell read readline readlines'.split( ):
setattr(self, n, getattr(self.file, n, None))
del self.buffer_file
def nobuffer(self):
""tell RewindableFile to stop using the buffer once it's exhausted""
self._use_buffer = False


Discussion


Sometimes, data coming from a socket or other input file handle
isn't what it was supposed to be. For example,
suppose you are reading from a buggy server, which is supposed to
return an XML stream, but sometimes returns an unformatted error
message instead. (This scenario often occurs because many servers
don't handle incorrect input very well.)

This recipe's RewindableFile class
helps you solve this problem. r =
RewindableFile(f) wraps the original input stream
f into a "rewindable
file" instance r which
essentially mimics f's
behavior but also provides a buffer. Read requests to
r are forwarded to
f, and the data thus read gets appended to
a buffer, then returned to the caller. The buffer contains all the
data read so far.

r can be told to rewind,
meaning to seek back to the start position. The next read request
will come from the buffer, until the buffer has been read, in which
case it gets the data from the input stream again. The newly read
data is also appended to the buffer.

When
buffering is no longer needed, call the nobuffer
method of r. This tells
r that, once it's done
reading the buffer's current contents, it can throw
the buffer away. After nobuffer is called, the
behavior of seek is no longer defined.

For example, suppose you have a server that gives either an error
message of the form ERROR: cannot do that, or an
XML data stream, starting with '<?xml'...:

    import RewindableFile
infile = urllib2.urlopen("http://somewhere/")
infile = RewindableFile.RewindableFile(infile)
s = infile.readline( )
if s.startswith("ERROR:"):
raise Exception(s[:-1])
infile.seek(0)
infile.nobuffer( ) # Don't buffer the data any more
... process the XML from infile ...

One sometimes-useful Python idiom is not supported by the class in
this recipe: you can't reliably stash away the bound
methods of a RewindableFile instance. (If you
don't know what bound methods are, no problem, of
course, since in that case you surely won't want to
stash them anywhere!). The reason for this limitation is that, when
the buffer is empty, the RewindableFile code
reassigns the input file's read,
readlines, etc., methods, as instance variables of
self. This gives slightly better performance, at
the cost of not supporting the infrequently-used idiom of saving
bound methods. See Recipe 6.11 for another example of a
similar technique, where an instance irreversibly changes its own
methods.

The tell method,
which gives the current location of a file, can be called on an
instance of RewindableFile only right after
wrapping, and before any reading, to get the beginning byte location.
The RewindableFile implementation of
tell TRies to get the real position from the
wrapped file, and use that as the beginning location. If the wrapped
file does not support tell, then the
RewindableFile implementation of
tell just returns 0.


See Also


Site http://www.dalkescientific.com/Python/ for
the latest version of this recipe's code;
Library Reference and Python in a
Nutshell
docs on file objects and
module cStringIO; Recipe 6.11 for another example of an
instance affecting an irreversible behavior change on itself by
rebinding its methods.

/ 394