7.23. Reading an Entire Line Without Blocking
7.23.1. Problem
You need to
read a line of data from a handle that select says
is ready for reading, but you can't use Perl's normal
<FH> operation (readline)
in conjunction with select because
<FH> may buffer extra data and
select doesn't know about those buffers.
7.23.2. Solution
Use the
following sysreadline function, like this:
$line = sysreadline(SOME_HANDLE);
In case only a partial line has been sent, include a number of
seconds to wait:
$line = sysreadline(SOME_HANDLE, TIMEOUT);
Here's the function to do that:
use IO::Handle;
use IO::Select;
use Symbol qw(qualify_to_ref);
sub sysreadline(*;$) {
my($handle, $timeout) = @_;
$handle = qualify_to_ref($handle, caller( ));
my $infinitely_patient = (@_ = = 1 || $timeout < 0);
my $start_time = time( );
my $selector = IO::Select->new( );
$selector->add($handle);
my $line = ";
SLEEP:
until (at_eol($line)) {
unless ($infinitely_patient) {
return $line if time( ) > ($start_time + $timeout);
}
# sleep only 1 second before checking again
next SLEEP unless $selector->can_read(1.0);
INPUT_READY:
while ($selector->can_read(0.0)) {
my $was_blocking = $handle->blocking(0);
CHAR: while (sysread($handle, my $nextbyte, 1)) {
$line .= $nextbyte;
last CHAR if $nextbyte eq "\n";
}
$handle->blocking($was_blocking);
# if incomplete line, keep trying
next SLEEP unless at_eol($line);
last INPUT_READY;
}
}
return $line;
}
sub at_eol($) { $_[0] =~ /\n\z/ }
7.23.3. Discussion
As described in Recipe 7.22, to determine
whether the operating system has data on a particular handle for your
process to read, you can use either Perl's built-in
select function or the can_read
method from the standard IO::Select
module.Although you can reasonably use functions like
sysread and recv to get data,
you can't use the buffered functions like readline
(that is, <FH>), read, or
getc. Also, even the unbuffered input functions
might still block. If someone connects and sends a character but
never sends a newline, your program will block in a
<FH>, which expects its input to end in a
newline—or in whatever you've assigned to the
$/ variable.We circumvent this by setting the handle to non-blocking mode and
then reading in characters until we find "\n".
This removes the need for the blocking <FH>
call. The sysreadline function in the Solution
takes an optional second argument so you don't have to wait forever
in case you get a partial line and nothing more.A far more serious issue is that select reports
only whether the operating system's low-level file descriptor is
available for I/O. It's not reliable in the general case to mix calls
to four-argument select with calls to any of the
buffered I/O functions listed in this chapter's Introduction (read,
<FH>, seek,
tell, etc.). Instead, you must use
sysread—and sysseek if
you want to reposition the filehandle within the file.The reason is that select's response does not
reflect any user-level buffering in your own process's address space
once the kernel has transferred the data. But the
<FH>—really Perl's readline(
) function—still uses your underlying buffered I/O
system. If two lines were waiting, select would
report true only once. You'd read the first line and leave the second
one in the buffer. But the next call to select
would block because, as far as the kernel is concerned, it's already
given you all of the data it had. That second line, now hidden from
your kernel, sits unread in an input buffer that's solely in user
space.
7.23.4. See Also
The sysread function in
perlfunc(1) and in Chapter 29 of
Programming Perl; the documentation for the
standard modules Symbol, IO::Handle, and IO::Select (also in Chapter
32 of Programming Perl); Recipe 7.22