8.15. Reading Fixed-Length Records
8.15.1. Problem
You
want to read a file whose records have a fixed length.
8.15.2. Solution
# $RECORDSIZE is the length of a record, in bytes.
# $TEMPLATE is the unpack template for the record
# FILE is the file to read from
# @FIELDS is an array, one element per field
until ( eof(FILE) ) {
read(FILE, $record, $RECORDSIZE) = = $RECORDSIZE
or die "short read\n";
@FIELDS = unpack($TEMPLATE, $record);
}
8.15.3. Discussion
Because the file in question is not a text file, you can't use
<FH> or IO::Handle's
getline method to read records. Instead, you must
simply read a particular number of bytes into a
variable. This variable contains one record's data, which you decode
using unpack with the appropriate format.
For binary data, the catch is determining that format. When reading
data written by a C program, this can mean peeking at C include files
or manpages describing the structure layout, and this requires
knowledge of C. It also requires that you become unnaturally chummy
with your C compiler, because otherwise it's hard to predict field
padding and alignment (such as the x2 in the
format used in Recipe 8.24). If you're lucky
enough to be on a Berkeley Unix system or a system supporting
gcc, then you may be able to use the
c2ph tool distributed with Perl to cajole your C
compiler into helping you with this.
The tailwtmp program at the end of this chapter
uses the format described in utmp(5) under
Linux, and works on its /var/log/wtmp and
/var/run/utmp files. Once you commit to working
in binary format, machine dependencies creep in fast. It probably
won't work unaltered on your system, but the procedure is still
illustrative. Here is the relevant layout from the C include file on
Linux:
#define UT_LINESIZE 12
#define UT_NAMESIZE 8
#define UT_HOSTSIZE 16
struct utmp { /* here are the pack template codes */
short ut_type; /* s for short, must be padded */
pid_t ut_pid; /* i for integer */
char ut_line[UT_LINESIZE]; /* A12 for 12-char string */
char ut_id[2]; /* A2, but need x2 for alignment */
time_t ut_time; /* l for long */
char ut_user[UT_NAMESIZE]; /* A8 for 8-char string */
char ut_host[UT_HOSTSIZE]; /* A16 for 16-char string */
long ut_addr; /* l for long */
};
Once you figure out the binary layout, feed that (in this case,
"s x2 i
A12 A2 x2
l A8 A16
l") to pack with an empty field
list to determine the record's size. Remember to check the return
value of read to make sure you got the number of
bytes you asked for.
If your records are text strings, use the "a" or
"A" unpack templates.
Fixed-length records are useful in that the nth
record begins at byte offset SIZE
*
(n-1) in
the file, where SIZE is the size of a single
record. See the indexing code in Recipe 8.8
for an example.
8.15.4. See Also
The unpack, pack, and
read functions in perlfunc(1)
and in Chapter 29 of Programming Perl; Recipe 1.1