Chapter 3: SQL Data Types
Overview
Previously we
defined database as an organized collection of information. Not only does that
mean that data have to be organized according to a company's business rules,
but also the database organization should reflect the nature of the
information. Databases can store dollar amounts, quantities, names, date and
time values, binary files, and more. These can be further classified by type,
which reflects the "nature" of the data: numbers, characters, dates,
etc.
Note | Data type is a characteristic of a database table column that determines what kind of data it can hold. |
One can ask: why do we need data types at
all? Wouldn't it be easier simply have one uniform data type and store
everything, let's say, in the form of character strings?There are many reasons why we don't do that.
Some of them are historical. For example, when relational databases were born
in the late twentieth century, hard disk space and memory storage were at
premium, so the idea was to store everything as efficiently as possible.
Already existing programming languages had some built-in rules for how to store
different types of data. For example, any English character (plus special
characters and digits) could be represented using its ASCII equivalent and the
necessary storage for it was one byte (more about ASCII later in this chapter).
Numbers are traditionally stored in the form of binary strings (native to
computer architecture). To represent a number from negative 32,768 to positive
32,767, two bytes (or sixteen bits) of storage are sufficient (216).
But if we used ASCII characters to represent numbers, we would need six bytes
to store any integer greater than 9,999 (five bytes for digits, and one for the
plus or minus sign), five bytes for whole numbers greater than 999 (four bytes
for digits, one byte for the sign), and so on. Now imagine — if we have a
million of records, we could save four million bytes (about 4M). Sounds like
almost nothing today, but back in the 1970s that was incredibly large storage
space. The principle of effectiveness is still in place, of course, but now we
are talking different scales.
Note | One byte consists of eight bits. Each bit is a binary number that can either be 0 or 1. All information is stored in memory or on the hard disk in form of ones and zeroes, representing the only two states computers understand: zero means no signal, and one indicates the presence of the signal. |
Another reason is logical consistency. Every
data type has its own rules, sort order, relations with other data types,
implicit conversion rules, and so on. It is definitely easier to work with sets
of like values, say dates, rather than with a mixture of dates, numbers, and
character strings. Try comparing library shelves where all materials are sorted
and classified (fiction is in one room, kids' literature in another, audio
books in their special area, videotapes somewhere else) with a pile of
chaotically mixed books, tapes, white papers, and CDs and think what would you
prefer for finding information.And the last thing to mention — some modern
data types (particularly movie files) are too large and too complicated to
store them in a traditional way. Now we are going to discuss existing SQL data
types in more details.