Picking Random Rows Some databases are so large, and queries on them so complex, that often it's impractical (and unnecessary) to retrieve all of the data relevant to a query. If you're interested in finding an overall trend or pattern, for example, an approximate answer within some margin of error usually will do. One way to speed such queries is to select a random sample of rows. An efficient sample can improve performance by orders of magnitude yet still yield accurate results.Standard SQL's TABLESAMPLE clause returns a random subset of rows. DB2 and SQL Server 2005 support TABLESAMPLE, and Oracle has something similar. For the other DBMSes, use a (nonstandard) function that returns a uniform random number between 0 and 1 (Table 9.1 ).Table 9.1. Randomization Features | DBMS | CLAUSE OR FUNCTION |
|---|
| Access | RND() function | | SQL Server 2000 | RAND() function | | SQL Server 2005 | TABLESAMPLE clause | | Oracle | SAMPLE clause or DBMS_RANDOM package | | DB2 | TABLESAMPLE clause | | MySQL | RAND() function | | PostgreSQL | RANDOM() function | Listing 9.42 randomly picks about 25% (0.25) of the rows from the sample-database table titles. If necessary, change RAND() to the function that appears in Table 9.1 for your DBMS. For Oracle, use Listing 9.42 . For SQL Server 2005 and DB2, use Listing 9.42 .Listing 9.42a. Select about 25% percent of the rows in the table titles at random. See Figure 9.42 for a possible result.
 Listing 9.42b. Select about 25% percent of the rows in the table titles at random (Oracle only). See Figure 9.42 for a possible result.
 Listing 9.42c. Select about 25% percent of the rows in the table titles at random (SQL Server 2005 and DB2 only). See Figure 9.42 for a possible result.
Limiting the Number of Rows Returned" earlier in this chapter.
title_id type sales -------- ---------- ----- T03 computer 25667 T04 psychology 13001 T11 psychology 94123
Tips- Randomizers take an optional seed argument or setting that sets the starting value for a random-number sequence. Identical seeds yield identical sequences (handy for testing). By default, the DBMS sets the seed based on the system time to generate different sequences every time.
Listing 9.42 won't run correctly on Microsoft Access or Microsoft SQL Server 2000 because the random-number function returns the same "random" number for each selected row. In Access, use Visual Basic to pick random rows. For SQL Server 2000, see SQLTeam.com's article "Returning Rows in Random Order" (www.sqlteam.com/item.asp?ItemID=217).
Selecting Every n th Row Instead of picking random rows, you can pick every n th row by using a modulo expression:- m MOD n (Microsoft Access )
- m % n (Microsoft SQL Server )
- MOD(m,n ) (other DBMSes )
This expression returns the remainder of m divided by n. For example, MOD(20,6) is 2 because 20 equals (3 x 6) + 2. MOD(a ,2) is 0 if a is an even number.The condition MOD(rownumber,n ) = 0 picks every n th row, where rownumber is a column of consecutive integers or row identifiers. This Oracle query picks every third row in a table, for example: SELECT * FROM table WHERE (ROWID,0) IN (SELECT ROWID, MOD(ROWNUM,3) FROM table );
Note that rownumber imposes a row order that doesn't exist implicitly in a relational-database table. |
 |