SELECT [select_options] select_expression, ... [INTO {OUTFILE | DUMPFILE} 'file_name' export_options] [FROM table_references [WHERE where_definition] [GROUP BY {col_name | col_pos } [ASC | DESC], ...] [HAVING where_definition] [ORDER BY {col_name | col_pos } [ASC | DESC], ...] [LIMIT [offset,] rows]]
SELECT is the most complex query in SQL, and is used to perform all data retrieval operations. This query supports the following select_options, which may be specified in any sensible combination by simply listing them separated by spaces:
STRAIGHT_JOIN
Forces MySQL to join multiple tables specified in the table_references argument in the order specified there. If you think MySQL’s query optimizer is doing it the ‘slow way’, this argument lets you override it. For more information on joins, see "Joins" below.
SQL_SMALL_RESULT
This option shouldn’t be needed in MySQL 3.23 or later; however, it remains available for compatibility. This option informs MySQL that you are expecting a relatively small result set from a query that uses the DISTINCT option or the GROUP BY clause, so it uses the faster, but more memory-intensive, method of generating a temporary table in memory to hold the result set as it is created.
SQL_BIG_RESULT
Along the same lines as SQL_SMALL_RESULT, this option informs MySQL that you are expecting a large number of results from a query that makes use of DISTINCT or GROUP BY. When it creates the result set, MySQL will create on disk, as needed, a temporary table in which the results are sorted. This is a quicker solution than generating an index on the temporary table, which would take longer to update for each result row in a large result set.
SQL_BUFFER_RESULT
This option forces MySQL to store the result set in a temporary table. This frees up the tables that were used in the query for use by other processes, while the result set is transmitted to the client.
HIGH_PRIORITY
This option does exactly what it says—it assigns a high priority to the SELECT query. Normally, if a query is waiting to update a table, all read-only queries (such as SELECT) must yield to it. A SELECT HIGH_PRIORITY, however, will go first.
DISTINCT | DISCTINCTROW | ALL
Any one of these three options may be used to specify the treatment of duplicate rows in the result set. ALL (the default) specifies that all duplicate rows appear in the result set, while DISTINCT and DISTINCTROW (they have the same effect) specify that duplicate rows should be eliminated from the result set.
select_expression defines a column of the result set to be returned by the query. Typically, this is a table column name, and may be specified as col_name, tbl_name.col_name, or db_name.tbl_name.col_name, depending on how specific you need to be for MySQL to know to which particular column you are referring. select_expressions need not refer to a database column—a simple mathematical formula such as 1 + 1 or a complex expression calculated with MySQL functions may also be used. Here’s an example of the latter, which will give the date one month from now in the form “January 1st, 2002”:
SELECT DATE_FORMAT( DATE_ADD(CURDATE(),INTERVAL 1 MONTH),'%M %D, %Y')
select_expressions may also contain an alias, or assigned name for the result column, if the expression is followed with [AS] alias (the AS is entirely optional). This expression must be used when referring to that column elsewhere in the query (e.g. in WHERE and ORDER BY clauses), as follows:
SELECT JokeDate AS JD FROM Jokes ORDER BY JD ASC
MySQL lets you use an INTO clause to output the results of a query into a file instead of returning them to the client. The most typical use of this is to export the contents of a table into a text file containing comma-separated values (CSV). Here’s an example:
SELECT * INTO OUTFILE '/home/user/myTable.txt' FIELDS TERMINATED BY ',' OPTIONALY ENCLOSED BY '"' LINES TERMINATED BY '\n' FROM myTable
The file to which the results are dumped must not exist beforehand, or this query will fail. This restriction prevents an SQL query from being used to overwrite critical operating system files. The created file will also be world-readable on systems that support file security, so consider this before you export sensitive data to a text file that anyone on the system can read.
DUMPFILE may be used instead of OUTFILE to write to the file only a single row, without row or column delimiters. It can be used, for example, to dump a BLOB stored in the table to a file (SELECT blobCol INTO DUMPFILE ...). For complete information on the INTO clause, see the MySQL Reference Manual. For information on reading data back from a text file, see "LOAD DATA INFILE".
The FROM clause contains a list of tables from which the rows composing the result set should be formed. At its most basic, table_references is a comma-separated list of one or more tables, which may be assigned aliases with or without using AS as described above for select_expression. If you specify more than one table name, you are performing a join. These are discussed in "Joins" below.
The where_definition in the WHERE clause sets the condition for a row to be included in the table of results sent in response to the SELECT query. This may be a simple condition (e.g. id=5) or a complex expression that makes use of MySQL functions and combines multiple conditions using Boolean operators (AND, OR, NOT).
The GROUP BY clause lets you specify one or more columns (by name, alias, or column position, where 1 is the first column in the result set) for which rows with equal values should be collapsed into single rows in the result set. This clause should usually be used in combination with the MySQL grouping functions such as COUNT, MAX, and AVG, described in "MySQL Functions", to produce result rows that give summary information about the groups produced. By default, the grouped results are sorted in ascending order of the grouped column(s); however, the ASC or DESC argument may be added following each column reference to explicitly sort the results in ascending or descending order for that column, respectively. Results are sorted by the first column listed, and then tying sets of rows are sorted by the second, and so on.
Note that the WHERE clause is processed before GROUP BY grouping occurs, so conditions in the WHERE clause may not refer to columns produced by the grouping operation. To impose conditions on the post-grouping result set, you should use the HAVING clause instead. This clause’s syntax is identical to that of the WHERE clause, except the conditions specified here are processed just prior to returning the set of results, and are not optimized in any way. For this reason, you should use the WHERE clause whenever possible. For more information on GROUP BY and the HAVING clause, see "Advanced SQL".
The ORDER BY clause lets you sort results according the values in one or more rows before they are returned. As for the GROUP BY clause, each column may be identified by a column name, alias, or position (where 1 is the first column in the result set), and each column may have an ASC or DESC argument to specify that sorting occurs in ascending or descending order, respectively (ascending is the default). Rows are initially sorted by the first column listed, and then tying sets of rows are sorted by the second, and so on.
The LIMIT clause instructs the query to return only a portion of the results it would normally generate. In the simple case, LIMIT n returns only the first n rows of the complete result set. You can also specify an offset by using the form LIMIT x, n. In this case, up to n rows will be returned, beginning from the xth row of the complete result set. The first row corresponds to x = 0, the second to x = 1 and so on.
As described above, the FROM clause of a SELECT query lets you specify the tables that are combined to create the result set. When multiple tables are combined in this way, it is called a join. MySQL supports several types of joins, as defined by the following supported syntaxes for the table_references component of the FROM clause above:
table_ref table_references, table_ref table_references [CROSS] JOIN table_ref table_references INNER JOIN table_ref join_condition table_references STRAIGHT_JOIN table_ref table_references LEFT [OUTER] JOIN table_ref join_condition { oj table_ref LEFT OUTER JOIN table_ref ON cond_expr } table_references NATURAL [LEFT [OUTER]] JOIN table_ref table_references RIGHT [OUTER] JOIN table_ref join_condition table_references NATURAL [RIGHT [OUTER]] JOIN table_ref
Where table_ref is defined as:
table_name [[AS] alias] [USE INDEX (key_list)] [IGNORE INDEX (key_list)]
and join_condition is defined as one of the following:
ON cond_expr USING (column_list)
Don't be disheartened by the sheer variety of join types; I'll explain how each of them works below.
The most basic type of join, an inner join, produces rows made up of all possible pairings of the rows from the first table with the second. You can perform an inner join in MySQL either by separating the table names with a comma (,) or with the words JOIN, CROSS JOIN, or INNER JOIN (these are all equivalent).
Normally, the WHERE clause of the SELECT query is used to specify a condition to narrow down which of the combined rows are actually returned (e.g. to match up a primary key in the first table with a column in the second); however, when the INNER JOIN syntax is used, the ON form of the join_condition can play this role as well. As a final alternative, the USING (column_list) form of join_condition lets you specify columns that must match between the two tables. For example,
SELECT * FROM t1 INNER JOIN t2 USING (aid)
is equivalent to
SELECT * FROM t1 INNER JOIN t2 ON t1.aid = t2.aid
and also
SELECT * FROM t1, t2 WHERE t1.aid = t2.aid
STRAIGHT_JOIN works the same as an inner join, except that the tables are processed in the order listed (left first, then right). Normally MySQL selects the order that will produce the shortest processing time, but if you think you know better, you can use a STRAIGHT_JOIN.
The second type of join is an outer join, which is accomplished in MySQL with LEFT/RIGHT [OUTER] JOIN (OUTER is completely optional, and has no effect). In a LEFT outer join, any row in the left-hand table that has no matching rows in the right-hand table (as defined by the join_condition), will be listed as a single row in the result set. NULL values will appear in all the columns that come from the right-hand table.
The { oj ... } syntax is equivalent to a standard left outer join; it is included for compatibility with other ODBC databases.
RIGHT outer joins work in the same way as LEFT outer joins, except in this case, it is the table on the right whose entries are always included, even if they do not have matching entries in the left-hand table. Since RIGHT outer joins are nonstandard, it is usually best to stick to LEFT outer joins for cross-database compatibility.
For some practical examples of outer joins and when they are useful, see "Advanced SQL".
Natural joins are kind of 'automatic', in that they will automatically match up rows based on column names that are found to match between the two tables. Thus, if a table called Jokes has an AID column that refers to entries in an Authors table whose primary key is another AID column, you can perform a join of these two tables on that column very simply (assuming there are no other columns with identical names in the two tables):
SELECT * FROM Jokes NATURAL JOIN Authors