Start with Data Analysis Prior to CodingWhen any data retrieval query is being run, it is compulsory to specify where the data is coming from. Specifying which tables are going to be accessed can be done with the FROM clause of the SELECT statement. FROM table lists do not necessarily have to be just standard tables. It is also valid to specify derived tables, joined tables, views, user-defined functions that return tables, and system functions that return tables.A derived table is a subquery that retrieves rows from the database. A SELECT clause within a FROM clause of another SELECT, referred to by an alias, produces a derived table. The resulting rows take on a construct similar to that of a standard table. Derived tables produce a performance hit in most situations, and where possible, a join should be coded to replace the derived table.The name of a view is often substituted for a table name. A view, stored as a query on the server, often provides necessary column restrictions and application considerations that makes it a beneficial data source.User-defined functions can generate recordset output and in that form be used in place of a table name. A function can declare an internal table variable, insert rows into the variable, and then return the variable as its return value. A class of user-defined functions known as in-line functions return the resultset of a SELECT statement as a variable of type table.System functions often generate a set of records that can be envisioned as a table. These rowset functions return an object that can be used in place of a table reference in a T-SQL statement.Data sources can be located throughout the organization. They can be located at many sites, be on different computers and are created using many data management tools. SQL Server has the functionality available to connect to and use data anywhere it resides. Of course the data in its raw format won't be presentable and will require formatting. After a data source has been determined, the coding can begin. Making Data PresentableOften data returned from the database is not presented in a manner that the end user can identify with and access in a reasonable manner. Many SQL Server functions allow for the alteration of the data type or more appropriate presentation to the user. If data is not in a type presentable to the user, the CONVERT() and CAST() functions will be the first ones used.Data needs to be converted from one form to another when you are using diverse functions or data types that don't match up with the type needed by a particular function. Data can be converted from one data type to another with the CONVERT() and CAST() functions. CAST() is based on the SQL-92 standard and is favored over CONVERT(), although both CONVERT() and CAST() have their particular strengths in handling individual types of data.You will need to supply both the expression that needs to be converted and the data type to convert the given expression to perform conversion operation. SQL Server automatically converts certain data from one data type to another. For example, if a smallint is compared to an int, the smallint is automatically converted to an int before the comparison proceeds. These are called implicit conversions because you don't have to use the CAST() or CONVERT() functions. When data is converted to a new data type and the data is too large for the new data type to handle, SQL Server displays an asterisk or identifies the problem with an error message, depending on the data types involved.The CONVERT() function transforms data from one data type to another.CONVERT() is also used extensively with date operations to format a date. The following example converts the Price column into a CHAR data type so that you can concatenate it with another string expression. Remember, concatenation can be done only using strings, so it must be converted: The CAST() function is similar to the CONVERT() function in that it converts data. It is preferred over the CONVERT() function because it's based on the SQL-92 standard.CAST() is usually preferred over CONVERT() for operations with more advanced processing requirements. |
When working in character strings, whether converting, concatenating, or applying string data to another application, you may need to trim off space at the beginning or end. Often data is space filled. Within T-SQL there are two functions for trimming spaces: RTRIM() and LTRIM(). If you want to trim all spaces, these can be combined, and though it looks awkward, it does accomplish the task.
SELECT 'The book costs ' + CAST(Price AS CHAR(5)) FROM Titles
By formatting data you're presenting the information so that more value is produced in the finished product. Formatting information can be as simple as placing output in the correct sequence; however, formatting can also mean the use of these procedures for the analysis and display of data.
LTRIM(String)
RTRIM(String)
LTRIM(RTRIM(String))
Functions Make Data Work
There are functions for making strings more usable. There are functions for performing mathematics. There are functions that work the calendar and functions that tell time, functions that move data around and functions that rhyme. Functions perform tasks against the data, they perform activities, and they produce results.All functions are either deterministic or nondeterministic. The specification of whether a function is deterministic or nondeterministic is called the determinism of the function. Deterministic functions always return the same result anytime they are called with a specific set of input values. Nondeterministic functions may return different results each time they are called with a specific set of input values.An example of a deterministic function would be DATEADD(), a built-in function that always returns the same result for any given set of argument values. GEtdATE() is not deterministic because it is always invoked with the same argument, yet the value it returns changes each time it is executed.
Be careful with the use of appropriate functions because nondeterministic functions cannot be used in a lot of processes. An index cannot be created on a computed column if the expression references any nondeterministic functions, and a clustered index cannot be created on a view if the view references any nondeterministic functions. |
Aggregate Collective Functions
Aggregate functions are functions that provide summary data about sets. Questions like "How many rows are in that table?" "How many widgets did we sell last week?" and "What is the average price we charged for a widget?" are all answered with aggregate functions. Some are more commonly used than others, and anyone familiar with working in a spreadsheet environment has seen these before.Aggregate functions take one of three types of arguments. Some take a simple wildcard (*). This means either that the operation doesn't apply to rows or that it should apply to all rows. In COUNT(*) the wildcard returns the number of rows in the table. The number of rows that are in the table is independent of any individual column. The function is applied across the entire group, regardless of the content of each row.All the functions take a column name as an argument, and then the aggregate applies only to that column. COUNT(PersonID) returns the number of non-null PersonIDs in the table. Some functions enable you to apply the function to distinct values. COUNT (DISTINCT PersonID) provides a count of the number of distinct IDs in the column, ignoring duplications.
The AVG() Function
The AVG() function returns the average value for a given column. It requires a column name, and optionally you can use DISTINCT() to get an average for just the distinct values in the table. Here's an example that determines the average size of an order from the sample Sales table:
SELECT AVG(QtyPurchased) FROM Sales
The COUNT() and COUNT_BIG() Functions
The COUNT() function returns an integer representing the number of rows in the table. The COUNT_BIG() function does the same thing, but it returns a number of type bigint, instead used for analyzing large datasets.
The MAX() and MIN() Functions
MAX() and MIN() are simple. They return the maximum or the minimum value in the set. The highest and lowest values within the set being evaluated will be returned in the data type of the column being used for the analysis.
The SUM() Function
The SUM() function is nearly as simpleit just returns the sum of all the values in the group. SELECT SUM(Quantity) FROM [Order Details] will provide a total for the number of products sold.
Functions Used for Deviation and Variance
Without delving too deep into statistical operations, we'll say that these sets of functions are useful in analyzing populations. The VAR() and VARP() functions calculate the variance of the group. The STDDEV() and STDDEVP() functions calculate the standard deviation of the group. Variance and standard deviation are used in statistics to determine the spread of a sample set. A sample set has a high variance and a high standard deviation if there is a wide range of values in the sample set. If there is a small range of values, the variance and standard deviation is smaller. The STDEVP() and STDEV() functions differ in that the STDEV() determines the sample standard deviation, whereas STDEVP() returns the population standard deviation. VAR() and VARP() are similar.
Checksum Operations
A checksum is a number generated from a data value that essentially summarizes the data value. If you store that value and then change the data, you can test the checksum to quickly notice this change. Checksums are used frequently in data storage to denote data corruption.
If you're working with large datasets, though, the capability to compare checksums saves an immense amount of time over comparing strings together. |
This returns a checksum for each row based on the entire content of the row. Now, if you want to use a checksum for the entire table, you can do this:
SELECT CHECKSUM(*) FROM
This gives you a checksum for the entire table. Now, look at a way to use this information. The key thing to remember about using a checksum is that it's a snapshot. To find differences, you've got to have a checksum before and a checksum after the changes occur.
SELECT CHECKSUM_AGG(CHECKSUM(*)) FROM [Order Details]
Of Calendars and Clocks
In T-SQL, as with many other programming languages, date and time manipulation have been given special importance by the many functions available. Dates and times are implemented as datetime and smalldatetime data types and need to be treated with special functions. To first examine dates, you must realize that dates cannot be added or subtracted using the regular operators. The date functions available in SQL Server enable you to change and manipulate dates easily.
The DATEADD() Function
The DATEADD() function adds a number to the part of the date you specify, such as month or day, and then returns the new datetime value. The function adds an interval to a date specified. The following example adds three days to the ShipDate column of a BackOrders table:
The portion of the date that you work with allows for several different entries, as presented in Table 6.1.
SELECT DATEADD(day, 3, ShipDate) FROM BackOrders
Date-Part | Abbreviations |
---|---|
Year | yy, yyyy |
Quarter | qq, q |
Month | mm, m |
Dayofyear | dy, y |
Day | dd, d |
Week | wk, ww |
Hour | hh |
Minute | mi, n |
Second | ss, s |
Millisecond | ms |
The DATEDIFF() Function
The DATEDIFF() function is used in a similar manner. This function, however, calculates the amount of time in date-parts between the second and first of two dates you specify. DATEDIFF() subtracts the start date from the end date to produce the result. Look at the following example:
The method of counting across boundaries, such as minutes, seconds, and milliseconds, makes the result given by DATEDIFF() consistent across all data types. The result is a signed integer value equal to the number of date-part boundaries crossed between the first and second date. For example, the number of weeks between Monday, March 5, and Monday, March 12, is 1.
SELECT PubDate,
DATEDIFF(year, PubDate, getdate()) AS 'Difference'
FROM Titles
The GETDATE() Function
To get the current date, use the GETDATE() function. This function can be useful when you are producing reports that need to be dated. The GETDATE() function returns the current system date and can be used in a SELECT statement. You can use GEtdATE() in designing a report to have the current date and time printed every time the report is produced. GETDATE() is also useful for functions such as logging the time a transaction occurred on an account. You can use GETDATE() anywhere to return the current system date.
The DATENAME() Function
The DATENAME() function can be used in determining part of a date in the form of a string. You may, for instance, need to know specifically the day or month of a publication. The DATENAME() function returns a character string representing the specified date-part from the date. Here's an example of using DATENAME() to find the name of the current month:
The DATEPART() and DATENAME() functions produce the specified part of a datetime value (the year, quarter, day, hour, and so on) as either an integer or a string. Because smalldatetime is accurate only to the minute, when a smalldatetime value is used with either of these functions, the seconds and milliseconds returned are zero.
Select datename(mm,getdate()) as 'month'
Using DATEPART()
DATEPART() is similar to DATENAME() except that it returns an integer value indicating the date rather than a character string. The following is an example of using DATEPART() to find the current month and display it numerically:
Many useful date functions give the developer flexibility in dealing with what is traditionally one of the more pesky data types: dates. With history reflecting unfavorably on the storage of date information, it is important to consider this data type to be infinite in that dates need to remain unique as time progresses.
SELECT datepart(mm,getdate()) AS 'month'
Using Mathematical Functions in T-SQL
In most programming languages numeric data types have always had useful functionality supplied with the coding environment. SQL Server is no exception with its adoption of a standard set of mathematical functions.A mathematical function performs a math operation on numeric expressions and returns the result of the operation. Math functions operate on any of the SQL Server numeric data types (decimal, integer, float, real, money, smallmoney, smallint, and tinyint). The initial precision of built-in operations on the float data type data is six decimal places.By default, a number passed to a mathematical function is interpreted as a decimal data type. The CAST() or CONVERT() functions can be used to change the data type to something else, such as a float.You can use the diverse set of math functions provided by T-SQL to fulfill many database needs. SQL Server provides numerous mathematical functions so that you can perform the most complex calculations possible. These numerous functions are provided in most programming language environments. Some of the more commonly used math functions are listed in Table 6.2.
Discussion of mathematical functions would be an involved endeavor for this medium. This book does not set out to make mathematical geniuses out of every SQL developer. To become fully versed in all mathematical functions would require a significant education in the science of mathematics.Character functions interact against alphanumeric data in a similar fashion as numeric data interacts with mathematical functions. Some of these were discussed within the realm of formatting data, but there are a few more worth looking into.
SELECT pi() AS 'PI' , abs(-44) AS 'Absolute', power(2,3) AS 'Power'
String It Up
SQL Server also provides a full array of character (string) functions. Character functions enable you to easily manipulate attributes that are character specific. Most of the functions provided in Table 6.3 are normally used, and therefore you should regularly practice using them.
The SUBSTRING() Function
It may be that you want to return a part of a string rather than the whole string. The SUBSTRING() function, which returns a part of a character or binary string, can be used for this task. The SUBSTRING() function, in a WHERE clause, may perform a table scan where an index was supposed to function (if an index was implemented). When the SUBSTRING() function does not include the first letter of the column being searched, a table scan is performed. This is not efficient processing. The following example illustrates the use of SUBSTRING() in a concatenation process:
All the string functions have their own place where they can be applied to assist with string manipulations. For the most part, you should find them very straightforward to use and not the focus of exam elements because of their ease of use. The final area of SQL Server functionality left to discuss is a series of functions that allow for various system-level interactions.
SELECT AU_FName + ' ' + AU_LName AS 'Full Name',
SUBSTRING (AU_FName, 1,1) + SUBSTRING (au_lname, 1,1) AS 'Initials'
FROM Authors
Working with System-Level Functions
As you have seen in the preceding two compilations, many functions are supplied to perform mathematical and character-manipulation operations. You can also use a third category of functions with the SELECT list known as system-specific functions. You can use system functions to retrieve special system or database information through T-SQL and the SELECT statement. Table 6.4 shows a compilation of the numerous system functions available in T-SQL.
System functions, information schema views, or the system stored procedures can be used to gain access to system information without querying the system tables directly. System tables can change significantly between versions of SQL Server.SQL Server provides system stored procedures or information schema views for obtaining information about the properties of data, such as the type of data in a column (numeric, text, and so on) or the length of a column. This type of information is called metadata and is maintained by SQL Server for all server and database objects. Metadata can be used to find out information about the structure of data, the contents of a server, or information that specifies the design of objects.Although it is possible to obtain data by querying any of the system tables directly, the system tables may not provide the information required in the future. It is recommended that system stored procedures, system functions, and information schema views be used because the contents of the system tables may change in future releases.
SELECT ISNULL (price, 0.0000), price FROM titles
Information Schema Views
Information schema views provide a method independent of the system tables to view metadata. These views enable applications to work properly even though significant changes may have been made to the system tables and more changes may be made in the future. An application that uses the views rather than a direct query against the system tables should function in the same manner in the future as it does in the current SQL Server release.The information schema views included in SQL Server conform to the SQL-92 Standard definition for the INFORMATION_SCHEMA. Names used in the SQL-92 standard for these views are different from those used by SQL Server, though the names from SQL Server can equivalently be mapped to those of the standard. The following list shows the SQL-92 names and the SQL Server equivalents:A SQL-92 "Catalog" is a SQL Server "Database.""Schema" in SQL-92 is an "Owner" in SQL Server."Object" is the same in both SQL-92 and SQL Server.A "Domain" in SQL-92 is a user-defined data type in SQL Server.When retrieving metadata from the information schema views, you must use a qualified name that includes the INFORMATION_SCHEMA in the position where you usually specify the username. For example:
For more information on the variety of metadata that can be obtained through the use of information schema views, use the Index tab of SQL Server Books Online. When you type information schema, the index shows links to all the appropriate views. Many system stored procedures can also be used to find information about server and database objects. With many of the procedures, however, you can also perform actions against the server, whereas information schema views are used solely to obtain metadata.
SELECT * FROM Northwind.INFORMATION_SCHEMA.TABLES
System Stored Procedures
Many administrative and informational activities in SQL Server can be accomplished through the use of SQL Server's many system stored procedures. System stored procedures are available to perform various activities, from obtaining information about server settings and objects to managing processes on the server to performing maintenance activities and much more.
It is not possible to cover all the procedures in this book, and SQL Server Books Online has full definitions and examples for these procedures. At various points throughout the book, references will be made to those procedures you are likely to find on the exam and others that will serve useful purposes in the future. |
The following are typical situations that are tested on the exam:Date conversion using appropriate date functionsString concatenation, including building and parsing functionsSystem application design, in which system functions are used to programmatically control or manipulate the software functionality |