Data Access with SQL and ADOAn Introduction to ActiveX Data Objects (ADO)ADO is the data access technology designed by Microsoft to make data access as simple and transparent as possible, regardless of how or where the data is stored. Previous data access technologies, most notably DAO and ODBC, are still in use. However, these technologies have been officially deprecated, meaning that Microsoft will no longer enhance them and may drop support for them at any time. For several years, as of this writing, ADO has been the sole designated data access technology for now and the future. Data Access Technology DefinedA data access technology such as ADO can be thought of as a connector between your application and the data-storage mechanism it uses. ADO allows your application to "talk" to databases in the VBA programming language. You do still need to understand the use of SQL, which will be discussed throughout the rest of this chapter, but you do not need to understand the low-level APIs required to translate SQL-based requests between VBA and the target data-storage application. ADO abstracts these low-level requirements into a set of common objects, properties and methods that are used in the same way no matter what data-storage application you're working with.Chapter 11 Interfaces.) Providers for the most commonly used databases are packaged with ADO.If you need to use ADO to access a database for which a provider has not been supplied, there are two potential options. You can obtain a native OLE DB provider for that database from a third-party vendor if one is available. If no third-party provider is available you can fall back on the OLE DB provider for ODBC. ODBC was an early industry standard low-level data access technology pioneered by Microsoft. Any database application that supports ODBC can be accessed by ADO using the OLE DB provider for ODBC. We discuss the topic of providers from an ADO point of view in more detail later in this section. ADO ObjectsADO is built on a very simple yet extremely flexible object model. For the vast majority of purposes you will require only three top-level objects: Connection, Command and Recordset, and three collections that belong to these top-level objects: Errors, Parameters and Fields. The top-level objects are not organized hierarchically. Rather they can each be created and used along with or independently of the others. As shown in Figure 13-18, the core ADO object model can be visualized as a triangle among the three top-level objects with the collections attached to the top-level object they belong to. Figure 13-18. The Core ADO Object Model![]() Top-Level ObjectsConnection The primary purpose of the Connection object is exactly what its name implies: connecting to a database. In keeping with the flexibility of ADO, however, a Connection object can be used to execute SQL statements or stored procedures directly.Command The Command object is typically used to execute SQL statements or stored procedures that do not return data and/or require parameters.Recordset The Recordset object is used to execute SQL statements or stored procedures that will return data to your application. CollectionsErrors The Errors collection is used by ADO to return any errors generated by the provider used to connect to the data source. These errors are not ADO errors. ADO objects generate runtime errors.Parameters The Parameters collection is used to store the variable arguments that will be passed to a SQL statement or stored procedure. This allows you to construct a reusable Command object containing all the static information to which you simply pass different parameters for each call.Fields The Fields collection is used to access the data contained in a Recordset object. A recordset can be thought of as containing a table in which only one row is active at a time. The Fields collection exposes the values in the columns corresponding to the active row. Connecting to Data SourcesBefore you can do anything with the data in a data source you must establish a connection to the data source. As with most tasks involving ADO, there is more than one way to do this. A connection can be specified by setting individual properties of an ADO Connection object or by grouping all the required properties into a connection string. We will demonstrate how to establish connections using connection strings.A connection string tells ADO what provider to use, what data source to connect to and any other details required to establish the connection (a user name and password, for example). A connection string consists of a series of property-value pairs, each followed by a semicolon. The first property specified in your connection string should be the Provider property. This tells ADO what provider you want to use. The provider determines the structure of the rest of the connection string. Each provider has a unique set of additional properties, some of which are required and some of which are optional.If no provider is specified, the default provider is the OLE DB provider for ODBC. This is for backward compatibility with old-style ODBC connection strings. This is rarely the provider you want to use, so always specify the provider in your connection strings. Sample connection strings for several common data sources are shown below. AccessThe OLE DB Provider for Microsoft Jet is used to connect to Access databases. The syntax for a typical connection string to an Access database is shown in Listing 13-1. Listing 13-1. An Access Database Connection StringAs you can see, the first argument specifies the provider you want to use. After you have specified the OLE DB Provider for Microsoft Jet, the only other property that is always required is the Data Source property. This tells OLE DB where the Access database you want to connect to is located. If your Access database is password-protected, you can pass it the user name and password required to open it using the User ID and Password properties.NOTEMicrosoft has deprecated the OLE DB Provider for Microsoft Jet. This means that it still works but is no longer actively distributed and that support for it will end at some point in the future. ADO version 2.5 was the last version of ADO with which this provider was distributed. If you or your users do not have this version of ADO installed, you can download it from the Microsoft Web site at [ http://msdn.microsoft.com/downloads/ ]. SQL ServerThe OLE DB Provider for SQL Server is used to connect to SQL Server databases. The base connection string syntax for this provider is shown in Listing 13-2. Listing 13-2. Base Connection String Syntax for SQL ServerThe Data Source property specifies the server and instance you want to connect to. SQL Server 2000 and later can have multiple server instances, each containing multiple databases. In prior versions of SQL Server, you would just specify the server name. In either case, the server name is typically the same as the name of the computer on which SQL Server is installed. The Initial Catalog property is used to specify the database you want to connect to.There are two common syntax variations used when connecting to a SQL Server database. The variation you need to use depends on how the login security has been configured on the SQL Server you are connecting to. Unlike Access, SQL Server always requires some sort of login credentials. The two types of SQL Server security are standard security and Windows integrated security. Standard security means that you log in to SQL Server with a user name and password specific to SQL Server. Integrated security means your Windows network login credentials are used to validate your connection to SQL Server. The connection string used for standard security is shown in Listing 13-3. The connection string used for integrated security is shown in Listing 13-4. Listing 13-3. Connection String for SQL Server with Standard Security
Listing 13-4. Connection String for SQL Server with Integrated Security
ExcelYou can connect to an Excel workbook using either the OLE DB Provider for ODBC or the OLE DB Provider for Microsoft Jet. We demonstrate the use of the latter. The syntax for this is shown in Listing 13-5. Listing 13-5. Connection String for an Excel WorkbookNote the Extended Properties property. This property consists of two values. The first tells the Jet provider you are connecting to an Excel workbook. The second tells the provider how to interpret the data in the sheet or range you specify in your SQL statements. A value of HDR=YES indicates the first row of data contains the column names. If this value is omitted or if it is specified as HDR=NO, the provider will assume all values in the table are data and there are no column names. Using the Connection StringAfter you have built your connection string, you can assign it to the ConnectionString property of the ADO Connection object or pass it directly to various properties and methods of other top-level ADO objects. Listing 13-6 demonstrates how to open a connection to an Access database using the Connection object.NOTETo use ADO from your Excel VBA project, you must set a reference to the Microsoft ActiveX Data Objects 2.x library, where x represents the highest numbered library available on your system. Listing 13-6. Opening a Connection to an Access DatabaseConnecting to other data sources works exactly the same way. We show other ways that the connection string can be used in later sections.Note that our ADO Connection object was declared as a global variable. This is required to enable a very important ADO feature called connection pooling. Creating and tearing down a connection to a database is a resource-intensive process. If your application will be performing a large number of database operations, you do not want ADO to have to create and destroy a connection for each one of them. To enable connection pooling, you must do the following:Declare your ADO Connection object as a global or module-level variable.Create your ADO connection on application startup and then close it.Each time a procedure needs to use the connection it should open the connection, use it and then close it.Do not set the Connection object to Nothing until your application is shutting down. By following these procedures you will allow ADO (or more accurately, the underlying OLE DB provider) to hold the database connection open behind the scenes and provide your procedures with an existing connection on request rather than forcing it to create a new connection each time one is needed. Error Handling ConnectionsWhen talking theoretically about making database connections, the connection attempts always succeed and you continue along your merry way. In the real world, this is not always the case. When you are attempting to access a database that is not located on the same computer as your code, any number of things can cause the connection attempt to fail.Chapter 12 VBA Error Handling. Listing 13-7. Error Handling a Connection AttemptThe error indicating a connection failure is a provider error, so we must examine the Connection object's Errors collection in order to determine whether a connection attempt failed. The Errors collection may contain multiple errors, but the first item in the collection is almost always the one that describes the root cause of the error. Unlike most collections that you'll encounter in VBA, the Connection object Errors collection is indexed beginning with zero.The process of retrying connection attempts can be lengthy. Therefore we've added status bar messages to keep our users updated. Otherwise they may think the application is frozen and attempt to end it with Ctrl+Alt+Del. Data Access TechniquesThere are four fundamental data manipulation operations you can perform on the data in a data source. You can retrieve existing data, you can add new data, you can modify existing data and you can delete existing data. In this section we demonstrate how to perform these operations using SQL and ADO.Any data-manipulation operation can be performed on any data source by passing plain-text SQL statements directly to the data source using ADO. Relational databases also allow you to store prefabricated SQL statements, known as stored queries or stored procedures. When working with client-server databases especially, if you have the option you should always prefer stored procedures. They are far more efficient than plain-text SQL statements and you can create far more complex data access logic within them.For the plain-text SQL statements we demonstrate below we will show only the most common syntax elements. Actual SQL statements can be much more complex and will have database-specific extensions available that are only valid when you are working with that database. To keep this section focused, all the examples shown use Access as the data source. The nice thing about ADO is that its usage is almost identical regardless of the data source you are working with. However, please see our previous book, the Excel 2002 VBA Programmer's Reference, for detailed examples of data access using SQL Server, Excel and even text files. Retrieving DataIn this section we demonstrate how to retrieve data using SQL and ADO. Excel also provides a number of automated features for retrieving and analyzing data. These will be covered in Chapter 14 Data Manipulation Techniques. The SQL SELECT statement is used to accomplish data retrieval. The basic syntax of the SELECT statement is shown in Listing 13-8. Listing 13-8. The SQL SELECT StatementWe've placed each clause of the SELECT statement on a separate line for readability purposes. In reality, white space has no effect on a SQL statement and you can arrange it on a single line or multiple lines as you see fit. Just make sure that at least one space separates each element in the statement from the next and the clauses appear in the order shown above. The meaning of each clause is the following:SELECT This clause is a comma-delimited list of columns you want to retrieve.FROM This clause contains the names of one or more tables that contain the data you want to retrieve. If multiple tables are specified, you will need to perform a join. There are various types of joins available in SQL and a discussion of them all is beyond the scope of this chapter. However, the most common join type is the INNER JOIN, which uses the following syntax: We demonstrate an INNER JOIN in the example below.WHERE This clause contains any criteria that restrict what data should be returned. It is typically expressed as a Boolean condition in the form of column_name = value. The WHERE clause is optional and can be omitted if you just want to retrieve all the data in a table.ORDER BY This clause indicates which columns you want the data sorted by. The ORDER BY clause is optional if you are not concerned about the order in which the data is returned to you. For our SELECT example, we will use our normalized BillableHours table shown in Figure 13-8 and the related Consultants table shown in Figure 13-4. We will assume the database is located in the same directory as the workbook that is calling it. Listing 13-9 shows a plain-text SQL query that retrieves all of the billable hours records for consultant Rob Bovey. Listing 13-9. Retrieving Data from AccessThere are several interesting techniques to point out in Listing 13-9. First examine the SQL used to specify the data we want to retrieve. This shows a little bit of the flexibility available in the SQL SELECT statement. In the SELECT clause, we show how you can concatenate the values of multiple database columns into a single result column.We have also joined two tables in the FROM clause. We need to do this because even though the data we want is located in the BillableHours table, the consultant's name that we want to restrict our data to is located in the Consultants table. In the FROM clause, we link the two tables on their common column, the ConsultantID column. Recall from our previous discussion on relationships that the ConsultantID column is the primary key of the Consultants table and a foreign key in the BillableHours table.Next, notice we have used multiple restriction conditions in the WHERE clause. We need to specify both the value of the FirstName and LastName columns to uniquely identify the consultant whose data we want to retrieve. To do this, we just link the two Boolean conditions with the SQL AND operator.Last, notice how we make use of the connection string in this example. Rather than creating an ADO Connection object, we simply pass the connection string directly to the ADO Recordset object's Open method. This is a very useful technique when you will only be performing a single query and you don't need advanced features of the ADO Connection object such as connection pooling.After we've opened the recordset, we check to see whether we got any data. This is accomplished by examining the recordset's EOF property. EOF stands for end of file. If no data was returned, the recordset's row pointer will be pointing to the end of the recordset and the EOF property will be True, otherwise the row pointer will be pointing to the first record in the recordset and the value of EOF will be False.If we successfully retrieved the data we asked for, we dump it onto Sheet1 using the CopyFromRecordset method of the Excel Range object. The CopyFromRecordset method provides an extremely fast method for extracting the data from a recordset onto a worksheet. The CopyFromRecordset method returns only the data, not the column names. An alternative method that adds the column names and then adds the data by looping the recordset one row at a time is shown in the code fragment in Listing 13-10. Listing 13-10. Looping a Recordset by Rows
Inserting DataInserting new data into a database is accomplished using the SQL INSERT statement. The basic syntax of the INSERT statement is shown in Listing 13-11. An explanation of each clause of the statement follows. Listing 13-11. The SQL INSERT StatementINSERT INTO This clause includes the name of the table into which you are inserting the data and a comma-delimited list of columns that will be receiving data enclosed in parenthesis. You are not required to list the columns if you will be supplying data in the VALUES clause for every required column in the table in the same order in which they appear in the table. Certain types of columns cannot be included in the INSERT clause. These include any AutoNumber column in an Access database table.VALUES This clause includes a comma-delimited list of values to be inserted into the table enclosed in parenthesis. When a list of columns is provided in the INSERT INTO clause, there must be a corresponding value for each of these columns in the VALUES clause. To illustrate the INSERT statement, we'll add a new consultant to our Consultants table. The code to accomplish this is shown in Listing 13-12. Listing 13-12. Inserting Data into AccessIn this example we've demonstrated how you can execute SQL statements using the ADO Connection object alone. The Connection object has an Execute method that we've used to insert our new record into the Consultants table. The first argument to the Execute method is the SQL statement we want to execute. The second argument is a return value that tells us the number of records affected after the method has completed. We don't need this because we know our INSERT statement only inserts a single record. If the insert fails for some reason a runtime error will be generated by ADO.The last argument tells the Execute method what we are giving it in the first argument as well as how to process the call. adCmdText is an enumeration member that tells the Execute method we are passing it a SQL text string. This is the same value we passed to the last argument of the Recordset.Open method in Listing 13-9 above. If this value is not specified, ADO will determine for itself what is being passed in the first argument, but telling it to begin with saves time. adExecuteNoRecords is an enumeration member that tells the Execute method how to process the call. In this case, it means the Execute method should not return a recordset. The Execute method will always return a recordset unless told to do otherwise, even if a recordset it not logically required and is therefore empty. Telling the Execute method we don't need a recordset returned saves the time and resources required to create and return the unnecessary empty recordset. Updating DataUpdating data in a database is accomplished using the SQL UPDATE statement. The basic syntax of the UPDATE statement is shown in Listing 13-13. An explanation of each clause of the statement follows. Listing 13-13. The SQL UPDATE StatementUPDATE This clause contains the name of the table that holds the data to be updated.SET This clause provides the column name to be updated and the value it will be updated with.WHERE This clause contains the criterion that identifies the row to be updated. It is expressed as a Boolean condition in the form of column_name = value. The WHERE clause is technically optional, but beware. If you do not supply a WHERE clause in the UPDATE statement then every row in the table will be updated with the specified value. This is rarely what you want to do and it is impossible to reverse. To illustrate the UPDATE statement, we'll use an Access stored parameter query to modify the name of one of the clients in our Clients table from Figure 13-6. The Access stored query is called qryUpdateClient and its contents are shown in Listing 13-14. Listing 13-14. The qryUpdateClient Parameter QueryAs you can see, this query takes two parameters. The CName parameter is used to modify the client name and the ID parameter is used to uniquely identify the client whose name we want to update. The code to execute this query is shown in Listing 13-15. Listing 13-15. Updating Access DataWe are again bypassing the ADO Connection object, this time assigning the connection string directly to the Command object's ActiveConnection property. Also notice that instead of a plain-text SQL statement we are now using the name of the Access stored query. Where the type of the plain-text queries we used previously was adCmdText, the type of a stored query is adCmdStoredProc.This example illustrates the use of the ADO Command object's Parameters collection. For each parameter in the Access parameter query, we must create a Parameter object and add it to the Parameters collection. These parameters must be created and added to the collection in exactly the same order as they appear in the SQL of the Access parameter query. We create and store the parameters in a single line of code by passing the result of the Command object's CreateParameter method to the Parameters collection Append method. We use four arguments of the CreateParameter method:Name This is the name of the parameter. It must be the same name that appears in the Access parameter query for the parameter.Type This is an enumeration member specifying the data type of the parameter. adVarChar means a Text parameter and adInteger means a Long Integer parameter.Direction This is an enumeration member indicating which direction the parameter is used to pass data. adParamInput indicates the parameter will be used to pass data from our code to the database.Size This value indicates the size of the parameter. It is only required for Text data types. In the case of our CName parameter, the column being updated has a maximum width of 50 characters. For numeric data types like the ID parameter, you can simply pass zero to this argument. After we have created the parameters and added them to our Command object's Parameters collection, we load them with the values we want to send to the database and then execute the stored query. The first argument to the Command object's Execute method is a return value indicating the number of records affected. We are using the lAffected variable to retrieve this value and checking it to ensure that exactly one record was updated. As in our insert example, we do not require a recordset to be returned, so we're using the third argument of the Command object to prevent this from happening. Deleting DataDeleting data in a database is accomplished using the SQL DELETE statement. The basic syntax of the DELETE statement is shown in Listing 13-16. An explanation of each clause of the statement follows. Listing 13-16. The SQL DELETE StatementDELETE FROM This clause contains the name of the table that holds the data to be deleted.WHERE This clause contains the criterion that identifies the row to be deleted. It is expressed as a Boolean condition in the form of column_name = value. The WHERE clause is technically optional, but beware. If you do not supply a WHERE clause in the DELETE statement, then every row in the table will be deleted. This is rarely what you want to do and it is impossible to reverse. To illustrate the DELETE statement, we'll remove the consultant we added to the Consultants table with our insert example in Listing 13-12. The code to accomplish this is shown in Listing 13-17. Listing 13-17. Deleting Access DataAll of the techniques used here should be familiar from previous examples so we will not cover them in any detail. |