Macro-OptimizationThe vast majority of the performance improvement will come from restructuring the code to use a more efficient algorithm. This part of the chapter highlights some of the things to look for and provides alternative suggestions for doing the same thing. Whether the suggestions shown here are better or worse than your existing code will depend very much on the situation, particularly on the amount and type of data being processed. The slowest parts of a procedure will invariably involve either external data retrieval or repeatedly looping through sets of data. Large loops are an opportunity for optimization; any improvement (however minor) that can be made inside a loop is a gain many times over. The performance of external data retrieval is usually dependant on the server and network performance and is something we have little control over. One thing we can do to minimize the effect of poor database performance is to load data (such as static lookup lists and so forth) when the application starts instead of when it is required. Users often find it more acceptable to have a longer startup time than sluggish performance after the application is loaded. PreprocessBefore reading the rest of this paragraph, start Excel and write the fastest possible VBA routine you can to calculate how many 1s there are in the binary representation of the numbers 0 to 255. How did you do it? Did you use the Dec2Bin function from the Analysis Toolpak? Did you use a recursive routine? Did you repeatedly divide by 2 and check whether the result was odd or even? Or did you work out the numbers yourself, hard-code the results in a VBA array and just read them at runtime, as shown in Listing 17-4? Listing 17-4. How Many 1s Are There in a Binary Number?Function CountTheOnes(ByVal iValue As Integer) As Integer Static vaOneCount As Variant 'Initialize the array If IsEmpty(vaOneCount) Then vaOneCount = Array(0, 1, 1, 2, 1, ... , 7, 8) End If 'Read the result CountTheOnes = vaOneCount(iValue) End Function By doing as much processing as possible when developing the application, our applications don't need to do the processing at runtime. Check the OrderThe best routines are those whose performance doesn't vary significantly with the volume of data being processed. For example, a routine that copied a set of data from a Variant array into a worksheet range, calculated the worksheet and returned a result would take approximately the same time whether there were ten or a thousand elements in the array. Such routines are said to have an order of 1 and are very hard to achieve in practice. The next best are those that vary linearly with the volume of data, such as one or more sequential For...Next loops through the data. These routines have an order of n, so if we have ten times as much data, the routine is likely to take approximately ten times as long. With a little thought and work, most routines can be reduced to order n. Nested loops result in routines that are very sensitive to the volume of data being processed. A routine with two nested loops has an order n2 and each extra level of nesting adds an extra order to the routine. If these routines are given ten times as much data to process, they're likely to take 100 or 1,000 times as long. If such a routine normally takes 1 second to complete, it might take 15 minutes to process 10 times as much data. In most cases, nested loops are just a quick and easy way to code an algorithm that could be redesigned as multiple sequential loops through the data. Note that nested loops will often be spread over many different procedures, for example where ProcedureA loops through an array and calls ProcedureB for each element, which itself loops through another array to process the element. As an example, consider the routine in Listing 17-5, which compares two arrays and processes any items that are in both. Listing 17-5. Compare Two ArraysSub ProcessLists(asArray1() As String, asArray2() As String) Dim lIndex1 As Long Dim lIndex2 As Long 'Loop through the first array For lIndex1 = LBound(asArray1) To UBound(asArray1) 'Loop through the second array For lIndex2 = LBound(asArray2) To UBound(asArray2) 'Do they match? If asArray1(lIndex1) = asArray2(lIndex2) Then 'Yes, so process it End If Next Next End Sub Without thinking too hard about how to improve this routine, we might be tempted to just add an Exit For to jump out of the inner loop after we've found a match, but that still leaves the routine essentially of order n2. If the two arrays are sorted, we can reduce this to order n by looping through both arrays within the same loop, as shown in Listing 17-6. Listing 17-6. Process Both Arrays Within One LoopSub ProcessLists(asArray1() As String, asArray2() As String) Dim lIndex1 As Long Dim lIndex2 As Long Dim iComp As Integer lIndex1 = LBound(asArray1) lIndex2 = LBound(asArray2) 'Loop through both arrays together Do 'Compare the elements from both arrays iComp = StrComp(asArray1(lIndex1), asArray2(lIndex2)) If iComp = 0 Then 'A match, so process it Debug.Print asArray1(lIndex1) 'And advance in both arrays lIndex1 = lIndex1 + 1 lIndex2 = lIndex2 + 1 ElseIf iComp = -1 Then 'Item in array1 is before item in array2, 'so move down array1 and check again lIndex1 = lIndex1 + 1 ElseIf iComp = 1 Then 'Item in array1 is after item in array2, 'so move down array2 and check again lIndex2 = lIndex2 + 1 End If 'Stop when we reach the end of one of the arrays Loop Until lIndex1 > UBound(asArray1) Or _ lIndex2 > UBound(asArray2) End Sub If the arrays are not sorted, it will probably be quicker to sort them both beforehand, then use the above routine. If the output has to be in a specific order (preventing us from sorting both arrays), we could sort asArray2 and use a binary search routine to see whether the string exists, or use a Dictionary object (see later for an example of each). Tighten the LoopHaving replaced the nested loops with more efficient algorithms, the next task is to make the code within the remaining loop as tight as possible. As well as implementing all the micro-optimizations shown later in this chapter, the primary goal is to ensure that in each iteration, we only execute the minimum amount of code possible. Returning to the question above "Does B have to follow A?" it is common to see loops that contain code to calculate intermediate results, followed by some tests to check whether the intermediate result should be used (because this reflects the order in which we originally thought about the routine). If we turn the routine on its head, we can do the tests first and only calculate the intermediate results for those elements we know we'll be using. Fast VBA AlgorithmsQuickSortThe QuickSort routine is one of the fastest sorting algorithms and should be used whenever you want to sort an array. It works by doing the following: Select one element from the array, typically taken from the middle Scan through the array moving everything that should come before the selected element to the bottom of the array and everything that should come after the selected element to the top of the array Call itself to sort the bottom half Call itself to sort the top half For best performance, you should have a number of QuickSort routines for specific data types, such as that shown in Listing 17-7 for one-dimensional string arrays. Listing 17-7. A QuickSort Routine for One-Dimensional String Arrays'************************************************************** '* '* FUNCTION NAME: QUICKSORT STRING ARRAY - 1D '* '* DESCRIPTION: Sorts the passed array into required order. '* The array must be a 1D string array. '* '* PARAMETERS: '* asArray A 1D string array of values to sort '* bSortAscending True = ascending order. '* iLow1 The first item to sort between '* iHigh1 The last item to sort between '* '************************************************************** Sub QuickSortString1D(asArray() As String, _ Optional bSortAscending As Boolean = True, _ Optional iLow1, Optional iHigh1) 'Dimension variables Dim iLow2 As Long, iHigh2 As Long Dim sKey As String Dim sSwap As String On Error GoTo PtrExit 'If not provided, sort the entire array If IsMissing(iLow1) Then iLow1 = LBound(asArray) If IsMissing(iHigh1) Then iHigh1 = UBound(asArray) 'Set new extremes to old extremes iLow2 = iLow1 iHigh2 = iHigh1 'Get value of array item in middle of new extremes sKey = asArray((iLow1 + iHigh1) \ 2) 'Loop for all the items in the array between the extremes Do While iLow2 < iHigh2 If bSortAscending Then 'Find the first item that is greater than the mid-point Do While asArray(iLow2) < sKey And iLow2 < iHigh1 iLow2 = iLow2 + 1 Loop 'Find the last item that is less than the mid-point Do While asArray(iHigh2) > sKey And iHigh2 > iLow1 iHigh2 = iHigh2 - 1 Loop Else 'Find the first item that is less than the mid-point Do While asArray(iLow2) > sKey And iLow2 < iHigh1 iLow2 = iLow2 + 1 Loop 'Find the last item that is greater than the mid-point Do While asArray(iHigh2) < sKey And iHigh2 > iLow1 iHigh2 = iHigh2 - 1 Loop End If 'If the two items are in the wrong order, swap them If iLow2 < iHigh2 Then sSwap = asArray(iLow2) asArray(iLow2) = asArray(iHigh2) asArray(iHigh2) = sSwap End If 'If the pointers are not together, do the next item If iLow2 <= iHigh2 Then iLow2 = iLow2 + 1 iHigh2 = iHigh2 - 1 End If Loop 'Recurse to sort the lower half of the extremes If iHigh2 > iLow1 Then QuickSortString1D asArray, bSortAscending, iLow1, iHigh2 End If 'Recurse to sort the upper half of the extremes If iLow2 < iHigh1 Then QuickSortString1D asArray, bSortAscending, iLow2, iHigh1 End If PtrExit: End Sub Binary SearchA binary search is a very quick way to locate an item within a sorted array. It works by doing the following: Compare the item to look for with the element in the middle of the array. If they match, we found it. If the item to look for is lower than the middle of the array, throw away the top half. If the item to look for is higher than the middle of the array, throw away the bottom half. Repeat 15, cutting the array in half each time until we find the item or run out of array. A binary search routine, such as that shown in Listing 17-8, is practically insensitive to the size of the array passed in, as doubling the size of the array results in only one extra iteration of the routine. Listing 17-8. A Binary Search Algorithm'************************************************************** '* Function Name: BinarySearchString '* '* Inputs: '* sLookFor - The string to search for in the array '* asArray - An array of strings, sorted in ascending order '* lCompareMethod - either vbBinaryCompare or vbTextCompare '* Defaults to vbTextCompare '* lNotFound - The value to return if the text isn't found '* Defaults to -1 '* '* Outputs: The position in the array, or -1 if not found '* '* Purpose: Uses a binary search algorithm to quickly locate a '* string within a sorted array of strings '* '************************************************************** Function BinarySearchString(ByRef sLookFor As String, _ ByRef asArray() As String, _ Optional ByVal lMethod As VbCompareMethod = vbTextCompare, _ Optional ByVal lNotFound As Long = -1) As Long Dim lLow As Long, lMid As Long, lHigh As Long Dim iComp As Integer On Error GoTo PTR_Exit 'Assume we didn't find it BinarySearchString = lNotFound 'Get the starting positions lLow = LBound(asArray) lHigh = UBound(asArray) Do 'Find the midpoint of the array lMid = (lLow + lHigh) \ 2 'Compare the mid-point element to the string 'being searched for iComp = StrComp(asArray(lMid), sLookFor, lMethod) If iComp = 0 Then 'We found it, so return the location and quit BinarySearchString = lMid Exit Do ElseIf iComp = 1 Then 'The midpoint item is bigger than us-- 'throw away the top half lHigh = lMid - 1 Else 'The midpoint item is smaller than us-- 'throw away the bottom half lLow = lMid + 1 End If 'Continue until our pointers cross Loop Until lLow > lHigh PTR_Exit: End Function Sort and ScanThe combination of a QuickSort and BinarySearch gives us a very efficient way of comparing two arrays, to process the elements in common, as shown in Listing 17-9. Listing 17-9. Combining a Sort and Binary SearchSub ProcessLists(asArray1() As String, asArray2() As String) Dim lIndex As Long 'Sort the second array QuickSortString1D asArray2 'Loop through the first array For lIndex = LBound(asArray1) To UBound(asArray1) 'Use the binary search routine to 'check if the element is in the second array If BinarySearchString(asArray1(lIndex), asArray2) <> -1 Then 'A match, so process it Debug.Print asArray1(lIndex) End If Next End Sub This is not quite as efficient as the example shown previously that relied on both lists being sorted, but is a very efficient and easy to understand alternative for use when the initial array must be left in its original order. The SORTSEARCH_INDEX udtWhen dealing with large 2D arrays, arrays of objects or multiple keys, it is usually more efficient to create a new indexing array and sort and search that than to try to sort and search the original array. An index array is an array of the SORTSEARCH_INDEX udt, which is defined as follows: Public Type SORTSEARCH_INDEX Key As String Index As Long End Type The key is the string used for sorting and searching, which is typically the value from the first column in a large 2D array, the name of an object or a concatenation of the values to sort an array by multiple columns. The index is the row number in the original array. When the udt array is sorted, we can loop through the elements in the sorted order, using the Index property to identify the appropriate row from the original array, as in Listing 17-10. Listing 17-10. Using the SORTSEARCH_INDEX udtSub UseIndexSort() Dim vaArray As Variant Dim lRow As Long Dim auIndex() As SORTSEARCH_INDEX 'Assume vaArray is a multicolumn, multirow Variant array 'e.g. as read from the worksheet vaArray = Selection.Value 'Create an index array of the same size ReDim auIndex(LBound(vaArray) To UBound(vaArray)) 'Populate the index array with the original row number 'and sort key For lRow = LBound(vaArray) To UBound(vaArray) auIndex(lRow).Index = lRow auIndex(lRow).Key = vaArray(lRow, 1) Next 'Sort the index array QuickSortIndex auIndex 'Loop through the sorted array For lRow = LBound(auIndex) To UBound(auIndex) 'The .Index element of the sorted udt is the row 'in the original array Debug.Print vaArray(auIndex(lRow).Index, 2) Next End Sub QuickSortIndex is a version of the QuickSort algorithm for arrays of the SORTSEARCH_INDEX user defined type and can be found on the CD in the workbook \Concepts\Ch17Optimizing VBA Performance\Algorithms.xls. The workbook also contains a version of the binary search algorithm, BinarySearchIndex, which searches for a string in the index array and returns the row number in the original array. |