6.3 Adding Global Image FunctionsYou have probably noticed that our apImage<> class does not include the image processing functions. We decided that the image class should only contain the absolute essentials. At first we were only going to offer set() and make the operations, such as add(), be functions rather than methods. However, set() and add() are so similar in appearance that we decided to keep them in the class interface.
6.3.1 Copying an ImageThere are three ways that images can be copied:Method 1: The source image is specified and the copy function creates the destination image. Method 2: The source image and destination image are specified. The pixel types are the same. Method 3: The source image and destination image are specified. The pixel types are different.
There is a more complicated case we need to consider. What happens if the destination image is not null? In other words, what should the behavior of copy() be if both the source and destination images are valid? Most imaging software packages will probably discard whatever is stored in the destination image and replace it with a copy of the source image. Doing so makes the implementation of copy() easier, but it does not address what the user is asking for. Our interpretation of the problem is quite different:If the source image is null, the destination image is also null.If no destination image is specified, create an image with the same boundary and alignment as the source image.If a destination image is specified, the copy operation is restricted to those pixels in common between the source and destination image. If there is no overlap in pixels, the destination image is set to null.You might wonder if all this is really necessary for a copy() function. The answer is yes, because we want to offer this type of functionality for any image processing function that looks like:
If we offer this functionality for some arbitrary function func(), we should have the same functionality for copy(), since it takes the identical form.Computing the overlap between two images is easy. Mathematically, we want to compute the intersection between the source image boundary and the destination image boundary. The result is an apRect that specifies which pixels should be processed. apRect has an intersect() method that will compute the overlap for us:
As you can see, we determine the pixels that need to be processed by computing the image windows (roi1 and roi2) that contain the intersection of the two images. Although this code is fairly simple, we do not want to duplicate it in every image processing function. Instead, we create a small framework to off-load most of the work, so new image processing functions can be added without having to worry about all the math. This framework also provides a consistent interface for each function. We are calling this a framework because we need more than one class to add this capability. Each class handles a specific type of image processing function, also referred to as an image filtering function. Usually this corresponds to the number of image arguments required by the function, but it can also include other parameters. We almost chose to use function objects, but decided in our own implementation to maximize code reuse between similar classes of image processing functions. For information on function objects, see [Stroustrup00]. 6.3.2 Processing Single Source ImagesSingle source image processing operations take a single source image and produce a single destination image. We provide a number of single source image processing operations, including:copy()operator+=operator-=duplicate()operator+= and operator-= might not look like single source operations, but these operators take a single image and add or subtract it to the destination image.Single source image processing operations have the following form:
SINGLE SOURCE PROCESSING CLASSWe provide a general class, apFunction_s1d1, that you can use to easily add your own single source image processing functions. (Note that we have chosen to make this class name descriptive, rather than concise, because it is used internally by developers. We reuse this naming convention when we provide a similar class, named apFunction_s1s2d1, for two source image processing functions.)apFunction_s1d1 lets us logically divide the processing operations, each as a separate method. We have made some of those methods virtual so that we can derive new classes from apFunction_s1d1 to handle custom requirements.In general, we cannot assume that the same intersection region is applied to both the source and destination images, so we keep these regions separate. We use the apIntersectRects structure, as shown:
The apFunction_s1d1 class is shown here.
apFunction_s1d1 has five template parameters. Four of these parameters are present because there are two images, each requiring two parameters. We have reordered the template parameters because many have default values that applications can simply use. This means that there are really only three parameters we need to consider:
Function looks a lot like the function that you will actually write (see our func() definition on page 206). The big difference is that the function you write can safely ignore issues, such as image overlap or constructing the destination image if none is specified. Our first argument is a placeholder for the data type for intermediate computations. We would have preferred to specify the first parameter using explicit template instantiation, but the compiler does not accept this.EXECUTE()The run() method is the main entry point of apFunction_s1d1, but it only calls the virtual function, execute(). The execute() method constructs the intersection and performs the image processing operation. execute() is only overridden if the standard rules for computing the image windows changes. We will soon see how image processing operations, such as convolution, require a new definition for execute(). The definition for execute() is shown here.
In broad terms, execute() does the following:Provides lock/unlock access to the image to prevent problems in multithreaded applications.Returns a null destination image if the source image is null. We can take advantage of the sNull definition available in every apImage object.Creates the destination image if none was specified. This is performed by the virtual function createDestination(). The default definition creates an image of the same size and alignment as the source image.Computes the intersection between the two images, creates an image window for each one, and stores the image windows in roi1_ and roi2_. We use the term roi, meaning region of interest, which aptly describes what these images represent. roi1_ will be identical to src1 if no destination is specified, or if the destination was the same size or larger than the source image. roi1_ and roi2_ are stored as member functions to keep the object as generic as possible. We thought about passing the computed images as parameters to process(), but derived classes might require other arguments as well so we decided against it.Calls process() to perform the image processing operation, which occurs inside a try block to catch any exceptions that might be thrown. The catch block does no special processing, other than to rethrow the error.INTERSECT()The intersection() method does nothing but call a global intersect() function. We added numerous intersect() functions to the global name space to encourage developers to use them for other purposes. The intersect() function is shown here.
PROCESS()We provide the process() function to allow derived classes to define their own processing behavior, if necessary. We create a placeholder variable so that the compiler will call function_ with the appropriate arguments, as shown:
We have kept this object, and the other objects like it, in a separate file to promote their use. The actual image processing operations are kept in separate files, based on what the functions do.COPY()Let's get back to our copy() example and look at how we handle its implementation. We start by designing the actual image processing operation, making sure that its function prototype matches the Function definition in apFunction_s1d1. We show two implementations of our copy function. Note that neither function requires the template parameter, R, so this parameter is ignored.ap_copy() defines the generic copy function and uses assignment to copy pixels from the source image to the destination image, as shown here. ap_copyfast() makes the assumption that memcpy() can be used to duplicate pixels, as long as source and destination images share the same data type. ap_copyfast() is slightly more complicated, because it uses typeid() to determine if the source and destination image share the same pixel type. To properly use typeid(), make sure that any compiler flags that enable Run-Time Type Information (RTTI) are set. The ap_copyfast() function is shown here.
The assumption that memcpy() can be used to duplicate pixels is usually, but not always, valid. For example, what if you had an image of std::string objects? It may sound absurd, but it demonstrates that blindly copying memory is not always appropriate.Our final version of copy(), written using the generic ap_copy(), is shown here.
As implemented, copy() addresses the issues raised when the source and destination images are specified and the pixel types may or may not be different. Note that T2 is passed as a parameter, R, which defines all of the template parameters; however, the copy function can ignore it. In addition, copy() offers improved performance when the pixel types match. (See the earlier discussion in Section 6.3.1 on page 204.)In the case where the source image is specified and the copy should create the destination image, we can create an overloaded version of copy() to take advantage of ap_copyfast(), as shown.
To demonstrate that we can use the STL generic algorithms with apImage<>, we rewrite copy() using std::copy(), as shown. The destination image must be allocated before calling std::copy(), since apImage<> does not support input iterators.
While this function may be easier to write, it also slows greatly during execution. Using our 2.0 GHz Intel Pentium 4 processor-based machine, the performance for allocating and copying a 1024x1024 8-bit monochrome image is as follows: copy() takes 4 milliseconds, while copy_stl() takes 16 milliseconds. Both functions produce the identical image as output. 6.3.3 Processing Two Source ImagesMany image processing functions operate on two source images and produce a single output image. We provide the following two source image processing operations:intersect()add()operator+sub()operator-Let's look at a few of these operations. We can use apFunction_s1d1 as a template to add a new object that computes the intersection of three images (the two source images and the destination image, if any). There are two images that supply source pixels to the image processing function. If a valid destination image is specified, its boundary information helps determine which source pixels to use in the image processing routine.Our new object, apFunction_s1s2d1, takes on a slightly more complicated form, because there are now two additional template parameters to refer to the additional image used by these operations. This brings the total number of parameters to seven, but for most uses only four are needed. The apFunction_s1s2d1 object is shown here.
INTERSECT()Let's look at the intersect() method, which is shown here. Note that we have removed a few virtual functions, compared to apFunction_s1d1, because we do not expect derived classes to be necessary.
ADD() AND OPERATOR+Now let's look at the add() operation. As we demonstrated with copy(), add() uses the apFunction_s1s2d1 class to add the contents of two images and store the result in a third. ap_add2() is the function that performs this operation. And like copy(), we can ignore the intermediate storage specifier (R in this case).
ap_add2() iterates row by row, and uses our generic add2() function to construct each destination pixel. We use add2() so that we can easily handle overflow detection (if we use apClampedTmpl<T> as parameters), or other optimizations based on pixel type.The user-callable functions are now easy to write. The only assumption that we make is with operator+, where the destination image is given the same pixel type and alignment as the first image specified. The implementation of add() and operator+ is shown here.
6.3.4 Processing Images with Neighborhood OperatorsA neighborhood operator is one where the content of many source pixels affects the contents of a single destination pixel. One of the most common types of neighborhood operations is called convolution. In convolution, the value of a pixel in the resulting filtered image is computed as a weighted sum of its neighboring pixels. The matrix of weights used in the summing operation is called a kernel.We provide the following convolution kernels:Low-pass kernel for eliminating noiseLaplacian kernel for sharpening edgesHigh-pass kernel for sharpening edgesGaussian kernel for smoothing edges Low-Pass Kernel for Noise EliminationNoise in an image may come from such phenomena as photographic grain. Noise often appears as random pixels interspersed throughout the image that have a very different pixel value than those pixels immediately surrounding them (or the neighboring pixels).Figure 6.6 illustrates the application of a noise-smoothing filter. Figure 6.6. Low-Pass Averaging FilterThere are many different algorithms for smoothing noise in an image. Noise in an image generally has a higher spatial frequency because of its seeming randomness. We use a simple low-pass spatial filter to reduce the effects of noise through an averaging operation. Each pixel is sequentially examined and, in the 3x3 kernel we use, the pixel value is determined by averaging the pixel value with its eight surrounding neighbors.Given a point (x,y) we average nine pixels in a 3x3 neighborhood surrounding this point. This 3x3 kernel is shown here.This kernel is sequentially centered over each pixel. The value of each pixel and its surrounding neighbors are multiplied and then summed. Finally, the average value is computed using the following formula:
Figure 6.7 shows how this kernel is used to reduce the noise on a single pixel in an image. Figure 6.7. Low-Pass FilterThis function is easy to write, until you consider the boundary conditions. Consider a source image with an origin of (0,0). In Figure 6.7, the origin has a pixel value of 215. To compute the destination point in the filtered image at (0,0), our equation shows that we need information from pixels that do not exist (for example, S(-1,-1)). We cannot compute a pixel when the data does not exist.This boundary condition can be handled in a number of different ways. One very common solution is to set all boundary pixels to 0 (black). We recommend a different, more generalized solution that has several advantages. By effectively optimizing the problem to determine which pixels are needed to compute the destination (or filtered) image, our solution allows developers to ignore complicated boundary solutions.Here's how it works. We compute the intersection of the source and destination image, taking the size of the kernel into account. Our intersection() function computes which pixels to process. In our example, the kernel size is 3 (using the 3x3 kernel above). The intersection() function assumes the kernel size is odd, making it possible to center the kernel over the pixels to process. The function is as follows:
As you can see, this function is very different from the simple intersection operations we have written so far. Let's apply this function using an example with our 3x3 kernel. Assume that both the source and destination images have an origin at (0,0) and are 640x480 pixels in size as shown in Figure 6.8. Figure 6.8. Source and Destination ImagesDetermine if the kernel size is too small, and therefore no intersection exists. This is a degenerate case. In our example, the kernel size of 3 is fine.Determine which pixels the destination image needs in order to compute an output value for every pixel in the image. To do this, we increase the size of the destination region to show the pixels that are needed from the source image to fill the destination. For our 3x3 kernel, this amounts to expanding the size of the destination region by one pixel in all directions, as shown in Figure 6.9. Figure 6.9. Find the Destination RegionThis is also called a dilation operation. The destination region has an origin at (0,0) and is 640x480 pixels in size. The expanded region has an origin at (-1,-1) and is 642x482 pixels in size.Intersect this expanded destination region with the source region to find out exactly which pixels are available in the source image. This produces an intersection region at (0,0) of size 640x480 pixels, as shown in Figure 6.10.Figure 6.10. Find the Available Source Image PixelsReturn a null apRect is there is no intersection or if the intersection is too small.Compute the actual destination pixels that will be manipulated. We determine this by reducing the size of the source region by the one pixel in all dimensions as shown in Figure 6.11.Figure 6.11. Find the Available Destination Image PixelsThis is also called an erosion operation. Eroding this region shows what pixels in the destination region can be computed, with an origin at (1,1) and 638x478 pixels in size.This says what we already know: if the source and destination images are the same size, there is a one pixel border surrounding the destination image that cannot be computed. Under common operating conditions, these calculations result in a long, but simple, result. It has much more utility when you need to process a region of interest of a larger image. With larger images, the destination image can often be filled with valid results, since the source image contains all of the necessary pixels.Our neighborhood processing is similar to the one source image, one destination image case we discussed earlier. Our image processing class, apFunction_s1d1Convolve, is derived from apFunction_s1d1 to take advantage of our modular design. Besides taking additional parameters, this object overrides the member functions that compute the intersection and creates a destination if none was specified.We can write a general purpose convolution routine by writing our processing function to take an array of kernel values and a divisor. For example, the following kernel is what our image framework uses to compute a low-pass averaging filter:
As you can guess, convolution is a fairly slow process, at least when compared with simple point processing routines. This function is somewhat dense and needs some further explanation.There are four loops in this function. The outer loops step pixel by pixel in the destination image. The inner two loops perform the neighborhood operation on the source image, by multiplying a kernel value by the source pixel value and accumulating this term in sum.R is the datatype used to represent intermediate computations. sum is a variable of type R that is used during the computation of each destination pixel. If we did not have the forethought to add this template parameter, R, then sum would have been of type T2 (the destination pixel type) and would have likely caused pixel overflows.When you call a convolution function, you must explicitly specify the datatype of R.Once sum is computed, it is scaled by the divisor, which is 9 in our example, to create the output pixel value. Some convolution kernels have a divisor of 1, and we can achieve much higher performance by making this a special case. For example, we saw a 10% performance improvement for a 1024x1024 image when we added this optimization.apLimit<> is used to prevent pixel overflows. Unlike many of our image processing functions, where the user can select special data types to prevent overflow (by use apClampedTmpl<>), convolution always enforces this constraint.Kernel values are expressed as a char. This is sufficient for most convolution kernels. However, some large kernels, especially Gaussian filters, may have values that do not fit. If this is the case, you will need your own convolve() function that defines the kernel as a larger quantity.Fortunately, all of these details are hidden. To perform convolution, you can simply call the convolve() function and explicitly specify R. Its definition is shown here.
Our example using an averaging low-pass filter now looks like the following:
If you call convolve() without specifying a value for R (i.e., as in convolve<Pel32>), the compiler will generate an error to remind you to add one. Laplacian Kernel for Sharpening EdgesThe edge of an object is indicated by a change in pixel value. Typically, there are two parameters associated with edges: strength and angle. The strength of an edge is the amount of change in pixel value when crossing the edge. Figure 6.12 illustrates strength by the length of the arrows. The angle is the angle of the line as drawn perpendicular to the edge. Figure 6.12 illustrates angle by the direction of the arrows. Figure 6.12. Edge DefinitionThere are many methods for sharpening edges. A very effective and simple image processing technique is to ignore the angle and use the strength to sharpen the edges. You can accomplish edge sharpening by using a Laplacian mask (or kernel) in a convolution operation on the image. The Laplacian kernel generates peaks where edges are found. Our framework provides the following Laplacian kernel:If we sum of all the values in the kernel, we see that they sum to zero. This means that when this kernel is run over a constant, or slowly varying image, the output will be zero or close to zero. However, when the kernel is run over a region where strong edges exist (the center pixel tends to be brighter or darker than surrounding pixels), the output can be very large. Figure 6.13 illustrates the application of an edge sharpening filter.Figure 6.13. Laplacian FilterWe can write a function that is similar to ap_convolve_generic, but uses this specific Laplacian kernel, as shown.
Many of the function arguments are ignored. By keeping the arguments identical for any filtering routine, we can reuse our framework with only the expense of a few wasted parameters. Note that this function still works for arbitrary pixel types. Although we have hard-coded the kernel operator into the function, we have made no additional assumptions about the pixel type.The function works as follows:It unrolls the two inner loops that are inside ap_convolve_generic and explicitly computes the summation of the kernel using the source pixels.It uses the variable pitch to specify the number of pixels to skip after we process one line of input pixels to get to the start of the next line. Precomputing this value allows us to quickly skip from one line to the next.While this function efficiently processes monochrome data types, it is slower for color images. To address this issue, we can take advantage of template specialization and we can define a special version of ap_convolve_3x3laplacian that works with apRGB (an 8-bit RGB image). To do this, we not only unroll the two inner loops, but we also explicitly compute the RGB values. This function is not difficult to write and it produces a dramatic increase in speed, as shown here.
The template<> prefix to the function tells the compiler that this is a specialization. You have to pay careful attention to the arguments, since you are replacing generic parameter types with explicit ones. You will still have to specify the template, R, although this value is hard-coded as apRGBPel32s in the function. It is important that this value is signed, because the Laplacian kernel contains both positive and negative values.There is one more small change to our template specialization for ap_convolve_3x3laplacian. As we discussed in class Versus typename on page 25, we cannot use the keyword typename in our specialization without generating an error. The line from our generic template definition:
must be changed to:
To use the Laplacian filter, you can simply call the laplacian3x3() function and, as with convolve(), explicitly specify the R template parameter. The definition of laplacian3x3() is shown here.
Table 6.1 shows the performance results when computing the Laplacian of a 1024x1024 apRGB image, with and without specialization, on our Intel Pentium 4 processor-based test platform, running at 2.0 GHz.
High-Pass Kernel for Sharpening EdgesAnother way to sharpen edges, especially those in scanned photographs, is to use a convolution operation with a high-pass kernel. High-pass kernels enhance pixel differences and effectively sharpen edges. Our framework provides the following high-pass kernel:If we sum of all the values in the kernel, we see that they sum to one. This means that when this kernel is run over a constant, or slowly varying image, the output will be very close to the original pixel values. In areas where edges are found (i.e., the pixel values vary), the output values are magnified. Figure 6.14 illustrates the application of a high-pass edge sharpening filter. Figure 6.14. High-Pass FilterGaussian Kernel for Smoothing EdgesYou can use a convolution operation with a Gaussian kernel to smooth the edges in your image. This technique usually produces a superior result to the low-pass kernel we presented on page 219. Our framework provides the following Gaussian kernel:Like our general convolution kernel, the Gaussian kernel uses summing and averaging to assign new values to pixels in the filtered image. The effect is that the strong edge differences are reduced, giving the filtered image a softer or blurred appearance. This is useful for reducing contrast or smoothing the image to eliminate such undesired effects as noise and textures. Figure 6.15 illustrates the application of a Gaussian edge smoothing filter. Figure 6.15. Gaussian Filter6.3.5 Generating ThumbnailsWe could not end a section on image processing routines without reviewing our global thumbnail() function in its final form. This is a stand-alone function. thumbnail() always computes the destination image. Figure 6.16 illustrates the application of the thumbnail() function. Figure 6.16. Thumbnail FunctionThe thumbnail() function is as shown.