Image-specific processing functions in computer vision

Computer vision pipelines require various kinds of image preprocessing. Some examples would be resizing images, rotating images by some angle, blurring images using some kernel. We refer to resizing, rotating, and blurring as the processing functions, and size, angle, and kernel as the corresponding parameters. There are times when different images might need to be processed using the same function but with different parameters. To accomplish this, SAS has developed image-specific processing functions in the Image and BiomedImage action sets in SAS Visual Data Mining and Machine Learning. First, I will describe the syntax of the image-specific functions. Second, I will describe how they can be used for image augmentation. Finally, I will present the performance improvement obtained by using an image-specific function.

Image-specific function versus global function

Prior to the image-specific functions, the Image and BiomedImage action sets only offered global image processing functions. Unlike an image-specific function, the global function processes all input images using the same parameter values. For example, the RESIZE function in the processImages action resizes all input images to the same size.

The syntax of the image-specific functions is very similar to the syntax of the global functions. So, if you have used the global functions in the past, you will find the image-specific functions intuitive and easy to use.

Figure 1: Global vs image specific function — Figure 1: Global versus image specific function

Figure 1 shows the syntax of the RESIZE global function, and the related image-specific function named RESIZE_SPECIFIC. Note the syntax of the two functions is very similar except for the width and height parameters.

In the RESIZE function, the width and height parameters are of numeric type. Specifically, in the example the width=200 and height=400, the RESIZE function resizes all input images to 200x400.

In the RESIZE_SPECIFIC function, the width and height parameters are of string type. These parameter values are the column names in the input CAS table. In the example, the width is set to “rwidth”, and the height is set to “rheight”. The RESIZE_SPECIFIC function resizes each input image to the width and height values specified in these columns. Because each input image can have a different value in the “rwidth” and “rheight” columns, the RESIZE_SPECIFIC function resizes each image with image-specific width and height values.

Image augmentation using image-specific functions

To reduce overfitting, deep convolutional neural network models require a large amount of training data which is often not available. In such cases, image augmentation is used for increasing the size of the training data set. It involves artificially creating many images by applying various image transformations such as rotation, blurring, cropping, for example, on a set of available training images. This section demonstrates the use of image-specific functions for image augmentation.

Figure 2 shows the approach for using image-specific functions for image augmentation. Initially, n images are loaded in a training image CAS table. The CAS table is appended p times to generate an augmented image table containing n x p images. Effectively, the augmented image table contains p copies of each of the training images. Then the image-specific functions are applied to the augmented image table using randomly generated parameter values. As a result, the images in the augmented image table are randomly transformed.

Figure 2: Image augmentation using image-specific functions

The following code is for applying the image-specific functions to the augmented images table. Specifically, the code uses the RESIZE_SPECIFIC, GAUSSIAN_FILTER_SPECIFIC, and MUTATION_SPECIFIC functions. We use the computedvars and computedvarsprogram features of a CAS table to randomly generate and assign the parameter values. For example, width=rand('integer', 100, 600), assigns the width parameter to a random value between 100 and 600 using rand, a pseudo-random generator function. The width parameter is used by the RESIZE_SPECIFIC function.

   r = s.image.processimages(
      casout=augment_tbl,
      steps=[
         dict(step=dict(stepType='resize_specific', 
            width='width', 
            height='height')),
         dict(step=dict(stepType='gaussian_filter_specific', 
            kernelwidth='kernelwidth', 
            kernelheight='kernelheight')),
         dict(step=dict(stepType='mutations_specific', 
            type='rotate_right',
            angle='angle', 
            paddingmethod='constant',
            b=128,
            g=128,
            r=128))
      ],
      table=dict(name=augment_tbl_name,
         computedvars=['kernelwidth', 'kernelheight', 'angle', 'width', 'height'],
         computedvarsprogram='''
            kernelwidth=rand('integer', 1, 10); /* Random integer between 1 and 10 */
            kernelheight=rand('integer', 1, 10);/* Random integer between 1 and 10 */
            angle=rand('integer', -15, 15);     /* Random integer between -15 and 15 */
            width=rand('integer', 100, 600);    /* Random integer between 100 and 600 */
            height=rand('integer', 50, 400);    /* Random integer between 50 and 400 */
         '''
         )
   )

Figure 3 shows a sample of the images that are generated using the preceding code. Observe that the generated images are rotated, blurred, and resized randomly.

Figure 3 image-specific processing functions: Sample of images from the augmented image table — Figure 3: Sample of images from the augmented image table

Further, Figure 4 shows an original training image from the training table and the corresponding images from the augmented image table. In this case, we generated three augmented images for each training image, so p = 3.

Figure 4 image-specific processing functions: Sample training and augmented images — Figure 4: Sample training and augmented images

Performance comparison between global function and image-specific function

Although a global processing function can be used for processing different images using different parameters, it is very inefficient compared to an image-specific function. We’ll demonstrate this in the following performance study.

In this study, we use a document rotation use case. A document image might be generated by scanning a document or by taking its picture. In either case, the document might not appear straight in the image. The image might need to be rotated to straighten the document. Figure 5 shows a few examples of such images. In a data set of such document images, each image might need to be rotated by a different angle.

In this post, we assume the rotation angle to straighten a document is available as input. The approach for computing the rotation angle is out of scope.

Figure 6 shows the global and image-specific approaches for rotating the document images.

Figure 6: Global and image-specific approaches for document rotation

Global rotation

In this approach, the document images are loaded in a CAS table. An image-angle map file containing the rotation angle for each image is loaded in a Pandas dataframe. For each image, the global rotation function is called using the corresponding angle value from the image-angle map. The resulting rotated image is appended to a results table.

Image-specific rotation

The first two steps in this approach are the same as the global rotation approach. Lastly, the images table and the image-angle map table are merged using the SAS Viya FedSQL action. The merged table contains the document images and the corresponding rotation angles. Finally, the image-specific rotation function is executed using the merged table as input. The output of the function is a table with rotated images.

Note the global rotation approach is difficult to implement because it requires the user to code the logic for looping over each image, reading, and specifying the parameters for each image, and appending the action output to a results table. Because the image-specific approach does not require such coding, it is a lot simpler to implement.

We ran each of the above two approaches on input tables containing images in the range of 500 to 3500 with a step of 500. Each run was repeated five times. The execution times were collected for each run.

Figure 7 shows the average execution times. The X-axis is the image count in the input table. The Y-axis is the average execution time. The figure shows both global and image-specific execution times grow linearly with the image count. More importantly, the image-specific execution time is significantly smaller than the global execution time. For an input table with 3,500 images, the global execution time is around 3,200 seconds, whereas the image-specific execution time is around 82 seconds. This shows the image-specific function is about 40x faster than the global function.

Figure 7: Average execution time by input table image count

Image-specific processing functions conclusion

This post demonstrates that the image-specific functions are not only simpler to use but are also highly performant. As a user of the Image or BiomedImage action set, if your use case demands processing different images with different parameters, you should use an image-specific function. On the other hand, if your use case demands processing all images with the same parameters, then you should use a global function.

We also discussed the syntax of the image-specific functions and showed how they can be used for image augmentation. Further, we presented the results of a performance comparison which show the image-specific approach for rotation is significantly (about 40 times) faster than the global rotation approach.

We are planning to roll out more image-specific functions in the future SAS Viya releases.

LEARN MORE | SAS Visual Data Mining and Machine Learning LEARN MORE | SAS Viya

Blogs