SAS performance tuning - A little bit goes a long way

7

mwsug-2016-logoOver the past 37 years I've had the good fortune to be able to attend and present at hundreds of in-house, local, regional, special-interest and international SAS events. I am a conference junkie. I've not only attended thousands of presentations, Hands-On Workshops, tutorials, breakout sessions, quick tips, posters, breakfasts, luncheons, mixers and more, but have had the privilege of hearing, seeing and networking with thousands of like-minded SAS users and presenters as they share valuable tips, techniques, advice, and suggestions on how to best use the SAS software.

For me, attending, volunteering and participating at SAS conferences and events has not only brought personal satisfaction like nothing else, it has allowed me to grow myself professionally and make many life-long friends. One of my objectives while attending a conference is to identify and learn at least three new things I didn't already know about SAS software. These three new things could consist of "cool" programming tips, unique coding techniques, "best" practice conventions, or countless other SAS-related nuggets.

At the upcoming 2016 MidWest SAS Users Group (MWSUG) Educational Forum and Conference, I'll be presenting several topics near and dear to my heart including "Top Ten SAS Performance Tuning Techniques." This 50-minutes presentation highlights my personal top ten list of performance tuning techniques for SAS users to apply in their programs and applications. If you are unable to attend, here are a couple programming tips and techniques from each performance area to consider.

CPU Techniques

1. Use IF-THEN / ELSE or SELECT-WHEN / OTHERWISE in the DATA step, or a Case expression in PROC SQL to conditionally process data.

2. CPU time and elapsed time can be reduced by using the SASFILE statement to process the same data set multiple times.

I/O Techniques

1. Consider using data compression for large data sets.

2. Build and use indexed data sets to improve access to data subsets.

Data Storage Techniques

1. Use data compression strategies to reduce the amount of storage used to store data sets.

2. Use KEEP= or DROP= data set options to retain desired variables.

Memory Techniques

1. Use memory-resident DATA step constructs like Hash objects to take advantage of available memory and memory speeds.

2. Use the MEMSIZE= system option to control memory usage with the SUMMARY procedure.

Want to learn more SAS tips, techniques and shortcuts like these? Please join me at the MidWest SAS Users Group Conference October 9 – 11 at the Hyatt Regency in downtown Cincinnati, Ohio. Register now for three days of great educational opportunities, 100+ presentations, training, workshops, networking and more.

I look forward to meeting and seeing you there!

Share

About Author

Kirk Paul Lafler

Consultant and founder of Software Intelligence Corporation

Kirk Paul Lafler, consultant and founder of Software Intelligence Corporation, has been a SAS user since 1979. As a SAS Certified Professional, Kirk provides IT consulting services and training to SAS users around the world. He is the author PROC SQL: Beyond the Basics Using SAS, Second Edition. He’s also written more than 100 peer-reviewed technical articles and writes the popular SAS tips column, "Kirk's Korner, " that appears in several SAS Users Group newsletters. Kirk is a frequent speaker at SAS Users Group meetings.

Related Posts

7 Comments

  1. Leonid Batkhan

    I would use caution recommending data compression in relation to "performance tuning".
    According to SAS Techniques for Optimizing I/O compression means that more CPU time is needed to decompress the observations as they are made available to SAS. But if your concern is I/O and not CPU usage, compressing your data might improve the I/O performance of your application.

    • Hi Leonid,

      Thank you for your comment!

      I agree with you entirely. But, I've always found the optimization process to be one where trade-offs between one resource area and another often occur. In these situations, I recommend examining and understanding the resource needs of your organization and then implementing the techniques that positively impact affected areas. My paper goes into much more detail than what I'm able to do here.

      Thanks again!

      Best Regards,

      Kirk

      • You are right of course. Sorry about the mixed up message.
        And to complement my point:
        1- When I/Os are divided by 10, the load is much more balanced between the CPU and I/O resources. Indeed, whenever I use SPDE, I almost always set option compress=binary.
        2- The only case where this might not be beneficial is when CPU is at a premium, for example on mainframes, and where storage space is not an issue.

    • Hi Chris,

      I found your post interesting. I will definitely have to run some benchmarks. When you ran your tests did you set or modify any system options? Also, were the results comparable for all data set sizes (e.g., small, medium, large)?

      Thanks for sharing the link. I'll followup once I've have time to run some benchmarks.

      Kirk

Leave A Reply

Back to Top