There are five short, simple, quick rules to help you take advantage of indexes. An index lets you retrieve information from SAS data sets or tables. And according to Kirk Paul Lafler, if you keep these rules in mind, your information retrieval will be more efficient.
Lafler is the Senior Consultant, Application Developer, Trainer and Author at Software Intelligence Corporation. He calls his best practices ‘rules’ because “they keep certain things – simple things - in mind. They make it a lot easier to create solid, workable indexes – indexes that optimize your work.”
During his Midwest SAS Users Groups conference presentation, Lafler began by reviewing indexes. He explained that an index is triggered by WHERE clause processing. And subset creation is the purpose of WHERE clause creation and partly the purpose of the index – “to speed up processing to get your individual rows of data - your subsets.”
According to Lafler, you must always keep this process in mind because as you begin to decide on the columns you want when creating the index, the subset should be the smallest subset possible.
Keep these rules in mind:
- If the table is small, sequential processing may be just as fast, or faster, than processing with an index.
- Avoid creating more indexes than are absolutely necessary.
- If the page count, as displayed in the CONTENTS procedure, is less than 3 pages, an index may provide little or no value.
- If the data subset for the index is large, sequential access may be more efficient than using the index.
- If the percentage of matches is 15% or less (15% rule) of the overall population then an index may be beneficial.
- An index that provides little or no value should be removed because it uses resources. The costs associated with maintaining an index can outweigh its performance value, because an index is updated each time the rows in a table are added, deleted, or modified.
Remember that there is nothing wrong with dropping an index. “There is always a cost associated with maintaining an index,” said Lafler. “CPU costs go up, storage costs go up. So only use an index when you really need it and choose your columns well.”