This week's SAS tip is from Barry de Ville and his book Decision Trees for Business Intelligence and Data Mining: Using SAS Enterprise Miner. Barry is a technical and analytical consultant at SAS. To learn more about Barry and his forthcoming new edition of the book, following this week's excerpt, visit his author page.
The following excerpt is from SAS Press author Barry de Ville and his book "Decision Trees for Business Intelligence and Data Mining: Using SAS Enterprise Miner" Copyright © 2006, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (please note that results may vary depending on your version of SAS software)
The Basics of Decision Trees
The goal of this section is to provide a comprehensive and detailed overview of the process of growing a decision tree. Many of the most common decision tree options and approaches are covered. These options and approaches have their roots in the original AID algorithm, as well as successor algorithms, such as CHAID, ID3, and CRT. The decision tree component of SAS Enterprise Miner incorporates and extends these options and approaches. It includes the popular features of CHAID and CRT and incorporates the decision tree algorithm refinements of the machine learning community (including the methods developed by Quinlan in ID3 and its successors).
The SAS Enterprise Miner decision tree supports both interactive (manual) and automatic growth approaches. Adjustable defaults are provided in both interactive and automatic approaches to help identify the best decision tree models for the analyst’s purpose.
The decision tree growing process can be broken down into a number of subprocesses, as shown in Figure 3.1.
These steps are performed in sequence, with the development of each layer of branches (or levels) of the decision tree. The decision tree growing process—steps 4 and 5—is an iterative process. This means that once the steps have been applied to the main set of data, which forms the root node of the decision tree, they can be reapplied recursively to any descendents of the root node.
Step 6—Complete the form and content of the final decision tree—is subject to both formal and informal shaping methods, which are used to terminate tree construction often before the mechanical components of the tree-growing algorithms stop functioning.