Data Mining 101 Q&A


My colleagues Tapan Patel, Wayne Thompson & Chris Stephens hosted over 550 live attendees on June 16th for our Data Mining 101 live webinar, part 3 of the Applying Business Analytics Webinar Series. Folks joined from all over North America as well as 32 other countries around the world. Since many people had similar questions during the webinar, I thought it was worth sharing some of the Q&A with you! Enjoy the read:

Q: What is the rate of false hits in using this kind of discovery tool, e.g. in detecting criminal activity?
A: Unfortunately, it depends on several factors and there is not an easy answer to this. Some of the factors include the modeling techniques used with EM and any data preparation used (imputation, transformations, etc.). It is also very data dependent.

Q: How can we use scoring for collection agencies?
A: Scoring is just the process of applying the model to data from new customers. You could build a model to predict which accounts are most likely to be "un-collectable" and then apply that model to new accounts to decide how much effort to put in.

Q: Can we achieve the goals for credit scoring using Base SAS instead of SAS Enterprise Miner?
A: It's possible to write SAS code. However, you will not get some of the automation benefits of SAS Enterprise Miner, especially the automated generation of score code, which greatly facilitates putting the model into a production environment.

Q: Is optimized binning possible in Base SAS? I believe it is just possible in SAS Enterprise Miner.
A: That's correct - It's only available in SAS Enterprise Miner.

Q: Does the data have to complete for logistic regression? What if we have a lot of missing values?
A: You do need to account for missing values. SAS Enterprise Miner provides tools for imputing missing values as part of the data mining process flow.

Q: How would one incorporate a model built to run via SAS Enterprise Guide into this software?
A: You can use the Import Model Node. It can import and assess a model that was not created by one of the SAS Enterprise Miner modeling nodes.

Q: Would the assessments be model performance recommendations or model design recommendations?
A: The assessment statistics are based on the model created in Base SAS, SAS Enterprise Guide, etc. It will calculate Lift, ROC, Gini, etc. It does not recommend any alternate designs.

Q: Will the SAS add-in work for Excel 2010 and allow for interface with Power Pivot?
A: The SAS add-in does support Excel 2010. I'm not familiar with Power Pivot, but I don't believe that SAS provides any specific integration with that feature.

Q: Can models be stored that are created by hand (SAS code) rather than by GUI?
A: Yes

Q: Is it easy to import data? Can you show examples of importing data from different file types?
A: Yes - There is a file import wizard that enables you to import data from various sources. However, I don't believe that Wayne will be showing that today

Q: Is there any source (maybe your website) that could show me how robust SAS capabilities are?
A: There are a number of sources. To ensure we best address what you mean by robustness, I suggest you contact us directly which you can do from You might also check out our support site.

Q: I noticed that several of the input variables did not have a normal distribution. Do you need to use the Transformation node to normalize inputs before using in models? How do you treat interval, nominal and ordinal data differently in transformation proc?
A: You don't have to but you should. The best practice would be to normalize the inputs. We have several different transformations specific to interval and non-interval models. Most of the time, the transformed input variables make for a more robust model.

Q: Is SAS Model Manager sold separately from SAS Enterprise Miner or is it bundled?
A: You can buy them separately or in a bundle. SAS Enterprise Model Manager is the name of the bundle.

Q: What is the storage space consumed when creating the different models?
A: The storage space required in SAS Model Manager is minimal since you are storing the model logic and not the data associated with the model.

Q: Can these do Bayesian data mining?
A: SAS Enterprise Miner does not provide built-in capabilities for Bayesian data mining. However, SAS does provide capabilities for Bayesian analyses through the SAS/STAT module

So, keep them coming! If you watch any of the Analytics On Demand webinars and have some questions that you need answers to, please ask away! We hope to see you next month on July 21st at Forecasting 101!


About Author

Kristine Vick

Principal Marketing Specialist

Kristine is an energetic, innovative, results focused marketing practitioner. She strives to share great analytical stories and successes. Kristine helps others see the big picture while taking care of details and thinking of creative ways to get more done!

Comments are closed.

Back to Top