How to use Suggestions in SAS® Data Studio

0

Got SAS® Data Preparation? With the release of SAS® Viya® 3.5, you now have a Suggestions feature in SAS® Data Studio.

The Suggestions feature uses machine learning models to analyze your data and suggest transforms based on the type of data found in your data set. You can add the suggested transforms to your SAS Data Studio plan, allowing you to address data quality issues with your data set. (Note: To get the best results with using the new Suggestions feature, you need the SAS® Quality Knowledge Base for Contact Information 31 or later registered and set as default in SAS Viya.)

Register models for use in the Suggestions feature

Before you begin using the Suggestions feature in SAS Data Studio you must first register the models used by the service. A default Models caslib is provided for your default CAS server during the installation process. To register the Suggestions models, open SAS Data Studio and select an input data set.

Select Register Models to register the supplied suggestion models.  It may take a few minutes to register the models. After the models have been registered, they are listed in this window.

Review the list of registered models. Note:  Currently, you cannot add custom models to this list for Suggestion analysis. Select Close to close the window.

Working with Suggestions

The Display option indicates the maximum number of suggestions to make. The default is 100. You can also select to Show suggestions for the Columns and table, Columns only, and table only. The default is Columns and table. Finally, you can select the columns to analyze for suggestions. You can select all or subset of the table’s columns.

Once you have made your selections, select OK to save the settings.

Select Get Suggestions to begin

Select Get Suggestions to start the analysis. The contents of the selected columns are analyzed using the models and a list of suggested transforms are provided based on your settings. This is especially useful for datasets where you have generic column names and unsure what kind of data is in the fields (e.g., Var1, Var2, etc.)

You can review the transforms added by selecting the step in the Plan pane.

Notice that the suggestion for standardizing the Name field is using the Name definition from the QKB with the default locale. In my case that is the English – United States locale.

The suggestion for standardizing the State field is using the State/Province (Full Name) definition from the QKB with the default locale.

I change this step to use the State/Province (Abbreviation) instead.

You can select Get Suggestions again to refresh the list and remove the already run suggestions.

To Learn More | Visit SAS Documentation
Share

About Author

Mary Kathryn Queen

Principal Technical Training Consultant

Mary Kathryn Queen is a Principal Technical Training Consultant in the Global Enablement and Learning (GEL) Team within SAS R&D's Global Technical Enablement Division. Her primary focus is on SAS Data Management technologies, particularly data quality, data preparation, and data governance.

Related Posts

Comments are closed.

Back to Top