Using an alternate Quality Knowledge Base (QKB) in a DataFlux Data Management Studio data job

0

In DataFlux Data Management Studio, the predominate component of the SAS Data Quality bundle, the data quality nodes in a data job use definitions from something called the SAS Quality Knowledge Base (QKB). The QKB supports over 25 languages and provides a set of pre-built rules, definitions and reference data that conduct operations such as parsing, standardizing and fuzzy matching to help you cleanse your data.  The QKB comes with pre-built definitions for both customer and product data and allows for customization and addition of rules to accommodate new data types and rules specific to your business. (You can learn more about the QKB here.)

Sometimes you may want to work with an alternate QKB installation that contains different definitions within the same data job. For example, your default QKB may be the Contact Information QKB; however, in your data flow you may want to use a definition that exists in the Product Data QKB.  These data quality nodes have the BF_PATH attribute as part of their Advanced Properties enabling you to do this.
Note: You must have the alternate QKB data installed and be licensed for any QKB locales that you plan to use in your data job.

Here is an example data job that uses the Advanced property of BF_PATH to call the Brand/Manufacturer Extraction definition from the Product Data QKB. It also calls Standardization definitions from the default QKB of Contact Info. Notice that from the data flow perspective, it is one seamless flow.

Using an alternate Quality Knowledge Base in a DataFlux Data Management Studio data job

The BF_PATH advanced property setting for the Extraction node that is using the Brand/Manufacturer definition from the Product QKB contains the path of where the Product Data QKB was installed.
Note: The path setting could be set as a macro variable.  The path information can be obtained from the QKB registration information in the Administration riser bar in Data Management Studio.

Using an Alternate Quality Knowledge Base (QKB) in a DataFlux Data Management Studio Data Job02

Once BF_PATH advanced property is set, you will not be able to use the user interface to make your definition selection. You will need to know the definition name and any other relevant information for the node, so you can add the information using the appropriate Advanced properties.

In my example, I need to set the definition and token fields for the Brand/Manufacturer definition using the PARSE_DEF and PARSE_DEF Advanced properties.

Using an Alternate Quality Knowledge Base (QKB) in a DataFlux Data Management Studio Data Job03

Note: I was able to find out the needed information by viewing the Product Data QKB information in the Administration riser bar in Data Management Studio.

Using an Alternate Quality Knowledge Base (QKB) in a DataFlux Data Management Studio Data Job04

Here is the user interface for the Extraction node that is using the Brand/Manufacturer definition from the Product QKB and its data preview.

Note: The definition cannot be displayed since it is not in the Active QKB.  You can select the Extraction field and Additional Output information on the user interface.

Using an Alternate Quality Knowledge Base (QKB) in a DataFlux Data Management Studio Data Job05

In conclusion, the Advanced property of BF_PATH is useful when you want to use an Alternate QKB installation in your data job.  For more information on Advanced properties for nodes in DataFlux Data Management Studio, refer to the topic “Advanced Properties” in the DataFlux Data Management Studio 2.7: User’s Guide. For more information on the SAS Quality Knowledge Base (QKB), refer to its documentation. Finally, to learn more about SAS Data Quality, visit sas.com/dataquality.

Share

About Author

Mary Kathryn Queen

Principal Technical Training Consultant

Mary Kathryn Queen is a Principal Technical Training Consultant in the Global Enablement and Learning (GEL) Team within SAS R&D's Global Technical Enablement Division. Her primary focus is on SAS Data Management technologies, particularly data quality, data preparation, and data governance.

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Top