With SAS Data Management, you can set up SAS Data Remediation to manage and correct data issues. SAS Data Remediation allows user- or role-based access to data exceptions.
When a data issue is discovered it can be sent automatically or manually to a remediation queue where it can be corrected by designated users. The issue can be fixed from within SAS Remediation without the need of going to the affected source system. For more efficiency, the remediation process can also be linked to a purpose designed workflow.
It involves a few steps to set up a remediation process that allows you to correct data issues from within SAS Remediation:
- Set up Data Management job to retrieve data and correct data in remediation.
- Set up a Workflow to control the remediation process.
- Register the remediation service.
Set up Data Management job to retrieve and correct data in remediation
To correct data issues from within Data Remediation we need two real-time data management jobs to retrieve and send data. The retrieve job will read the record in question to populate its data in the remediation UI and a send job to write the corrected data back to the data source or a staging area first.
Retrieve and sent job
If the following remediation fields are available in the retrieve or send job’s External Data Provider node, data will be passed to the fields. The field values can be used to identify and work with the correct record:
REM_KEY (internal field to store issues record id)
REM_USERNAME (the current remediation user)
The "retrieve” action occurs when the issue is opened in SAS Data Remediation. Data Remediation will only pass REM_ values to the data management job if the fields are present in the External Data Provider node. Although the REM_ values are the only way the data management job can communicate with SAS Data Remediation but they are not all required, meaning you can just call the fields in the External Data Provider node you need.
The job’s output fields will be displayed in the Remediation UI as edit fields to correct the issue record. It's best to use a Field Layout node as the last job node to pass out the wanted fields with desired labels.
Note: The retrieve job should only return one record.
A simple example of a retrieve job would be to have the issue record id coming from REM_KEY into the data management job to select the record from the source system.
The “send” action occurs when pressing the “Commit Changes” button in the Data Remediation UI. All REM_ values in addition to the output fields of the retrieve job (the issue record edit fields) are passed to the send job. The job will receive values for those fields present in the External Data Provider node.
The send job can now work with the remediation record and save it to a staging area or submit it to the data source directly.
Note: Only one row will be sent as an input to the send job. Any data values returned by the send job will be ignored by Data Remediation.
Move jobs the Data Management Server
When both jobs are written, and tested you need to move them to Data Management Server into a Real-Time Data Services sub-directory for Data Remediation to call them.
When Data Remediation is calling the jobs, it will use the user credentials for the person logged on to Data Remediation. Therefore, you need to make sure that the jobs on Data Management Server have been granted the right permission.
Set up a Workflow to control the remediation process
Although you don’t need to involve a workflow in SAS Data Remediation but to improve efficiency it might be a good using one.
You can design your own workflow using SAS Workflow Studio or you can use a prepared workflow already coming with Data Remediation. You need to make sure that the desired workflow is loaded on to Workflow Server to link it to the Data Remediation Service.
Using SAS Workflow will help you to better control Data Remediation issues.
Register the remediation service
We can now register our remediation service in SAS Data Remediation. Therefore, we go to Data Remediation Administrator “Add New Client Application.”
Under Properties we supply an ID, which can be the name of the remediation service as long as it is unique, and a Display name, which is the name showing in the Remediation UI.
Next we set up the edit UI for the issue record. Under Issue User Interface we go to: User default remediation UI…. Using Data Management Server:
The Server address is the fully qualified address for Data Management Server including the port it is listening on. For example: http://dmserver.company.com:21036.
The Real-time service to retrieve item attributes and Real-time service to send item attributes needs to point to the retrieve/send job respectively on Data Management Server, including the job suffix .ddf as well as any directories under Real-Time Data Services where the jobs are stored.
Under the tab Subject Area, we can register different subject categories for this remediation service. When calling the remediation service we can categorize different remediation issues by setting different subject areas.
Under the tab Issues Types, we can register issue categories. This enables us to categorize the different remediation issues.
At Task Templates/Select Templates you can set the workflow to be used for a particular issue type.
By saving the remediation service you will be able to use it. You can now assign data issues to the remediation service to efficiently correct the data and improve your data quality from within SAS Data Remediation.
I have implemented your solution into a project, but i got a problem. I have a Remediation Application with approximately 20 Issue Types. Each Issue Type has different fields.For example issue 1 has fields KEY, Date_Of_Birth and issue 2 has fields KEY, Taxcode. In the Data Remediation Administration tab, i can only put one real time data service, which has one specific output, lets assume KEY and Date_Of_Birth. So if i open one record with ISSUE 2 type i can't retrieve the item attributes cause Date_Of_Birth doesn't exist. Could you give me some way to work around this problem?
Thank you in advance,
You can either create a data job that accepts fields for all issue type, even if not needed for every issue type. In the job you can then check the data supplied and react accordingly.
Or you can have different data jobs for different data issues and register them separately through Data Remediation Administration. Having one Client Application per Data Job.