In my last post, I spoke about the importance of dropping the reactionary tactics to compliance that are increasingly commonplace. In this two-part series, I want to go a step further by exploring some useful starting points. As ever, it is those first tentative steps with a new compliance initiative that are always the biggest obstacle to overcome.
Keep it simple - Start with the 4F’s
If you look at some of the recommendations from the various regulatory bodies, it’s clear that an essential requirement is for organisations to understand fully, and account for, all the data that they process. I’m sure you would agree that is a fair request. But the first challenge for many organisations is understanding what data is bound by a compliance obligation, where it can be found, and what is done with it!
I find the 4F’s useful here: Function, Flow, Form, Foster. In this post, we'll look at the first two.
Compliance directives (such as GDPR, BCBS239, etc.) are interested in understanding how well you manage specific types of data. For example, GDPR is focused on personal data. If a bank is mashing up my personal data with all kinds of non-compliant third-party social media data (so they can spam me across multiple platforms), then the local data protection regulator would like to know.
Likewise, you may have significant amounts of equipment data with critical attributes such as voltage, safe handling procedures, maintenance periods, etc.
Each type of data has different compliance obligations against it. So the functions that take place on your data are important to know – and these functions aren't always obvious from merely examining a data set.
For example, in the past, I’ve mined access logs, user account access and a range of different sniffing tactics to see who has been CRUD’ing (Creating-Reading-Updating-Deleting) a data set. That's because it’s not always obvious who is utilising a data set for a particular function.
What is the lineage and provenance of your data? Who supplied it? Where does it go?
Data compliance requires a detailed understanding of the flow of your data. And as you can probably guess, flow and function are tightly coupled.
Using data discovery tools, and some of the techniques mentioned in the function section above, I’ve been able to find a lot of unchartered routes for data in the past. (If you want to see the full extent of the problem, just see what comes out of a "data source amnesty" exercise when you’re performing a data migration – prepare to be shocked!).
So, as you’ve no doubt experienced in your organisation – data flows between applications, departments, partners and clients. It is these flows and data junctions that need to be understood. Because every uncharted lineage path pushes your compliance risk meter into the red.
You can bet that every laptop stolen from an employee that contained a spreadsheet with customer details was not on any approved data flow path.
In the next post, I'll close out the 4F framework with the remaining two activities that underpin the data compliance activity. Please share your views on what I've written so far. What else would you add to a kick-off itinerary for compliance?Learn how SAS can help you identify, govern and protect personal data