In Part 1 of this series, I defined data security and privacy, explaining that data privacy is the act of allowing access and use of any data – based on privacy requirements, guidelines or procedures. In Part 2, I'll focus on data security.
I see data security as the physical act of implementing the data privacy policies, procedures or guidelines, and securing the data over time. This, of course, includes reporting and a variety of auditing activities that are required by laws like the General Data Protection Regulation.
You may have experienced the continued demand for analysis of data on other platforms (Hadoop, cloud, MongoDB, etc.). This requires you to think about the flow of the data within your environment and how it is used. Each platform can experience its own types of security issues. For example, let's say you're going to stream live data from an online application onto your big data platform for immediate analysis. This data also ends up in the data warehouse further down the data flow from the application system. Besides the source system, you now have two other systems to secure – the big data platform and the data warehouse. Chances are these are all different database management systems, which may not necessarily have the same mechanisms for securing data.
Secure the data – everywhere
Securing data on your big data platform may require you to create views for access. Not a bad way to secure access to the data. However, what really stops a process from reading the data on the big platform and propagating the data to another platform? For that matter, the same issue can happen in a relational database or any other database. This is the reason we have to know how, where and when the data is used, and who used it – and be able to report on it.
There is a certain amount of automation that you can apply to auditing the data on any of these platforms. Seeing who or what process READS the data may be the most troublesome. But without the right people – who understand access patterns and have the authority to escalate issues – the process won't be as effective as you thought it would be upon implementation.
With the data secure at the lowest level (the database), how do you allow people to update and secure their own data? For example, what if one of the requirements is to allow me to update my own data or decide who gets access to my data? Designing this type of data access will not be an easy task. I would suggest that you don’t allow direct updates to the enterprise database directly, but instead allow updates to an alternate database. This enables you to update the enterprise database later, after validation and rules have been applied to consumers' data access requests. Another option would be to copy the data that's going to be updated to another database before allowing the update process to occur. Either way, you have to be able to report on how, where and when the data got changed, updated, accessed or copied – and who was involved in those processes.
Obviously, it's important to have a process to help govern your data – but you also need to make sure the right people are involved. Those people should understand the importance of identifying the data and data flows, as well as protecting the data. You need a team that understands today’s needs while always keeping future needs in mind. Data security and privacy will continue to be an issue in our fast-moving world. It's up to us to keep the data that has been entrusted to us as secure and private as possible.Read a paper to learn how SAS Data Management can help protect your data