Analytical Identification of Hidden Risks

January 28th was international Data Protection Day and, it (nearly) seems like the week after that, the tougher EU regulation will go into effect. At least that’s the impression you can get when you talk to data protection experts or catch sight of their efforts to cope with the EU’s new General Data Protection Regulation (GDPR). What needs to be protected? All of the personal information you’re aware of, of course. But more importantly the stuff you’re not aware of.

Grandpa was a bank clerk. Grandpa also had a bulletproof safe at home, where he kept all of his personal information safe and sound. He inventoried and documented each piece of information he felt was important, filing it away for future reference. A task that was straightforward and thus enjoyable. Back then, the telephone was screwed into place in the hallway, the scoring engine was called “local council,” and master data was stored on microfilm, as an aperture card in the steel cabinet.

“Dodging” fines with more of the same?
As a data protection officer in Germany, you also have registers of processing operations in 2017 – subject to lots of red tape, based on the Federal Data Protection Act (BDSG) of the last century. That’s all very well but the cosy times will end in May of next year. Come to think of it, it’s already over.
Big-data analytics and fast-paced digitalization are a reality. So the idea of clearly explaining those data-siphoning algorithms in writing, updating each and every source field listed, and then being able to establish proof of all the appropriate data protection measures with documentation seems a bit too ambitious.

How can an analytical search tool help?

There’s not enough time to search through, check, and recheck the thousands of fields. An automatic tool needs to “skim through”: based on rules, identifying any patterns in the text that contain “something personal.” And simply keeping count of them.

And it doesn’t have to be perfect and all-encompassing, but transparent and expandable. Such a scan works like a virus scan across all disks: suspicious databases are clicked on and one sample is taken from each table, weighed, and measured – for instance, “What percentage of field X sounds like e-mail, IBAN, or an ID number?”

The result is a list for evaluating the risk. Whether distributed as a report to the relevant staff, stapled to the register of the processing activities, or initially provided as inventory – for the purpose of estimating the amount of upcoming work associated with this tedious GDPR in 2017. The “personal data sniffer” is accompanied by data flow and metadata analyses, including text mining algorithms or artificial intelligence.

This promotes legal security. Particularly for data researchers when “profiling.”

Too soon for a summary? Or is it already “high time”?

“The millennium was better!” insists the senior consultant behind me at the gate. He probably means the expensive days, when tracking down two-digit numbers in COBOL programs cost 200 Deutschmark – per hour! This blog, on the other hand, appears on the website of a software provider. And the message is plain and simple: There’s an app for that! Analytical tools sniff out various bits of personal information in your extensive volumes of data. That you’re not aware of. Which you’re therefore not protecting. But need to. And should, as eventually Murphy (of Murphy’s law) or Alex (Springer) or your boss will come up with the idea on their own.

Watch also a video "The new EU Data Protection Regulation - What does it mean? " by Casper Pedersen

Blogs

Blogs

Analytical Identification of Hidden Risks

Too soon for a summary? Or is it already “high time”?

About Author