GDPR – sounding the death knell for self-learning algorithms?

In just a few short months the European General Data Protection Regulation becomes enforceable. This regulation enshrines in law the rights of EU citizens to have their personal data treated in accordance with their wishes. The regulation applies to any organisation which is processing EU citizens’ data, and the UK government has indicated that, irrespective of Brexit, it will implement the regulation in full.

Businesses need a clear view of how customer data used

The GDPR regulations are not just about where personal data is stored and the ability to opt out of email spam messages. Article 15 of the regulation specifically mentions the right for individuals to obtain meaningful information about the logic involved in certain automatic decisions concerning them, as well as the significance and the envisaged consequences of such processing for that individual.

Furthermore, Article 22 establishes the right of individuals to not be subject to an automated decision-making process where those decisions have “a legal effect” or “a similar, significant effect” on the individual.

Therefore, organisations - for certain types of automated decision making will - need to be able to::

Provide clear information on how analytical processing is applied at an individual level.
Ensure that certain individuals can object or opt-out from that processing

The scope of this provision is currently unclear. The data protection authorities will be providing guidance on which types of organisations and which types of automated decision-making fall within the scope of this article. In any case, even if the stricter rules of Article 22 do not apply, all automated decision-making is now regulated by the general provisions of the GDPR; this includes the general right of individuals to information about the processing of their personal data.

For those organisations that provide credit (e.g. mortgages, personal loans, credit cards) this is nothing new; there are already regulations in place to prevent discrimination and enforce clarity in data usage. It may more deeply impact other organisations, for example the healthcare sector where more personalised medicine requires profiling based on personal data.

A "black box" problem

These regulations have profound implications for data-driven organisations. Clearly, making an automated decision needs to be within a structured framework so that HOW the decision is made can be understood. The analytical model needs to be sufficiently interpretable to allow an explanation of WHY a decision was made about an individual, and WHAT the implications were for that individual. Organisations will also need to be able to explain what data was used to reach that decision, and in the case of important decisions, whether or not the overall decision making process is properly controlled.

The issue is more complex where “black box” systems are deployed. These are systems that are opaque to how the data is being used; self-learning algorithms are one example of this (a self-learning algorithm is one which adjusts its own parameters on the basis of new data and feedback with no human intervention).

Another issue with black box systems is that they may inadvertently become discriminatory. For example, if a group of postcodes is used as a factor in an automated decision-making algorithm, this may divide groups along ethnic lines, whether intentionally or not. A transparent “white box” approach would include a review process to ensure that this type of issue does not occur.

More transparent, sophisticated analytical solutions, which SAS has been offering to its customers for many years in areas such as credit-scoring and customer loyalty programs, do not face these challenges.

Best approach checklist

The best approach is to ensure the following are true of your decision making environment:

It's clear which data has been used to make a decision on each individual.
Analytical models are open and interpretable.
The process of deploying models into production is clear, so that which model is in use at any one time is well understood – this implies clear version control on analytical models and a careful testing process before deployment.
A history of decisions and how they have been made is available through an audit trail.
Decisions are consistent across channels (for example web, email and SMS all provide the same offer).

All of the above are perfectly achievable and are standard practice in many organisations. It’s important not to lose sight of the fact that an analytical approach has been proven to significantly increase revenues; moving away from analytics because of GDPR would be a self-destructive overreaction. There are also significant benefits to a controlled model development process, including reducing the time to deploy models and increasing the productivity of analysts.

The highest risks to business in this new world come from uncontrolled software development and deployment processes which may introduce inadvertent effects upon consumers.

As the risks of non-compliance include fines of up to four per cent of revenue or €20M (whichever is the greater), plus the associated reputational damage, it’s clear that the control and clarity of an analytical modelling process is of greater value than ever.

Find out more about GDPR and how SAS can help.

This post was co-authored by Iain Brown.

Blogs

Blogs

About Author