Dance hall rules: data science ethics today will impact artificial intelligence tomorrow


Artificial Intelligence networks surrounding a human headThere has been much discussion about the relationship between data science and artificial intelligence. It can become a complicated dance when applied data science is partnered with emerging artificial intelligence technologies. Who takes the lead? How do we keep the beat? Can we make sure neither party steps on the other's toes?

I like to think of data science at least partially as being an application of artificial intelligence. Academics (and to some extent even practitioners) create algorithms while data scientists cull data and apply these algorithms. As these algorithms develop more abilities to learn, machines will become more intelligent.

Learning to dance with this new partner will be a delicate balance of directing the algorithms (through informed feature selection, feedback loops, manual model parameter selection and business rule encoding) and letting them lead (through autotuning, optimization techniques and deep learning). These considerations will undoubtedly grow in importance as data science and automated decisioning expand into every corner of the organization and into our daily lives.

However, one area where we, as data scientists, most definitely need to take the lead is in developing and using ethical frameworks. Computers are getting better at simulating intelligence, but so far, lag in simulating human values.

Serious folks like Stephen Hawking, Bill Gates and Elon Musk are investing considerable time and energy in spreading the word about the perceived threat that unconstrained (eg: without ethical frameworks) AI presents.  As chief artificial intelligence practitioners, data scientists need to take the lead in setting up ethical frameworks which will keep AI on the right track.

Let’s start with a reasonable assumption: by the end of the century, machines will be a great deal more self sufficient (indeed, self-driving cars, target seeking military drones and smart coffee machines are already scientific FACT). Self-sufficient in this sense means self-learning and self-modifying, and eventually identifying and acting in contexts for which the machine was not originally designed. If you extrapolate this progression of self sufficiency to a natural conclusion, it could also mean that not even the hard-coding of back doors in such systems would be enough to completely eliminate the chance of undesirable machine behaviors.

If we assume that the best-planned coding or programming might not be enough to ensure human favorable outcomes (for now, we’ll stay away from the very tricky discussion of WHAT precisely IS a human favorable outcome), we should now be discussing and implementing a combination of ethical frameworks and even AI behavioral training.

And this, for me, is where the real brain-stretching starts. What precisely would AI behavioral and morality training look like?

Dan Curtis proposed an eventual need for Artificial Psychology as early as 1963, but realized the current state of technology was not yet ready for it. Even today, the Wikipedia entry on Artificial Psychology ends with the sentence, "As of 2015, the level of artificial intelligence does not approach any threshold where any of the theories or principles of artificial psychology can even be tested, and therefore, artificial psychology remains a largely theoretical discipline."

But is that entirely so? Can we not already start today with programmatic approaches that take ethics into account? After that do we need to create virtual environments where AI machines can learn, including rewards for correct moral choices?

Follow ethical guidelines

We should be applying ethical vigor to current machine learning processes.  How does that apply to our daily work as data scientists? Well, Google tries a simple mantra, "Do no harm." Likewise, any business contexts that go against our own personal credos should be vetoed.

How do we as data scientists ensure that we are training ethically responsible, strategically aligned models that will generate good behaviors from our programs? Are there existing techniques for this? Do we need to develop new ones? How far can data SCIENCE take us before we need to establish and encode a data ETHOS?

These are important questions and we need to begin sensibly and concretely.

For example, when training predictive models, is it enough to simply use historical cases to build a model? Or, is it up to us to consider morality. What is the model behavior we want to encourage? What types of customer or other entity behaviors do we want to reward? Do we stress the rewards of certain outcomes? This is where human rules complement machine learning when we move to deploy. And this isn’t just some abstract morality question. It’s also a question of making models behave in line with organizational strategic objectives AND ethics.

Outcomes management

Following this logic, an essential part of effective model management really becomes outcomes assessment, not just in terms of accuracy, but are we ethically in line and producing ethical outcomes.

We have a realistic chance to teach desired behaviors in our predictive models. In this age of analytics sandboxes and playpens, what kind of structures can we envision? How about experimental approaches? Online testing, A/B or one-armed bandit? Constructed cases?

If a model’s purpose is to select the right customers, how do we build promotion and recognition concepts into models transparently? Surely certain algorithms do apply these concepts mathematically, which is a good thing, but is it enough? A great short book about hyperparameter tuning and other model testing considerations is, Evaluating Machine Learning Models: A Beginner’s Guide to Key Concepts and Pitfalls, from O’Reilly press (while not written from an ethics perspective, anyone is free to read it with ethics in mind).

These are tough issues for anyone, but especially for data scientists. Science looks to eliminate bias but aren’t ethics and morality really a type of (necessary?) bias? A next (evolutionary?) step can of course be that AI will discover better behaviors or even optimal morals? At what point should we feel safe in subjecting ourselves to such a paradigm?

I would be very interested hearing your views on these topics.

And get in on the dance...


About Author

Andrew Pease

Principal Business Solutions Manager

After 14 years in various roles at SAS, Andrew is currently responsible for advanced analytics in the Center of Excellence. Andrew helps financial institutions, major retailers, pharmaceuticals, manufacturers, utilities and public sector to understand and use powerful analytic techniques such as decision management, predictive modelling, time-series forecasting, optimization, and text mining.


  1. Herman Bruyninckx on

    You are overly concerned about the "ethical problems" of AI, because there is no need for "new" ethics.

    First of all, an AI-driven system is just as much an engineered system as any other actual system, hence its designers should apply the same ethical concerns and legislation as they are (hopefully...) applying now already.

    More in particular, independently of the amount of data that is being processed automatically, it is and still will be, the human engineer who writes to code that turns the outcome of the AI algorithms into decisions. This code will (have to) remain as simple as it is now, and that is a realistic expectation since the number of decisions to be taken does not grow in the same way as the amount of data that has to processed. Not at all.

    The biggest misinterpretation about "machine learning", "deep learning" or whatever the latest hype is called, is that it is "learning" anything. It isn't. The only thing that has been coming our of machine learning, and the only thing that is going to come out of it in the future, is "data reduction": the system reduces a huge amount of data into a small set of classes, and hopefully, these classifications make sense to humans.

    That also means that those engineers must continue to implement the same ethical reflexes they (hopefully...) are implementing now: don't make any decision until you're sure that it is backed up with enough interpretable data about the context. If you let your system come up with its own classifications, and you base your decision making on those classes without being sure about how to interpret the new classes, _you_ are behaving unethically, not your software.

    In summary, the only ethical danger that I see (and which is, I admit, a huge one) is that project responsibles will hire AI engineers who are not knowledgeable in the application domain that their software is going to be used in, and to let those people decide about what decision logic to implement.

    The other ethical aspect about "big data" is privacy. And also in that domain, the current legislation is very clear (and restrictive!). The pragmatic problem, though, is that many coorporations do not follow the legal regulations, and violate that privacy all the time. Again, this is not a problem with the AI software, but with the humans using it.

    • Andrew Pease

      Thanks for your insights Herman. I take your point that the 'learning' in machine learning is an oft-misunderstood misnomer and that such analytic methods ultimately equate to intelligent filtering of data. That's an important definition of terms for having a constructive dialogue here.

      However, I think you'll agree that currently, AI engineers, or data scientists, don't always apply ethical concerns' as rigorously as they really should. As they create, manage and use AI software, they have both legal and ethical obligations, which in the rush to results, may sometimes grow fuzzy. Ethical frameworks should be a constant consideration, be built into systems, and above all else strive to the goal that AI 'does no harm', even though that can be a very difficult concept to precisely and continually define.

      • Herman Bruyninckx on

        > However, I think you'll agree that currently, AI engineers,
        > or data scientists, don't always apply ethical concerns'
        > as rigorously as they really should.

        I fully agree with this observation. So, the resulting line of action should _not_ be to introduce ethical guidelines for AI software, but just to (re)train the AI engineers about the common sense ethics around technology that exist for decades already 🙂
        Ethical progress in technology was, is, and probably will be, all about training people. Which is, unfortunately, not what we see in the big IT moguls like Facebook, Uber, Apple, and other companies that are commonly depicted in the common press as "industry leaders" and "role models for new companies". These evolutions show the risk of not educating our public fast and well enough to be able to think for itself, and see through the PR and financial "success stories" of technology...

        • Andrew Pease

          Agree 100% that educating the public to be informed consumers of analytics, AI, and statistics is DEFINITELY essential. Also essential is continually promoting 'data awareness' so that individuals monitor and take ownership of their data footprint.

          I'm a bit skeptical though that 'just (re)training the AI engineers around the common sense ethics around technology' will be enough. Ethical guidelines and frameworks can play a critical role in reducing the grey area between the euphoric rush to technological breakthrough, and the values and interests of society and the individual.

Leave A Reply

Back to Top